Machine Learning (ML) Service - ProceZeus Documentation

1. Overview

The machine learning service is responsible for predicting the outcomes of a user's case.

Outcomes can either be categorized as either being True/False or by a numerical value. Whether a given outcome is boolean or integer is evaluated by a human and then given to the system beforehand (See section 1.6). Therefore, this sub-system makes use of both classifiers and regressors to make predictions. The inputs for both the classifier the and regressor are the facts obtained by the user's inputs. An array of outcomes is then returned.

The input and output data are all represented numerically despite having the potential to be boolean values. Below illustrates how values are treated:

0 --> False / Null 1 --> True (n > 1) --> True AND Numerical

Numerical Values consist of:

Dates / Time (in months)
Money (in $)

The inputs are stored in a numpy array consisting of only integers with the possible values listed in section 1.1. Every index of the array represents a different fact/input data point which will be used by the machine learning. The indexes of the facts are determined once the precedents are tagged (they are subject to change orders upon re-tagging the data). An input array will look as such:

[fact_1, fact_2, ..., fact_n]

Here is an example to retrieve the labels for each column:

from feature_extraction.post_processing.regex.regex_tagger import TagPrecedents

indexes = TagPrecedents().get_intent_index()

# print sample of the content
for i in index['outcomes_vector'][:3]:
    print(i)

output:

(0, 'additional_indemnity_money', 'bool')
(1, 'declares_resiliation_is_correct', 'bool')
(2, 'landlord_serious_prejudice', 'bool')

structure for 'indexes' variable:

{
    'outcomes_vector': [
        (array_index, column_label, column_type),
        (array_index, column_label, column_type)
    ],
    'facts_vector': [
        (array_index, column_label, column_type),
        (array_index, column_label, column_type)
    ]
}

Similarly to section 1.2, the output will be an array of integers of the size of all the number of outcomes supported by the system. Please refer to section 1.1 for other inquiries.

A multiclassifier is used to predict all outcomes. In the background, SkLearn uses a different estimator per outcome in order to perform this task. When obtaining a prediction, ALL outcomes are either classified as True or False. Even the numerical outcomes are classified as such. If an outcome is expected to be a numerical value AND that outcome is True then the input is passed to the appropriate regressor in order to predict the outcome's integer value. If the previous condition isn't met then no further data manipulation is necessary for a given outcome and the classifier's prediction is simply returned for this column.

Adding a new classifier New classifiers will be automatically trained upon adding regexes. See section 1.6.

The regressors are only used if the classifier predicted an outcome as True. The reason for this implementation is because the regressors are trained on bias data where we know the outcome was True. Therefore the input data must also be biased towards the same end goal.

During training, only for regression, the average values of every fact of the data set is obtained. The vector will look as such:

[average_column_1, average_column_2, ..., average_column_n]

This vector is kept in binary format and can be retrieved this way:

from util.file import Load
mean_facts_vector = Load.load_binary('model_metrics.bin')['regressor'][<name of the regressor>]['mean_facts_vector']

Regression fine tuning When making a regressive prediction, the user's input is entered as an array of numerical values as in section 1.2.

Wherever a 0 is encountered in the user's input, we replace it with the average value of it's column.The purpose of this strategy is to predict more accurate results when the regressor is used. When a prediction is performed with missing input we then replace that missing input with it's average value to get a better fit on the curve.
During training, outliers in the dataset are removed. Outliers are determined by:

abs(outcome - average_of_outcomes) > (2 * std_of_outcomes)

Adding a new regressor The regressor's estimators are crafted manually as opposed to using the SkLearn's wrapper as in section 1.4. Because the regressors require much more discreet attention, this approach was necessary. A custom wrapper is instead written, and every new regressor can inherit the AbstractRegressor Class.

Code new regressor (inherit abstract_regressor.py)
Update multi_output_regression.py to accomodate new class

Adding new columns is fairly simply. In the feature_extraction/post_processing/regex/regex_lib.py file simply append your regex to the regex_facts or regex_outcomes list. The syntax is the following:

regex_facts = [
    (
        <column_label>, [
            re.compile(<regex_1>, re.IGNORECASE),
            re.compile(<regex_2>, re.IGNORECASE),
            re.compile(<regex_n>, re.IGNORECASE)
        ],
        <data_type>),
    ),
    (
        <column_label>, [
            re.compile(<regex_1>, re.IGNORECASE),
            re.compile(<regex_2>, re.IGNORECASE),
            re.compile(<regex_n>, re.IGNORECASE)
        ],
        <data_type>),
    ),
]

Type as many regular expressions as needed to cover all the dataset. Upon tagging the data a percentage of lines tagged will be displayed.

Note: <data_type> are the following strings:

"BOOLEAN"
"MONEY"
"DATE"

The newly added columns in the regex_lib.py file will then automatically be used the next time the machine learning performs its training on the condition that the data has be re-post-processed. Be sure to create a regressor if you want to predict "DATE" or "MONEY" though (See section 1.5).

2. DATA

All persistent machine learning data are stored as binaries. In order to centralize this information it is advised to upload the models on a server. These models may then be fetched in the init.py script in the source directory (do not confuse with __init__.py script). Simply append your download link to the binary_urls list found in this file.

To load any binary files, first make sure it is stored in the binary/data/ folder. This should be performed automatically by the init.py. Then simply use the following:

from util.file import Load

Load.load_binary(<binary_file_name>)

To save a binary file use the following:

from util.file import Save

Save().save_binary(<desired_binary_file_name>, model)

The output directory will be binary/data/ by default.

Some global variables are listed in util/constant.py

2.4 Binary file content

classifier_labels.bin

{
   outcome_index_0 <int>: (
       column_label <str>,
       column_type <str>
   ),
   outcome_index_n <int>: (
       column_label <str>,
       column_type <str>
   ),
}

model_metrics.bin

{
    'data_set':{
        'size': <int>
    },
    'classifier':{
        classifier_name_0 <str>: {
            'prediction_accuracy': <float>
        },
        classifier_name_n <str>: {
            'prediction_accuracy': <float>
        }
    },
    'regressor':{
        regressor_name_0 <str> :{
            'std': <float>,
            'variance': <float>,
            'mean_facts_vector': <numpy.array>
        },
        regressor_name_n <str> :{
            'std': <float>,
            'variance': <float>,
            'mean_facts_vector': <numpy.array>
        }
    }

multi_class_svm_model.bin Used to predict classifier results

from util.file import Load
from sklearn.preprocessing import binarize

model = Load.load_binary("multi_class_svm_model.bin")
classifier_labels = Load.load_binary('classifier_labels.bin')
input_vector = [fact_1, fact_2, fact_n, ...]
data = binarize([input_vector], threshold=0)
prediction = model.predict(data)

precedent_vectors.bin

{
    <precedent_id> <str>:{
        'outcomes_vector': numpy.array,
        'facts_vector': numpy.array,
        'file_number': <str>,
        'name': AZ-********.txt <str>
    }
}

similarity_case_numbers.bin An array of all case numbers. This is used to map the indices (returned by the similarity model) to case numbers

[
    'AZ-XXXXXX',
    'AZ-XXXXXX',
    'AZ-XXXXXX',
    'AZ-XXXXXX'
    ...
]

similarity_model.bin

Case similarity comparator. Uses NearestNeighbour algorithm. Set to return the 5 nearest neighbours.

Input: A vector, which is the concatenation of the vector containing facts and the vector containing outcomes Output: The indices (which have a direct mapping to case numbers using similarity_case_numbers [see above]) of the 5 most similar cases

from util.file import Load

model = Load.load_binary("similarity_model.bin")

facts_vector = [fact_1, fact_2, fact_n, ...]
outcomes_vector = [outcome_1, outcome_2, outcome_n, ...]
input_vector = facts_vector + outcomes_vector

model.kneighbors(input_vector)

*_scaler.bin Every machine learning model requires a scaler to transform the data into values which will exponentially increase training time.

*_regressor.bin Models used to predict regressive results

from util.file import Load
from keras.models import load_model
import os

file_path = os.path.join(Path.binary_directory, '<regressor_name>')
regressor = load_model(file_path)
scaler = Load.load_binary('<your_scaler>')
model = AbstractRegressor._create_pipeline(scaler, regressor)
input_data = [fact_1, fact_2, ..., fact_n]
prediction = model.predict([input_data])

3. Installation Instructions

Add Cyberjustice Lab username as environment variables: export CJL_USER={USERNAME} either to your .bashrc or run it as a command
Add Cyberjustice Lab password as environment variables: export CJL_PASS={PASSWORD} either to your .bashrc or run it as a command
Run pip3 install -r requirements.txt
Run pip3 install -r requirements_test.txt

4. File Structure


----| data <all data input and output>
--------| raw
------------| text_bk <extract precedents here>

--------| binary <all saved binarized model/data>
--------| cache <temp files>
--------| test <used for unit testing>

----| feature_extraction <all data manipulation before supervised training>
--------| feature_extraction.py <driver for feature extraction (using 3 drivers above)>
--------| pre_processing
------------| pre_processing_driver
----------------| filter_precedent
--------------------| precedent_directory_cleaner.py
--------| post_processing
------------| post_processing_driver.py <driver for post_processing>
----------------| regex
--------------------| regex_entity_extraction.py
--------------------| regex_lib.py
--------------------| regex_tagger.py

----| model_learning <supervised training>
------------| classifier
----------------| classifier_dirver.py
----------------| multi_output
--------------------| multi_class_svm.py
------------| regression
----------------| regression_driver.py
----------------| single_output_regression
--------------------| abtract_regressor.py
--------------------| tenant_pays_landlord.py
--------------------| additional_indemnity.py
----------------| multi_output
--------------------| multi_output_regression.py
------------| similar_finder
----------------| similar_finder.py

----| util <common tool>
------------| log.py <logging tool>
------------| file.py <file save and load>
------------| constant.py <global variables>

----| web
--------| ml_controller.py

init.py
main.py <driver for the pipeline (feature extraction + model training>

5. ML API

Predict the outcome based on given facts and demands. Returns an array of predicted outcomes as well as similar precedents. The precedents have distances assigned to them. The lower the distance, the more similar it is.

URL : /predict

Method : POST

Data constraints

Provide facts_vector and demands_vector, with key values for each fact/demand.

{
  "facts" : {
    "absent" : 1,
    "apartment_impropre" : 0,
    "apartment_infestation" : 1,
    "asker_is_landlord" : 1,
    "asker_is_tenant" : 1,
    "bothers_others" : 1,
    "disrespect_previous_judgement" : 1,
    "incorrect_facts" : 1,
    "landlord_inspector_fees" : 1,
    "landlord_notifies_tenant_retake_apartment" : 1,
    "landlord_pays_indemnity" : 1,
    "landlord_prejudice_justified" : 1,
    "landlord_relocation_indemnity_fees" : 1,
    "landlord_rent_change" : 1,
    "landlord_rent_change_doc_renseignements" : 1,
    "landlord_rent_change_piece_justification" : 1,
    "landlord_rent_change_receipts" : 0,
    "landlord_retakes_apartment" : 1,
    "landlord_retakes_apartment_indemnity" : 1,
    "landlord_sends_demand_regie_logement" : 0,
    "landlord_serious_prejudice" : 1,
    "lease" : 1,
    "proof_of_late" : 1,
    "proof_of_revenu" : 0,
    "rent_increased" : 1,
    "tenant_bad_payment_habits" : 1,
    "tenant_continuous_late_payment" : 1,
    "tenant_damaged_rental" : 1,
    "tenant_dead" : 1,
    "tenant_declare_insalubre" : 1,
    "tenant_financial_problem" : 0,
    "tenant_group_responsability" : 1,
    "tenant_individual_responsability" : 1,
    "tenant_is_bothered" : 1,
    "lack_of_proof" : 1,
    "tenant_landlord_agreement" : 0,
    "tenant_lease_fixed" : 1,
    "tenant_lease_indeterminate" : 1,
    "tenant_left_without_paying" : 0,
    "tenant_monthly_payment" : 1,
    "tenant_negligence" : 1,
    "tenant_not_request_cancel_lease" : 1,
    "tenant_owes_rent" : 1,
    "tenant_refuses_retake_apartment" : 1,
    "tenant_rent_not_paid_less_3_weeks" : 1,
    "tenant_rent_not_paid_more_3_weeks" : 0,
    "tenant_rent_paid_before_hearing" : 1,
    "tenant_violence" : 1,
    "tenant_withold_rent_without_permission" : 1,
    "violent" : 1
  }
}

Code : 200 OK

Content examples

{
  "outcomes_vector": {
    "additional_indemnity_money": "221",
    "authorize_landlord_retake_apartment": "0",
    "declares_housing_inhabitable": "0",
    "declares_resiliation_is_correct": "0",
    "landlord_prejudice_justified": "1",
    "landlord_retakes_apartment_indemnity": "0",
    "landlord_serious_prejudice": "0",
    "orders_expulsion": "1",
    "orders_immediate_execution": "1",
    "orders_landlord_notify_tenant_when_habitable": "0",
    "orders_resiliation": "1",
    "orders_tenant_pay_first_of_month": "0",
    "tenant_ordered_to_pay_landlord": "643",
    "tenant_ordered_to_pay_landlord_legal_fees": "80"
  },
  "probabilities_vector": {
    "additional_indemnity_money": "0.93",
    "authorize_landlord_retake_apartment": "1.0",
    "declares_housing_inhabitable": "1.0",
    "declares_resiliation_is_correct": "0.94",
    "landlord_prejudice_justified": "0.74",
    "landlord_retakes_apartment_indemnity": "1.0",
    "landlord_serious_prejudice": "1.0",
    "orders_expulsion": "0.88",
    "orders_immediate_execution": "0.72",
    "orders_landlord_notify_tenant_when_habitable": "1.0",
    "orders_resiliation": "0.91",
    "orders_tenant_pay_first_of_month": "0.99",
    "tenant_ordered_to_pay_landlord": "0.99",
    "tenant_ordered_to_pay_landlord_legal_fees": "0.91"
  },
  "similar_precedents": [
    {
      "distance": 0.3423500835013649,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "900.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "970.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "70.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "970.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "88.0"
      },
      "precedent": "AZ-51211608"
    },
    {
      "distance": 0.3429019324281239,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "735.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "873.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "0.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "873.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "80.0"
      },
      "precedent": "AZ-51176404"
    },
    {
      "distance": 0.49114649102172725,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "770.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "1360.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "590.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "1360.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "81.0"
      },
      "precedent": "AZ-51212451"
    },
    {
      "distance": 0.49200755901067444,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "945.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "1290.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "345.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "1290.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "72.0"
      },
      "precedent": "AZ-51201834"
    },
    {
      "distance": 0.4933548500076463,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "800.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "1400.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "0.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "0.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "92.0"
      },
      "precedent": "AZ-51391660"
    }
  ]
}

Code : 400 Bad Request - Inputs not provided

Code : 404 Not Found - Conversation doesn"t exist

Get the weights of every outcome sorted by descending order of importance

URL: /weights

Method: GET

Data constraints

None

Code : 200 OK

Content examples

{
    "additional_indemnity_money": {
        "important_facts": [
            "asker_is_landlord",
            "tenant_withold_rent_without_permission",
            "tenant_refuses_retake_apartment",
            "tenant_monthly_payment",
            "tenant_not_paid_lease_timespan"
        ],
        "additional_facts": [
            "tenant_financial_problem",
            "tenant_owes_rent",
            "asker_is_tenant",
            "tenant_damaged_rental",
            "tenant_individual_responsability",
            "signed_proof_of_rent_debt",
            "tenant_lease_indeterminate",
            "tenant_dead",
            "tenant_is_bothered",
            "bothers_others"
        ]
    }
}

Get the anti facts

Left hand side always initialized to 1 and right hand side to 0

URL: /antifacts

Method: GET

Data constraints

None

Code : 200 OK

Content examples

{
    "tenant_rent_not_paid_less_3_weeks": "tenant_rent_not_paid_more_3_weeks",
    "tenant_lease_fixed": "tenant_lease_indeterminate",
    "not_violent": "violent",
    "tenant_individual_responsability": "tenant_group_responsability"
}

Get the ml stats

Used to obtain:

Size of data set
Variance of regression outcomes
Standard deviation of regression outcomes
Mean of regression outcomes
Prediction accuracy of each classifier

URL: /statistics

Method: GET

Data constraints

None

Code : 200 OK

Content examples

{
    "classifier": {
        "additional_indemnity_money": {
            "prediction_accuracy": 79.8400199975003
        },
        "authorize_landlord_retake_apartment": {
            "prediction_accuracy": 99.48756405449319
        },
        "declares_housing_inhabitable": {
            "prediction_accuracy": 99.95000624921884
        },
        "declares_resiliation_is_correct": {
            "prediction_accuracy": 91.83852018497687
        },
        "landlord_prejudice_justified": {
            "prediction_accuracy": 81.07736532933383
        },
        "landlord_retakes_apartment_indemnity": {
            "prediction_accuracy": 99.72503437070367
        },
        "landlord_serious_prejudice": {
            "prediction_accuracy": 96.35045619297588
        },
        "orders_expulsion": {
            "prediction_accuracy": 91.55105611798525
        },
        "orders_immediate_execution": {
            "prediction_accuracy": 84.32695913010873
        },
        "orders_landlord_notify_tenant_when_habitable": {
            "prediction_accuracy": 100
        },
        "orders_resiliation": {
            "prediction_accuracy": 93.48831396075491
        },
        "orders_tenant_pay_first_of_month": {
            "prediction_accuracy": 98.05024371953506
        },
        "tenant_ordered_to_pay_landlord": {
            "prediction_accuracy": 83.82702162229721
        },
        "tenant_ordered_to_pay_landlord_legal_fees": {
            "prediction_accuracy": 90.32620922384702
        }
    },
    "data_set": {
        "size": 40003
    },
    "regressor": {
        "additional_indemnity_money": {
            "mean": 1477.7728467101024,
            "std": 1927.8147997893939,
            "variance": 3716469.9022870203
        },
        "tenant_pays_landlord": {
            "mean": 2148.867088064977,
            "std": 2129.510243010276,
            "variance": 4534813.8750856845
        }
    }
}

6. Using the Command Line

* denotes optional arguments

From the source directory JusticeAi/src/ml_service/ you may run:

Pre Processing python main.py -pre [number of files | empty for all]
Post Processing i. Each fact and outcome is listed with their number of occurences ii. % of tagged lines is displayed iii. python3 main.py -post [number of files | empty for all]
Training **Note: Always train svm before the sf and the svr Testing results are displayed: i. classifier: accuracy, F1, precision, recall ii. regression: absolute error, r2 arguments: i. --svm: classifier ii. --svr: regressor iii. --sf: similarity finder iv. --all: classifier, regressor, similarity finder python3 main.py -train [data size | empty for all] --svm* --sf* --svr* --all*

ProceZeus Documentation

Machine Learning Service

0. Table of Contents

1. Overview

1.1 Data Representation

1.2 Facts / Input

1.3 outcomes / output

1.4 Classification

1.5 Regression

1.6 Adding new columns (input/output)

2. DATA

2.1 Accessing binary data

2.2 Saving binary data

2.3 Global Variables

2.4 Binary file content

3. Installation Instructions

4. File Structure

5. ML API

5.1 Predict Outcome

Success Response

Error Response

5.2 Get Fact Weights

Success Response

5.3 Get Anti Facts

Success Response

5.4 Get Machine Learning Statistics

Success Response

6. Using the Command Line