You can learn more about ProceZeus by visiting the official documentation page. These docs were generated with mdBook.

All of the project's services are split into separate Docker images. All application dependencies are contained within the Docker images. The dependencies required to run this project locally are:

docker
docker-compose

To install Docker and Docker Compose, you can follow the instructions here.

We've developed a script to help with running the entire application with all its components. It's a thin wrapper around docker-compose, so all docker-compose will work for it. All you need is:

./cjl build && ./cjl up

If you want to suppress output push the job to the background:

./cjl up -d

Docker isn't always the best at determining diffs between images. If that happens, you can destroy all Docker images on your host with:

./cjl clean [-y]

To run all tests and lints for all services:

./cjl test

To try to fix all linting errors for all services:

./cjl lint-fix

In order to shut down all containers:

./cjl down

Finally, if you want to reset the database (Helps with inconsistent database states), you can run:

./cjl reset-db

The cjl script also takes any other command that docker-compose can take.

The following services can run individually or tested against:

Deployment is done via Travis CI and Ansible. The most current version of the master branch is reflected on the Demo Page

The following architecture diagram represents the various services and the relationships they have with one another.

High Level Architecture

See CONTRIBUTING.md for details.

See the releases tab

The following is a list of team members that are contributing to the project:

This project is licensed under the MIT License - see the LICENSE file for details

The project is split into several modules; one per micro-service. The description of each microservice is shown below:

This module is responsible for responding to the web client's API queries. It is also the primary point of contact for the other micro-services

This module is responsible for all things related to predicting outcomes and classifying based on precedent data

This module contains the Web UI that users will interact with

Postgresql

This module contains the data persistence layer of our system

This module is responsible for all things related to natural language processing, which the user interacts with

The ProceZeus application has many components with a and requires lots of binaries for the various machine learning models. We use docker and docker-compose to manage all of that complexity. Although you'll need to have these tools installed to build ProceZeus, we've hidden the gory details from you by providing a handy cjl script.

cjl is a thin wrapper around the docker-compose command, so any command that would work for docker-compose should also work for cjl. However, cjl contains utility functions to automatically lint your code, run tests, reset the database, remove all Docker images, and more! You can read more about the functions available in cjl in the "Getting Started" section.

docker-compose configuration files are currently split based on the environment. Running ./cjl build && ./cjl up will default to a dev environment, which uses the docker-compose.dev.yml configuration. This can be changed by specifying the environment you like with the COMPOSE_FILE environment variable. For example, this technique is used in the CI environment to run tests and upload the code coverage reports instead of running the application.

We're using Travis CI for as a continuous integration service to run our tests on our latest changes pushed to Git. The details are in .travis.yml, but Travis uses a build matrix in order to test each service in a separate process. We've also added an additional constraint where every commit message must begin with a reference to an issue (eg. [#123]) in order to improve the trackability of our work. If a build fails, GitHub will not let you merge in your changes.

Once a service's tests have completed running, the test line code covereage is uploaded to CodeCov.io. This service ensures that we're always maintaining a reasonable number of tests for our application over time. This check is more informational, and is not required to pass for a build to be merged.

The server at capstone.cyberjustice.ca us fronted by nginx. It is being used to serve static files and machine learning model binaries required at build time at https://capstone.cyberjustice.ca/data. The client is currently running at https://capstone.cyberjustice.ca by proxying the user's request to the server running in the Docker web_client service. All requests to https://capstone.cyberjustice.ca/api are also proxied and passed through to the backend_service for the client's REST API.

We are using the default PostgreSQL docker image as a persistent data store. At the moment, we do not keep track of database migrations, and the database is wiped and rebuilt on every deployment. This can also be done manually on your local machine with ./cjl reset-db. The services that require DB access run an init.py script during build time that constructs the database models.

Overview

The web client is a Vue.js application that integrates all micro-services and provides a user interface to end users. The main focus of the web client is to 1) provides on-screen chatbot experience to the users and 2) delivery the system features as a whole.

The major technologies we use in web client are:

Build Tool: Webpack
Framework: Vue.js
UI Library: ElementUI

File Structure

Below is a list of the most important files and directories

-/build                             // includes all webpack build configuration
-/config                            // includes all webpack environment configuration
-/src                               // source code of the vue.js application
-/test                              // unit test
index.html                          // main index file
Dockerfile                          // docker configuration
package.json                        // application dependencies

When developing on UI and features, you should mostly work on src folder without touching the other directories and files.

Installation Instruction

The web client does not work if other micro-services are not running concurrently. In production and continuous integration, all services including web client are built in docker; however, when developing in local, the docker does not build web client. The reason is docker doesn't rebuild itself when web client is updated, therefore not very efficient to work in docker environment.

To start work on the web client, please make sure you have installed Node.js 8 (Do not install v9.0+), and follow the following steps:

If you have not built the docker images for other micro-services yet, run ./cjl up in the root directory of the repository. For more information, check the main README.
Once the micro-services are up, run npm install in web client directory.
When the installation is finished, run npm run start to start the application
When the application is running, you can edit the source code. The latest changes will be shown in the browser.

Develop on Components

Under src directory, you should see the application source code with the following folders:

-/assets                            // static assets such as images
-/components                        // reusable components
-/router                            // url router
-/theme                             // styling

Vue.js is component based Javascript framework, therefore each .vue file creates a reusable component. Each component is able to be run independently.

So far in our application, we have:

Landing.vue: the landing page component is used to handle first-time users
Dashboard.vue: the main component that contains Sidebar.vue component and Chat.vue component. When Sidebar.vue component handles the data display on the UI, the Chat.vue component handles all the logic related to the chatbot.
Legal.vue: the legal page component is used to fetch and show the latest Privacy Policy and End User License Agreement.
Eventbus.js: a bus for component communications.

.vue file usually contains all necessary codes for a component (Javascript, HTML, and CSS). To make our lives easier, all styling is configurated and written in SASS format and stored in the theme folder. To change the styling of the UI, you only need to edit the corresponding .scss file without touching the functional codes.

Due to the simplicity of the nature of the application, we did not implement state management architecture. As mentioned above, we use Eventbus.js to handle component communication. If you want to have major refactoring in the future, you can check out Vuex.

We use ElementUI as the UI library. It is the best library available for Vue.js. For the best practice and code consistency, you should always check if the feature can be implemented using Element component.

Testing

The unit test of the web client is using the default Vue.js unit test library, which is built with Mocha. To test the application locally, run npm run test.

All unit test files are stored in test/unit directory. Each .spec.js file contains the unit tests for the corresponding component. You should always make sure your new changes are well tested. Once you run the test, the test report will be generated in test/unit/converage. You can open test/unit/converage/icov-report/index.html to see the visual report.

Due to the scope of the project, we did not implement E2E automation test. To do so, please check Nightwatch.js.

Reference

export COMPOSE_FILE=ci
./cjl up -d && ./cjl run backend_service

Initializes a new conversation

URL : /new

Method : POST

Data constraints

Provide the user's name and person type.

{
    "name": "[unicode 40 chars max]",
    "person_type": "(TENANT|LANDLORD)"
}

Code : 200 OK

Content examples

{
    "conversation_id": 1
}

Code : 400 Bad Request - Invalid person_type provided

Stores the user confirmation or text supplied in order to confirm whether an NLP prediction was accurate. This is request is sent when the user either accepts or rejects an intent classification via interface buttons.

URL : /store-user-confirmation

Method : POST

Data constraints

Provide the conversation id and confirmation text of the user.

{
    "conversation_id": 1,
    "confirmation": true | false | "$5000"
}

Code : 200 OK

Content examples

{
    "message": "User confirmation stored successfully"
}

Code : 400 Bad Request

Code : 404 Not Found

Send a message

Sends a message to the bot. This will be the message that the bot displays to the user

URL : /conversation

Method : POST

Data constraints

Provide the conversation_id and a message. Message should be empty string for first call.

{
    "conversation_id": "[integer]",
    "message": "[unicode]"
}

Code : 200 OK

Content examples

Simple response containing a message and conversation progress. Messages may contain pipe characters (|) which indicate that a sentence should be split into separate conversation windows. Note that the first message will not contain a pipe character. Each subsequent sentence will begin with a pipe character if it is meant to be split.

Converation progress will be null at the beginning of the conversation, and if the user ends up asking a FAQ question (which have no progress). However, if the user asks a question that requires the bot to resolve facts, progress will indicate a percentage of how far along the conversation is to getting a prediction.

{
    "conversation_id": 1,
    "message": "Hello Tim Timmens!|What kind of problem are you having?",
    "progress": null
}

This is formatted in the web interface as:

Message 1: Hello Tim Timmens!

Message 2: What kind of problem are you having?

{
    "conversation_id": 5,
    "message": "Oh I see you're having problems with lease termination!|I have a few questions for you.|Do you have a lease?",
    "progress": 0
}

This is formatted in the web interface as:

Message 1: Oh I see you're having problems with lease termination!

Message 2: I have a few questions for you.

Message 3: Do you have a lease

{
    "conversation_id": 1,
    "file_request": {
        "document_type": "LEASE"
    },
    "message": "Could you please upload your lease if you have it, Tim Timmens?"
}

LEASE: A lease for a dwelling

Code : 404 Not Found

Get a conversation history

Gets the message history for a conversation

URL : /conversation/:conversation_id

Method : GET

Code : 200 OK

Content examples

{
    "claim_category": "NONPAYMENT",
    "bot_state": "RESOLVING_FACTS",
    "current_fact": {
        "name": "landlord_retakes_apartment",
        "summary": "Landlord intends to retake dwelling",
        "type": "BOOLEAN"
    },
    "fact_entities": [
        {
            "fact": {
                "name": "apartment_impropre",
                "summary": "Dwelling unfit for habitation",
                "type": "BOOLEAN"
            },
            "id": 1,
            "value": "false"
        },
        {
            "fact": {
                "name": "landlord_relocation_indemnity_fees",
                "summary": "Relocation reimbursed following inhabitability",
                "type": "BOOLEAN"
            },
            "id": 2,
            "value": "true"
        }
    ],
    "files": [],
    "id": 1,
    "messages": [
        {
            "enforce_possible_answer": true,
            "file_request": null,
            "id": 1,
            "possible_answers": "[\"Yes\"]",
                "relevant_fact": null,
                "sender_type": "BOT",
                "text": "Hello Bobby! Before we start, I want to make it clear that I am not a replacement for a lawyer and any information I provide you with is not meant to be construed as legal advice. Always check in with your legal professional. You can read more about our terms of use <a href='/legal' target='_blank'>here</a>. Do you accept these conditions?",
                "timestamp": "2017-12-20T01:27:35.993932+00:00"
            },
            {
                "enforce_possible_answer": null,
                "file_request": null,
                "id": 2,
                "possible_answers": null,
                "relevant_fact": null,
                "sender_type": "USER",
                "text": "Yes",
                "timestamp": "2017-12-20T01:27:39.023317+00:00"
            },
            {
                "enforce_possible_answer": false,
                "file_request": {
                    "document_type": "LEASE"
                },
                "id": 3,
                "possible_answers": null,
                "relevant_fact": null,
                "sender_type": "BOT",
                "text": "I see you're a tenant, Bobby. If you have it on hand, it would be very helpful if you could upload your lease. What issue can I help you with today?",
                "timestamp": "2017-12-20T01:27:39.040375+00:00"
            },
            {
                "enforce_possible_answer": null,
                "file_request": null,
                "id": 4,
                "possible_answers": null,
                "relevant_fact": null,
                "sender_type": "USER",
                "text": "I am being kicked out",
                "timestamp": "2017-12-20T01:27:40.694884+00:00"
            },
            {
                "enforce_possible_answer": false,
                "file_request": null,
                "id": 5,
                "possible_answers": null,
                "relevant_fact": {
                    "name": "apartment_impropre",
                    "summary": "Dwelling unfit for habitation",
                    "type": "BOOLEAN"
                },
                "sender_type": "BOT",
                "text": "Oh yes, I know all about problems with nonpayment. Would you deem the apartment unfit for habitation?",
                "timestamp": "2017-12-20T01:27:40.794129+00:00"
            },
            {
                "enforce_possible_answer": null,
                "file_request": null,
                "id": 6,
                "possible_answers": null,
                "relevant_fact": {
                    "name": "apartment_impropre",
                    "summary": "Dwelling unfit for habitation",
                    "type": "BOOLEAN"
                },
                "sender_type": "USER",
                "text": "No",
                "timestamp": "2017-12-20T01:28:46.591573+00:00"
            },
            {
                "enforce_possible_answer": false,
                "file_request": null,
                "id": 7,
                "possible_answers": null,
                "relevant_fact": {
                    "name": "landlord_relocation_indemnity_fees",
                    "summary": "Relocation reimbursed following inhabitability",
                    "type": "BOOLEAN"
                },
                "sender_type": "BOT",
                "text": "Have moving expenses been compensated when the apartment was deemed inhabitable?",
                "timestamp": "2017-12-20T01:28:46.652110+00:00"
            },
            {
                "enforce_possible_answer": null,
                "file_request": null,
                "id": 8,
                "possible_answers": null,
                "relevant_fact": {
                    "name": "landlord_relocation_indemnity_fees",
                    "summary": "Relocation reimbursed following inhabitability",
                    "type": "BOOLEAN"
                },
                "sender_type": "USER",
                "text": "Yes",
                "timestamp": "2017-12-20T01:28:51.529825+00:00"
            }
        ],
    "name": "Bobby",
    "person_type": "TENANT"
}

Code : 400 Bad Request

Code : 404 Not Found

Retrieves a report for a conversation once at least one prediction has been returned. Will return 404 if a report has not been generated yet.

URL : /conversation/:conversation_id/report

Method : GET

Code : 200 OK

Content examples

{
    "report": {
        "accuracy": 0.8114285714285714,
        "curves": {
            "additional_indemnity_money": {
                "mean": 1477.7728467101024,
                "outcome_value": 6038,
                "std": 1927.8147997893939,
                "variance": 3716469.9022870203
            }
        },
        "data_set": 8,
        "outcomes": {
            "additional_indemnity_money": 6038,
            "landlord_prejudice_justified": true,
            "orders_expulsion": true,
            "orders_immediate_execution": true,
            "orders_resiliation": true,
            "tenant_ordered_to_pay_landlord": 3092,
            "tenant_ordered_to_pay_landlord_legal_fees": 80
        },
        "similar_case": 5,
        "similar_precedents": [
            {
                "distance": 2.6080129205467784,
                "facts": {
                    "landlord_relocation_indemnity_fees": 0,
                    "tenant_dead": false,
                    "tenant_is_bothered": false,
                    "tenant_left_without_paying": false,
                    "tenant_owes_rent": 0,
                    "tenant_rent_not_paid_more_3_weeks": true
                },
                "outcomes": {
                    "additional_indemnity_money": 5850,
                    "landlord_prejudice_justified": true,
                    "orders_expulsion": true,
                    "orders_immediate_execution": true,
                    "orders_resiliation": true,
                    "tenant_ordered_to_pay_landlord": 7150,
                    "tenant_ordered_to_pay_landlord_legal_fees": 74
                },
                "precedent": "AZ-51412066"
            },
            {
                "distance": 2.6543730465072035,
                "facts": {
                    "landlord_relocation_indemnity_fees": 0,
                    "tenant_dead": false,
                    "tenant_is_bothered": false,
                    "tenant_left_without_paying": false,
                    "tenant_owes_rent": 2460,
                    "tenant_rent_not_paid_more_3_weeks": true
                },
                "outcomes": {
                    "additional_indemnity_money": 3620,
                    "landlord_prejudice_justified": true,
                    "orders_expulsion": true,
                    "orders_immediate_execution": true,
                    "orders_resiliation": true,
                    "tenant_ordered_to_pay_landlord": 2460,
                    "tenant_ordered_to_pay_landlord_legal_fees": 81
                },
                "precedent": "AZ-51163532"
            },
            {
                "distance": 2.6969256661279988,
                "facts": {
                    "landlord_relocation_indemnity_fees": 0,
                    "tenant_dead": false,
                    "tenant_is_bothered": false,
                    "tenant_left_without_paying": false,
                    "tenant_owes_rent": 0,
                    "tenant_rent_not_paid_more_3_weeks": true
                },
                "outcomes": {
                    "additional_indemnity_money": 2463,
                    "landlord_prejudice_justified": true,
                    "orders_expulsion": true,
                    "orders_immediate_execution": true,
                    "orders_resiliation": true,
                    "tenant_ordered_to_pay_landlord": 2886,
                    "tenant_ordered_to_pay_landlord_legal_fees": 83
                },
                "precedent": "AZ-51395624"
            },
            {
                "distance": 2.719885995641093,
                "facts": {
                    "landlord_relocation_indemnity_fees": 0,
                    "tenant_dead": false,
                    "tenant_is_bothered": false,
                    "tenant_left_without_paying": false,
                    "tenant_owes_rent": 0,
                    "tenant_rent_not_paid_more_3_weeks": true
                },
                "outcomes": {
                    "additional_indemnity_money": 3180,
                    "landlord_prejudice_justified": true,
                    "orders_expulsion": true,
                    "orders_immediate_execution": true,
                    "orders_resiliation": true,
                    "tenant_ordered_to_pay_landlord": 3830,
                    "tenant_ordered_to_pay_landlord_legal_fees": 74
                },
                "precedent": "AZ-51395655"
            },
            {
                "distance": 2.7806288394504484,
                "facts": {
                    "landlord_relocation_indemnity_fees": 0,
                    "tenant_dead": false,
                    "tenant_is_bothered": false,
                    "tenant_left_without_paying": false,
                    "tenant_owes_rent": 3600,
                    "tenant_rent_not_paid_more_3_weeks": true
                },
                "outcomes": {
                    "additional_indemnity_money": 3750,
                    "landlord_prejudice_justified": true,
                    "orders_expulsion": true,
                    "orders_immediate_execution": true,
                    "orders_resiliation": true,
                    "tenant_ordered_to_pay_landlord": 3600,
                    "tenant_ordered_to_pay_landlord_legal_fees": 78
                },
                "precedent": "AZ-51187376"
            }
        ]
    }
}

Code : 400 Bad Request

Gets only the list of resolved facts for the conversation

URL : /conversation/:conversation_id/resolved

Method : GET

Code : 200 OK

Content examples

{
    "fact_entities": [
        {
            "fact": {
                "name": "apartment_impropre",
                "summary": "Dwelling unfit for habitation",
                "type": "BOOLEAN"
            },
            "id": 1,
            "value": "false"
        },
        {
            "fact": {
                "name": "landlord_relocation_indemnity_fees",
                "summary": "Relocation reimbursed following inhabitability",
                "type": "BOOLEAN"
            },
            "id": 2,
            "value": "true"
        }
    ]
}

Code : 400 Bad Request

Code : 404 Not Found

Removes a resolved fact from the conversation

URL : /conversation/:conversation_id/resolved/:fact_id

Method : DELETE

Code : 200 OK

Content examples

{
    "success": true
}

Code : 400 Bad Request

Code : 404 Not Found

Upload a file that serves as evidence for a particular conversation.

URL : /conversation/:conversation_id/files

Method : POST

Headers

Content-Type: multipart/form-data

Data constraints

Provide 'file' form key with file data.

Code : 200 OK

Content examples

{
    "name": "leaky_pipes.png",
    "type": "image/png",
    "timestamp": "2017-10-24T00:01:27.806730+00:00"
}

Code : 400 Bad Request

Code : 404 Not Found

Gets a list of file metadata for a conversation

URL : /conversation/:conversation_id/files

Method : GET

Code : 200 OK

Content examples

{
    "files": [
        {
            "name": "leaky_pipes.png",
            "type": "image/png",
            "timestamp": "2017-10-24T00:01:27.000000+00:00"
        },
        {
            "name": "my_least.pdf",
            "type": "application/pdf",
            "timestamp": "2017-10-24T00:01:30.000000+00:00"
        }
    ]
}

Code : 400 Bad Request

Code : 404 Not Found

Obtains information and contents of the latest legal documents

URL : /legal

Method : GET

Code : 200 OK

Content examples

[
    {
        "abbreviation": "EULA",
        "html": {
            "content": [
                {
                    "subtitle": "TL;DR",
                    "summary": "no purse as fully me or point. Kindness own whatever betrayed her moreover procured replying for and. Proposal indulged no do do sociable he throwing settling. Covered ten nor comfort offices carried. Age she way earnestly the fulfilled extremely.",
                    "text": "Prevailed sincerity behaviour to so do principle mr. As departure at no propriety zealously my. On dear rent if girl view. First on smart there he sense. Earnestly enjoyment her you resources. Brother chamber ten old against. Mr be cottage so related minuter is. Delicate say and blessing ladyship exertion few margaret. Delight herself welcome against smiling its for. Suspected discovery by he affection household of principle perfectly he.",
                    "title": "DESCRIPTION OF SERVICE"
                },
                {
                    "subtitle": "TL;DR",
                    "summary": "Scarcely on striking packages by so property in delicate. Up or well must less rent read walk so be. Easy sold at do hour sing spot. Any meant has cease too the decay. Since party burst am it match. By or blushes between besides offices noisier as.",
                    "text": "It prepare is ye nothing blushes up brought. Or as gravity pasture limited evening on. Wicket around beauty say she. Frankness resembled say not new smallness you discovery. Noisier ferrars yet shyness weather ten colonel. Too him himself engaged husband pursuit musical. Man age but him determine consisted therefore. Dinner to beyond regret wished an branch he. Remain bed but expect suffer little repair.",
                    "title": "ACCEPTANCE OF TERMS"
                },
                {
                    "subtitle": "TL;DR",
                    "summary": "Luckily friends do ashamed to do suppose. Tried meant mr smile so. Exquisite behaviour as to middleton perfectly.",
                    "text": "He my polite be object oh change. Consider no mr am overcame yourself throwing sociable children. Hastily her totally conduct may. My solid by stuff first smile fanny. Humoured how advanced mrs elegance sir who. Home sons when them dine do want to. Estimating themselves unsatiable imprudence an he at an. Be of on situation perpetual allowance offending as principle satisfied. Improved carriage securing are desirous too.",
                    "title": "MODIFICATION OF TERMS"
                },
                {
                    "subtitle": "TL;DR",
                    "summary": "Improved own provided blessing may peculiar domestic. Sight house has sex never. No visited raising gravity outward subject my cottage mr be. Hold do at tore in park feet near my case.",
                    "text": "Extremely we promotion remainder eagerness enjoyment an. Ham her demands removal brought minuter raising invited gay. Contented consisted continual curiosity contained get sex. Forth child dried in in aware do. You had met they song how feel lain evil near. Small she avoid six yet table china. And bed make say been then dine mrs. To household rapturous fulfilled attempted on so. ",
                    "title": "REGISTRATION"
                }
            ],
            "header": "End User License Agreement",
            "subheader": "Savings her pleased are several started females met. Short her not among being any. Thing of judge fruit charm views do. Miles mr an forty along as he. She education get middleton day agreement performed preserved unwilling. Do however as pleased offence outward beloved by present. By outward neither he so covered amiable greater. Juvenile proposal betrayed he an informed weddings followed. Precaution day see imprudence sympathize principles. At full leaf give quit to in they up."
        },
        "time_created": "2017-10-26T20:52:41-04:00",
        "type": "End User License Agreement",
        "version": 1
    }
]

./cjl test nlp_service

Warning: Have at the very least 8GB of RAM available to run nlp_service tests.

We recommend an i5 Broadwell CPU and above if you so wish to run tests locally.

The team used atomic commits and pushes while working on Natural Language Processing to run the tests on its continuous integration tool (Travis in this case).

pip3 install -r requirements_test.txt

Extract a claim category from a user's message. Returns a question based on the claim category found, or a clarification question.

URL : /claim_category

Method : POST

Data constraints

Provide the conversation id and the message.

{
    "conversation_id": 1,
    "message": "I am being evicted"
}

Code : 200 OK

Content examples

{
    "message": "I see you're having problems with lease termination. Have you kept up with your rent payments?",
    "progress": 0
}

Code : 400 Bad Request - Inputs not provided

Code : 404 Not Found - Conversation doesn't exist

Submit message

Submits a user input to the NLP service. Returns the next question to ask, or a clarification question.

URL : /submit_message

Method : POST

Data constraints

Provide the conversation id and the message.

{
    "conversation_id": 1,
    "message": "My rent is $900 per month."
}

Code : 200 OK

Content examples

{
    "message": "Have you kept up with your rent payments?",
    "progress": 10
}

Code : 400 Bad Request - Inputs not provided

Code : 404 Not Found - Conversation doesn't exist

The util.parse_dataset.py module and the associated CreateJson class can be used to create json training data for RASA NLU.

[meta]
() = entity_name1, entity_extractor(optional)
{} = entity_name2, entity_extractor(optional)

[regex_features]
name:regex

[entity_synonyms]
entity:synonym1, synonym2

[common_examples: intent_name1]
sentence1
sentence2

[common_examples: intent_name2]
sentence1
sentence2

[] are reserved characters used to identify sections
meta section allows for the definition of meta-characters that define entities
regex_features are simply regex features
entity_synonyms are simply entity synonyms
common_examples:intent_name are common examples for a particular intent

[meta]
() = money, ner_duckling

[regex_features]
money:$\d(.)?+|\d(.)?+$

[common_examples: true]
my landlord increased my rent by ($500)
i owe my landlord (40 dollars)

[common_examples: false]
i don't owe my landlord any money
i dont have any debts
no

python3 -m util.parse_dataset <read_dir> <write_dir>

python3 -m util.parse_dataset ~/Documents/ ~/Documents/Json/

DO NOT FORGET THE '/' AT THE END OF YOUR DIRECTORY

As of April 10th 2018 the outlier detection is not being used by the NLP service

This is due to a lack of data of what is considered an "outlier answer".

Two kinds of claim categories:

Developed claim categories:
- Series of questions that the user answers to resolve facts
- Multiple outcomes dynamically calculated by the ml_service
- a conclusive view with a dashboard containing resolved facts and most similar legal cases to theirs
FAQs have:
- one long and developed answer resumed from websites such as Regie du logement, Educaloi or LikeHome.

Add the claim category to nlp_service/controllers/nlp_controller.py in "conversation.claim_category" inside of the "classify_claim_category" function
Define the new claim category inside of the class "ClaimCategory" in postgresql_db/models.py
Define the new category inside of the the *.txt file in nlp_service/rasa/text/category (depending whether or not it is a category belonging to a tenant (category_tenant.txt) or a landlord (category_landlord.txt)) We recommend keeping track of FAQ vs developed categories by writing "faq_AbrievationOfSource_factname"
Write in nlp_service/services/response_strings.py your response if the claim you wrote is an "FAQ"
At this stage you should either have a complete FAQ or an empty developed claim category, which you'll have to add facts to! (following section)

Adding a new fact (includes adding new questions)

Add new fact to postgresql_db/models.py as well as the type of answer you are expecting from it and the summary (displayed definition on the front-end)
Add your new fact to nlp_service/services/response_strings.py in "fact_questions" by adding the question trying to answer the fact
If not answerable by a generic "yes or no" add the fact as a {name_of_fact}.txt file in nlp_service/rasa/text/fact/individual
If answerable by a generic "yes or no", add the fact name to nlp_service/init_rasa.py in "fact_names"

Add the outcome(s) you want to be checked by the ml_service to the desired developed claim categories in nlp_service/services/fact_service.py in "outcome_mapping"
Tell the system what to say if the ml_service returns the outcome as "True" (it will happen) or "False" (it won't happen) in nlp_service/services/response_strings.py in "prediction"

The models are retrained every time the project is (re)built.

The training is initialized init.py whenever the train function's force_train parameter inside of nlp_service/rasa/rasa_classifier.py is set to true. The models are loaded in nlp_service/controllers/nlp_controller.py where force_train is initialized as false and initialize_interpreter is initialized as true.

The team a core part of its Natural Language Processing component RASA NLU. Documentation available here. Active Gitter channel available here.

The team experimented with multiple pipelines and considered Spacy 2.0 by far superior to MITIE. Our config file can be found ~/nlp_service/rasa/config/rasa_config.json

Components:

nlp_spacy: initializes spacy structures
tokenizer_spacy: creation of tokens using Spacy
intent_entity_featurizer_regex: uses regular expressions to aid in intent and entity classification (ONLY SUPPORTED BY NER_CRF)
ner_crf: entity extractor using conditional random fields
ner_synonyms: maps two or more entities to be extracted to have the same value
intent_classifier_sklearn: classifies intents of the text being parsed
duckling: extraction of pre-trained entities such as money, time, dates, etc.

We do not recommend "ner_spacy" as a replacement to "ner_crf" due to its absence of confidence scores for the entity extraction. We also strongly advise against using more than 1 thread or more than 1 process due to stability issues with duckling.

Things to know that are not mentioned in RASA documentation:

Proper usage of the intent_entity_featurizer_regex will often drastically improve intent confidence percentage (up to 40%)
- Regex on sections of common examples that are unique to a specific intent (e.g.Regex on the word "tax" that has an extremely large chance of only appearing when the user wants information concerning his RL-31 slip)
- Regex only actually helps with intent confidence ratio, not entity confidence. (This bit of information was obtained after a conversation with RASA contributors on gitter)
Working with common examples
- I'm and Im and I am count as different words with Spacy. Avoid using those words in common examples.
- Capitals matter. Lower casing our data sets while continuously lower casing the user's input for NLP improved the confidence percentage drastically
- Avoid fluff (stop words) in the common examples for a proper word vector to be calculated. (e.g. deleting "can you help me with this?" at the end of the common examples for this will alter the vector calculated for the intent's common example.)
Working with entities
- We strongly suggest using entity_synonyms not only for different variations of the entity you are attempting to extract but also for common spelling mistakes of the entities

0. Table of Contents

1. Overview

The machine learning service is responsible for predicting the outcomes of a user's case.

Outcomes can either be categorized as either being True/False or by a numerical value. Whether a given outcome is boolean or integer is evaluated by a human and then given to the system beforehand (See section 1.6). Therefore, this sub-system makes use of both classifiers and regressors to make predictions. The inputs for both the classifier the and regressor are the facts obtained by the user's inputs. An array of outcomes is then returned.

The input and output data are all represented numerically despite having the potential to be boolean values. Below illustrates how values are treated:

0 --> False / Null 1 --> True (n > 1) --> True AND Numerical

Numerical Values consist of:

Dates / Time (in months)
Money (in $)

The inputs are stored in a numpy array consisting of only integers with the possible values listed in section 1.1. Every index of the array represents a different fact/input data point which will be used by the machine learning. The indexes of the facts are determined once the precedents are tagged (they are subject to change orders upon re-tagging the data). An input array will look as such:

[fact_1, fact_2, ..., fact_n]

Here is an example to retrieve the labels for each column:

from feature_extraction.post_processing.regex.regex_tagger import TagPrecedents

indexes = TagPrecedents().get_intent_index()

# print sample of the content
for i in index['outcomes_vector'][:3]:
    print(i)

output:

(0, 'additional_indemnity_money', 'bool')
(1, 'declares_resiliation_is_correct', 'bool')
(2, 'landlord_serious_prejudice', 'bool')

structure for 'indexes' variable:

{
    'outcomes_vector': [
        (array_index, column_label, column_type),
        (array_index, column_label, column_type)
    ],
    'facts_vector': [
        (array_index, column_label, column_type),
        (array_index, column_label, column_type)
    ]
}

Similarly to section 1.2, the output will be an array of integers of the size of all the number of outcomes supported by the system. Please refer to section 1.1 for other inquiries.

A multiclassifier is used to predict all outcomes. In the background, SkLearn uses a different estimator per outcome in order to perform this task. When obtaining a prediction, ALL outcomes are either classified as True or False. Even the numerical outcomes are classified as such. If an outcome is expected to be a numerical value AND that outcome is True then the input is passed to the appropriate regressor in order to predict the outcome's integer value. If the previous condition isn't met then no further data manipulation is necessary for a given outcome and the classifier's prediction is simply returned for this column.

Adding a new classifier New classifiers will be automatically trained upon adding regexes. See section 1.6.

The regressors are only used if the classifier predicted an outcome as True. The reason for this implementation is because the regressors are trained on bias data where we know the outcome was True. Therefore the input data must also be biased towards the same end goal.

During training, only for regression, the average values of every fact of the data set is obtained. The vector will look as such:

[average_column_1, average_column_2, ..., average_column_n]

This vector is kept in binary format and can be retrieved this way:

from util.file import Load
mean_facts_vector = Load.load_binary('model_metrics.bin')['regressor'][<name of the regressor>]['mean_facts_vector']

Regression fine tuning When making a regressive prediction, the user's input is entered as an array of numerical values as in section 1.2.

Wherever a 0 is encountered in the user's input, we replace it with the average value of it's column.The purpose of this strategy is to predict more accurate results when the regressor is used. When a prediction is performed with missing input we then replace that missing input with it's average value to get a better fit on the curve.
During training, outliers in the dataset are removed. Outliers are determined by:

abs(outcome - average_of_outcomes) > (2 * std_of_outcomes)

Adding a new regressor The regressor's estimators are crafted manually as opposed to using the SkLearn's wrapper as in section 1.4. Because the regressors require much more discreet attention, this approach was necessary. A custom wrapper is instead written, and every new regressor can inherit the AbstractRegressor Class.

Code new regressor (inherit abstract_regressor.py)
Update multi_output_regression.py to accomodate new class

Adding new columns is fairly simply. In the feature_extraction/post_processing/regex/regex_lib.py file simply append your regex to the regex_facts or regex_outcomes list. The syntax is the following:

regex_facts = [
    (
        <column_label>, [
            re.compile(<regex_1>, re.IGNORECASE),
            re.compile(<regex_2>, re.IGNORECASE),
            re.compile(<regex_n>, re.IGNORECASE)
        ],
        <data_type>),
    ),
    (
        <column_label>, [
            re.compile(<regex_1>, re.IGNORECASE),
            re.compile(<regex_2>, re.IGNORECASE),
            re.compile(<regex_n>, re.IGNORECASE)
        ],
        <data_type>),
    ),
]

Type as many regular expressions as needed to cover all the dataset. Upon tagging the data a percentage of lines tagged will be displayed.

Note: <data_type> are the following strings:

"BOOLEAN"
"MONEY"
"DATE"

The newly added columns in the regex_lib.py file will then automatically be used the next time the machine learning performs its training on the condition that the data has be re-post-processed. Be sure to create a regressor if you want to predict "DATE" or "MONEY" though (See section 1.5).

2. DATA

All persistent machine learning data are stored as binaries. In order to centralize this information it is advised to upload the models on a server. These models may then be fetched in the init.py script in the source directory (do not confuse with __init__.py script). Simply append your download link to the binary_urls list found in this file.

To load any binary files, first make sure it is stored in the binary/data/ folder. This should be performed automatically by the init.py. Then simply use the following:

from util.file import Load

Load.load_binary(<binary_file_name>)

To save a binary file use the following:

from util.file import Save

Save().save_binary(<desired_binary_file_name>, model)

The output directory will be binary/data/ by default.

Some global variables are listed in util/constant.py

2.4 Binary file content

classifier_labels.bin

{
   outcome_index_0 <int>: (
       column_label <str>,
       column_type <str>
   ),
   outcome_index_n <int>: (
       column_label <str>,
       column_type <str>
   ),
}

model_metrics.bin

{
    'data_set':{
        'size': <int>
    },
    'classifier':{
        classifier_name_0 <str>: {
            'prediction_accuracy': <float>
        },
        classifier_name_n <str>: {
            'prediction_accuracy': <float>
        }
    },
    'regressor':{
        regressor_name_0 <str> :{
            'std': <float>,
            'variance': <float>,
            'mean_facts_vector': <numpy.array>
        },
        regressor_name_n <str> :{
            'std': <float>,
            'variance': <float>,
            'mean_facts_vector': <numpy.array>
        }
    }

multi_class_svm_model.bin Used to predict classifier results

from util.file import Load
from sklearn.preprocessing import binarize

model = Load.load_binary("multi_class_svm_model.bin")
classifier_labels = Load.load_binary('classifier_labels.bin')
input_vector = [fact_1, fact_2, fact_n, ...]
data = binarize([input_vector], threshold=0)
prediction = model.predict(data)

precedent_vectors.bin

{
    <precedent_id> <str>:{
        'outcomes_vector': numpy.array,
        'facts_vector': numpy.array,
        'file_number': <str>,
        'name': AZ-********.txt <str>
    }
}

similarity_case_numbers.bin An array of all case numbers. This is used to map the indices (returned by the similarity model) to case numbers

[
    'AZ-XXXXXX',
    'AZ-XXXXXX',
    'AZ-XXXXXX',
    'AZ-XXXXXX'
    ...
]

similarity_model.bin

Case similarity comparator. Uses NearestNeighbour algorithm. Set to return the 5 nearest neighbours.

Input: A vector, which is the concatenation of the vector containing facts and the vector containing outcomes Output: The indices (which have a direct mapping to case numbers using similarity_case_numbers [see above]) of the 5 most similar cases

from util.file import Load

model = Load.load_binary("similarity_model.bin")

facts_vector = [fact_1, fact_2, fact_n, ...]
outcomes_vector = [outcome_1, outcome_2, outcome_n, ...]
input_vector = facts_vector + outcomes_vector

model.kneighbors(input_vector)

*_scaler.bin Every machine learning model requires a scaler to transform the data into values which will exponentially increase training time.

*_regressor.bin Models used to predict regressive results

from util.file import Load
from keras.models import load_model
import os

file_path = os.path.join(Path.binary_directory, '<regressor_name>')
regressor = load_model(file_path)
scaler = Load.load_binary('<your_scaler>')
model = AbstractRegressor._create_pipeline(scaler, regressor)
input_data = [fact_1, fact_2, ..., fact_n]
prediction = model.predict([input_data])

3. Installation Instructions

Add Cyberjustice Lab username as environment variables: export CJL_USER={USERNAME} either to your .bashrc or run it as a command
Add Cyberjustice Lab password as environment variables: export CJL_PASS={PASSWORD} either to your .bashrc or run it as a command
Run pip3 install -r requirements.txt
Run pip3 install -r requirements_test.txt

4. File Structure


----| data <all data input and output>
--------| raw
------------| text_bk <extract precedents here>

--------| binary <all saved binarized model/data>
--------| cache <temp files>
--------| test <used for unit testing>

----| feature_extraction <all data manipulation before supervised training>
--------| feature_extraction.py <driver for feature extraction (using 3 drivers above)>
--------| pre_processing
------------| pre_processing_driver
----------------| filter_precedent
--------------------| precedent_directory_cleaner.py
--------| post_processing
------------| post_processing_driver.py <driver for post_processing>
----------------| regex
--------------------| regex_entity_extraction.py
--------------------| regex_lib.py
--------------------| regex_tagger.py

----| model_learning <supervised training>
------------| classifier
----------------| classifier_dirver.py
----------------| multi_output
--------------------| multi_class_svm.py
------------| regression
----------------| regression_driver.py
----------------| single_output_regression
--------------------| abtract_regressor.py
--------------------| tenant_pays_landlord.py
--------------------| additional_indemnity.py
----------------| multi_output
--------------------| multi_output_regression.py
------------| similar_finder
----------------| similar_finder.py

----| util <common tool>
------------| log.py <logging tool>
------------| file.py <file save and load>
------------| constant.py <global variables>

----| web
--------| ml_controller.py

init.py
main.py <driver for the pipeline (feature extraction + model training>

5. ML API

Predict the outcome based on given facts and demands. Returns an array of predicted outcomes as well as similar precedents. The precedents have distances assigned to them. The lower the distance, the more similar it is.

URL : /predict

Method : POST

Data constraints

Provide facts_vector and demands_vector, with key values for each fact/demand.

{
  "facts" : {
    "absent" : 1,
    "apartment_impropre" : 0,
    "apartment_infestation" : 1,
    "asker_is_landlord" : 1,
    "asker_is_tenant" : 1,
    "bothers_others" : 1,
    "disrespect_previous_judgement" : 1,
    "incorrect_facts" : 1,
    "landlord_inspector_fees" : 1,
    "landlord_notifies_tenant_retake_apartment" : 1,
    "landlord_pays_indemnity" : 1,
    "landlord_prejudice_justified" : 1,
    "landlord_relocation_indemnity_fees" : 1,
    "landlord_rent_change" : 1,
    "landlord_rent_change_doc_renseignements" : 1,
    "landlord_rent_change_piece_justification" : 1,
    "landlord_rent_change_receipts" : 0,
    "landlord_retakes_apartment" : 1,
    "landlord_retakes_apartment_indemnity" : 1,
    "landlord_sends_demand_regie_logement" : 0,
    "landlord_serious_prejudice" : 1,
    "lease" : 1,
    "proof_of_late" : 1,
    "proof_of_revenu" : 0,
    "rent_increased" : 1,
    "tenant_bad_payment_habits" : 1,
    "tenant_continuous_late_payment" : 1,
    "tenant_damaged_rental" : 1,
    "tenant_dead" : 1,
    "tenant_declare_insalubre" : 1,
    "tenant_financial_problem" : 0,
    "tenant_group_responsability" : 1,
    "tenant_individual_responsability" : 1,
    "tenant_is_bothered" : 1,
    "lack_of_proof" : 1,
    "tenant_landlord_agreement" : 0,
    "tenant_lease_fixed" : 1,
    "tenant_lease_indeterminate" : 1,
    "tenant_left_without_paying" : 0,
    "tenant_monthly_payment" : 1,
    "tenant_negligence" : 1,
    "tenant_not_request_cancel_lease" : 1,
    "tenant_owes_rent" : 1,
    "tenant_refuses_retake_apartment" : 1,
    "tenant_rent_not_paid_less_3_weeks" : 1,
    "tenant_rent_not_paid_more_3_weeks" : 0,
    "tenant_rent_paid_before_hearing" : 1,
    "tenant_violence" : 1,
    "tenant_withold_rent_without_permission" : 1,
    "violent" : 1
  }
}

Code : 200 OK

Content examples

{
  "outcomes_vector": {
    "additional_indemnity_money": "221",
    "authorize_landlord_retake_apartment": "0",
    "declares_housing_inhabitable": "0",
    "declares_resiliation_is_correct": "0",
    "landlord_prejudice_justified": "1",
    "landlord_retakes_apartment_indemnity": "0",
    "landlord_serious_prejudice": "0",
    "orders_expulsion": "1",
    "orders_immediate_execution": "1",
    "orders_landlord_notify_tenant_when_habitable": "0",
    "orders_resiliation": "1",
    "orders_tenant_pay_first_of_month": "0",
    "tenant_ordered_to_pay_landlord": "643",
    "tenant_ordered_to_pay_landlord_legal_fees": "80"
  },
  "probabilities_vector": {
    "additional_indemnity_money": "0.93",
    "authorize_landlord_retake_apartment": "1.0",
    "declares_housing_inhabitable": "1.0",
    "declares_resiliation_is_correct": "0.94",
    "landlord_prejudice_justified": "0.74",
    "landlord_retakes_apartment_indemnity": "1.0",
    "landlord_serious_prejudice": "1.0",
    "orders_expulsion": "0.88",
    "orders_immediate_execution": "0.72",
    "orders_landlord_notify_tenant_when_habitable": "1.0",
    "orders_resiliation": "0.91",
    "orders_tenant_pay_first_of_month": "0.99",
    "tenant_ordered_to_pay_landlord": "0.99",
    "tenant_ordered_to_pay_landlord_legal_fees": "0.91"
  },
  "similar_precedents": [
    {
      "distance": 0.3423500835013649,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "900.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "970.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "70.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "970.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "88.0"
      },
      "precedent": "AZ-51211608"
    },
    {
      "distance": 0.3429019324281239,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "735.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "873.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "0.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "873.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "80.0"
      },
      "precedent": "AZ-51176404"
    },
    {
      "distance": 0.49114649102172725,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "770.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "1360.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "590.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "1360.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "81.0"
      },
      "precedent": "AZ-51212451"
    },
    {
      "distance": 0.49200755901067444,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "945.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "1290.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "345.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "1290.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "72.0"
      },
      "precedent": "AZ-51201834"
    },
    {
      "distance": 0.4933548500076463,
      "facts": {
        "apartment_dirty": false,
        "asker_is_landlord": true,
        "asker_is_tenant": false,
        "bothers_others": false,
        "disrespect_previous_judgement": false,
        "landlord_inspector_fees": "0.0",
        "landlord_notifies_tenant_retake_apartment": false,
        "landlord_pays_indemnity": false,
        "landlord_relocation_indemnity_fees": "0.0",
        "landlord_rent_change": false,
        "landlord_rent_change_doc_renseignements": false,
        "landlord_retakes_apartment": false,
        "landlord_sends_demand_regie_logement": false,
        "rent_increased": false,
        "signed_proof_of_rent_debt": false,
        "tenant_continuous_late_payment": false,
        "tenant_damaged_rental": false,
        "tenant_dead": false,
        "tenant_financial_problem": false,
        "tenant_group_responsability": false,
        "tenant_individual_responsability": true,
        "tenant_is_bothered": false,
        "tenant_lease_indeterminate": false,
        "tenant_left_without_paying": false,
        "tenant_monthly_payment": "800.0",
        "tenant_not_paid_lease_timespan": "0.0",
        "tenant_owes_rent": "1400.0",
        "tenant_refuses_retake_apartment": false,
        "tenant_rent_not_paid_more_3_weeks": true,
        "tenant_sends_demand_regie_logement": false,
        "tenant_withold_rent_without_permission": false,
        "violent": false
      },
      "outcomes": {
        "additional_indemnity_money": "0.0",
        "authorize_landlord_retake_apartment": false,
        "declares_housing_inhabitable": false,
        "declares_resiliation_is_correct": false,
        "landlord_prejudice_justified": true,
        "landlord_retakes_apartment_indemnity": false,
        "landlord_serious_prejudice": false,
        "orders_expulsion": true,
        "orders_immediate_execution": true,
        "orders_landlord_notify_tenant_when_habitable": false,
        "orders_resiliation": true,
        "orders_tenant_pay_first_of_month": false,
        "tenant_ordered_to_pay_landlord": "0.0",
        "tenant_ordered_to_pay_landlord_legal_fees": "92.0"
      },
      "precedent": "AZ-51391660"
    }
  ]
}

Code : 400 Bad Request - Inputs not provided

Code : 404 Not Found - Conversation doesn"t exist

Get the weights of every outcome sorted by descending order of importance

URL: /weights

Method: GET

Data constraints

None

Code : 200 OK

Content examples

{
    "additional_indemnity_money": {
        "important_facts": [
            "asker_is_landlord",
            "tenant_withold_rent_without_permission",
            "tenant_refuses_retake_apartment",
            "tenant_monthly_payment",
            "tenant_not_paid_lease_timespan"
        ],
        "additional_facts": [
            "tenant_financial_problem",
            "tenant_owes_rent",
            "asker_is_tenant",
            "tenant_damaged_rental",
            "tenant_individual_responsability",
            "signed_proof_of_rent_debt",
            "tenant_lease_indeterminate",
            "tenant_dead",
            "tenant_is_bothered",
            "bothers_others"
        ]
    }
}

Get the anti facts

Left hand side always initialized to 1 and right hand side to 0

URL: /antifacts

Method: GET

Data constraints

None

Code : 200 OK

Content examples

{
    "tenant_rent_not_paid_less_3_weeks": "tenant_rent_not_paid_more_3_weeks",
    "tenant_lease_fixed": "tenant_lease_indeterminate",
    "not_violent": "violent",
    "tenant_individual_responsability": "tenant_group_responsability"
}

Get the ml stats

Used to obtain:

Size of data set
Variance of regression outcomes
Standard deviation of regression outcomes
Mean of regression outcomes
Prediction accuracy of each classifier

URL: /statistics

Method: GET

Data constraints

None

Code : 200 OK

Content examples

{
    "classifier": {
        "additional_indemnity_money": {
            "prediction_accuracy": 79.8400199975003
        },
        "authorize_landlord_retake_apartment": {
            "prediction_accuracy": 99.48756405449319
        },
        "declares_housing_inhabitable": {
            "prediction_accuracy": 99.95000624921884
        },
        "declares_resiliation_is_correct": {
            "prediction_accuracy": 91.83852018497687
        },
        "landlord_prejudice_justified": {
            "prediction_accuracy": 81.07736532933383
        },
        "landlord_retakes_apartment_indemnity": {
            "prediction_accuracy": 99.72503437070367
        },
        "landlord_serious_prejudice": {
            "prediction_accuracy": 96.35045619297588
        },
        "orders_expulsion": {
            "prediction_accuracy": 91.55105611798525
        },
        "orders_immediate_execution": {
            "prediction_accuracy": 84.32695913010873
        },
        "orders_landlord_notify_tenant_when_habitable": {
            "prediction_accuracy": 100
        },
        "orders_resiliation": {
            "prediction_accuracy": 93.48831396075491
        },
        "orders_tenant_pay_first_of_month": {
            "prediction_accuracy": 98.05024371953506
        },
        "tenant_ordered_to_pay_landlord": {
            "prediction_accuracy": 83.82702162229721
        },
        "tenant_ordered_to_pay_landlord_legal_fees": {
            "prediction_accuracy": 90.32620922384702
        }
    },
    "data_set": {
        "size": 40003
    },
    "regressor": {
        "additional_indemnity_money": {
            "mean": 1477.7728467101024,
            "std": 1927.8147997893939,
            "variance": 3716469.9022870203
        },
        "tenant_pays_landlord": {
            "mean": 2148.867088064977,
            "std": 2129.510243010276,
            "variance": 4534813.8750856845
        }
    }
}

6. Using the Command Line

* denotes optional arguments

From the source directory JusticeAi/src/ml_service/ you may run:

Pre Processing python main.py -pre [number of files | empty for all]
Post Processing i. Each fact and outcome is listed with their number of occurences ii. % of tagged lines is displayed iii. python3 main.py -post [number of files | empty for all]
Training **Note: Always train svm before the sf and the svr Testing results are displayed: i. classifier: accuracy, F1, precision, recall ii. regression: absolute error, r2 arguments: i. --svm: classifier ii. --svr: regressor iii. --sf: similarity finder iv. --all: classifier, regressor, similarity finder python3 main.py -train [data size | empty for all] --svm* --sf* --svr* --all*

PostgreSQL Database

Bring up all services:

./cjl up -d

Connect via psql:

./cjl run --rm postgresql_db "psql -h postgresql_db -U postgres"

The above command will prompt you to enter the database password.

SQL script backup via pg_dump:

export PGPASSWORD=$(printf '%s' "$POSTGRES_PASSWORD")
./cjl run --rm -e PGPASSWORD='$PGPASSWORD' postgresql_db "pg_dump -h postgresql_db -U postgres -p 5432"

export COMPOSE_FILE=ci
./cjl up -d && ./cjl run ocr_service

Extract Text

From provided image data, returns the text extracted from this data as a string.

URL : /ocr/extract_text

Method : POST

Headers : multipart/form-data

Data constraints

Provide the 'file' key with image data data as the value.

Code : 200 OK

Code : 400 Bad Request - No file key or no image data provided

export COMPOSE_FILE=ci
./cjl up -d && ./cjl run web_client

The following technologies are in use in this service:

Bootstrap is an open source front end framework developed by Twitter. It contains styling for various common web components, such as forms and inputs, as well as providing a convenient grid system that greatly facilitates web page styling and layout.

Alternatives: Foundation Framework, pure.css, skeleton
Reason Chosen:
- Team member’s past experiences
- Industry standard

(Vue.js)[https://vuejs.org/] is an open source front end framework for building single page applications. It leverages component based architecture that allows for the creation of an interactive website. Its primary purpose will be to power the visible portion of the chatbot, displaying messages, sending messages to the server, and prompting the user for various interactions such as answering questions or providing files to use as evidence.

Alternatives: AngularJS, Angular 4, ReactJs
Reason Chosen:
- Low learning curve
- High performance
- Small footprint and minimal API

Requires the following installed on the host system:

SQLite
Python3

pip install -r requirements.txt FLASK_APP=app.py flask run

`POST /question`

Inserts a new user-generated question. Returns that user's ID.

{
    "question": "Is it okay for a landlord to ask for a security deposit?"
}

{
    "id": "5cd8a900-8a18-41b3-abb8-bf0307918afc"
}

415 if request does not contain valid JSON
422 if the question key is not present
422 if the question value is too long

Updates a user's email address based on their ID.

If an ID is provided, that ID's record is updated. If no ID is provided, a new record is created.

{
    "id": "5c17bfd0-87d0-4493-a312-f3f32323fff2",
    "email": "test@test.com"
}

{
    "id": "5c17bfd0-87d0-4493-a312-f3f32323fff2"
}

415 if request does not contain valid JSON
422 if the email key is not present
422 if the email value is too long

Updates a user's subscription status based on their ID. 1 is subscribed, 0 is not subscribed.

If an ID is provided, that ID's record is updated. If no ID is provided, a new record is created.

{
    "id": "5c17bfd0-87d0-4493-a312-f3f32323fff2",
    "is_subscribed": 1
}

{
    "id": "5c17bfd0-87d0-4493-a312-f3f32323fff2"
}

415 if request does not contain valid JSON
422 if the is_subscribed key is not present
422 if the is_subscribed key is not an integer

Updates a user's status on whether they are a legal professional based on their ID. 1 is a legal professional, 0 is not.

If an ID is provided, that ID's record is updated. If no ID is provided, a new record is created.

{
    "id": "5c17bfd0-87d0-4493-a312-f3f32323fff2",
    "is_legal_professional": 1
}

{
    "id": "5c17bfd0-87d0-4493-a312-f3f32323fff2"
}

415 if request does not contain valid JSON
422 if the is_subscribed key is not present
422 if the is_subscribed key is not an integer

This page presents users and devs alike with an amalgam of quick fixes for the various issues we ran into.

Docker remove all images: docker rmi $(docker images -a -q) Docker remove all Containers: docker stop $(docker ps -a -q) docker rm $(docker ps -a -q) Docker remove all volumes: docker volume rm $(docker volume ls -f)

Docker remove all except volumes: sudo docker system prune Docker remove all volumes: sudo docker volume prune

Add the docker group if it doesn't already exist:

sudo groupadd docker

Add the $USER you'd like to use to the docker group

sudo gpasswd -a $USER docker

a. Log yourself into the new docker group:

newgrp docker

b. Log out and log in to the user you just added to the docker group.
Test if you can run docker without su privileges by typing:

docker run hello-world

PostgreSQL

In root directory ./cjlean db-reset

Get into postgres container: docker exec -it <CONTAINER_ID> bash
Enter postgres command line: psql postgres postgres (if asked for password enter DEV_PASS_NOT_SECRET)
Type in order: DROP SCHEMA public CASCADE; CREATE SCHEMA public; GRANT ALL ON SCHEMA public TO postgres; GRANT ALL ON SCHEMA public TO public;
Type: \q to exit psql

Ensure when building that you did not build with root
Ensure the environment variables are set up in ~/.bashrc
If you built with root, ./cjl clean with root and ./cjl build --no-cache out of root

We've noticed that sometimes the database takes ~30 sec to create the models at runtime and be ready to accept connections, but the application services have thrown an error due to the requirement of creating a database connection. If you simply ./cjl down && ./cjl up, this problem goes away, but it may be worth having the application servers stall and wait as they perform a health check on the database before attempting to create a database connections.

ProceZeus Documentation

ProceZeus

Documentation

Getting Started

Prerequisites

Installing

Running the Entire Application Stack

Running or Testing Specific Services

Deployment

Architecture

Contributing

Versioning

Authors

License

Archicture and Infrastructure

Services

Backend Service

ML Service

Web Client

Postgresql

NLP Service

Infrastructure and Continuous Integration/Deployment

Web Client

Backend Service

Run Tests and Lints

Backend API

Initialize a new conversation

Success Response

Error Response

Store User Confirmation

Success Response

Error Response

Send a message

Success Response

Example 1:

Example 2:

Response containing a request for a file.

Document Types

Error Response

Get a conversation history

Success Response

Error Response

Get a report for a conversation

Success Response

Error Response

Code : 404 Not Found

Get facts resolved during conversation

Success Response

Error Response

Remove a resolved fact

Success Response

Error Response

Upload a file

Success Response

Error Response

Get conversation file metadata

Success Response

Error Response

Get Latest Legal Documents

Success Response

Natural Language Processing Service

Run Tests and Lints

Installing requirements

NLP API

Classify claim category

Success Response

Error Response

Submit message

Success Response

Error Response

RASA JSON Tool

Format

Example

Command Line Use

Example

Outlier detection

Adding a new claim category to the product

Adding a new fact (includes adding new questions)

Adding a new outcome or a response (this section is only useful for developed claim categories)

Retrain models

Code : `404 Not Found`

`POST /question`

`PUT /email`

`PUT /subscription`

`PUT /legal`