Model deployment¶
In ABM you can deploy your model in two ways:
- By scoring a new dataset
- By generating scoring code and using it to score data in your custom process
Scoring a new data¶
After your predictive model is built, you can calculate scores for the given dataset.
For instance, if the goal of your project was to select the best target group for your marketing campaign, you would like to know the response probability for each customer in your database. In this task, ABM will assign the response probability (calculated based on the predictive model built by ABM) to your customers (identified by an ID variable which must be unique for each customer).
To start the scoring data task:
- From the Homepage level, open the selected project by clicking on its name or click the Open project button
- Click the Score data button (from the Project’s reports level)
The scoring data task is divided into six steps for classification projects and five steps for approximation projects.
Step 1: Select a file for scoring
Step 2: Input detailed information about the file selected for scoring
Step 3: Select threshold (this step is omitted for approximation projects)
Step 4: Choose the ID variable(s) to which score values will be assigned (e.g. customer’s id)
Step 5: Start the data scoring process
Step 6: Download scoring result
And that’s it!
Step 1: Selecting file for scoring¶
Select an already uploaded file with the data that will be used for scoring. Currently, the .csv format is accepted (other formats are planned).
Remark: It’s crucial that the file for scoring and for model construction have the same structure:
- identical number of variables
- variables should have the same names and types
You can also add a new file by clicking the Browse for files button and by selecting the file from your computer. Upload the file with the Upload button and wait until it’s sent to the ABM server.
Choose the file by clicking its name and the Next button if you want to proceed to Step 2.
Step 2: Entering file settings¶
In Step 2 you will be asked to indicate what separators were used in the chosen file, including:
- Columns separators. For instance, if columns in CSV file are separated by a comma (e.g. name, email, age), write , in the first field
- Decimal separators. For instance, if a decimal character is a dot (e.g. 4.25) in CSV file, then write . in the second field
- Text separators. For instance, if a quotation mark is used to indicate the beginning and end of the text (e.g.’Baker Street 221b’), write ‘ in the third field
- File encoding. Choose the character encoding used in your file. If you need to add specific encoding to the list, contact us at abm_support@algolytics.com
Click the Next button if you want to proceed to Step 3 or the Back button if you want to change the previous project settings.
Step 3: Selecting threshold (classification projects)¶
In step 3 you can set the classification threshold for scoring. ABM suggests the optimal threshold based on the final model and the quality measure chosen when building the model. The user can manually change the threshold (for instance based on the model quality measures such as KS Score, Equal TPR TNR Score, Min Distance Score or the values on the ROC curve).
Step 4: Selecting identity variable¶
In Step 4 you will be asked to indicate the ID variable(s) name(s). The ID variable is a variable with unique values which is used for identification of observations (e.g. customer ID). The target predicted value will be assigned to this variable.
Click the Next button if you want to proceed or the Back button if you want to change the previous project settings.
Step 5: Running data scoring process¶
You’re almost there. Check if the information provided is correct. Click the Run button to start the scoring process or the Back button if you want to change the scoring process settings.
Step 6: Downloading scoring results¶
When the scoring process is finished you will see the following screen:
Click the Download button to see the scoring result. For classification projects the output file is a .csv file containing three columns divided by commas:
- PositiveTargetProb: the probability of a positive value of the target variable (the situation when the event occurred, e.g. 1, the customer churned)
- Identity variable name: name of the chosen ID variable to which the probability will be assigned (e.g. customer ID)
- PredictedTarget: the predicted value of the target variable. By default, if the threshold = 0.5, PredictedTarget will be equal to ‘1’ if the PositiveTargetProb >= 0.5, and to ‘0’ otherwise
For approximation projects the output file is a .csv file containing two columns divided by commas:
- PredictedTargetValue: the predicted value of your target variable
- Identity variable name: name of the chosen ID variable to which the predicted value will be assigned (e.g. product id)
You can also access the output file with the scoring results from the Repository.
Click the Back button to change the scoring process settings and to start the scoring process again. You can also return to model’s reports or go to ABM Homepage.
Generating scoring code¶
You can also download the scoring code for the final model. The scoring code is a code (in case of ABM: SQL or Java code) that given the input variables, calculates the model’s output. It includes the code to perform all the necessary variable transformations and formulas to produce the final output variables (e.g. PositiveTargetProb, PredictedTarget for classification). In order to generate the scoring code, follow the below steps:
Open project and click the Scoring code button.
Select what language your scoring code should be generated in. The following options are available:
- Java
- general SQL (should be modified accordingly to your database)
- SQL for Mysql
- SQL for Teradata
- SQL for Oracle
- SQL for Postgres
Click the Generate button and then download the file.