Adding, running and deleting projects¶
Adding a new project¶
In order to create a new project, go to the Homepage and click the Add New Project button.
In the next four steps you will be asked to:
Step 1: Enter the name for your project and select a source file
Step 2: Select the Working mode
Step 3: Enter variables settings
Step 4: Enter building model process settings
Step 5: Check if the information provided is correct
In the next chapters, you will learn how to specify the settings of your projects with reference to the above steps.
Step 1: Naming a new project and selecting source data¶
In the first field, enter a project name (something easy to remember).
Remark: The project name should be no longer than 32 characters
Then indicate the file that will be used to build your model. You can do this by choosing already imported files from the list. If there is no data available or you want to import new data, click the Add new data button and follow the steps described in the Chapter Importing files.
Click the Next button to proceed.
Step 2: Selecting Working Mode¶
There are three Working modes available:
- The Quick mode enables fast creation of the base model. Thanks to the use of the Regularized Regression algorithm in the modelling process, it assures you obtain an accurate model in a relatively short time. The algorithm automatically ensures the imputation of missing values, choosing the most significant modelling variables and application of optimal variable transformations
- The Advanced mode is an extension of the Quick mode. It uses more advanced methods for feature selection and data preparation (discretization techniques, binarization of categorical variables, imputation of missing values, and identification of outliers)
- The Gold mode is an extension of the Advanced mode with a more in-depth search through possible predictive modelling paths. It requires more time for the modelling process and also more time for scoring as the final result is provided as an assembly of several models
Click the name of the working mode you want to use in your project in order to proceed to the next step.
Step 3: Entering variables settings¶
In this step you will be asked to:
- Choose a target variable (e.g. TARGET) and define what is a positive value of the target variable (e.g.: 1)
- (optional) you can impose how ABM ought to use selected variables by indicating their type (ACTIVE, INACTIVE, OBLIGATORY, ID) and/or role (CATEGORICAL, NUMERICAL)
Here is a short description of what particular types and roles mean:
- ACTIVE: if a variable is active, it means that it will be taken into account during the model building process (but it doesn’t mean it will be selected for the model)
- INACTIVE: if a variable is inactive, it means that it will be ignored during the model building process
- OBLIGATORY: if a variable is obligatory, it means that it will be chosen during the feature selection stage, but not necessarily included in the final model
- ID: ID variable is a variable with unique values used for identification of observations (e.g. customer ID). In the scoring process, scoring points will be assigned to the ID variable
- CATEGORICAL: is a variable that takes a value that is one of several possible categories (e.g. gender, occupation, eye colour). Categorical variables have no numerical meaning
- NUMERICAL: is a variable naturally measured as a number (e.g. age, income, temperature) for which an arithmetic operation can be applied
In order to assign a specific type and/or role to selected variable(s):
- Filter rows, to see only selected variable(s)
- Choose from the list role and/or type that a single variable(s) should have
If you want to set a specific type/role to multiple variables:
- Filter rows, to see only selected variables
- Choose from the list role and/or type that selected variables should have
- Click Set Role and/or click Set Type button to make changes
Click the Next button if you want to proceed or the Back button if you want to change the previous project settings.
Step 4: Building model process settings¶
In this step, you will be asked to enter various settings that specify the modelling process.
- Sampling Mode: determines the sample selection method during the data sampling stage. The following modes are available (default: MANUAL):
- MANUAL: the user sets the sample size manually
- AUTO_ADVANCED: sample size is selected automatically
- (optional) Sample size: in conjunction with MANUAL Sampling Mode, the user enters the sample size in this field. The default setting is 30 000
- Stratification Mode: ABM supports the use of stratified sampling. This is especially useful when the proportion of the positive target values is small within the data set. The user can determine the ABM behaviour in case it is not possible to ensure both the specified sample size (user specified: Sample Size) and the proportion of positive target values (user specified: Positive Target Category Ratio). The following modes are available (default: CONST_NUM):
- CONST_RATIO: ABM will preserve the proportion of the positive target if there are not enough positive target values in the data set, the resulting sample size will be smaller than specified
- CONST_NUM: ABM will preserve the sample size if there are not enough positive target values in the data set, its proportion in the resulting sample will be smaller
- NONE: stratified sampling is turned off, the original proportion of the positive target will be preserved
- OVERSAMPLING: when there are not enough positive target values in the data set, positive target samples will be drawn multiple times in order to maintain the proportion of the positive target in the sample as well as the sample size itself
- Positive Target Category Ratio: stratified sampling setting which specifies the proportion of the positive target in the resulting sample; if Stratification Mode is set as NONE, this option is ignored (default: 0.5)
- Classification Model Quality Measure: the user can select the best way of measuring model quality (default: LIFT)
- LIFT: lift is a measure of the effectiveness of a classification model calculated as the ratio between the results obtained with and without the model. Lift shows how much more likely we are to receive positive target value when using a model than if we select a random sample
- ACCURACY: the proportion of the total number of predictions that were correct
- CAPTURED RESPONSE: the percentage of positive cases for the given percentage of all cases sorted by decreasing score
- PRECISION: the proportion of positive cases that were correctly identified
- RECALL: the proportion of actual positive cases which are correctly identified
- Cut-off: when LIFT or CAPTURED_RESPONSE is selected as Classification Model Quality Measure the user enters the cut-off value for the selected measure (the percentile at which ABM will optimise the model) (default: 0.1)
- Classification Threshold: threshold probability value above which an observation is classified as the positive target by the model; probabilities returned by the model which are below this value will mean that observation is classified as the negative target (default: 0.5)
Click the Next button if you want to proceed or the Back button if you want to change the previous project settings.
Step 5: Summary¶
You are almost ready to build a predictive model with ABM. In this final step, we check whether all information provided in Steps 1,2,3,4 is correct.
If you have no remarks, click the Finish button and you will be sent to the Homepage where you can run the process. If not, click the Back button to make changes.
Changing project settings¶
Sometimes you may come to the conclusion that the settings of the project you have just added are not right and you wish to change them. You can do this before running the process:
- From the Homepage, click the Open project button or click the project name
- Click the Change settings button and follow Steps 1,2,3,4,5 described in previous chapters
Remark: You can also change settings after the model is built to check how it performs with other parameters. However, to avoid overwriting already received results, we suggest adding another project with new parameters and comparing the models built within both projects.
Running a project¶
You can run the model building process based on the imported source file and settings in two ways:
- From the Homepage by clicking the Run project button
- By choosing the project name and then clicking the Run button
The model building process may take a while depending on the working mode you have chosen (Gold mode takes the longest). So it’s time to grab a cup of coffee and wait until it finishes :)
Nevertheless, you can explore the report concerning the particular process stage as soon as ABM finishes calculating its statistics. Open the project by clicking on its name and then click the icon to see the stage result.
You can monitor the progress of calculating the model thanks to the progress bar available after opening the project.
Go to the Models and variables statistics Chapter to see what information is available in each of the seven reports.
After your predictive model is built, you can calculate scores for new data. Go to the Data Scoring Chapter to see details.
Deleting a project¶
In order to delete a project, click the Delete button available from the Homepage.
You will be asked for confirmation.