Adding, running and deleting projects

Adding a new project

In order to create a new project, go to the Homepage and click the Add New Project button.

_images/homepageaddnewproject.jpg

In the next four steps you will be asked to:

Step 1: Enter the name for your project and select a source file

Step 2: Select the Working mode

Step 3: Enter variables settings

Step 4: Enter building model process settings

Step 5: Check if the information provided is correct

In the next chapters, you will learn how to specify the settings of your projects with reference to the above steps.

Step 1: Naming a new project and selecting source data

_images/addingnewprojectstep1.jpg

In the first field, enter a project name (something easy to remember).

Remark: The project name should be no longer than 32 characters

Then indicate the file that will be used to build your model. You can do this by choosing already imported files from the list. If there is no data available or you want to import new data, click the Add new data button and follow the steps described in the Chapter Importing files.

Click the Next button to proceed.

Step 2: Selecting Working Mode

_images/addingnewprojectstep2.jpg

There are three Working modes available:

  • The Quick mode enables fast creation of the base model. Thanks to the use of the Regularized Regression algorithm in the modelling process, it assures you obtain an accurate model in a relatively short time. The algorithm automatically ensures the imputation of missing values, choosing the most significant modelling variables and application of optimal variable transformations
  • The Advanced mode is an extension of the Quick mode. It uses more advanced methods for feature selection and data preparation (discretization techniques, binarization of categorical variables, imputation of missing values, and identification of outliers)
  • The Gold mode is an extension of the Advanced mode with a more in-depth search through possible predictive modelling paths. It requires more time for the modelling process and also more time for scoring as the final result is provided as an assembly of several models

Click the name of the working mode you want to use in your project in order to proceed to the next step.

Step 3: Entering variables settings

In this step you will be asked to:

  1. Choose a target variable (e.g. TARGET) and define what is a positive value of the target variable (e.g.: 1)
_images/addingnewprojectstep31.jpg
  1. (optional) you can impose how ABM ought to use selected variables by indicating their type (ACTIVE, INACTIVE, OBLIGATORY, ID) and/or role (CATEGORICAL, NUMERICAL)

Here is a short description of what particular types and roles mean:

  • ACTIVE: if a variable is active, it means that it will be taken into account during the model building process (but it doesn’t mean it will be selected for the model)
  • INACTIVE: if a variable is inactive, it means that it will be ignored during the model building process
  • OBLIGATORY: if a variable is obligatory, it means that it will be chosen during the feature selection stage, but not necessarily included in the final model
  • ID: ID variable is a variable with unique values used for identification of observations (e.g. customer ID). In the scoring process, scoring points will be assigned to the ID variable
  • CATEGORICAL: is a variable that takes a value that is one of several possible categories (e.g. gender, occupation, eye colour). Categorical variables have no numerical meaning
  • NUMERICAL: is a variable naturally measured as a number (e.g. age, income, temperature) for which an arithmetic operation can be applied

In order to assign a specific type and/or role to selected variable(s):

  1. Filter rows, to see only selected variable(s)
  2. Choose from the list role and/or type that a single variable(s) should have
_images/addingnewprojectstep32.jpg

If you want to set a specific type/role to multiple variables:

  1. Filter rows, to see only selected variables
  2. Choose from the list role and/or type that selected variables should have
  3. Click Set Role and/or click Set Type button to make changes
_images/addingnewprojectstep33.jpg

Click the Next button if you want to proceed or the Back button if you want to change the previous project settings.

Step 4: Building model process settings

In this step, you will be asked to enter various settings that specify the modelling process.

  1. Sampling Mode: determines the sample selection method during the data sampling stage. The following modes are available (default: MANUAL):
    1. MANUAL: the user sets the sample size manually
    2. AUTO_ADVANCED: sample size is selected automatically
_images/addingnewprojectstep41.jpg
  1. (optional) Sample size: in conjunction with MANUAL Sampling Mode, the user enters the sample size in this field. The default setting is 30 000
_images/addingnewprojectstep42.jpg
  1. Stratification Mode: ABM supports the use of stratified sampling. This is especially useful when the proportion of the positive target values is small within the data set. The user can determine the ABM behaviour in case it is not possible to ensure both the specified sample size (user specified: Sample Size) and the proportion of positive target values (user specified: Positive Target Category Ratio). The following modes are available (default: CONST_NUM):
    1. CONST_RATIO: ABM will preserve the proportion of the positive target if there are not enough positive target values in the data set, the resulting sample size will be smaller than specified
    2. CONST_NUM: ABM will preserve the sample size if there are not enough positive target values in the data set, its proportion in the resulting sample will be smaller
    3. NONE: stratified sampling is turned off, the original proportion of the positive target will be preserved
    4. OVERSAMPLING: when there are not enough positive target values in the data set, positive target samples will be drawn multiple times in order to maintain the proportion of the positive target in the sample as well as the sample size itself
_images/addingnewprojectstep43.jpg
  1. Positive Target Category Ratio: stratified sampling setting which specifies the proportion of the positive target in the resulting sample; if Stratification Mode is set as NONE, this option is ignored (default: 0.5)
_images/addingnewprojectstep44.jpg
  1. Classification Model Quality Measure: the user can select the best way of measuring model quality (default: LIFT)
    1. LIFT: lift is a measure of the effectiveness of a classification model calculated as the ratio between the results obtained with and without the model. Lift shows how much more likely we are to receive positive target value when using a model than if we select a random sample
    2. ACCURACY: the proportion of the total number of predictions that were correct
    3. CAPTURED RESPONSE: the percentage of positive cases for the given percentage of all cases sorted by decreasing score
    4. PRECISION: the proportion of positive cases that were correctly identified
    5. RECALL: the proportion of actual positive cases which are correctly identified
_images/addingnewprojectstep45.jpg
  1. Cut-off: when LIFT or CAPTURED_RESPONSE is selected as Classification Model Quality Measure the user enters the cut-off value for the selected measure (the percentile at which ABM will optimise the model) (default: 0.1)
_images/addingnewprojectstep46.jpg
  1. Classification Threshold: threshold probability value above which an observation is classified as the positive target by the model; probabilities returned by the model which are below this value will mean that observation is classified as the negative target (default: 0.5)
_images/addingnewprojectstep47.jpg

Click the Next button if you want to proceed or the Back button if you want to change the previous project settings.

Step 5: Summary

You are almost ready to build a predictive model with ABM. In this final step, we check whether all information provided in Steps 1,2,3,4 is correct.

_images/addingnewprojectstep5.jpg

If you have no remarks, click the Finish button and you will be sent to the Homepage where you can run the process. If not, click the Back button to make changes.

Changing project settings

Sometimes you may come to the conclusion that the settings of the project you have just added are not right and you wish to change them. You can do this before running the process:

  • From the Homepage, click the Open project button or click the project name
_images/changingsettingsHstep1.jpg _images/changingsettingsHstep2.jpg

Remark: You can also change settings after the model is built to check how it performs with other parameters. However, to avoid overwriting already received results, we suggest adding another project with new parameters and comparing the models built within both projects.

Running a project

You can run the model building process based on the imported source file and settings in two ways:

  • From the Homepage by clicking the Run project button
_images/runningproject1.jpg
  • By choosing the project name and then clicking the Run button
_images/runningproject2.jpg

The model building process may take a while depending on the working mode you have chosen (Gold mode takes the longest). So it’s time to grab a cup of coffee and wait until it finishes :)

Nevertheless, you can explore the report concerning the particular process stage as soon as ABM finishes calculating its statistics. Open the project by clicking on its name and then click the icon to see the stage result.

_images/runningproject3.jpg

You can monitor the progress of calculating the model thanks to the progress bar available after opening the project.

Go to the Models and variables statistics Chapter to see what information is available in each of the seven reports.

After your predictive model is built, you can calculate scores for new data. Go to the Data Scoring Chapter to see details.

Deleting a project

In order to delete a project, click the Delete button available from the Homepage.

_images/deletingproject1.jpg

You will be asked for confirmation.

_images/deletingproject2.jpg