A sample is a subset of historical data that you can extract when you apply a selection or sampling method to the data source. A sample construction helps to construct development and (optionally) validation data sets for analysis and modeling.
Select the weight field if present.
Typically, a weight field is available when you sample the data before using it in the Analytics Center portal. If you do not specify the field, each case counts as one.
In the Select the fields to sample grid, set the field type and define the fields that you want to include in the sample.
You can select the NOT USED type if you do not want to use a particular field.
Optional: In the User defined field, type a new name for the field.
Optional: In the Description field, type a description.
Select a sampling method.
Use this method to sample a simple proportion of cases. It fills the sample table with a random selection of records from the source. The probability of selection is set to achieve the specified percentage or number of cases.
Select the stratum field.
In the Ratio column for each stratum value, set the ratio of population cases to source records.
Population is a group of cases with the known behavior which is consistent with the group of cases whose behavior you want to predict. You use the population to extract data samples for modeling and validation.
In the Sample percentage column for each stratum value, set the percentage of records that you want to sample.
Use this method to sample a different proportion of each value for the selected field (stratum) that represents the behavior to be predicted. It fills the sample table with random selections of each class.
Define the sample percentage that you want to use for validation and testing.
Click Next.
Previous: Selecting a data source |