In the Sample construction step, determine which data to use to train the model and
    which data to use to test the model's accuracy.
     During the training process of a text extraction model, the Maximum Entropy algorithm
      is applied on the training data, and the model learns to predict labels. The data that you
      designate for testing is not used to train the model. Instead, Pega Platform uses this data to compare whether the labels that you defined (for example,
        Complain, Purchase, and so on) match the
      labels that the model predicted. 
    - 
        
          If you want to keep the split between the training and testing data as defined
                    in the file that you uploaded, in the
          Construct training and test
                        sets using
          field, select
          User-defined sampling based
                        on "Type" column.
        
      
 
- 
        If you want to ignore the split that is defined in the file and customize that
                    split according to your business needs, perform the following actions:
        
          - 
             Select Uniform sampling.
          
 
          - 
            
              In the
              Training set
              field, specify the
                            percentage of records that is randomly assigned to the training
                            sample.
            
          
 
        
       
- 
        
          Click
          Next.
        
      
 
- 
        
          In the
          Model creation
          step, make sure that the
          Maximum Entropy
          check box is selected.
        
      
 
- 
        
          Click
          Next.
        
        
The model training and testing process starts.