Creating machine learning topic models
Efficiently connect your customers with the right consultant by providing training data to a topic model.
The machine learning topic model teaches itself based on training data provided to it, and then starts analyzing text on its own. The training data contains sample messages from customers with an assigned topic category. For example, the model assigns the message I want to book a ticket from New York to Warsaw to the Booking a flight category.
If you do not have enough training data for the machine learning model to start operating efficiently, consider creating a keyword-based topic model as a temporary substitute. For more information about the differences between topic model types, see Comparing keyword-based and machine learning topic detection.
- Ensure that the system locale language settings are set to UTF-8.
- Specify a repository for text analytics models. For more information, see Specifying a database for Prediction Studio records.
To create a topic model based on machine learning, perform the following procedures:
- Setting up a machine learning topic model
Start building a topic model based on machine learning by specifying the model name, language, and corresponding ruleset.
- Uploading data for training and testing of the topic model
Upload sample records to train the model and to test whether the model assigns the topics correctly.
- Defining the training and testing samples for topic detection
Split the uploaded data into a set for training the model and a set for testing the model accuracy.
- Reviewing the taxonomy for machine learning topic detection
Verify the correctness of the taxonomy of topics that Prediction Studio generated from the training data. If you updated an older version of a model, the taxonomy might include topics from that version. Clean up your model by deleting topics that have no training data, and improve the model's predictions by adding keywords.
- Training and testing the topic model
Select the algorithms that Prediction Studio uses to build the model, and then start the building process.
- Reviewing the topic model
Review the created model by analyzing the results of testing against the provided training data.
- Saving the topic model
Save the model to use it as part of the Pega Platform text analytics feature. You can also download a file that contains the model that you created.