Verify the correctness of the taxonomy of topics that Prediction Studio generated from the training data. If you updated an older
version of a model, the taxonomy might include topics from that version. Clean up your model
by deleting topics that have no training data, and improve the model's predictions by adding
keywords.
Keywords influence the behavior of a machine learning model, but they are not exact rules.
The "Should," "Must," and "And" words act as positive features for matching a text to a topic,
while the "Not" words act as negative features. The training and testing data have the
greatest impact on your machine learning model, while keywords have a smaller impact.
You cannot add topics in this step. If you want to add topics, go back to the
Source selection step. For more information, see Uploading data for training and testing of the topic detection model.
-
In the Taxonomy review wizard step, review the taxonomy details,
and then expand the taxonomy to view the topics.
The hierarchy of the taxonomy is used to group topics. Do not add training data or
keywords to grouping topics.
-
Review the summary of training and test data for individual topics by selecting the
topics in the list.
- Optional:
To add positive or negative features for matching a text to a topic, add keywords to
the topic:
-
Select the topic, and then click the Manage keywords
tab.
-
In the Keywords section, enter keywords to influence the
model's predictions.
Keywords can be words or phrases. You can enter several keywords in each
category.
For example:
- Should words
- phone
telephone
mobile
- And words
- call
- Optional:
To delete topics that do not contain any training data, select a topic, and then click
Delete.
Topics without any training data might appear in the taxonomy when you start with a
keyword-based model, and then update it to a machine learning model. If the training data
that you use to train the new model contains a smaller number of topics than the original
keyword-based model, only that number of topics get trained, and the remaining topics are
without training data.
-
Click Next.
What to do next: Select the algorithms that Prediction Studio uses to build the model, and then start the building process. For more information, see
Training and testing the topic detection model.