Creating a taxonomy for keyword-based topic detection
After you create a topic model, define the topics that you want to detect in a piece of text. For each topic, add a list of keywords that define the topic. Based on these keywords, topic detection then assigns topics to an analyzed piece of text.
In the Taxonomy workspace, create a list of topics that you want to detect:
Create a parent topic by clicking Add topic.
To add a child topic, select a parent topic, and then click.You can add multiple levels of topics, depending on your use case and classification approach. For example, you can break down the parent category Support into In-store support and Phone support.
For each topic, enter a list of keywords that pertain to that topic.A keyword can consist of multiple words. To separate keywords, press the Tab+Enter keyboard shortcut.You can specify the following types of keywords:
- Should words
- If any of the Should words appear in a piece of text, topic detection assigns that text to the corresponding topic. To achieve accurate results, create an exhaustive list of Should words. For example, for a Support topic, you can specify the following Should words: help, assistance, support, aid, guidance, assist, advice, and so on.
- Must words
- If all Must words appear in a piece of text, topic detection assigns that text to the corresponding topic. You can specify whether you want all Must words to appear at sentence level, or in the text as a whole. Use Must words to narrow down your topic detection conditions. For example, you can specify that a piece of text must contain the word help to be assigned to the Support parent category.
- And words
- If a piece of text contains both And words and Should words, topic detection assigns that text to the corresponding topic. Use And words to distinguish between similar categories and to increase the accuracy of topic detection. For example, you can specify the same Should words for the In-store support and Phone support topics, but then add premises, store, and office as keywords specific to the In-store support topic, and phone and call as keywords specific to Phone support.
- Not words
- If a Not word appears in a piece of text, the text is not assigned to the corresponding topic. For example, enter phone or call as words that prevent topic detection from assigning a piece of text to the In-store support topic.
To detect child topics only when the corresponding parent topic is detected, for the parent topic, select the Match child topics only if the current topic matches check box.
To test your taxonomy, select.Always test your taxonomy on a number of text samples to determine whether it accurately assigns topics. Depending on the results, you might refine your taxonomy, for example, by increasing the number of Should words to accommodate additional use cases, or by adding Not words to help differentiate between similar categories.
To export the taxonomy as an
.xlsxfile, select .
Save the taxonomy by clicking Save.You can use the taxonomy as part of a machine learning topic model or directly in Text Analyzers to perform keyword-based topic detection.