Close popover

Table of Contents

Creating a taxonomy for keyword-based topic detection

Version:

After you create a topic detection model, define the topics that you want to detect in a piece of text. For each topic, add a list of keywords that define the topic. Based on these keywords, topic detection then assigns topics to an analyzed piece of text.

Create a keyword-based topic detection model by specifying the model name, language, and corresponding ruleset. For more information, see Setting up a keyword-based topic detection model.
  1. In the Taxonomy workspace, create a list of topics that you want to detect:

    1. Create a parent topic by clicking Add topic.

    2. Optional:

      To add a child topic, select a parent topic, and then click Manage Add Child .

      You can add multiple levels of topics, depending on your use case and classification approach. For example, you can break down the parent category Support into In-store support and Phone support.
    3. Repeat steps 1.a and 1.b to create a complete hierarchy of the topics that you want to detect.

  2. For each topic, enter a list of keywords that pertain to that topic.

    A keyword can consist of multiple words. To separate keywords, press the Tab+Enter keyboard shortcut.
    You can specify the following types of keywords:
    Should words
    If any of the Should words appear in a piece of text, topic detection assigns that text to the corresponding topic. To achieve accurate results, create an exhaustive list of Should words. For example, for a Support topic, you can specify the following Should words: help, assistance, support, aid, guidance, assist, advice, and so on.
    Must words
    If all Must words appear in a piece of text, topic detection assigns that text to the corresponding topic. You can specify whether you want all Must words to appear at sentence level, or in the text as a whole. Use Must words to narrow down your topic detection conditions. For example, you can specify that a piece of text must contain the word help to be assigned to the Support parent category.
    And words
    If a piece of text contains both And words and Should words, topic detection assigns that text to the corresponding topic. Use And words to distinguish between similar categories and to increase the accuracy of topic detection. For example, you can specify the same Should words for the In-store support and Phone support topics, but then add premises, store, and office as keywords specific to the In-store support topic, and phone and call as keywords specific to Phone support.
    Not words
    If a Not word appears in a piece of text, the text is not assigned to the corresponding topic. For example, enter phone or call as words that prevent topic detection from assigning a piece of text to the In-store support topic.
  3. To detect child topics only when the corresponding parent topic is detected, for the parent topic, select the Match child topics only if the current topic matches check box.

  4. Optional:

    To test your taxonomy, select Actions Test .

    Always test your taxonomy on a number of text samples to determine whether it accurately assigns topics. Depending on the results, you might refine your taxonomy, for example, by increasing the number of Should words to accommodate additional use cases, or by adding Not words to help differentiate between similar categories.
  5. Optional:

    To export the taxonomy as an .xlsx file, select Actions Export .

  6. Save the taxonomy by clicking Save

    You can use the taxonomy as part of a machine-learning topic detection model or directly in Text Analyzers to perform keyword-based topic detection.

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.