Table of Contents

Article

Tutorial: Configuring a topic detection model for discovering keywords

Create a model for matching a piece of text to a predefined set of topics, that is based on a taxonomy of topics and keywords of different types. Use topic detection to classify text into semantic categories that are related to various domains, for example, customer support or complaint routing. 

Use case

The uPlusTelco company releases a new product called uPlusPhone10. The company wants to track social media responses to the release of this product and determine which uPlusPhone10 features are most popular.

 

Creating a topic detection model rule

Use Prediction Studio to configure a uPlusPhone10 topic detection model.

  1. In Prediction Studio, click New and select Text Categorization.
  2. In the Create Text Categorization Model window, set the model parameters:
    • Name: uPlusPhone10
    • Detection type: Topic
    • Creation: Use category keywords
    • Language: English
  3. Select the context of the model by specifying the applicable class, ruleset, and ruleset version.
  4. Click Create.
Thumbnail

Creating a topic detection model in Prediction Studio

Defining a taxonomy

After you create a rule that contains a topic detection model, define a taxonomy of topics and associated keywords. Each topic represents a category into which you can classify text. uPlusTelco wants to classify information based on the phone's features, performance, and specifications.

  1. Add a parent topic by clicking Add top-most.
  2. Enter Features and click Create.
  3. Repeat steps 1 and 2 to create the Performance and Specifications parent categories.
  4. Add child topics that correspond to the features of uPlusPhone10, such as applications, camera, connectivity, and games, by performing the following actions:
    1. Select the Features topic.
    2. Click Manage > Add child.
    3. Enter the name of the new topic and click Add.
  5. Repeat step 4 to create additional child topics, as shown in the following example:
    Thumbnail
  6. For each topic, add keywords that are specific to that topic by performing the following actions:
    1. Select a topic.
    2. Optional: For child topics to use keywords that are specific to the parent topic, select Allow sub topics to inherit keywords. For example, if you select this option for the Features topic, the Applications, Camera, Connectivity, and Games topics cannot be detected unless the phrase uPlusPhone10 is detected first, as shown in the following example:
      Thumbnail
      If a parent topic has an empty list of keywords, the topic detection model then automatically finds a match among the child topics.
    3. Specify Should words, Must words, And words, and Not words that apply to the Camera, Connectivity, and Games topics. For more information about keyword categories, see Defining a taxonomy.
      To view the full taxonomy, download the uPlusPhone10 taxonomy
    4. Click Save.
  7. Test your taxonomy by performing the following actions:
    1. Click Actions > Test.
    2. In the Test window, paste a piece of text in the Sample text box, for example, uPlusPhone10 handset and its applications are quite amazing!
    3. Click Testand view the results.
      Thumbnail
      Always test your taxonomy to ensure that the text analytics produces the expected results and, if needed, improve the taxonomy by adding more categories or modifying keywords. For example, can you improve the uPlusPhone10 taxonomy to correctly classify the sentence I wish uPlusPhone10 came with more games as belonging to the Features > Games topic instead of to just the Features topic?

Example:

See the following example of creating a taxonomy for topic detection:

Conclusions

You created a topic detection model in Prediction Studio and defined a taxonomy of topics with the associated words and phrases that are specific to each topic. You tested the model by applying it to real-life samples and identified areas of improvement to boost the accuracy of your model.

Next steps

Build a machine-learning (ML) topic detection model. In Pega Platform™, you can use a keyword-based topic detection model in association with an ML-based model to maximize the accuracy and reliability of topic detection. For more information, see Creating machine learning-based topic detection models and Best practices for creating categorization models.

Published June 15, 2018 — Updated October 16, 2018


100% found this useful

Related Content

Have a question? Get answers now.

Visit the Pega Support Community to ask questions, engage in discussions, and help others.