A sentiment lexicon is a list of semantic features for words and phrases. Use lexicons for creating machine learning-based sentiment and intent analysis models.
Lexicons determine whether particular word or phrase carries any emotional load, that is, belongs to the SW (sentiment word) category. If so, the lexicon provides the sentiment (polarity) value for that word or phrase. Additionally, the lexicon determines which words are filtered out before processing of text ( IGNORE ) and which words are used in negations ( NEGATIVE ). Applying semantic features on lexicon items that are identified in the training data enhances the model’s prediction accuracy.
Pega Platform provides the default pySentimentLexicon lexicon that supports English, Spanish, Italian, Dutch, German, French, and Portuguese.
- A word or a phrase.
- The associated sentiment value. The available values are highly negative, negative, mildly negative, neutral, mildly positive, positive, highly positive, and positive, negative.
- The language of the word or phrase.
- The type of word or phrase that, in correlation with the value of the pySentiment property, determines the overall sentiment of the analyzed phrase or document. For example, the number of features whose pyWordType property is NEGATIVE ( for example, no, not, isn't, cannot ) can be indicative of the overall negative sentiment of the document since more negations can be found in negative phrases or documents.
- Preparing data for text extraction
In the Source selection step of the text extraction model creation wizard, select the extraction type and provide the data for training and testing of your text extraction model.
- Preparing data for sentiment analysis
In the Lexicon selection step, select the sentiment lexicon to use for sentiment analysis. Sentiment lexicons contain features that are used to enhance the accuracy of the model.