Pattern recognition and extraction in Pega Platform help you to
detect all entities whose structure matches a pattern that you define. For example, you can
detect and mark strings that contain the at sign (@) and
.com as email addresses.
- Make sure that you can access the Analytics Center. You can
do this by starting the pyDecisionAnalytics portal. Add this portal to
the list of portals in your access group. For more information see, Access Group form - Completing the Definition tab.
- Make sure that the system locale language settings are set
to UTF-8.
You write pattern extraction models by using the Apache Ruta script language. For more
information, see the official Apache UIMA Ruta Guide and Reference online help.
For an example use case, see Creating entity extraction rules for text analytics
on the PDN.
-
In Designer Studio, click
.
-
In the Analytics Center, click
Create, and then click
Text extraction.
-
In the
Create Text Extraction Model
window, enter the
name of the text extraction model.
-
In the
Creation
section, select
Rule.
-
In the Language section, expand the drop-down list and select a
language for the model.
-
In the
Template
field, expand the drop-down list and
select a template that contains Apache Ruta script that you can modify.
Note: Use the provided templates only as the starting point for creating your own pattern
extraction models.
-
In the
Rule script
field, modify the provided Apache
Ruta script to create a custom pattern extraction model.
-
In the
Save model
section, finalize the creation of the
pattern extraction model by providing its application context:
-
To use the default rule context for decision data rules that contain
sentiment analysis models, select
Use default
context.
-
To specify the
Applies to
class, ruleset, and ruleset
version parameters of the new rule, select
Specify
context.
-
Click
Create.