Pattern recognition and extraction in Pega Platform help you to
detect all entities whose structure matches a pattern that you define. For example, you can
detect and mark strings that contain the at sign (@) and
.com as email addresses.
Make sure that the system locale language settings are set
to UTF-8.
You write pattern extraction models by using the Apache Ruta script language. For more
information, see the official Apache UIMA Ruta Guide and Reference online help.
For an example use case, see Creating entity extraction rules for text analytics
on the Pega Community.
-
-
-
In the
Create Text Extraction Model
window, enter the
name of the text extraction model.
-
In the
Creation
section, select
Rule.
-
In the Language section, expand the drop-down list and select a
language for the model.
-
In the
Template
field, expand the drop-down list and
select a template that contains Apache Ruta script that you can modify.
Note: Use the provided templates only as the starting point for creating your own pattern
extraction models.
-
In the
Rule script
field, modify the provided Apache
Ruta script to create a custom pattern extraction model.
-
In the
Save model
section, finalize the creation of the
pattern extraction model by providing its application context:
-
To use the default rule context for decision data rules that contain
sentiment analysis models, select
Use default
context.
-
To specify the
Applies to
class, ruleset, and ruleset
version parameters of the new rule, select
Specify
context.
-
Click
Create.