Skip to main content
LinkedIn
Copied!

Table of Contents

Tips for optimizing decision strategies

Version:

Only available versions of this content are shown in the dropdown

Decision strategies that determine the next-best-action in your application can include components with complex configurations of different rule types. These configurations can lead to poor performance.

When a decision strategy performance issue occurs, it can be attributed to a combination of strategy design, and the customer data being used at run time. Use the following guidelines to understand the flow of data in your decision strategy so that you can diagnose and eliminate bottlenecks and other performance issues.

Data flow metrics

When you encounter a decision strategy performance issue, you can usually see it in the data flow run page. The strategy component in a data flow is expensive in terms of time percentage. You can interpret it through the following two metrics:

Time percentage taken among the overall data flow
In a typical decision management scenario, strategy is the most time consuming (CPU intensive) component. This metric could reach up to 90%-95% of a total data flow execution time. Such a high percentage means that the strategy execution is shown as a performance bottleneck (as it should be). At the same time, it does not signify a performance problem. However, if you see a relatively low percentage, it might be an indication that other parts of the system (for example, database or Decision Data Store) can be tuned better.
Average time taken by Strategy shape for every record
This metric records the time spent on the Strategy shape in Data Flow per record. If the strategy execution time exceeds the threshold, PEGA0063 alert (decision strategy execution time above threshold) is triggered. For more information, see Pega alerts for Cassandra. This metric mainly consists of three sub parts (visible after enabling detailed metrics for a data flow run):
  • Preprocessing, such as loading interaction history or interaction history summary caches for a batch run.
  • Strategy execution, where most of this metric should be covered in a typically good scenario.
  • Postprocessing, which invokes the data flow synchronously, to save strategy results and monitoring Info to pxDecisionResults data set or other built-in destination, depending on your configuration.

Troubleshooting decision strategy performance

To troubleshoot decision strategy performance:

  1. Look at the data flow metrics and alerts in Pega Predictive Diagnostic Cloud (PDC) for guidance and direction on the performance challenge. In case of a sudden performance degradation in staging or production environment, monitor which metric is affected.
  2. Analyze the alerts for typical strategy-related issues:
    • PEGA0063 (strategy execution time)
    • PEGA0064 (maximum number of strategy results processed per strategy component)
    • PEGA0075 (Decision Data Store interaction time)
    • PEGA0058 and PEGA0059 (interaction history reading/writing time)
    For more information, see Pega alerts for Cassandra.
  3. Enable detailed data flow metrics, by setting the pyUseDetailedMetrics property for RunOptions page. This property is a part of Data-Decision-DDF-RunOptions. When set to True, the detailed metrics for the execution of each shape will be calculated and made available in the progress page.

Perform test runs and simulations

You can test strategies to find performance issues by performing strategy test runs. The statistics such as the processing speed of records or decisions, time spent in each component, throughput, and the number of processed decisions or records, can help you assess the health of a strategy. For example, by viewing the Time spent statistics, you can get insight into how much time is spent on processing data in each strategy component, whether the indicated amount of time is justified, or whether the component uses a complex processing logic that you can optimize, and so on.

To test a strategy, you provide input data to the strategy components and then run a single case or batch set of cases. Data transforms, data sets, and data flows support the generation of the data objects that contain input data for test runs. The data processing power that is provided by data sets and data flows is best suited for validating your design against sample data from one or more data sources.

To understand the impact of strategy changes on the overall strategy execution time, run a simulation or a batch run to check the differences between versions of strategies.

To run a simulation, use the Revision Management performance check tool. This simulation runs on the same audience and top level strategy, so you can collect the average processing speed for each record in each revision. You can compare the results with previous revision and report on the changes in performance. For more information on Revision Manager, see Simulating your revision changes. This approach can be replicated in a batch run or manually, using a simulation test from the landing page in Pega Customer Decision Hub portal. You can monitor the performance of your strategy through the data flow metrics. Running on the same strategy, with the same audience means you track the change in your strategy performance by comparing metrics from run to run.

For more information, see Unit testing strategy rules or Configuring batch case runs.

Test your strategy on multiple records. In most cases, 100 records are enough to give you a reliable indication of the performance of your strategy.
The current batch test capability has two limitations:
  • It shows the result for the current strategy only, so you would need to run a performance test on each sub-strategy to collect metrics.
  • It only has detailed metrics for legacy components when running in an optimized mode.
Strategy test run panel and test results
A performance indication is displayed for each shape on the strategy
                        canvas.

Use the strategy execution profile

This is the traditional Pega Platform test run page which accepts a data transform to initialize the customer page for executing strategy. In Pega Customer Decision Hub, the data transform rules used for persona testing can be directly used here to generate a report you can download for offline analysis.

To use the strategy execution profile:

  1. Open the strategy rule under test (normally, this would be the top next best action strategy).
  2. Click Actions Run .
  3. Initialize the primary page context with a data transform or copy it from another page, whichever is appropriate based on your setup.
  4. Click Run.
  5. Download the strategy execution profile report
Streategy execution profile report
Report showing next-best-action strategy execution parameters.

The report includes total strategy execution time and a strategy/component breakdown. When this is executed with the new SSA engine, only the unoptimized components are measured directly with pages in, pages out, and execution time, as indicated with a proper component name. The row with component name All represents the total time spent within that particular strategy execution, including sub strategy executions, when applicable. The Optimized row includes all components included in All, minus the non-optimized components. In case of sub strategy component, the execution time is accumulative.

Apply filters early

Apply filters in a strategy as early as possible to eliminate data from the strategy flow that is not required to issue a decision. This solution reduces the amount of memory that is needed to process a strategy and decreases the processing time.

Applying filters early in the flow of strategy
A strategy canvas with multiple shapes. Filter components are placed right
                        after the first component, which is Data Import.

Avoid computing inactive paths in data joins

In complex decision strategies that contain multiple layers of substrategies, you can encounter Data Join components that are always triggered, regardless of their validity in the decision path. This type of design can needlessly extend the strategy processing time and is not recommended.

To illustrate this problem, see the following example strategy:

An analysis of a strategy in terms of the number of decisions and time spent in each component
The image shows the time spent and the number of decisions made at each
                        component in a strategy.

In the preceding strategy, the condition that is configured in the Data Join shape states that the data is matched only if the value of the SubjectID property of the input records is the same. However, even if the processing of the Filter shape results in no output records, the substrategy is still processed, which results in the unnecessary addition of 1.56 seconds to the total processing time.

To process the strategy only when required, use the Switch and Group By components. The Group By component counts the customer records that pass through the Filter component. If at least one customer record passes through the Filter component, the strategy is processed; otherwise, the strategy is not processed.

Strategy optimized to avoid processing unused paths
Strategy is not executed if there are no input records. Switch and Group By
                        components are added for optimization.

For more information, see Strategy rule form - Completing the Strategy tab.

Cache time-consuming expressions

You can cache the global parts of an expression that are not required for each decision. For example, the following Set Property component takes 525.76 milliseconds to compute, which is 12 percent of the total strategy processing time. To a strategy designer, this amount of time might indicate that this element requires optimization.

Set Property component that takes an excessive amount of time to process
A Set Property component in a strategy takes 525 ms to process, which is
                        12% of the total processing time.

This Set Valid Set Property component sets properties as stated by the following expression:

.D_Date_Start <= DateTime.D_Today     &&
.D_Date_End   >= DateTime.D_Today     &&
.D_Time_Start <= DateTime.D_TimeOfDay &&
.D_Time_End   >= DateTime.D_TimeOfDay

Based on the preceding expression, the DateTime.D_Today and DateTime.D_TimeOfDay properties are retrieved from the clipboard page for each decision. This time-consuming process can be optimized by caching the two properties through an additional Set Property component.

Reducing the processing time in a strategy through property caching
A Data Cache component is added after a Set Property component.

The new DataCache Set Property component sets temporary D_Today and D_TimeOfDay properties. This component reduces the processing time of the Set Valid component from 12 percent of the total strategy processing time to 1 percent by using the following expression:

.D_Date_Start <= DataCache.D_Today     &&
.D_Date_End   >= DataCache.D_Today     &&
.D_Time_Start <= DataCache.D_TimeOfDay &&
.D_Time_End   >= DataCache.D_TimeOfDay
You cannot apply this approach to all time-consuming strategy components. For example, a component might consume most of the total processing time of the strategy because it handles an increased number of records. You can view the number of records that pass through each strategy component by selecting the Number of processed records statistics in a batch case test run.

Frequently asked questions when troubleshooting strategy performance

Which node type does the strategy runs through a data flow? Can it be controlled to run on specific node types?
There is no dedicated node type for a strategy, because it normally gets executed under a data flow.
What are the top components that when used in a strategy design could lead to performance degradation?
Unoptimized components are typically the ones that need more attention when debugging. Typical components that might run into performance issues:
  • Adaptive Model is by nature is a relatively expensive component. If using interaction history predictors, adaptive models can be extremely expensive when there is an issue with Decision Data Store (for example, PEGA0075 alert).
  • Interaction history, when there are many records to be loaded.
  • Data import or decision data, when importing a large list of pages.
  • Embedded Strategy, when iterating over a large page list.
  • Data Join, when the component is wrongly configured, it leads to an explosion in the number of result pages (for example, Cartesian product).
  • MarkerContainer, which is a internal technical representation of all data that needs to be propagated along with the service request page, for example, adaptive decision manager Mmodel results and monitoring data. It is transparent to the strategy designer, but if there are too many service request pages or the strategy logic is incorrectly configured, it might cause a long garbage collection (GC) pause issue. In this case, select Exclude model results from Assisted Channel Eligibility from data join shape properties, when applicable.
What are the important things to look at while tracing strategies?
Tracing using the built-in tracer is not recommended with the optimized decision engine, as the optimizations means it cannot guarantee the order of execution.

Related Content

Did you find this content helpful?

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.

Ready to crush complexity?

Experience the benefits of Pega Community when you log in.

We'd prefer it if you saw us at our best.

Pega Community has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice
Contact us