LinkedIn
Copied!

Table of Contents

Support Play: Troubleshooting agent and service activity performance

Summary

Some performance issues occur when an agent or a service runs. Analyzing such performance problems differs from analyzing an interactive user's performance issue. (For performance analysis for interactive users , see Support Play: A methodology for troubleshooting performance.)

Quick Links

Troubleshooting with PAL Tools

Adding Troubleshooting Steps to an Existing Activity

     Troubleshooting Tools

     Troubleshooting Process Using PALLog

Additional Resources

Need Further Help?


 

 

Suggested Approach

Use this support play after you have followed steps from the Troubleshooting Performance  support play and determined that a performance problem is in an agent or service. 

This article describes special ways to run PAL or DB Trace to capture information about agent or service performance.  See Troubleshooting Performance for the actual analysis of the captured data.

You can run PAL on any requestor, including agent requestors and service requestors.  However, a difficulty when using PAL on an agent or service interaction is that their activities may execute so quickly that you cannot catch the data you want before it is replaced by newer data.  As a result, it may be necessary to add custom steps into the agent or service activities to catch the PAL data.

Unlike PAL, you cannot run the DB Trace tool directly on agents or services requestors.  The only way to get a DB Trace output for these interactions is to add custom steps as described below into the activities.

Troubleshooting with PAL Tools  

The Performance tool (PAL, or Performance AnaLyzer) provides a collection of counters and timer readings that you can use to analyze some performance issues in a system.  This tool captures the information necessary to identify processing inefficiencies or excessive use of resources in your application and stores this data in “PAL counters” or “PAL readings” for one requestor or node.  

Use PAL to gain insight into where the system is spending resources and identify areas of poor performance, such as delays in processing, refreshing screens, submitting work objects, or other application functions..  For details on how to take and analyze PAL readings, see:

Performance statistics for all requestors in the node are available through other tools.  Depending upon your system version, use one of these tools:

  • V4.2:  System Console/Monitor Servlet
  • V5.1:  System Management Application

The first step in analyzing agent performance is to select the requestor for the agent or service in question. The Name of any requestor shows its hash name. 

  • If the name begins with “H”, this requestor is being used by a user (HTTP interaction)
  • If the name begins with “A”, this requestor is being used by a listener or service rule
  • If the name begins with “B”, this is a batch requestor (used by agent processing)

In addition, the Client Address column shows the IP address of the computer that is sending information to the requestor.  For agents, the “address” is a label (“Master Agent,” “Usage Daemon,” etc.).

Top of Page

Using the System Console/Monitor Servlet (Version 4.2)

The System Console has a Requestor Status page, which shows all the requestors running on that node.  Once the appropriate requestor has been identified, click the clock icon   zzz   to the left of the requestor.

zzz

The PAL Detail window appears for this requestor:

zzz

Note that the PAL counters on this window are be in a different order than on the PAL Detail window available in the portal, and are labeled slightly differently; however, the properties tracked will be the same.

Top of Page

Using the System Management Application (Version 5.1)  

The System Management Application has a Requestor Management page, which displays information about any of the requestors running on this node of the system.  When the appropriate agent requestor has been identified, click the radio button next to that requestor, and then click the Performance Details button at the top of the window.

zzz

The PAL Detail window appears for this requestor:

zzz

For details on the System Management Application in 5.1,  see the PDFSystem Management Reference Guide.

Top of Page

Adding Troubleshooting Steps to an Existing Activity

As stated above, when troubleshooting agent or services performance, you may be unable to catch the activity run in the PAL window, as the process may run quite fast; in addition, DB Trace is not available at all for running during an activity.  Therefore, it may be more efficient to add troubleshooting steps inside the activity itself, to gather more data on possible problems.

Top of Page

Troubleshooting Tools

The tools that you can add as steps in an activity include:

  • getPAL Java method
  • PALLog activities
  • SetRequestorLevelDBTrace activity

PAL Details using the getPAL Method

The getPAL method allows you to capture PAL information by making the PAL data collection part of the actual agent activity.

getPAL is a method found in the PublicAPI class:

zzz

The PAL class has several Java methods, including:

  • getStats
  • clearStats

You can call these methods from a Java step in any activity .

For an example, inspect the code in the standard activity named Code-Pega-PAL.PALDataGet.  The first Java step in that activity calls the getStats method.  Other activities defined on the Code-Pega-PAL class demonstrate how to instrument specific processes in an application to capture specific performance data.

Top of Page

PAL Details using the PALLog RuleSet

If you want to avoid coding Java steps, download the PALLog RuleSet ( ZIP24136_PALLog.ZIP).  This RuleSet contains two activities:

  • ClearPALData
  • SavePALData

These activities call the getPAL methods, and may be used to gather the PAL data inside agent or service activities. 

ClearPALData

zzz

To use this activity, add a step at the beginning of your own activity to call the ClearPALData activity, to clear the PAL statistics.  This step is analogous to clicking the Reset Data link in the standard user tracing steps.

SavePALData

zzz

Similar, add a step to your activity that calls the SavePALData activity at those places where you want to save the PAL reading values. The SavePALData activity takes two parameters:

  • SnapShotName – the name of a PAL snapshot file
  • PALFilePath – the path to which the file will be written on the server

zzz 

Important:  Some standard agent activities run frequently (once every 30 seconds or so). Placing these troubleshooting steps into an agent activity may result in a number of PAL data files being created.  To give the files unique names to prevent them overwriting each other, you can include a timestamp in the Snapshotname parameter:

"PALforSLA_" + Lib(Pega-RULES:DateTime).CurrentDateTime()

This guarantees that the filenames for multiple PAL readings will be unique. For example:

  • PALforSLA_20061115T100602_B0D643C767a0824A580D522AF379DC85F4.log
  • PALforSLA_20061115T100634_B3824DVED4C8864E593C721540098C58E.log
  • PALforSLA_20061115T100704_B03E086F25981219891A4CEDC40329FD7.log

(The final portion of these file names is the requestor ID hash code, which is probably but not certainly unique; thus, the timestamp is added.)

Each time this activity is called to take PAL readings during the running of an agent or service activity, it creates a file containing the detail PAL properties and their values.  This will contain the same type of data that you see when you click the Save Data link from the summary PAL display:

zzz

Depending upon how many times this activity is called , the PAL data collected is cumulative.  For example, an activity under investigation has 9 steps.  To instrument it, four steps are added. :

  1. call ClearPALData
  2. (original) Step 1
  3. Step 2
  4. Step 3
  5. call SavePALData   (first)
  6. Step 4
  7. Step 5
  8. Step 6
  9. call SavePALData   (second)
  10. Step 7
  11. Step 8
  12. Step 9
  13. call SavePALData   (third)

The above “activity” results in PAL measurements for the first three original steps from the first call to SavePALData.  The second call would report PAL statistics for the first six steps; and the third call to SavePALData would measure the data for all 9 steps.

These measurements can be broken down further:

  1. call ClearPALData
  2. (original) Step 1
  3. Step 2
  4. Step 3
  5. call SavePALData   (first)
  6. call ClearPALData
  7. Step 4
  8. Step 5
  9. Step 6
  10. call SavePALData   (second)
  11. call ClearPALData
  12. Step 7
  13. Step 8
  14. Step 9
  15. call SavePALData   (third)

In the above case, the first call to SavePALData would result in PAL measurements for the first three steps.  Since the PAL data is then cleared, the second call would result in measurements for original steps 4 through 6 only, and the final run of SavePALData would measure data for original steps 7 through 9.

Top of Page

DB Trace - Using the SetRequestorLevelDBTrace Activity  

The property .pyDBTraceEnabled ( on the pxRequestor page) determines whether database tracing is enabled for a particular requestor. 

NOTE:  This property value has no effect when Global DB Trace is enabled.  Global DB Trace enables DB Trace on all requestors in the system, and is best used for system-wide problems. 

To set the .pyDBTraceEnabled property, call the standard activity Code-Pega-Requestor.SetRequestorLevelDBTrace

zzz

Pass a boolean parameter enabled to start (true) or end (false) DB Tracing.

When DB Trace runs, it create a DB Trace text file like the one created directly through the interactive Performance tools, which you can then analyze (as explained in Troubleshooting Performance).  The name of this file has several parts:

  • user ID
  • hash value
  • date/timestamp 

 Example:

WorkUser_AcmeCo.com_F8D6EFA61117A446D2467AB669B352D3_20070227T192928_938_GMT.txt

NOTES:

  • In this example, the user ID is WorkUser@AcmeCo.com.   Instead of using the “@” symbol in the file name (which could cause problems), an underscore (“_”) was substituted.
  • Unlike the PAL activities, you cannot direct the DB Trace data to a specific directory.  The DB Trace data is always stored in the ServiceExport directory.  (The exact location of this directory varies depending upon your application server. For example, the Apache Tomcat path is /contextRoot/work/Catalina/localhost/prweb/StaticContent/global/ServiceExport          where contextRoot is the path defined for this application.)

Top of Page

Troubleshooting Process Using PALLog

Application Agent

The process for troubleshooting agent performance using the PALLog activities is as follows:

  1. Make a new copy of the activity in another RuleSet or RuleSet version.
  2. Add a step at the beginning of the activity to start DB Trace.
  3. Add a step at the beginning of the activity to clear the PAL statistics.
  4. Add steps as desired in the middle of or at the end of the activity to capture new PAL statistics.
  5. Add a step at the end of the activity to stop DB Trace.
  6. Run the agent that calls this activity.
  7. Review the generated performance data

In the following example, a copy of the the standard activity Assign-.ProcessServiceLevelEvents instrumented with additional steps.

zzz

As shown above, a new first step added to the example activity starts the DB Trace, by calling the SetRequestorLevelDBTrace activity.  The parameter enabled is set to checked (true), to start DB Tracing.. 

Next, the developer adds a step 2 that clears the PAL statistics.

zzz

Steps 3 through 9 in this example activity are the unaltered.  The developer adds Step 10 at the end of the processing to write the PAL statistics into to a file.

zzz

  • The SnapShotName parameter contains the name of the PAL snapshot file, and includes the timestamp to give the file a unique name.
  • The PALFilePath identifies a directory where this file will be created. 

Finally, the developer adds Step 11 at the end of the activity to end the DB Trace, clearing the enabled box, and saves the Activity form.

zzz

After the activity is edited, the developer runs the process that calls the activity. 

IMPORTANT:  Before running the new edited , the update the agent access group to make sure the agent has access to the new activity. 

After the activity runs, you can retrieve the output files that were created and analyze them

Finally, If you edited an existing application activity, at the end of this process remember to either comment out the troubleshooting steps added above, or delete them.

Top of Page

Pega Agent

It’s possible that the performance issue isn’t in an agent which is part of your application, but is in one of the standard Pega-****- RuleSets.  In this case, you can't update the agent activity to add new steps, because it belongs to a locked RuleSet Version.

To instrument standard activities, the procedure has a few additional twists:

  1. Create a copy of the standard agent activity
  2. Create a copy of the agent’s Rule-Agent-Queue instance
  3. Add steps as above
  4. Disable the standard agent
  5. Run the new agent Review the generated performance data

Create a copy of the Pega agent activity

 If the agent activity is not set to FINAL availability, make a copy of the activity and save it with the same name into a higher “working” RuleSet in the RuleSet List (perhaps an open custom RuleSet, or the developer’s troubleshooting RuleSet), so that it will get chosen by rule resolution.   If it is possible to copy the activity and keep the same name, then it is not necessary to change anything else – rule resolution will make sure this new activity is used.

Important:  When making a copy of the activity, make sure that the RuleSet the activity is saved in is accessible to the agent!  Check the agent’s access group, and if this RuleSet is not part of the access group, add it.

If the agent activity is set to Final, and so can’t be overridden, then you can save it into a higher RuleSet with a slightly different name.  In this case, it is necessary to disable the original agent, in either of two ways:

  • Disable the agent in the Monitor Servlet (for Version 4.2) or the System Management Application (Version 5.1).
  • Find the Data-Agent-Queue instance which contains the standard agent activity, and disable that agent activity. 

After you disable the standard activity , add the new activity (with the new name) to one of the custom agents for the application.  Again, make certain that this activity is in a RuleSet which is accessible to the agent (through its access group).

Follow the procedure above for editing the activity, running the agent, and reviewing the data. 

REMEMBER:  After the troubleshooting is completed, re-enable the original agent, and disable or delete the new activity with the troubleshooting steps.

Top of Page

Service - Version 4.2

Version 4.2 contains specific tools for troubleshooting services.  The best method for getting standard PAL data for services is o run PAL from the service activity, using the same process as described above for an agent activity.

Top of Page

Service - Version 5.1  

Version 5.1 includes a number of PAL statistics that provide further information about service interactions:

PAL Label

Description

CPU time to process parse rules

When mapping data, rules of the following rule types may be used:

  • Rule-Parse-Delimited
  • Rule-Parse-Structured
  • Rule-Parse-XML

This reading measures CPU time spent processing parse rules.  If this measurement is over .5 seconds, review the data being parsed to see if there are issues (for example, a problem with the data structures in the file, or a change in the structure for some of the records).

Elapsed time to process parse rules

This reading measures the elapsed (total) time spent processing the Parse rules.

Number of parse rules

This reading counts the number of parse rules executed.

CPU Inbound Mapping Time

Whenever a Rule-Service- rule receives a request, data must be mapped from that request to Process Commander properties.   This reading measures the CPU time spent mapping the inbound data.

Elapsed Inbound Mapping Time

This reading measures the elapsed (total) time spent mapping the inbound data for a Rule-Service request.

CPU Outbound Mapping Time

Whenever a Rule-Service receives and processes a request, the response data must be mapped from properties to the form the external system expects.   This reading measures the CPU time spent mapping the outbound data.

Elapsed Outbound Mapping Time

This reading measures the elapsed (total) time spent mapping the outbound data for a response to an external system request.

CPU Activity Time

Whenever a Rule-Service receives and processes a request, after the data is mapped for the response, the system runs a “service” activity.  This reading measures the CPU time spent running a service activity.

Elapsed Activity Time

This reading measures the elapsed (total) time spent when the system runs a service activity.

Number of records in file

This reading counts the number of records in files processed by File Listeners.

Number of Bytes received by the Server through Services

This reading displays the amount of data received by the server through a service request, measured in Kbytes. 

These PAL counters provide a detailed picture of where time is spent during a service interaction.

For full details on how these can be used to troubleshoot services, see Testing Services and Connectors in Version 5.1.

Top of Page

Additional Resources  

Top of Page

Need Further Help?

If you have followed this Support Play, but require additional help, contact Global Customer Support by logging a Support Request.

Top of Page

Pega Platform 7.1.1 - 8.3.1 Business Architect System Architect System Administrator System Administration
Suggest Edit

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.