LinkedIn
Copied!

PEGA0019 alert: Long-running requestor detected

Normally, each interaction that is processed by a requestor takes between a millisecond to a few seconds of elapsed time to complete. The PEGA0019 alert indicates that a requestor has been servicing one interaction for a long elapsed interval, such as 30 minutes. This delay can arise from an infinite loop or because a request is made to a database or external system that has failed, is disconnected or unavailable, and so no response is received from that external system within the time limit.

Occasionally, requestors are stuck in the server. Beginning with version 5.2, the master agent can end these stuck requestors, freeing resources.

The master agent checks each requestor to see if it is busy, by attempting to acquire a lock on that requestor. The check is performed periodically at a specific time interval that is set in the interval setting of the prconfig.xml file (120 seconds by default).

If the lock attempt fails (the requestor is busy), the master agent keeps track of this requestor for the amount of time that is set in the requesttime setting before generating the alert.

The master agent keeps generating the alerts until the threshold for the maximum number of notifications that is set in the notifications setting of the prconfig.xml file is reached. Even if the requestor is still running, no more alerts are generated.

Example message text

Long running interaction detected (Requestor:H3C02956618E8749DA74FCDDF78FFE168,Java Thread:http-bio-8080-exec-6,Last Access Time:20170428T051715.409 GMT,User:Admin.Lokas,Access Group Name:Lokas:Administrators,Application Name:Lokas 01.01.01,Last Known Processing:Activity=Pega-RunRecord.pzRunRecordExecute)

The alert captures and displays detailed information about the issue:

  • Requestor – The ID of the long-running requestor
  • Java Thread – The name of the thread that is associated with the long-running requestor
  • Last Access Time - The time when the long-running requestor was last accessed
  • User – The name of the user that is associated with the long-running requestor
  • Access Group Name – The name of the user access group
  • Application Name – The application name that is associated with the long-running requestor
  • Last Known Processing – The last known action that the requestor performed
  • The stack trace from the thread that is associated with the long-running requestor

Default prconfig.xml settings

<env name="alerts/longrunningrequests/enabled" value="true" />

  • Toggles the alert on (true) or off (false)

<env name="alerts/longrunningrequests/notifications" value="3" />

  • A positive integer indicating the maximum number of alerts/notifications that are generated when a long-running requestor is detected by the master agent. The default is 3.

<env name="alerts/longrunningrequests/requesttime" value="600" />

  • A positive integer indicating the number of seconds for which the system tracks a busy requestor before generating the alert. The default is 600 seconds (10 minutes).

When you modify a default value by editing the prconfig.xml file, the new value applies only for the node on which the configuration file is modified. Ensure that the same modified value is propagated to all nodes in the cluster and that all nodes reflect the same threshold value in their configuration files.

You must restart the nodes for the changes to the prconfig.xml file to take effect.

Reasons for the alert

Use the System Management Application (SMA) & Requestor Management page to check whether the requestor is still running. If the requestor is still running, check with the user (identified in the alert) to see whether the user is running a legitimate query or whether there is a problem with the requestor activity. If there is a problem, use the SMA to terminate the requestor.

Other alert types, such as the PEGA0005 database operation threshold, run within a requestor context, and are added to the Alert log only when a response occurs. As a result, PEGA0005 alert types cannot report indefinite looping or failures. For example, if the database server fails completely or the server loses connectivity to the database server, the PEGA0019 alert is displayed after 10 minutes, but the PEGA0005 alert is not displayed.

Suggest Edit

Related Content

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.