PEGA0019 alert: Long-running requestor detected
Normally, each interaction that is processed by a requestor takes between a millisecond to a few seconds of elapsed time to complete. The PEGA0019 alert indicates that a requestor has been servicing one interaction for a long elapsed interval, such as 30 minutes. This delay can arise from an infinite loop or because a request is made to a database or external system that has failed, is disconnected or unavailable, and so no response is received from that external system within the time limit.
Occasionally, requestors are stuck in the server. Beginning with version 5.2, the master agent can end these stuck requestors, freeing resources.
The master agent checks each requestor to see if it is busy, by attempting to acquire a lock on that requestor. The check is performed periodically at a specific time interval that is set in the
interval setting of the prconfig.xml file (120 seconds by default).
If the lock attempt fails (the requestor is busy), the master agent keeps track of this requestor for the amount of time that is set in the
requesttime setting before generating the alert.
The master agent keeps generating the alerts until the threshold for the maximum number of notifications that is set in the
notifications setting of the prconfig.xml file is reached. Even if the requestor is still running, no more alerts are generated.
Example message text
Long running interaction detected (Requestor:H3C02956618E8749DA74FCDDF78FFE168,Java Thread:http-bio-8080-exec-6,Last Access Time:20170428T051715.409 GMT,User:Admin.Lokas,Access Group Name:Lokas:Administrators,Application Name:Lokas 01.01.01,Last Known Processing:Activity=Pega-RunRecord.pzRunRecordExecute)
The alert captures and displays detailed information about the issue:
- Requestor – The ID of the long-running requestor
- Java Thread – The name of the thread that is associated with the long-running requestor
- Last Access Time - The time when the long-running requestor was last accessed
- User – The name of the user that is associated with the long-running requestor
- Access Group Name – The name of the user access group
- Application Name – The application name that is associated with the long-running requestor
- Last Known Processing – The last known action that the requestor performed
- The stack trace from the thread that is associated with the long-running requestor
Default prconfig.xml settings
<env name="alerts/longrunningrequests/enabled" value="true" />
- Toggles the alert on (true) or off (false)
<env name="alerts/longrunningrequests/notifications" value="3" />
- A positive integer indicating the maximum number of alerts/notifications that are generated when a long-running requestor is detected by the master agent. The default is 3.
<env name="alerts/longrunningrequests/requesttime" value="600" />
- A positive integer indicating the number of seconds for which the system tracks a busy requestor before generating the alert. The default is 600 seconds (10 minutes).
When you modify a default value by editing the prconfig.xml file, the new value applies only for the node on which the configuration file is modified. Ensure that the same modified value is propagated to all nodes in the cluster and that all nodes reflect the same threshold value in their configuration files.
Reasons for the alert
Use the System Management Application (SMA) & Requestor Management page to check whether the requestor is still running. If the requestor is still running, check with the user (identified in the alert) to see whether the user is running a legitimate query or whether there is a problem with the requestor activity. If there is a problem, use the SMA to terminate the requestor.
Other alert types, such as the PEGA0005 database operation threshold, run within a requestor context, and are added to the Alert log only when a response occurs. As a result, PEGA0005 alert types cannot report indefinite looping or failures. For example, if the database server fails completely or the server loses connectivity to the database server, the PEGA0019 alert is displayed after 10 minutes, but the PEGA0005 alert is not displayed.