Support Article
Pega agent is getting hung and failed to do its job
SA-2825
Summary
Various threads are not doing their (various) intended jobs. The agents are not going down completely and they are unable to stop/interrupt them from System Management Console (attempts to do this will have no impact).
Error Messages
N/a
Steps to Reproduce
N/A
Root Cause
The root cause of this problem is in a third-party product integrated with PRPC. Inspection of thread dumps show that the agent threads were stuck within the Rule-Utility-Function SendEmailMessage - specifically in the portion of that RUF that attempts to establish a connection to the smtp email server. Within the javamail API, the smtp.connect() method has an unlimited default timeout value, so when this connect attempt hangs (for unknown reasons) - the thread will wait forever (and it is NOT in a state where it can be stopped or interrupted - this is a non-blocking operation).
Resolution
This issue is resolved by hotfix item HFix-7384 - which introduces a change to the Rule-Utiliyt-Function SendEmailMessage which will put a more reasonable timeout (default to 60 seconds, configurable by a System Setting). With the hotfix applied, if the communicaiton with the email server hangs, the prpc thread will time out (and throw an exception) - with the expectation that the communication problem is sporadic and will succeed with the next retry of the same agent queue.
Published January 31, 2016 - Updated October 8, 2020
Have a question? Get answers now.
Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.