Support Article
Production system hangs under heavy volume
SA-29155
Summary
The system starts to hang under larger volume of service calls. User has implemented our first phase of the project without issue and now user is moving towards our next phase with a much larger increase in volume.
Based on sizing estimate user should have enough CPU, Heap and servers to handle the load. However, after running for a while the servers start to hang sometimes with out of memory errors, sometimes with other timeout errors.
As issue appears load dependent, user is running the system with 2 JVMs supporting internal users, but not the 2 JVMs that support external users.
We did volume testing and the volumes that caused the hangs should have been supportable.
Error Messages
Out of memory errors, timeout errors in the logs.
Steps to Reproduce
Run the system under heavy volume.
Root Cause
A defect in Pegasystems’ code or rules: As part of the SetResponse DataFlow a call to R-U-F pzStoreFactRecords is made.
An error in the retry looping logic in that funcation is incorrect and can cause a scenario where the same record is written repeatedly until the fifth Database Exception occurs at which time the process completes. Subsequent re-reading of the massive IH record inserts caused the Out of Memory conditions.
Resolution
Apply HFix-29495.
Published November 15, 2016 - Updated October 8, 2020
Have a question? Get answers now.
Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.