Support Article
Restart of all PRPC nodes fails on initial attempt
SA-31368
Summary
PRPC initialization sporadically fails when multiple nodes are restarted.
Error Messages
From App Server STDOUT log:
JBAS013412: Timeout after [300] seconds waiting for service container stability. Operation will roll back.
From PegaRULES log:
2016-11-15 20:55:56,352 [server.com] [ STANDARD] [ ] [ ] ( internal.mgmt.PRNodeImpl) INFO - Starts joining cluster
2016-11-15 21:01:21,366 [server.com] [ STANDARD] [ ] [ ] ( internal.mgmt.PREnvironment) ERROR - java.lang.IllegalStateException: Node failed to start!
2016-11-15 21:01:21,371 [server.com] [ STANDARD] [ ] [ ] ( etier.impl.EngineStartup) ERROR - PegaRULES initialization failed. Server: server.com
com.pega.pegarules.pub.context.InitializationFailedError: PRNodeImpl init failed
at com.pega.pegarules.session.internal.mgmt.PREnvironment.getThreadAndInitialize(PREnvironment.java:388)
at com.pega.pegarules.session.internal.PRSessionProviderImpl.getThreadAndInitialize(PRSessionProviderImpl.java:1998)
Caused by: java.lang.IllegalStateException: Node failed to start!
at com.hazelcast.instance.HazelcastInstanceImpl.<init>(HazelcastInstanceImpl.java:125)
Steps to Reproduce
Not Applicable
Root Cause
A configuration setting in the operating environment (the amount of time that the Controller Boot Thread waits for app server startup) was set too low, causing the servers to shut themselves back down.
Resolution
Make the following change to the operating environment: Adjust the "jboss.as.management.blocking.timeout" in JBOSS configuration file (standalone-*.xml or domain.xml - depending on clustering setup) to increase the value. Initial value set was 300 seconds, recommended increase to 600 seconds.
Published December 31, 2016 - Updated October 8, 2020
Have a question? Get answers now.
Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.