Skip to main content

This content has been archived and is no longer being updated. Links may not function; however, this content may be relevant to outdated versions of the product.

Support Article

Hazelcast timeout occurs on multiple nodes

SA-59955

Summary



Pega nodes hang sporadically when configuring a multiple node cluster.  Hazelcast timeout occurs on multiple nodes.


Error Messages



[erClockSynchDaemon-0] [STANDARD] [ ] [ ] ( spi.impl.BasicInvocation) WARN - [xyz]:5701 [2daa41af6f9d6825ddcdf7697eb3f0ca] [3.4.1] No response for 120000 ms. BasicInvocationFuture{invocation=BasicInvocation{ serviceName='hz:impl:executorService', op=Operation{serviceName='hz:impl:executorService', callId=26516, invocationTime=1527167115304, waitTimeout=-1, callTimeout=60000}, partitionId=-1, replicaIndex=0, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeout=60000, target=Address[xyz]:5701, backupsExpected=0, backupsCompleted=0}, response=null, done=false}


Steps to Reproduce



Configure a three nodes cluster on Pega 7.2.2.


Root Cause



A defect or configuration issue in the  operating environment.
Thread dumps were generated manually. Multiple threads were blocked in the Oracle JDBC driver code as below:


at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
...
 at oracle.jdbc.driver.T4CTTIoping.doOPING(T4CTTIoping.java:50)
at oracle.jdbc.driver.T4CConnection.doPingDatabase(T4CConnection.java:5221)
 - locked <0x00000006c9bf9658> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.PhysicalConnection.pingDatabase(PhysicalConnection.java:7015)
at oracle.jdbc.driver.PhysicalConnection.pingDatabase(PhysicalConnection.java:7036)
at oracle.jdbc.driver.OracleConnection.isValid(OracleConnection.java:222)
at org.apache.tomcat.dbcp.dbcp2.DelegatingConnection.isValid(DelegatingConnection.java:916)
at org.apache.tomcat.dbcp.dbcp2.PoolableConnection.validate(PoolableConnection.java:282)


The impacted environment's driver is Oracle 12.1.0.2.0 which causes issues. For Oracle 12c, using the following version can cause synchronization failure when an offline attachment is added.


Resolution



Perform the following local-change:
  1. Update the driver to Oracle 12.2.0.1.
  2. Add validationQuery="select 1 from dual" parameter to prevent the Oracle code from invoking isValid.
  3. Add the following parameter to avoid potential issues with firewall.

    testOnBorrow=true,validationQueryTimeout=10,testWhileIdle=true,timeBetweenEvictionRunsMillis=30
Below is an example of the resource in prcontext.xml with these settings,

 <Resource name="jdbc/PegaRULES"
auth="Container"
type="javax.sql.DataSource"
driverClassName="oracle.jdbc.OracleDriver"
url="jdbc:oracle:thin:@localhost:1521/xgc.pega"
username=""
password=""
maxActive="100"
maxIdle="30"
maxWait="10000"
validationQuery="select 1 from dual"
testOnBorrow=true
validationQueryTimeout=10
testWhileIdle=true
timeBetweenEvictionRunsMillis=30
/>

Published December 29, 2018 - Updated October 8, 2020

Was this useful?

0% found this useful

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Community has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice
Contact us