Support Article

RequestorLockException and multiple thread dumps, Hazelcast lock

SA-27255

Summary



Users cannot log in to the application on a particular node and many thread dumps are created because of the com.hazelcast.spi.impl.BasicInvocationFuture lock, which generates multiple threads.

Error Messages



2016-08-08 08:30:01,539 [ WebContainer : 5] [ ] [ ] [ ] (ngineinterface.service.HttpAPI) ERROR : com.pega.pegarules.pub.context.RequestorLockException 
com.pega.pegarules.pub.context.RequestorLockException: Unable to synchronize on requestor H30CC6091A0E70A7DABD35C7727110DD9 within 120 seconds: (thisThread = WebContainer : 5) (originally locked by = WebContainer : 3) (finally locked by = WebContainer : 3) 

"WebContainer : 3" Id=652 in TIMED_WAITING on lock=com.hazelcast.spi.impl.BasicInvocationFuture@b306ee51 (running in native) 
BlockedCount : 1, BlockedTime : -1, WaitedCount : 35919, WaitedTime : -1 
at java.lang.Object.wait(Native Method) 
at java.lang.Object.wait(Object.java:196) 
at com.hazelcast.spi.impl.BasicInvocationFuture.pollResponse(BasicInvocationFuture.java:265) 
- locked com.hazelcast.spi.impl.BasicInvocationFuture@b306ee51 
at com.hazelcast.spi.impl.BasicInvocationFuture.waitForResponse(BasicInvocationFuture.java:216) 
at com.hazelcast.spi.impl.BasicInvocationFuture.get(BasicInvocationFuture.java:193) 
at com.hazelcast.spi.impl.BasicInvocationFuture.get(BasicInvocationFuture.java:173) 
at com.hazelcast.spi.impl.BasicInvocationFuture.getSafely(BasicInvocationFuture.java:185) 
at com.hazelcast.util.FutureUtil.retrieveValue(FutureUtil.java:301) 
at com.hazelcast.util.FutureUtil.executeWithDeadline(FutureUtil.java:289) 
at com.hazelcast.util.FutureUtil.waitWithDeadline(FutureUtil.java:278) 
at com.hazelcast.util.FutureUtil.waitWithDeadline(FutureUtil.java:252) 
at com.hazelcast.spi.impl.EventServiceImpl.invokeRegistrationOnOtherNodes(EventServiceImpl.java:208) 
at com.hazelcast.spi.impl.EventServiceImpl.registerListenerInternal(EventServiceImpl.java:159) 
at com.hazelcast.spi.impl.EventServiceImpl.registerListener(EventServiceImpl.java:143) 
at com.hazelcast.map.impl.AbstractMapServiceContextSupport.addEventListener(AbstractMapServiceContextSupport.java:196) 
at com.hazelcast.map.impl.DefaultMapServiceContext.addEventListener(DefaultMapServiceContext.java:28) 
at com.hazelcast.map.impl.proxy.MapProxySupport.addEntryListenerInternal(MapProxySupport.java:901) 
at com.hazelcast.map.impl.proxy.MapProxyImpl.addEntryListener(MapProxyImpl.java:430) 
at com.pega.pegarules.cluster.internal.PRHazelcastDistributedMapImpl.<init>(PRHazelcastDistributedMapImpl.java:84) 
at com.pega.pegarules.cluster.internal.PRHazelcastDistributedObjectManagerImpl.getDistributedMap(PRHazelcastDistributedObjectManagerImpl.java:107) 
at com.pega.pegarules.exec.internal.async.ClusterSubscriptionsManager.getSubscriberNodesMap(ClusterSubscriptionsManager.java:86) 
at com.pega.pegarules.exec.internal.async.ClusterSubscriptionsManager.lockAndGetNodes(ClusterSubscriptionsManager.java:65) 
at com.pega.pegarules.exec.internal.async.ClusterSubscriptionsManager.subscribeNodeForChannel(ClusterSubscriptionsManager.java:54) 
at com.pega.pegarules.exec.internal.async.PRHazelcastPublishServiceImpl.addSubscription(PRHazelcastPublishServiceImpl.java:37) 

"hz._hzInstance_1_83e67edb8c7e06ffb34406fb73d423d7.generic-operation.thread-0" Id=156 in WAITING on lock=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3da88849 (running in native) 
BlockedCount : 0, BlockedTime : -1, WaitedCount : 2360, WaitedTime : -1 
at sun.misc.Unsafe.park(Native Method) 
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:182) 
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1998) 
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:410) 
at com.hazelcast.spi.impl.BasicOperationScheduler$OperationThread.doRun(BasicOperationScheduler.java:445) 
at com.hazelcast.spi.impl.BasicOperationScheduler$OperationThread.run(BasicOperationScheduler.java:432)


Steps to Reproduce



Not Applicable


Root Cause



A defect in Pegasystems’ code or rules and an a known defect with all versions of Hazelcast
A clock drifting issue causes one of the nodes to become inactive.

Resolution

Perform the following local-change to increase the Hazelcast timeout setting.
For example:


​-Dhazelcast.operation.call.timeout.millis=120000 

Published August 26, 2016 - Updated October 21, 2016

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.