Support Article

Multinode PRPC system fails to create a cluster

SA-10267

Summary

You have multiple nodes. It takes a long time for the nodes to start up.

Error Messages

[3/19/15 12:52:42:063 CDT] 00000082 SystemOut O 2015-03-19 12:52:42,063 [ HOST] [ STANDARD] [ ] ( external.async.IStartupTask) INFO - load Declarative Page Definition Cache ..done
[3/19/15 12:52:42:349 CDT] 00000082 SystemOut O 2015-03-19 12:52:42,349 [ HOST] [ STANDARD] [ ] (rnal.assembly.RulesAspectCache) WARN - Assembler was unable to be found for Rule-HC.
Pega-supplied assembler 'com.pega.nci.rules.AdjustmentMethod_Inclusions' could not be located for aspect Action.
Did you forget to import a jar file?
[3/19/15 12:52:43:243 CDT] 00000082 SystemOut O 2015-03-19 12:52:43,243 [ HOST] [ STANDARD] [ ] ( internal.mgmt.PRNodeImpl) INFO - Node Identification: "HOST aes_trn 2015-03-19 17:51:26.568 GMT"; Node Id: 29a2d6a58b744aa2aa707cd37768bd51
[3/19/15 12:52:44:295 CDT] 0000006f DefaultAddres I com.hazelcast.instance.DefaultAddressPicker Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [<IP>, <IP>]
[3/19/15 12:52:44:302 CDT] 0000006f DefaultAddres I com.hazelcast.instance.DefaultAddressPicker Prefer IPv4 stack is true.
[3/19/15 12:52:44:302 CDT] 0000006f DefaultAddres W com.hazelcast.instance.DefaultAddressPicker Could not find a matching address to start with! Picking one of non-loopback addresses.
[3/19/15 12:52:44:306 CDT] 0000006f DefaultAddres I com.hazelcast.instance.DefaultAddressPicker Picked Address[<IP>]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
[3/19/15 12:52:44:807 CDT] 0000006f system I com.hazelcast.system [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Hazelcast Community Edition 3.2 (20140321) starting at Address[<IP>]:5701
[3/19/15 12:52:44:808 CDT] 0000006f system I com.hazelcast.system [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Copyright (C) 2008-2014 Hazelcast
[3/19/15 12:52:44:811 CDT] 0000006f Node I com.hazelcast.instance.Node [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Creating TcpIpJoiner
[3/19/15 12:52:44:814 CDT] 0000006f LifecycleServ I com.hazelcast.core.LifecycleService [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Address[<IP>]:5701 is STARTING
[3/19/15 12:52:45:010 CDT] 0000006f TcpIpJoiner I com.hazelcast.cluster.TcpIpJoiner [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Connecting to possible member: Address[<IP>]:5701
[3/19/15 12:52:45:020 CDT] 0000006f TcpIpJoiner I com.hazelcast.cluster.TcpIpJoiner [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Connecting to possible member: Address[<IP>]:5702
[3/19/15 12:52:45:020 CDT] 00000098 SocketConnect I com.hazelcast.nio.SocketConnector [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Connecting to /<IP>:5701, timeout: 0, bind-any: true
[3/19/15 12:52:45:021 CDT] 00000099 SocketConnect I com.hazelcast.nio.SocketConnector [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Connecting to /<IP>:5702, timeout: 0, bind-any: true
[3/19/15 12:52:45:067 CDT] 00000097 SocketAccepto I com.hazelcast.nio.SocketAcceptor [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Accepting socket connection from /<IP>:37696
[3/19/15 12:52:45:082 CDT] 00000098 TcpIpConnecti I com.hazelcast.nio.TcpIpConnectionManager [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] 40193 accepted socket connection from /<IP>:5701
[3/19/15 12:52:45:082 CDT] 00000099 TcpIpConnecti I com.hazelcast.nio.TcpIpConnectionManager [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] 40841 accepted socket connection from /<IP>:5702
[3/19/15 12:52:45:082 CDT] 00000097 TcpIpConnecti I com.hazelcast.nio.TcpIpConnectionManager [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] 5701 accepted socket connection from /<IP>:37696
[3/19/15 12:53:22:043 CDT] 0000006f TcpIpJoiner W com.hazelcast.cluster.TcpIpJoiner [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Couldn't join to the master : Address[<IP>]:5702
[3/19/15 12:53:22:044 CDT] 0000006f TcpIpJoiner W com.hazelcast.cluster.TcpIpJoiner [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Failed to connect, node joined= false, allConnected= false to all other members after 0 seconds.
[3/19/15 12:53:22:044 CDT] 0000006f TcpIpJoiner W com.hazelcast.cluster.TcpIpJoiner [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] Rebooting after 10 seconds.

[3/19/15 12:59:48:014 CDT] 00000099 BasicInvocati W com.hazelcast.spi.impl.BasicInvocation [<IP>]:5701 [109ee35d0bbf6aad53fa33b9a3fcf0bc] [3.2] While asking 'is-executing': InvocationFuture{invocation=BasicInvocation{ serviceName='hz:core:clusterService', op=com.hazelcast.cluster.JoinCheckOperation@4c10549c, partitionId=-1, replicaIndex=0, tryCount=1, tryPauseMillis=500, invokeCount=1, callTimeout=60000, target=Address[<IP>]:5702}, done=false}
java.util.concurrent.ExecutionException: com.hazelcast.spi.exception.TargetNotMemberException: Not Member! target:Address[<IP>]:5702, partitionId: -1, operation: com.hazelcast.spi.impl.BasicInvocation$IsStillExecuting, service: hz:core:clusterService
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.resolveResponseOrThrowException(BasicInvocation.java:792)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.get(BasicInvocation.java:696)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.isOperationExecuting(BasicInvocation.java:875)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.waitForResponse(BasicInvocation.java:753)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.get(BasicInvocation.java:695)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.get(BasicInvocation.java:674)
at com.hazelcast.cluster.ClusterServiceImpl.checkJoinInfo(ClusterServiceImpl.java:206)
at com.hazelcast.cluster.TcpIpJoiner.searchForOtherClusters(TcpIpJoiner.java:468)
at com.hazelcast.cluster.SplitBrainHandler.searchForOtherClusters(SplitBrainHandler.java:47)
at com.hazelcast.cluster.SplitBrainHandler.run(SplitBrainHandler.java:37)
at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:186)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1177)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.lang.Thread.run(Thread.java:795)

Steps to Reproduce

1. Stop both nodes
2. Restart nodes one after another.

Root Cause

The root cause of this problem is defect/misconfiguration in the operating environment.
The cluster is configured with one node in the DMZ. The infrastructure of the network prevented the communication between the nodes causing many repeated attempts to comunicate.

Resolution

Open the firewall to allow the configured hazelcast ports to communicate.

Tags:

Pega Platform

Pega Platform 7.1.7

Case Management

Published June 12, 2015 - Updated October 8, 2020

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.

Visit the Collaboration Center

Multinode PRPC system fails to create a cluster

Summary

Error Messages

Steps to Reproduce

Root Cause

Resolution

Tags:

Have a question? Get answers now.

The Power of Pega Resources

Experience the benefits of Pega Community when you log in.

Multinode PRPC system fails to create a cluster

Summary

Error Messages

Steps to Reproduce

Root Cause

Resolution

Tags:

Have a question? Get answers now.

The Power of Pega Resources

Experience the benefits of Pega Community when you log in.

We'd prefer it if you saw us at our best.