Support Article
System hung or crashes with nativce OOM
SA-20852
Summary
After few hours of start-up and with no users logged in, the JVM crashes with native OOM and hangs.
Error Messages
Detail "java/lang/OutOfMemoryError" "Failed to create a thread: retVal -1073741830, errno 11" received
Steps to Reproduce
Not Applicable
Root Cause
Java core of thread dump shows thousands of threads in parked state with thread call stack below. This indicates that there are entries in PR_SYS_STATUSNODES that cannot be reached by the JVM (part of auto discovery process within Elastic search)
"PegaRULES-Search[[unicast_connect]][T#4]" J9VMThread:0x0000000008DD6B00, j9thread_t:0x00007F421087D240, java/lang/Thread:0x000000078A72A978, state:P, prio=5
(java/lang/Thread getId:0xEC4, isDaemon:true)
(native thread ID:0x3552, native priority:0x5, native policy:UNKNOWN, vmstate:P, vm thread flags:0x00020001)
(native stack address range from:0x00007F420D8A5000, to:0x00007F420D8E6000, size:0x41000)
CPU usage total: 0.000106099 secs
Parked on: java/util/concurrent/locks/[email protected] Owned by: "PegaRULES-Search[[unicast_connect]][T#2]" (J9VMThread:0x000000000554E800, java/lang/Thread:0x00000007860E16A0)
Heap bytes allocated since last GC cycle=0 (0x0)
Java callstack:
at sun/misc/Unsafe.park(Native Method)
at java/util/concurrent/locks/LockSupport.park(LockSupport.java:198(Compiled Code))
at java/util/concurrent/locks/AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:846(Compiled Code))
at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:879(Compiled Code))
at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1209(Compiled Code))
at java/util/concurrent/locks/ReentrantLock$NonfairSync.lock(ReentrantLock.java:230(Compiled Code))
at java/util/concurrent/locks/ReentrantLock.lock(ReentrantLock.java:306(Compiled Code))
at com/pega/elasticsearch/common/util/concurrent/KeyedLock.acquire(KeyedLock.java:65(Compiled Code))
at com/pega/elasticsearch/transport/netty/NettyTransport.connectToNode(NettyTransport.java:634(Compiled Code))
at com/pega/elasticsearch/transport/netty/NettyTransport.connectToNodeLight(NettyTransport.java:610(Compiled Code))
at com/pega/elasticsearch/transport/TransportService.connectToNodeLight(TransportService.java:133(Compiled Code))
at com/pega/elasticsearch/discovery/zen/ping/unicast/UnicastZenPing$3.run(UnicastZenPing.java:279(Compiled Code))
at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1177(Compiled Code))
at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642(Compiled Code))
at java/lang/Thread.run(Thread.java:857(Compiled Code))
No native callstack available for this thread
Resolution
Perform the following local-change:
- Delete ALL entries from PR_SYS_STATUSNODES (SQL> TRUNCATE TABLE PR_SYS_STATUSNODES;)
- Recycle all JVM’s
- This time when you login, you should see ONLY one Index host configuration which comes from Dynamic system setting indexing/hostid and indexing/explicitindexdir
- Change the index host settings if required i.e if the DSS values above are incorrect and submit.
- Please note that at runtime, search settings are read from PR_SYS_STATUSNODES table
Published March 16, 2016 - Updated October 8, 2020
Have a question? Get answers now.
Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.