LinkedIn
Copied!

PEGA0126 alert: Service registry is entering safe mode

The PEGA0126 alert triggers when a Pega node cannot update its heartbeat in the service registry database for more than the heartbeat timeout (90 seconds by default). The service registry enters safe mode, which means that the node stops all cluster-related activity and declares itself unhealthy. Safe mode guarantees consistency across all Pega nodes and prevents the split-brain syndrome, which occurs when the nodes in the cluster become unaware of one another and separate into multiple clusters.

The heartbeat is a process in which each Pega node updates its last seen time parameter in the service registry database every 30 seconds to indicate that the node is still running and available. The service registry is a mechanism for discovery and coordination of distributed components in a Pega cluster. For more information, see About service registry.

Reason for the alert

PEGA0126 alerts are always preceded by one or more PEGA0125 alerts and caused indirectly by the issues behind the PEGA0125 alerts, such as a slow or unavailable database, or another process blocking the heartbeat.

Alert text

Service registry is entering safe mode

Logs to recognize the alert

In the PegaRULES.log file, the logs relevant to the PEGA0126 alert contain the following entry:
Too many failed heartbeats... entering safe mode on sessions

Recommendations

  1. Review the service registry heartbeat threads and any other related threads in the PegaRULES.log file.
    For more information, see Log files tool.
  2. Identify and resolve the issues behind the PEGA0125 alerts.
    For more information, see PEGA0125 alert: Service registry heartbeat failed.
  3. In on-premises environments, perform the following actions:
    1. Ping your Pega Platform™ instance and look for unhealthy nodes.
      For more information, see Verifying that an instance is running.
    2. Restart the unhealthy nodes.
In Pega Cloud environments, a failed health check triggers an automatic restart of the unhealthy nodes that are in safe mode.
Suggest Edit

Related Content

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.