Best practices for starting and stopping nodes in a Pega cluster
The best practice for maintaining high availability in a cluster is to dedicate a couple of nodes in the cluster to offline processing, such as for BIX and search indexing, and not have those nodes participate in end-user session load-balancing. This method minimizes the impact of agent processing on production performance for end-user sessions. These nodes can serve as the Elasticsearch host nodes, and should be isolated from restarts of the other production nodes. This configuration provides a stable environment because, for example, the Node IDs for the Elasticsearch host nodes do not change very often, if at all.
If your environment is not set up in this way, follow the best practice procedures described in the rest of this article to obtain the best high availability possible.
Full system restart
Nodes should be shut down in the following order:
- Non-search nodes
- Secondary search nodes
- Primary search node
Shutting down the search nodes last minimizes the amount of time that search is unavailable.
Nodes should be started in the following order:
- Primary search node
- Secondary search nodes, after the primary node has started
- Remaining nodes
Starting nodes in the this order also ensures that search remains accessible to all nodes during the cluster startup. This order also ensures that all nodes started after the search node immediately discover the search node or nodes during its initialization process.
If nodes are started before the search node is available, the startup of these nodes is delayed because the ping request to the search node will have to time out, which delays initialization.
Single node restart
A single node can be restarted unless it is the only search node. In this case, the search node must be stopped first, and then all other nodes stopped. Start the search node first, and then start the rest of the nodes.
For high availability, configure more than one node to be a search node. This configuration will eliminate the need for a full restart if a search node must be restarted, as long as all configured search nodes are not quiesced at the same time.