PEGA0090 alert: Hazelcast partition was lost
The PEGA0090 alert is generated when partition data is lost from a node that ungracefully shut down.
Hazelcast Partition Lost Listener
(HazelcastPartitionLostListener) WARN -Node [10.123.0.79]:5701 indicated that partition 189 was lost. The lost backup count is 0.
The message calls out the shutdown node, the lost partition, and the count.
This count is equivalent to how many backups were lost:
- 0 means the owning node lost the data.
- 1 means the owning node and a backup were lost.
- 2 means the owner and two backups were lost, and so forth.
All maps have at least two backups and some maps are replicated on all nodes. The data is distributed between the nodes so there's no definitive way to check what data was held onto by what node.
Partition data was lost from a node that ungracefully shut down.
What it means
Hazelcast did not have the time needed to migrate the data to another node before the ungracefully shutdown. In nearly all cases, this message is benign because all distributed maps have multiple backup copies stored across the cluster.
In the event of this message, a backup copy will become the new owner, a new backup copy will be created, and the cluster will repartition the data to prevent data loss.
What to do
You do not have to do anything. This messages indicates the possibility of data loss as a result of a node ungracefully shutting down before the cluster could migrate data. This message is only actionable if multiple nodes have ungracefully shut down. The higher the number of nodes taken out of the cluster, the higher the likelihood that all copies of a partition have been lost, resulting in data loss.