Modifying Cassandra node routing policies
Maintain high performance and short write times by changing the default node routing policies that limit the Cassandra-Cassandra network activity.
By default, when Pega Platform connects to Cassandra, the DataStax token aware policy routes requests to Cassandra nodes. The goal of that policy is to always route requests to nodes that hold the requested data, which reduces the amount of Cassandra-to-Cassandra network activity through the following actions:
- Calculating the token for the request by creating a murmur3 hash function of the partition key for the requested or written data.
- Determining the list of potential nodes to which to send data by creating a group of nodes whose token range contains the token that you calculated.
- Choosing one of the nodes in the list to which to send the request, with the local data center as the priority.
Enable the token range partitioner by setting the prconfig/dnode/dds_partitioner_class/default dynamic system setting to com.pega.dsm.dnode.impl.dataset.cassandra.TokenRangePartitioner.When the DDS data set browse operation is part of a data flow, the DDS data set breaks up the retrieved data into chunks, so that these chunk requests can be spread across the batch data flow nodes. By default, these chunks are defined as evenly split token ranges which do not take into account where the data resides. In a large cluster, a single token range may require data from multiple nodes. By configuring this DSS setting, you can ensure that no chunk range query requires data from more than one Cassandra node.
Enable the extended token aware policy by setting the prconfig/dnode/cassandra_use_extended_token_aware_policy/default dynamic system setting to true.When a Cassandra range query runs, the extended token aware policy selects a token from the token range to determine the Cassandra node to which to send the request, which is effective when the token range partitioner is configured.
Enable the additional latency aware routing policy by setting the prconfig/dnode/cassandra_latency_aware_policy/default dynamic system setting to true.In Cassandra clusters, individual node performance might vary significantly because of internal operations on the load (for example, repair or compaction). The latency aware routing policy is an additional DataStax client mechanism that can be loaded on top of the token aware policy to route queries away from slower nodes.
To configure the additional latency aware routing policy parameters, configure the following dynamic system settings:
For more information, see the Apache Cassandra documentation.
Specify when the policy excludes a slow node from queries by setting the prconfig/dnode/cassandra_latency_aware_policy/exclusion_threshold/default dynamic system setting to a number that represents how many times slower the node must be from the fastest node to get excluded.If you set the exclusion threshold to 3, the policy excludes the nodes that are more than 3 times slower than the fastest node.
Specify how the weight of older latencies decreases over time by setting the prconfig/dnode/cassandra_latency_aware_policy/scale/default dynamic system setting to a number of milliseconds.
Specify how long the policy can exclude a node before retrying a query by setting the prconfig/dnode/cassandra_latency_aware_policy/retry_period/default dynamic system setting to a number of seconds.
Specify how often the minimum average latency is recomputed by setting the prconfig/dnode/cassandra_latency_aware_policy/update_rate/default dynamic system setting to a number of milliseconds.
Specify the minimum number of measurements per host to consider for the latency aware policy by setting the prconfig/dnode/cassandra_latency_aware_policy/min_measure/default dynamic system setting.
- Configuring multiple data centers
Ensure the continuity of your online services by adding a secondary Cassandra data center.
- Configuring the Cassandra cluster
Pega Platform comes with an internal Cassandra cluster to which you can connect through a Decision Data Store data set. Before connecting to the cluster through Pega Platform, perform the following steps to achieve optimal performance and data consistency across the nodes in the cluster.
- Creating a dynamic system setting
Add a dynamic system settings rule to change default system behavior.
- Configuring dynamic system settings
As a best practice, set system configuration settings by using dynamic system settings data instances. For example, you can use a dynamic system settings to configure which fields are available in full-text search. Dynamic system settings are stored in the Pega Platform database and are used by all nodes that share that database.