Close popover

Table of Contents

Best practices for disk space management


You can maintain the high performance of decision services in your application by following best practices for allocating disk space to the Decision Data Store (DDS) nodes.

  • For DDS nodes, perform the following actions:

    • Assign a maximum of 1 TB of DDS data per Cassandra node, with a maximum of 100 GB per node for a single table.
      To avoid very long compaction procedures and, in effect, a build-up of SSTables, you can configure the compaction settings for SSTables. For more information, see Configuring compaction settings for SSTables.
    • Facilitate compaction by ensuring at least 2 TB of disk space.

      Use an HDD with a maximum capacity of 1 TB, or an SSD with a maximum capacity of between 2 and 5 TB.

      To avoid issues when compacting the largest SSTables, ensure that the disk space that you provide for Cassandra is at least double the size of your Cassandra cluster. One single DDS node running out of disk space does not affect service availability, but might cause performance degradation and eventually result in failure. For more information, see Sizing a Cassandra cluster.
    • Ensure that all DDS nodes have the same disk capacity.
    • Store the commit log and caches on separate disks by configuring the following properties: dnode/yaml/commitlog_directory and dnode/yaml/saved_caches_directory.
  • For application data, perform the following actions:

    • Avoid distributing data unequally across nodes by limiting the size of a single data record to less than 100 MB.

      For DDS data sets, when the size of the data record exceeds the threshold limit, Pega Platform triggers the PEGA0079 alert. For more information, see PEGA0079 alert.

    • Avoid splitting records across nodes by writing short rows.

      For example, do not write to a table as a ping test by using the same partition key repeatedly.

  • Monitor the available disk space on a regular basis.

    For more information, see Verifying the available disk space.

  • Configuring compaction settings for SSTables

    Maintain the good health of the Cassandra cluster by tuning compaction throughput for write-intensive workloads.

  • Configuring Cassandra compression

    You can customize the compression settings for Cassandra SSTables to best suit your application's requirements. By using compression, you reduce the size of the data written to disk, and increase read and write throughput.

  • Managing decision management nodes

    Manage the decision management nodes in your application by running certain actions for them, for example, repair or clean-up.

  • Configuring the Cassandra cluster

    Pega Platform comes with an internal Cassandra cluster to which you can connect through a Decision Data Store data set. Before connecting to the cluster through Pega Platform, perform the following steps to achieve optimal performance and data consistency across the nodes in the cluster.

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.