PegaRULES table partitioning
Large Pega 7 Platform deployments typically store and maintain sizable volumes of work data in the PegaRULES database schema. Partitioning is an effective technique for dividing large work tables and indexes into smaller and more manageable tablespaces. For instance, by partitioning a 100 GB table into one hundred 1 GB data partitions, the following benefits can be achieved:
- Each table spends less time in backup mode. If there is a failure during backup, partitioning reduces the amount of manual recovery or restores.
- Each partition can be reorganized independently of other partitions.
- Queries take less time to run against a partition than against the entire table.
- Unused partitioned data can be taken offline or archived on inexpensive storage subsystems.
The benefits of partitioning are often greatest when the partitioning strategy closely models underlying business processes. For example, assume that the data is accessed or modified according to region or country. Regional partitions enable your enterprise to take advantage of the different time zones of different regions. For example, Europe's night hours overlap U.S. business hours. The European partitions can be maintained while the U.S. partitions remain in full operation.
Basic partitioning methods, suggestions for which PegaRULES and CPM work tables to partition, and a list of considerations for using table partitions are described below.
The basic methods are:
Not all databases can implement all these partitioning methods. The databases that can implement the method are shown in parentheses.
Range (Oracle, DB2/zOS, DB2/LUW, MS SQL Server)
Range partitions are based on a range of column values. This method is useful for segmenting a table along a continuous domain, such as time. The scope of each partition is defined by a partition bound (non-inclusive). The key value commonly used to bound a partition is a date field.
For example, assume that a table called PEGASAMPLE has a field called ORDER_DATE. Also assume that there is five-year history in this table. To create partitions for every quarter, create 20 partitions in the table, one for each quarter. The first partition has the lowest bound and the last one has the highest bound. A query that seeks data for the first quarter of 2009 goes directly to this partition instead of going through all the data in the table.
List (Oracle, MS SQL Server)
List partitions are based on the list of values. This method is useful for segmenting a table along a discrete domain, such as product, country, employee, or region. Each partition in a list-partitioned table contains data that matches a list of discrete partition-key values.
For example, suppose that your database contains data for many different countries and has a large table named SALES_HISTORY by country. You can create list partitions that are based on regions such as europe, north_america, south_america, and asia. Rows for the country values defined in each partition are automatically placed in the appropriate regional partitions. Therefore, you could define the north_america partition to contain rows with VALUE US, Canada, and Mexico.
Hash partitions are based on a value that is derived from a hashing algorithm. This method is useful when there are no obvious partition-key values such as customer number, Social Security number (SSN), or product ID. It can also be used if range or list partitioning causes uneven distribution of data and requires manual rebalancing,
Composite partitions use the range method and, within each partition, subpartitions that are defined by using the hash or list methods. The composite method supports historical operations data at the partition level and parallelism (parallel DML), and data placement at the subpartition level. The partitions of a composite-partitioned table or index are logical structures only. Their data is stored in the segments of their subpartitions.
For example, the ORDERS table is range-partitioned on the ORDER_DATE key in four separate ranges that represent quarters of the year. Each range partition is further hash partitioned on the PRODUCTID key into eight sub-partitions, for a total of 32 subpartitions. Each tablespace contains one subpartition from each partition.
Like partitioned tables, partitioned indexes improve manageability, availability, performance, and scalability. They can either be partitioned independently (global indexes) or automatically linked to a table's partitioning method (local indexes).
Local indexes (Oracle, DB2/zOS, DB2/LUW, MS SQL Server)
Local indexes are separate indexes on each partition of the partitioned table. The local index is created and destroyed when the table partition is added or removed from the partitioned table. Local indexes are typically used when the key of the index is also part of the primary partition key.
Global partitioned Indexes (Oracle, DB2/LUW)
Global partitioned Indexes can be partitioned by using a different partitioning strategy from the main table and therefore Index lookups might be faster when the index key is not part of the partition table key.
Global non-partitioned indexes (Oracle, DB2/zOS, DB2/LUW)
The index structures are not partitioned. They are the same as ordinary indexes created on non-partitioned tables.
Work tables to partition
Partitioning is recommended for the following base PegaRULES and CPM work tables.
|Table name||Partition method||Key columns|
For table descriptions, see Database tables in the PegaRULES database.
|Table name||Partition method||Key columns||Description|
|PCA_INTENT_USAGE||N/A||Table is purged nightly by CaAggregateData CPM agent.||Intent usage data; used for adaptive suggestions.|
|PCA_INTENT_CONTEXT||N/A||Table is purged nightly by CaAggregateData CPM agent.||Demographic interaction snapshot data; used for adaptive suggestions.|
|PCA_INTENT_INTERACTIONS||N/A||Table is purged nightly by CaAggregateData CPM agent.||Index of interaction objects to support fast data retrieval.|
|Interaction and service work objects.|
|Quality review work objects.|
|Call data from the telephony environment.|
|PA_CTI_IVRDATA||N/A||Originally intended for IVR (interactive voice response) configurations. Rarely used in production.||Additional call data that is not attached to the call.|
When using partitions, keep these considerations in mind:
- The Pega 7 Platform does not include any partitioned tables. During an upgrade, partitioned tables are treated as non-partitioned tables.
- Queries that are included with the Pega 7 Platform are not appended to the customer's partition key column, which means that you might not realize a real-time query benefit by partition pruning.
- Queries that are included with the Pega 7 Platform use a global index. The cost of maintaining the global index must be considered when partitioning. The maintenance options include: online, using a drop partition; update global indexes, or offline rebuild. Oracle 12c also has an asynchronous global index rebuilding feature.
- Do not change the primary key to a local partitioned index. Doing so will trigger locks on all index partitions that use DML and burden the real-time application.
- Partition pruning with the local partition index occurs if a custom SQL's predicate includes the partition key column.
- You can use a partition to archive data. Migrating a partition into a separate database can benefit system performance by reducing real-time system data volume.
- It is not recommended that you use pxUpdateDateTime or pyassignmentstatus as partition key columns because they are volatile.
- When you partition across work tables, if a work table is partitioned by pzInsKey and pxObjClass, other work tables such as Assign- and WorkAttach- can be partitioned by pxRefObjectKey and pxRefObjectClass. This is the same as partitioning by pxCreateDateTime.