Planning your case archiving and purging process
Plan your case archiving and purging process to ensure that it is efficiently implemented.
- Run a report on the cases that you plan to archive. You can use the final reports to verify that the process succeeded.
- To search your archived cases properly, optimize the properties in your case
worktable you want to search when the case data is archived in Pega Cloud File Storage. If you purchased and enabled Business Intelligence Exchange
(BIX) to extract your case data, in your Filter Criteria, ensure that you
clear Use last updated time as start before optimizing
properties for search.
Use the pxUpdateDateTime filter instead to enable incremental extractions. This avoids creating new indexes that cause all your properties to become optimized, leading to performance issues.
You may need to submit a cloud-case request to add an index on pxUpdateDateTime to improve search performance.
For more information, see Creating and running an Extract: Filter Criteria.
- Ensure that you do not need to reopen the cases that you archive. Reopening an archived case requires contacting Pega Supportand providing a compelling business reason to complete the action. For more information, see Reviewing archived case data.
- Resolve all cases below the top level in the case hierarchy, where the top-level case is resolved according to its archival policy. For more information, see Case archiving and purge overview.
- If you have imported a Rule-Admin-Product (RAP) rule with the Data-Retention-Policy class instances, review the archiving and purging policies for your cases, as this import overrides the case archival polices.
- To check the performance of your archival process (see steps c. and d.), make a
cloud change request to add custom indexes to the case type worktable and on the
pr_metadata table by using My Support Portal.
Include the following index statements to the cloud change request:
Index on the worktable for the case type Index on pr_metadata (pxobjclass, pyresolvedtimestamp) INCLUDE (pystatuswork, pzinskey) WHERE pxcoverinskey IS NULL (pyisparent, pyarchivestatus) include (pzinskey)
Running a successful archiving process means archiving and purging cases faster than the rate at which the cases become eligible for archiving. When you start the archiving and purge process, you might have a backlog of cases that are eligible for archiving and purge. This backlog grows as more cases becoming eligible for archiving and purge every day.
When running archival jobs and purge jobs, use the following settings:
- Configure dataarchival/LimitPerPolicy to change the rate at which you archive eligible cases. For more information about archive settings, see Creating and configuring case archive settings.
- Configure dataarchival/purgeQueryLimit to change the rate at which you delete records after they have been archived to Pega Cloud File Storage. For more information about archive and purge settings, see Configuring purge cycle settings.
Identify the number of cases in your initial backlog and the rate at which cases become eligible for archiving, then create an archival policy for those cases.For example, your system contains a backlog of 10,000 cases and adds 500 cases a day.
For more information about creating an archival policy, see Creating an archival policy.
Schedule an archive and purge process that finishes during a time period of low-system load.For example, your system might experience low system load every day for five hours or every weekend for 12 hours.
For more information about scheduling an archival policy, see Scheduling a case archival policy.
Determine the upper range for archiving and purge processing by adjusting the dataarchival/LimitPerPolicy and dataarchival/purgeQueryLimitsettings. You can run an experiment to determine the necessary values by performing the following steps:
- Set a low rate in the dataarchival/LimitPerPolicy and dataarchival/purgeQueryLimit dynamic system settings, respectively.
- Run a process with the following jobs, using the low value for the
number of cases:
For more information, see Case archiving and purging overview.
- Monitor the progress of your case archiving process. For more information, see Monitoring the progress of your case archival process.
- Evaluate the performance impact and time that the low rate of archive and purge process has on your system.
- Increase the settings for archiving and purging values and run the
archiving process again with increased values.
A faster rate of archiving impacts system resources more, and takes longer to complete.The pyPegaPurger job can take a significant amount of time, and even a small value, such as 5000, for the dataarchival/purgeQueryLimit setting can cause timeout errors. If you encounter an error, use a smaller value for dataarchival/purgeQueryLimit and then schedule multiple runs of only the pyPegaPurger job until pr_metadata is empty. This indicates that all cases processed by pyPegaCopier and pyPegaIndexer have been purged by pyPegaPurger.
- Continue to increase the rate to determine that maximum value with which you can complete the process within a low system load duration and with an acceptable system impact.
Clear your backlog by using the maximum rate found above. The maximum rate of archiving and purging determines the time frame for clearing your backlog.
Adjust the archiving and purging process after clearing your backlog.
If your rate does not exceed the rate at which cases become eligible for archiving, then run your archive and purge jobs twice a day instead of once.
- If your current rate of archiving and purging meets or exceeds the rate at which cases become eligible for archiving, keep that archiving rate.
- If your current rate of archiving is slower than the rate at which cases become eligible for archiving, plan to run your archival and purge jobs more frequently to archive faster.
Plan to run your archive and purge jobs at an initially fast rate to clear your backlog.
- Enable a permanent deletion policy for your archived case data.
For more information, see step 5 of Creating an archival policy.
- Schedule the py_PegaExpunger to run after the
pyPegaPurger job completes.
When the job runs, it will permanently delete the eligible data as per your data retention policy.