Planning your case archiving process
Plan your case archiving and purge process to ensure an efficient implementation.
- Run a report on the cases that you plan to archive. You can use the final reports to verify that the process succeeded.
- To search your archived cases properly, optimize the properties in your case
worktable you want to search when the case data is archived in Pega Cloud File Storage. If you purchased and enabled Business Intelligence Exchange
(BIX) to extract your case data, in your Filter Criteria, ensure that you
clear Use last updated time as start before optimizing
properties for search.
Use the pxUpdateDateTime filter instead to enable incremental extractions. This avoids creating new indexes that cause all your properties to become optimized, leading to performance issues.
You may need to submit a cloud-case request to add an index on pxUpdateDateTime to improve search performance.
For more information, see Creating and running an Extract: Filter Criteria.
- Ensure that you do not need to reopen the cases that you archive. Reopening an archived case requires contacting Pega Supportand providing a compelling business reason to complete the action. For more information, see Reviewing archived case data.
- Resolve all cases below the top level in the case hierarchy, where the top-level case is resolved according to its archival policy. For more information, see Case archiving and purge overview.
- If you have imported a Rule-Admin-Product (RAP) rule with the Data-Retention-Policy class instances, review the archiving and purging policies for your cases, as this import overrides the case archival polices.
- To check the performance of your archival process (see steps c. and d.), make a
cloud change request to add custom indexes to the case type worktable and on the
pr_metadata table by using My Support Portal.
Include the following index statements to the cloud change request:
Index on the worktable for the case type Index on pr_metadata (pxobjclass, pyresolvedtimestamp) INCLUDE (pystatuswork, pzinskey) WHERE pxcoverinskey IS NULL (pyisparent, pyarchivestatus) include (pzinskey)
Running a successful archiving and purge process means archiving and purging cases faster than the rate that the cases become eligible for archiving. When you start the archiving and purge process, you might have a backlog of cases that are eligible for archiving and purge; this backlog keeps growing with more cases becoming eligible for archiving and purge every day.
When running archival jobs, use the dataarchival/dailyLimitPerPolicy setting to change the rate at which you archive eligible cases.
When running purge jobs, use the dataarchival/purgeQueryLimit to change the rate at which a subquery deletes records.
Plan to run your archive and purge jobs at an initial faster rate to clear your backlog.
Find out the number of cases in your initial backlog and the rate at which cases become eligible for archiving.
Determine a period of low-system load, then schedule an archive and purge process that finishes during that time period.
For example, your system might experience low system load every day for five hours or every weekend for 12 hours.
For more information, see Scheduling the case archival process.
Determine how high your rate of archiving and purge can go by performing an experiment:
- Start the experiment by setting a low rate in the dataarchival/dailyLimitPerPolicy and dataarchival/purgeQueryLimit dynamic system settings, respectively.
- Run a process with the following jobs with the low value for the
number of cases:
For more information, see Case archiving and purging overview.
- Monitor the progress of your case archiving process. For more information about monitoring the progress of your case archiving process, see Monitoring the progress of your case archival process.
- Determine the performance impact and time that this low rate of archive and purge process takes.
- Run the archiving process again with an increased rate of
archiving and purging.
A faster rate of archiving impacts system resources more and takes longer to complete.In Pega Platform version 8.5.2 and earlier, the pyPegaPurger job can take a significant amount of time that even a small value, such as 5000, for the setting dataarchival/purgeQueryLimit can cause timeout errors. If you encounter an error, use a smaller value for dataarchival/purgeQueryLimit and then schedule multiple runs of only the pyPegaPurger job until the pr_metadata is empty. This indicates that all cases processed by pyPegaCopier and pyPegaIndexer have been purged by pyPegaPurger.
- Continue to increase the rate to determine a maximum that completes the process within the low system load duration and within an acceptable system impact.
Clear your backlog by using the maximum rate found above. The maximum rate of archiving and purging determines the time frame for clearing your backlog.
Adjust the archiving and purging process after clearing your backlog.
Run your archive and purge jobs twice a day instead of once if your rate does not exceed the rate at which cases become eligible for archiving.
- If your current rate of archiving and purging meets or exceeds the rate at which cases become eligible for archiving, keep that archiving rate.
- If your current rate of archiving is slower than the rate at which cases become eligible for archiving, plan to run your archival and purge jobs more frequently to archive faster.