Data collected by Pega Predictive Diagnostic Cloud
Pega Predictive Diagnostic Cloud™ (PDC) receives, analyzes, and stores data that is sent from Pega Platform™. The information is gathered from several data sources and received as summary statistics or individual data.
The following topics contain information about the data collected by PDC:
- Communication channels
- Data that PDC collects from Pega Platform
- Data security
- Data retention policy
Pega Platform sends information to PDC from the following channels:
- Alerts and exceptions – Asynchronously sent to PDC and written to the local alert files. PDC receives most of the data that Pega Platform writes in the PegaRulesAlert log file. Business-sensitive data is filtered out and removed.
- PegaRULES ManagementDaemon thread – Sends a node health status message to PDC every ten minutes. The health status message contains current health statistics, such as CPU utilization, memory utilization, agent count, requestor count, recent responsiveness, and the time when the last system pulse agent was triggered. Work data is not included in the health status message.
- PegaAESRemote agents – Run on all monitored nodes and periodically send information to PDC to assess the overall health and help in identifying any problems.
PDC collects the following data from monitored nodes:
- Alerts – Most of the data that PDC receives comes from Pega alerts that are saved in the PegaAlerts log file. Over 50 types of alerts are triggered when counts or elapsed times exceed a threshold during an interaction. Alerts contain metadata about an interaction. For more information about metadata that is included in alert logs, see Alert log message data. For more information about the types of alerts, see List of performance and security alerts in Pega Platform.
- Parameter page – Some parameters from the current parameter page are sent together with the alert. The parameter page contains important contextual information about the functions that run in the monitored application. The required parameters are sent to an allow list in Pega Platform. All the remaining parameters are filtered out and excluded from the alert.
- Database alerts – The database query is sent as a part of the alert. All business data values are removed in the SQL calls with
- Exceptions – Exceptions are sent to PDC for analysis (
INFOstatements are not sent). You can view exceptions by looking at the error lines in the PegaRULES log file. Exceptions can contain some contextual data that is a subset of the fields that are sent for alerts.
- Performance statistics – The monitored application sends performance statistics hourly. PDC uses these statistics to identify overall performance and performance trends of your application’s systems, including statistics for average response time and unique user count.
- Database indexes – PDC gathers the database index information daily to create recommendations for improving query performance. You can see which indexes are currently used and determine whether you need additional indexes. The PushDBIndexes activity gathers the index information for all tables and sends it to PDC.
- Guardrail violation counts – Guardrail warnings indicate that the rules in your application do not follow Pega Platform best practices. PDC counts the total number of rules and the number of individual rules that have justified and unjustified warnings of severe, moderate, or caution type.
In Pega Platform 7.3.1 and earlier versions, agents in the PegaAESRemote ruleset sent only the following information:
- Summarized usage data – The PushLogUsage agent runs a report against a log-usage class to assess user count, interaction count, and average response time.
- Schema data – Database table name and index definitions. This data assists with the analysis of slow database queries.
- Guardrail statistics – Rule warning count.
New features might require your system to send data that earlier versions of the PegaAESRemote ruleset did not support. To ensure that you have the most recent version of the ruleset, install the appropriate hotfix or product rule for your version of Pega Platform. These enhancements consist of rules only, and you can apply them by using the Hotfix Manager or the Import wizard without interrupting the availability of your system. For more information, see Enhanced PegaAESRemote ruleset to support the latest Pega Predictive Diagnostic Cloud features.
Beginning with Pega Platform 7.4, agents in the PegaAESRemote ruleset gather the following data for PDC:
- Database query statistics – An agent gathers, resets, and sends PostgreSQL database query statement statistics. PDC administrators can access this information to assess database load that might be running in the background.
- Database table statistics – An agent gathers and sends PostgreSQL table sizes and access statistics. This information helps PDC administrators understand and explore research usage and improve analysis of slow database queries.
- Pega run-time environment – An agent gathers Java code and sends Pega configuration options to assist with solving problems through PDC.
- JVM – An agent gathers the basic run-time information about the JVM that is used on the node.
- Agent status – An agent gathers information about the status of agents in your system. PDC uses this information to detect agents that have failed or have been stopped manually.
- Queue status – An agent gathers information about the status of queues. As a result, PDC can detect queue failures and identify latency in the search index updates.
- Listener status – An agent gathers information about the status of listeners. You can use this information to assess listeners' health and performance.
- Hotfix history – An agent sends the list of hotfixes that are installed in your system, and the information about the hotfixes' status. This information is available if at least one hotfix scan was performed in the system.
- Node information – An agent sends information that improves debugging of your system, for example, the version of Pega Platform and the PegaAESRemote ruleset installed on the node, node type and production level, the time when the system was started, and total memory.
- Conflicting queries in the database – An agent detects when conflicting queries cause a blockage in the PostgreSQL database system. PDC uses this information to provide an enhanced description and advice that helps you to resolve the problem.
- Elasticsearch status – An agent detects when full-text search is not available.
- Enhanced schema data – In addition to the data collected in earlier versions, an agent sends information about the number of inserts, updates, and deletes in the database table. This data assists with the analysis of slow database queries.
Beginning with Pega Platform 8.4, agents in the PegaAESRemote ruleset gather the following data for PDC:
- Job scheduler status – The job details and other information that you can access on the Tools for monitoring system resources in Pega Predictive Diagnostic Cloud. landing page. For more information, see
- Queue processor status – The queue processor details on the Tools for monitoring system resources in Pega Predictive Diagnostic Cloud. landing page. For more information, see
- PostgreSQL database index usage – The information on the Tables > Index Used Info tab of the landing page. For more information, see Tools for monitoring system resources in Pega Predictive Diagnostic Cloud.
- PostgreSQL database cache effectiveness – The information in the Metrics tab on the landing page. section of the
- PostgreSQL database connection tracking – The information on the Connection details tab on the Metrics tab of the landing page. chart and the
- Hourly usage data – The statistics that you can access on the Usage statistics for your Pega Platform systems provided by Pega Predictive Diagnostic Cloud. landing page. For more information, see
- System and rule changes – Changes in your system and in the node topology. For more information, see Changes Summary landing page overview in Pega Predictive Diagnostic Cloud.
- Enhanced conflicting query detection – This enhancement provides more details about conflicting queries and makes the threshold for the PEGA0106 alert configurable. For more information, see PEGA0106: Conflicting queries in the PostgreSQL database system.
Pega Platform uses an allow list for sending clipboard parameter data. This means that Pega Platform only sends the parameters that are required for analysis and that have known and safe content. If a parameter is not listed as safe, the parameter name value is removed.
The following parameters are on the allow list by default: AJAXTrackID, ActivityClassToExecute, ActivityNameToExecute, CustomActivityClassName, CustomActivityName, FlowClass, FlowType, Format, InsKey, RuleClass, RuleObjClass, StreamClass, StreamName, TaskStatus, ViewClass, ViewInsKey, ViewOwner, ViewPurpose, action, actionName, activityName, contentID, currentLockOwner, dynamicContainerID, flowType, harnessName, inStandardsMode, insName, objClass, openHandle, originalLockOwner, portal, portalName, portalThreadName, preActivity, primaryPageClass, productName, productVersion, pxObjClass, pyAction, pyActivity, pyClassName, pyDefinitionKey, pyExecuteOnDataPage, pyForEachCount, pyPageName, pyReportClass, pyReportName, pyRuleset, pyRunType, pyStream, pyStreamName, pyTempPlaceHolder, pzTransactionID, requestorID, tabIndex.
All communication with PDC is fully encrypted because the data is transmitted through a SOAP protocol over HTTPS. Your application sends SOAP messages to PDC, but PDC cannot connect to your application for additional information. The multitenant features in Pega Platform ensure that the client (tenant) data is accessed only by that client. This means that the information exchange between your monitored application and PDC is one-way. Users can use PDC by directly logging in to PDC through a web browser over an encrypted HTTPS connection, or by subscribing to emailed reports. For more information about multitenancy, see Multitenancy.
PDC is an application built on Pega Platform and securely hosted in a dedicated private cloud on Pega Cloud. PDC uses multitenancy capabilities by giving each client a unique URL, user name, and password. All clients are segregated and have access only to their system database using a unique URL and can never view information or data about other clients.
Data that PDC handles is stored in persisted memory and encrypted with a 256-bit AES key. The keys are automatically rotated periodically, securely stored in an encrypted key management system (KMS), and managed by Pega Cloud® Services.
The main purpose of PDC is to help you identify and resolve current and recent issues. Depending on the type of information, PDC stores raw data only for a limited time, and then purges the data from the database.
The following table lists the periods after which PDC removes particular types of data:
|Type of data||Retention period|
|Cases||Cases stay open for as long as associated events occur. PDC automatically resolves a case 12 days after the last associated event. After resolving a case, PDC retains the case data for 95 days before purging the data, unless you choose to ignore similar cases. Cases with the Resolved-Ignore status stay in the database indefinitely so that PDC can ignore similar cases in the future.|
|Node health statistics (received every two minutes)||14 days|
|Database indexes||30 days|
|Summarized usage data||30 days|
|Schema data||30 days|
|Database query statistics||30 days|
|Database table statistics||30 days|
|Agent status||3 days|
|Queue status||14 days|
|Listener status||14 days|
|Job scheduler status||14 days|
|Agent queue||14 days|
|Node information||14 days|
|Elasticsearch status||14 days|
|Dynamic system settings (DSS)||Current snapshot|
|CPU utilization by JVM||30 days|
|Heap utilization||30 days|
|Active requestor count||30 days|