Data collected by Pega Predictive Diagnostic Cloud

Pega Predictive Diagnostic Cloud™ (PDC) receives, analyzes, and stores data that is sent from Pega Platform™. The information is gathered from several data sources and received as summary statistics or individual data.

The following topics contain information about the data collected by PDC:

Communication channels

Pega Platform sends information to PDC from the following channels:

  • Alerts and exceptions – Asynchronously sent to PDC and written to the local alert files. PDC receives most of the data that Pega Platform writes in the PegaRulesAlert log file. Business-sensitive data is filtered out and removed.
  • PegaRULES ManagementDaemon thread – Sends a node health status message to PDC every two minutes. The health status message contains current health statistics, such as CPU utilization, memory utilization, agent count, requestor count, recent responsiveness, and the time when the last system pulse agent ran. Work data is not included in the health status message.
  • PegaAESRemote agents – Run on all monitored nodes and periodically send information to PDC to assess the overall health and help in identifying any problems.

Data that PDC collects from Pega Platform

PDC collects the following data from monitored nodes:

  • Alerts – Most of the data that PDC receives comes from Pega alerts that are written to the PegaAlerts log file. Over 50 types of alerts are triggered when counts or elapsed times exceed a threshold during an interaction. Alerts contain metadata about an interaction. For more information about metadata that is included in alert logs, see Alert log message data. For more information about the types of alerts, see List of performance and security alerts in Pega Platform.
  • Parameter page – Some parameters from the current parameter page are sent along with the alert. The parameter page contains important contextual information about the functions that run in the monitored application. The required parameters are sent to a whitelist in Pega Platform. All the remaining parameters are filtered out and excluded. You can configure the whitelist to send additional parameters as necessary to provide more context for your alerts.
  • Database alerts – The database query is sent as a part of the alert. All business data values are removed in the SQL calls with INSERT INTO statements.
  • Exceptions – Exceptions are sent to PDC for analysis (DEBUG or INFO statements are not sent). You can view exceptions by looking at the error lines in the PegaRULES log file. Exceptions can contain some contextual data that is a subset of the fields sent for alerts.
  • Performance statistics – The monitored application sends performance statistics hourly. PDC uses these statistics to identify overall performance and performance trends of your application’s systems, including statistics for average response time and unique user count.
  • Database indexes – PDC gathers the database index information daily to create recommendations for improving query performance. You can see which indexes are currently used and determine whether you need additional indexes. The PushDBIndexes activity gathers the index information for all tables and sends it to PDC.
  • Guardrail violation counts – Guardrail warnings indicate that the rules in your application do not follow all Pega Platform best practices. PDC counts the total number of rules and the number of individual rules that have justified and unjustified warnings of severe, moderate, or caution.

Data collected by PegaAESRemote ruleset agents

In Pega Platform 7.3.1 and earlier versions, the PegaAESRemote agents sent only the following information:

  • Summarized usage data – The PushLogUsage agent runs a report against a log-usage class to assess user count, interaction count, and average response time.
  • Schema data – Database table name and index definitions. This data assists with the analysis of slow database queries.
  • Guardrail statistics – Rule warning count.

Beginning with Pega Platform 7.4, a series of PegaAESRemote agents gather the following data for PDC:

  • Database query statistics – An agent gathers, resets, and sends PostgreSQL database query statement statistics. PDC administrators can access this information to assess database load that might be running in the background.
  • Database table statistics – An agent gathers and sends PostgreSQL table sizes and access statistics. This information helps PDC administrators understand and explore research usage and improve analysis of slow database queries.
  • Pega run-time environment – An agent gathers Java code and sends Pega configuration options to assist with solving problems through PDC.
  • Java virtual machine – An agent gathers the basic run-time information about the Java virtual machine that is used on the node.
  • Agent status – An agent gathers information about the status of agents in your system. PDC uses this information to detect agents that have failed or have been stopped manually.
  • Queue status – An agent gathers information about the status of queues. As a result, PDC can detect queue failures and identify latency in the search index updates.
  • Listener status – An agent gathers information about the status of listeners. You can use this information to assess listeners' health and performance.
  • Hotfix history – An agent sends the list of hotfixes that are installed in your system, and the information about the hotfixes' status. This information is available if at least one hotfix scan has been performed in the system.
  • Node information – An agent sends information that improves debugging of your system, for example, the version of Pega Platform and the PegaAESRemote ruleset installed on the node, node type and production level, the time when the system was started, and total memory.
  • Conflicting queries in the database – An agent detects when conflicting queries cause a blockage in the PostgreSQL database system. PDC uses this information to provide an enhanced description and advice that helps you to resolve the problem.
  • Elasticsearch status – An agent detects when full-text search is not available.
  • Enhanced schema data – In addition to the data collected in earlier versions, an agent sends information about the number of inserts, updates, and deletes in the database table. This data assists with the analysis of slow database queries.

Data security

Pega Platform uses a whitelist model for sending clipboard parameter data. This means that Pega Platform only sends the parameters that are required for analysis and that have known and safe content. If a parameter is not listed as safe, the parameter name value is removed.

The following parameters are on the whitelist by default: AJAXTrackID, ActivityClassToExecute, ActivityNameToExecute, CustomActivityClassName, CustomActivityName, FlowClass, FlowType, Format, InsKey, RuleClass, RuleObjClass, StreamClass, StreamName, TaskStatus, ViewClass, ViewInsKey, ViewOwner, ViewPurpose, action, actionName, activityName, contentID, currentLockOwner, dynamicContainerID, flowType, harnessName, inStandardsMode, insName, objClass, openHandle, originalLockOwner, portal, portalName, portalThreadName, preActivity, primaryPageClass, productName, productVersion, pxObjClass, pyAction, pyActivity, pyClassName, pyDefinitionKey, pyExecuteOnDataPage, pyForEachCount, pyPageName, pyReportClass, pyReportName, pyRuleset, pyRunType, pyStream, pyStreamName, pyTempPlaceHolder, pzTransactionID, requestorID, tabIndex.

All communication with PDC is fully encrypted because the data is transmitted through a SOAP protocol over HTTPS. Your application sends SOAP messages to PDC, but PDC cannot reach back to your application for additional information. The multitenant features in Pega Platform ensure that the customer (tenant) data can be accessed only by that customer. This means that the information flow between your monitored application and PDC is one-way. Users can use PDC by directly logging in to PDC through a web browser over an encrypted HTTPS connection, or by subscribing to emailed reports. For more information about multitenancy, see Multitenancy.

PDC is an application built on Pega Platform and securely hosted in a dedicated private cloud on Pega Cloud. PDC uses multitenancy capabilities by giving each customer a unique URL, user name, and password. All customers are segregated and have access only to their system database on their unique URL and can never see information or data about other customers.

Data that PDC handles is stored in persisted memory and encrypted with a 256-bit AES key. The keys are automatically rotated periodically, securely stored in an encrypted key management system (KMS), and managed by the Pega Cloud service.


100% found this useful

Have a question? Get answers now.

Visit the Pega Support Community to ask questions, engage in discussions, and help others.