Configuring custom monitoring thresholds in Pega Autonomic Event Services 7.3
The Enterprise Health console displays a set of health indicators that represent the collective health of each system's nodes by displaying a status. The default thresholds for the statuses are designed to fully utilize available resources and optimize system performance. After you configure a system for monitoring with Pega® Autonomic Event Services, you can customize the values of these thresholds by modifying the underlying decision table. You can customize values for a single node or for a single system. You can also change the enterprise-wide default threshold values for the health indicators.
For example, the default threshold for the number of running agents is seven. If your system must run 31 agents, then Pega Autonomic Event Services indicates a critical condition for agents even though the number is correct. You can change the threshold values so that 31 is treated as the normal condition, and any other number is treated as critical.
- Log in to the Pega Autonomic Event Services Manager Portal as aesmanager.
- On the navigation pane, click Enterprise.
- Click the system for which you want to customize monitoring thresholds.
- Click the node for which you want to customize monitoring thresholds. If you want to customize monitoring thresholds for a system, click any node in that system.
In the Criteria Scope column, you can see which decision table is currently used to calculate the status of an indicator. The possible options are:- Default – Default, enterprise-wide thresholds are used for this indicator.
- System – System-specific thresholds have been configured for the indicator.
- Node – Node-specific thresholds have been configured for the indicator.
- In the row with the appropriate health indicator, click the system link or the node link in the Maintenance column, or click the default link in the Criteria Scope column.
- Update the underlying decision table with values to define the
Normal
,Warn
, andCritical
status. Depending on the indicator, the values have the following meanings:- Requestors – Number of active requestors
- Agents – Number of running agents
- Memory – Percentage of JVM memory in use
- Pulse – Last time of system pulse
- CPU – Process CPU usage
- Database – Number of database connections or the occurrence of SQL exceptions
- Cache – Rule cache enabled (yes or no)
- HTTP Response – Average HTTP (browser or portal requestor) response time (in seconds)
- Save the decision table.