Working with Agents in Pega 7.2.2
Understanding agents and agent queues is critical to managing your Pega® deployment effectively. This article explains Pega Platform 7.2.2 agents and how they work. Of note is the information about Pega Integration agents for SAML. Get the answers to commonly asked questions about migrating modified agent rules to different environments.
- How Agents work
- Pega-Integration agents for SAML require enablement on one node of a cluster
- Migrating modified Agent rules
- Does the activity Security > Usage need to be Rule Connect?
- When does the Data-Agent-Queue (DAQ) instance get created with the updated changes to RAQ?
- Three conditions to consider
- FAQs: Actions to take
Before you read this article, you should have fundamental knowledge of the Pega 7 Platform, especially the following Pega 7 concepts, tasks, and reference information:
- How to develop rules for an application
- Agent concepts and terms and related topics
- What Agents are available in Pega 7.2.2 and how to upgrade to Pega 7.2.2
- How to create and schedule Agents in Pega 7.2.2 on multiple nodes of a cluster (two or more Web Application Servers)
- How to create, modify, and schedule Pega 7 Agent Queues
- Information in the Pega 7.2.2 Help topics, Standard agents and Agent schedule data instances, and the related Help topics
Refer to Related Content for links to all prerequisite information and other references cited throughout this article.
Agents are designed to run processes in the cluster of a multi-node environment to achieve a specific purpose. All agents are first defined as an instance of the class Rule-Agent-Queue (RAQ) for a given ruleset and the class that it operates on. At system startup time, Pega reads all of the Rule-Agent-Queue instances and determines which nodes each RAQ should be enabled on.
To achieve this, a Data-Agent-Queue (DAQ) instance is created for the same rule. The DAQ is the agent schedule data instance. It allows node-level overrides, eliminating the need for manual check-out or check-in of rule changes. The key to the DAQ instance is the ruleset and the NodeID assigned to the node. These DAQ instances are then used going forward to run the processes for the node based on the specified settings.
If you do not need an agent to run on one or more nodes, you can safely shut it down on those nodes and keep it enabled on others nodes that need it. Some agents are required to run on one node only; other agents are required to run on all nodes. Refer to the Help topics for the Pega-provided agents for more information about specific agents and where they are enabled or disabled by default.
In Pega 7.2.2, the Rule-Agent-Queue instance in Pega-Integration enables, by default, all six agents for SAML authentication that call the same activity. The RAQ passes a different class to the activity to operate on and clears the associated Data-Admin-DB-Table class mapping from the database.
This should only require one (1) agent queue with the cleanup and parameters for each class in the one activity, making it consistent with other Pega-provided Agents and reducing the unnecessary overhead of running more agents than are actually required. Until this product enhancement is delivered in a future product release, you must enable all six agents on one node of a cluster. These are legacy agents that can and should be running on one node only in a multi-node cluster.
Agent rule name
Calls the pyCleanupReplayCache activity to delete instances in the database table pr_data_saml_replaycache for Data-Admin-ReplayCache class for SAML configurations.
Needs to be enabled on one (1) node only in a cluster. The Rule-Agent-Queue passes a parameter classname of Data-Admin-ReplayCache to the agent to process on.
Calls the pyCleanupSSONodeInfoInstances activity to delete instances in the database table pr_data_saml_requestorinfo for Data-Admin-Security-SSO-SAML-NodeInfo class for SAML configurations.
Needs to be enabled on one (1) node only in a cluster. The Rule-Agent-Queue passes a parameter classname of Data-Admin-Security-SSO-SAML-NodeInfo to the agent to process on.
Calls the pyCleanupWebSSO activity to delete instances in the database table pr_data_saml_sessioninfo for Data-Admin-SesssionInfo class older than xx minutes for SAML configurations.
Needs to be enabled on one (1) node only in a cluster. The Rule-Agent-Queue passes a parameter classname of Data-Admin-SessionInfo to the agent to process on.
Calls the pyCleanupWebSSO activity to delete instances in the database table pr_data_saml_logoutreqinfo for Data-Admin-LogoutRequestInfo class older than xx minutes for SAML configurations.
Needs to be enabled on one (1) node only in a cluster. The Rule-Agent-Queue passes a parameter classname of Data-Admin-LogoutRequestInfo to the agent to process on.
Calls the pyCleanupWebSSO activity to delete instances in the database table pr_data_saml_authreqcontext for Data-Admin-Security-SSO-SAML-AuthRequestContext class for SAML configurations.
Needs to be enabled on one (1) node only in a cluster. The Rule-Agent-Queue passes a parameter classname of Data-Admin-Security-SSO-SAML-AuthRequestContext to the agent to process on.
Calls the pyCleanupWebSSO activity to delete instances in the database table pr_data_saml_logininfo class older than xx minutes for SAML configurations.
Needs to be enabled on one (1) node only in a cluster. The Rule-Agent-Queue passes a parameter classname of Data-Admin-LoginInfo to the agent to process on.
Legacy agents run on one node only in multi-node cluster
If you make a change to a Rule-Agent-Queue instance, for example, agentXYZ, you must migrate the changes to other environments so that the same agent can be run on one or more nodes. The activity that agentXYZ initiates creates some pages using an RDB-Delete method on each page. This method calls Rule-Connect-SQL to delete content from application-specific database tables as part of a cleanup process.
Here are answers to commonly asked questions.
No: On the Activity rule form, the Security tab, Usage is more for reporting and categorizing the type of agent. You can choose any one of the Usage choices that are available for selection:
- Rule Connect
- Load Data Page
Continuing with the example, the Rule-Agent-Queue (RAQ) agentXYZ was migrated using Rule-Admin-Product from Development and Rules Move to Staging. However, the DAQ instance knows nothing about the changes just imported or created in that environment. When you open the DAQ instance for agentXYZ (if it already exists), there is no Refresh from RAQ action to take on it. So what actions are required to be able to use the new or updated agentXYZ agent? The master agent wakes up, checks the scheduling for agents to determine which agent needs to be fired, and then updates the Data-Agent-Queues if the Rule-Agent-Queue instance has changed since it was created. This can occur depending on the last run time, usually 10 minutes, more or less. If the master agent finds no corresponding DAQ instance, then the master agent creates the DAQ instance.
When migrating modified Rule-Agent-Queue instances, consider the following conditions between RAQ and DAQ instances:
- A Rule-Agent-Queue and Data-Agent-Queue instance do not exist.
This is a newly created agent process.
- A Rule-Agent-Queue exists, but the Data-Agent-Queue instance does not exist.
This is not a new process, but perhaps a new node is being added to process the agents on.
- A Rule-Agent-Queue and Data-Agent-Queue instance exists.
This is not a new process, but a change to an existing one.
This is either a change to an existing Agent in a ruleset or a new Agent being added to the ruleset.
Continuing with the example, the agent that was created is not using queued items (work items queued up for a specific agent) and, therefore, it will not work if the Mode of the RAQ instance is set to Legacy or Standard. Because the agent is processing rows in the database directly by SQL, without any work item to process, the RAQ must be defined as an Advanced Mode Agent. Otherwise, when you review the agent’s activity in the System Management Application (SMA) or in the Designer Studio, the Agent Management landing page, it will appear as if the agent ran (indicated by Start and Stop date and times), but the activity itself will not run.
The only way to know if this is the case is by reviewing the RAQ instance and taking one of the following actions:
- Check to see what Mode was configured
- Turn on the logging level for
com.pega.pegarules.session.internal.async.agent.QueueProcessorand setting it to DEBUG
Once enabled and the DAQ is run, you will see status lines similar to the examples below.
The first line indicates that the agent was found, but not determined to be an agent needed when the system starts up the first time. The next line indicates a problem reading an item from the queue to process on. Because no items are expected to be processed, it correctly indicates
No item to process. Changing the mode to Advanced Mode Agent will then get the desired results because you are not working with a work item.
30 Nov 2016 09:33:45,623 [PegaRULES-MasterAgent] (internal.async.AgentsForRuleset) DEBUG - Not a StartupAgentMyNewAgentName
30 Nov 2016 09:34:00,019 [ PegaRULES-Batch-2] ( async.agent.QueueProcessor) DEBUG - No item to process: MyRuleSetName #7: WorkItemClass-NameHere.MyNewAgentActivityName"
Here are answers to commonly asked questions about managing Agents in multi-node, clustered environments.
- To recognize changes made to both RAQ and DAQ instances –
- Do all nodes in a cluster needing the agent enabled require a restart of that node?
- Or do all nodes in a cluster need to be restarted regardless where the agent is enabled in order?
For example, the agent is enabled on one node, but there are six nodes.
No: The master agent polls for changes and updates the appropriate DAQ instance.
- Is there a way to force (manually create) the DAQ instance without restarting the nodes to pick up changes made or migrated in a RAQ instance?
Not at this time.
- Is there a way to force (manually update) an existing DAQ instance without restarting the nodes to pick up changes made or migrated in a RAQ instance?
Not at this time.
- Are there any role-specific settings that an agent must have in order to work properly?
Not at this time.
- Is there specific debug logging that can be turned on for agent processing to determine if the agent ran successfully (performed what it was instructed to do) or not and to see the errors and warnings being reported?
What normally appears in the logs is an agent starting up, shutting down, or triggering an error exception.
Not all information (warnings or errors) is shown by default in the log files. Therefore, you need to enable logging to get more granular detail.
The best approach is to start with
com.pega.pegarules.session.internal.async.agent.QueueProcessor to determine which agents are processing correctly.