Error - "Flow Not At Task" while running a flow.
Users are having a flow with two assignments. The second assignment is configured with a SLA and is always processed by system using resume flow. While running the flow, they encounter an error - "Flow not at Task". The case also gets stuck at the second assignment.
"Flow not at Task"
Steps to Reproduce
This error is very intermittent 3 to 5 times per day and only in Production. Internal replication hasn't been possible.
1. Flow has 2 assignments, in the 2nd assignment we configured SLA.
2. 2nd assignment(routed to workbasket) is always processed by system using resume flow.
3. Sometimes we are getting “Flow not at Task” issue & case getting stocked at second assignment which is to a workbaset.
The root cause was established by adding additional logging for the deferred list using a PRPC Trigger to execute a logging activity immediately when a Commit method occurs and the previous method for the instance was Obj-Save (Committed Save in the form). The Trigger activity logged the details of the DB deferred operations list to the PegaRULES log file in order to gain a better understanding of what is being saved.
The logs produced when the issue occured confirmed the incorrect save as shown in the example extract below:
2015-02-02 13:19:41,422 [ WebContainer : 32] [1_WorkTab0] [ xxxx:xx.xx.xx] (xxxxxx) INFO xxxxxxxxx|xx.xxx.xxx.xx [email protected] *** Save of YYYYYY AAA-11111 with update timestamp 20150202T131941.188 GMT on page named pyWorkPage
2015-02-02 13:19:41,422 [ WebContainer : 32] [1_WorkTab0] [ xxxx:xx.xx.xx] (xxxxxx) INFO xxxxxxxxx|xx.xxx.xxx.xx [email protected] *** Save of YYYYYY AAA-11111 with update timestamp 20150202T094421.070 GMT on page named [NOTHING]
As we can see above, the save order of the case AAA-11111 (redacted) has been duplicated. At the same execution time (2015-02-02 13:19:41,422), a second order appears on the Deferred List. But this second entry, which will override the previous one, is incorrect. The update timestamp is an earlier one (20150202T094421.070 GMT which is before the expected 20150202T131941.188 GMT) and the associated page is empty.
This is the root cause of the problem encountered, an unexpected and incorrect duplication of the data on the deferred list for save being saved and committed to the database.
The issue was resolved via the application of PRPC hot fix HFix-8658 for PRPC 6.2 SP1.
The hot fix provided updated versions of the following engine class files:
The changes made within these class files result in the following behaviour:
Upon activation (the term describing re-activation of a passified requestor), check the deferred list to ensure that there really is one and only one unique pzInsKey present in the list of deferred pages (many pages, each with a unique pzInsKey).
If that is not the case:
1. Report the facts that a mismatch has been encountered.
2. Retain the first encountered page and drop the rest.
This change corrected the problem behaviour observed and recorded in the diagnostic logging undertaken where it was established that two entries were present on the deferred page.