Creating an external data flow run
You can specify where to run external data flows and manage and monitor running them on the External processing tab of the Data Flows landing page. External data flows run in an external environment (data set) that is referenced by a Hadoop record on the Pega Platform platform.
Before you can create an external data flow run, you must:
- Create a Hadoop record that references the external data set on which you want to run the data flow.
- Create an external data flow rule that you want to run on an external data set.
To specify where to run an external data flow:
In the header of Dev Studio, click.
On the form that opens, provide details about where to run the external data flow:
- Applies to – The class on which the external data flow is defined.
- Access group – An instance of Data-Admin-Operator-AccessGroup rule.
- External data flow – The name of the external data flow rule that you want to use for external processing.
Hadoop – The Data-Admin-Hadoop record
instance where you want to run the data flow. This field is auto-populated with the
Hadoop record that is configured as the source for the selected external data flow
rule. You can configure multiple instances of a Hadoop record that point to the same external data set but have different run-time settings.
Click Create. The run object is created and listed on the External processing tab.
In the External Data Flow Run window that is displayed, click Start to run the external data flow. In this window, you can view the details for running the external data flow.Depending on the current status of the external data flow, you can also stop running or restart the external data flow from this window or on the External processing tab of the Data Flows landing page.
On the External processing tab, click a run object to monitor its status on the External Data Flow Run window.
- Managing external data flow runs
You can manage existing external data flows on the External processing tab of the Data Flows landing page. For each external data flow, you can view its ID, the external data flow rule, the start and end time, the current execution stage, and the status information. You can also start, stop, or restart an external data flow, depending on its current status.
- External Data Flow Run window
You can monitor and manage each instance of running an external data flow from the External Data Flow Run window. This window gives you detailed information about each stage that an external data flow advances through to completion.