Close popover

Table of Contents

External Data Flow Run window

Version:

You can monitor and manage each instance of running an external data flow from the External Data Flow Run window. This window gives you detailed information about each stage that an external data flow advances through to completion.

Run settings

In this section, you can view the following information:

  • Data flow – The external data flow rule instance that is used in this run.
  • Hadoop – The Hadoop record that references the external data set where the external data flow rule instance is running.

Run details

In this section, you can view the following information:

  • Status – The status of running the external data flow. This field can have the following values:
    • New
    • Pending start
    • In progress
    • Completed
    • Pending stop
    • Stopped
    • Failed
  • Info – Additional feedback regarding the current status of running the external data flow. For example, this information can explain the cause of a run failure.
  • Overall progress – A bar that shows the progress of running the external data flow.

Execution plan

In this section, you can view the following stages of running the external data flow:

  • Script generation – Generates the Pig Latin script. The Pig Latin script is a set of statements that reflects the configuration of the external data flow that you use as part of this run.
  • Resources preparation – Copies JAR resources from the Pega Platform engine to the Hadoop environment. You can view the Pig Latin script that was generated for running this external data flow.
  • Deployment – Launches the YARN application that deploys the external data flow in the Hadoop environment. You can view the YARN Application Master ID for the application that runs the external data flow in the Hadoop environment.
  • Script execution – Runs the external data flow by executing the Pig Latin script in the Hadoop environment. You can monitor whether this stage completed successfully.
  • Cleanup – Removes all resources that were deployed as part of running the external data flow from the Hadoop environment. These resources include the YARN application launcher, the working directory, the Pega Platform JAR resources, and so on.

  • Creating an external data flow run

    You can specify where to run external data flows and manage and monitor running them on the External processing tab of the Data Flows landing page. External data flows run in an external environment (data set) that is referenced by a Hadoop record on the Pega Platform platform.

  • Managing external data flow runs

    You can manage existing external data flows on the External processing tab of the Data Flows landing page. For each external data flow, you can view its ID, the external data flow rule, the start and end time, the current execution stage, and the status information. You can also start, stop, or restart an external data flow, depending on its current status.

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.