Processing data with data flows
Data flows are scalable and resilient data pipelines that you can use to ingest, process, and move data from one or more sources to one or more destinations.Each data flow consists of components that transform data in the pipeline and enrich data processing with event strategies, strategies, and text analysis. The components run concurrently to handle data starting from the source and ending at the destination.
- Creating a data flow
Create a data flow to process and move data between data sources. Customize your data flow by adding data flow shapes and by referencing other business rules to do more complex data operations. For example, a simple data flow can move data from a single data set, apply a filter, and save the results in a different data set. More complex data flows can be sourced by other data flows, can apply strategies for data processing, and open a case or trigger an activity as the final outcome of the data flow.
- Creating external data flows
External Data Flow (EDF) is a rule for defining the flow of data on the graphical canvas and executing that flow on an external system. With EDF, you can run predictive analytics models in a Hadoop environment and utilize its infrastructure to process large numbers of records to limit the data transfer between Hadoop and the Pega Platform.
- Making decisions in data flow runs
- Managing data flow runs
Control record processing in your application by starting, stopping, or restarting data flows. Monitor data flow status to achieve a better understanding of data flow performance.
- Data flow methods
Data flows can be run, monitored, and managed through a rule-based API. Data-Decision-DDFRunOptions is the container class for the API rules and provides the properties required to programmatically configure data flow runs. Additionally, the DataFlow-Execute method allows you to perform a number of operations that depend on the design of the data flow that you invoke.
- Decision data methods
Decision data records are designed to be run through a rule-based API. When you run a decision data record, you test the data that it provides.
- External data flow methods
External data flows can be run, monitored, and managed through a rule-based API. Data-Decision-EDF-RunOptions and Pega-DM-EDF-Work are the container classes for the API rules, and provide the properties required to programmatically configure external data flow runs.