Configuring the Data Flow service

The Data Flow service enables running data flow instances on decision data nodes. To enable the Data Flow service, you must add at least one decision data node on the Data Flow tab of the Services landing page. You can add additional decision data nodes to scale the processing of data. Each new data flow instance is processed on all of nodes that are listed in this tab.

  1. In the header of Dev Studio, click Configure > Decisioning > Infrastructure > Services > Data Flow.
    Note: Test runs of data flows are always processed on the local Pega Platform node, so you do not configure the Data Flow service for test runs.
  2. Select a type of service.
    Nodes in the Batch service are used to process the batch runs of data flows. Nodes in the Real Time service are used to process the real-time runs of data flows. Both services use independent nodes.
  3. Add a node.
    1. Click Add node.
    2. In the window that is displayed, select the check box for the decision data node that you want to add. You can add multiple nodes at once.
    3. Click Submit.
    4. Refresh the landing page until the status of the node changes to NORMAL.
  4. Optional: Click Edit settings to change settings for data flow nodes.
    1. Specify the number of Pega Platform threads that are assigned to process running the data flows. By default, the thread count is the number of threads created by Pega Platform.
    Note: The number of threads for running data flows is the same across all decision data nodes that are configured for the Data Flow service.
  5. Optional: Display the details for a particular node by clicking the row for that. You can view the node ID number, status details, and the number of active data flow runs.
  6. Optional: Manage a node by selecting an action from the Execute menu.
    Note: If you decommission a node that has active data flow runs, the status of that node changes to LEAVING, and it is not decommissioned until all active data flow runs are finished.