You are here: Decision Strategy Manager > Services > Data flow service > Configuring the data flow service

Configuring the Data Flow service

The Data Flow service enables running data flow work items on decision data. To enable the Data Flow service, you must add at least one decision data node on the Data Flow tab of the Services landing page. You can add additional decision data nodes to scale the processing of data. Each new data flow work item is processed on all nodes that are listed in this tab.

Test runs of data flows are always processed on the local Pega Platform node, so you do not configure the Data Flow service for test runs.

Click Designer Studio > Decisioning > Infrastructure > Services > Data Flow.
Add a node.
1. Click Add node.
2. In the window that is displayed, select the check box for the decision data node that you want to add. You can add multiple nodes at once.
3. Click Submit.
4. Refresh the landing page until the status of the node changes to NORMAL.
Optional: Click Edit settings to change settings for data flow nodes.
Specify the number of Pega Platform threads that are assigned to process running the data flows and the batch scalability factor to use idle threads for running the data flows.

For example, when the source of a data flow is divided into five partitions, the data flow run is divided into five assignments that can be processed simultaneously on separate threads if there are enough threads.

The number of available threads is calculated by multiplying the thread count by the number of nodes. With two nodes and five threads in the system, the data flow run uses five threads and five threads remain idle. After you set the batch scalability factor to two, all 10 threads are used to process five assignments.
1. Enter the number of threads.
  
  The number of threads for running data flows is the same across all decision data nodes that are configured for the Data Flow service.
2. Enter the batch scalability factor.
Optional: Display the details for a particular node by clicking the row for that. You can view the node ID number, status details, and the number of active data flow runs.
Optional: Manage a node by selecting an action from the Execute menu.
If you decommission a node that has active data flow runs, the status of that node changes to LEAVING, and it is not decommissioned until all active data flow runs are finished.

Open topic with navigation