Support Article

Partitioning does not work for data flows

SA-56764

Summary

On setting the partition key and running the data flow (based on a data set), data flows are not partitioned.

Error Messages

Not Applicable

Steps to Reproduce

Create a data base table.
Map the data set to a database table.
Create a data flow based to read from the data set.

Root Cause

An issue in the custom application code or rules.

Resolution

Here’s the explanation for the reported behavior:

Partitioning does not depend just on the number of nodes present but also on how powerful the hardware (CPU, memory, and disk IO) is to run multiple threads per node.
The data flow execution splits the amount of work to multiple assignments, and one of the factor is partition key. It is better to have more number of partitions.

The below formula limits the number of assignments created for better parallelism.

Number of Assignments = Number of nodes in the data flow cluster * Number of Configured Threads * 2.

Ensure the configuration (such as, number of assignments to process, number of nodes and threads) meets the above factors.

Published August 24, 2018 - Updated December 2, 2021

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.

Visit the Collaboration Center

Get Started with Community

COVID-19 Employee Safety and Business Continuity Tracker

Partitioning does not work for data flows

Summary

Error Messages

Steps to Reproduce

Root Cause

Resolution

Have a question? Get answers now.

Ready to crush complexity?

Experience the benefits of Pega Community when you log in.

Get Started with Community

COVID-19 Employee Safety and Business Continuity Tracker

Partitioning does not work for data flows

Summary

Error Messages

Steps to Reproduce

Root Cause

Resolution

Have a question? Get answers now.

Ready to crush complexity?

Experience the benefits of Pega Community when you log in.

We'd prefer it if you saw us at our best.