I have a Data Flow Task that moves a bunch of data from multiple sources to multiple destinations. About 50 in all. The data moved is from one database to another with varying rows and columns in each flow.
While I believe I understand the basic idea behind the Data Flow Task's DefaultBufferMaxRows and DefaultBufferSize as it relates to Rows per Batch and Maximum insert commit size of the Destination, it's not clear to me what happens when there are multiple unrelated source and destination flows.
What I'm wondering is which of the following makes the most sense :
- Divide out all the source and destination flows into separate Data Flow Tasks
- Divide them into groups that have roughly the same size and number of rows
- Leave as is and just make sure to set the properties with enough Buffer Rows and Buffer Size while setting the Rows per batch and Maximum insert commit size to the individual destination
I believe I read some place that it's better to have each source and destination in it's own data flow task, but I am unable to find the link at this time.
Most examples I've been able to locate online seem to always be for one source to one or more destinations, or just one to one.