5

I have a task which pulls records from Oracle db to our SQL using dataflow task. This package runs everyday around 45 mins. This package will refresh about 15 tables. except one, others are incremental update. so almost every task runs 2 to 10 mins.

the one package which full replacement runs up to 25 mins. I want to tune this dataflow task to run faster.

There is just 400k of rows in the table. I did read some articles about DefaultBufferSize and DefaultBufferMaxRows. I have below doubts.

  1. If I can set DefaultBufferSize upto 100 MB, Is there any place to look or analyse how much I can provide.

  2. DefaultBufferMaxRows is set to 10k. Even If I give 50k and I provided 10 MB for DefaultBufferSize if which can only hold up to some 20k then what will SSIS do. Just ignore those 30k records or still it will pull all those 50k rocords(Spooling)?

  3. Can I use Logging options to set proper limits?

James Z
  • 12,209
  • 10
  • 24
  • 44
Maximus
  • 792
  • 9
  • 19
  • thanks for your response. i agree that approach surely will help me some other day. But that article doesn't have answers for my questions. If you know anything about DefaultBufferSize and DefaultBufferMaxRows please reply back. – Maximus Feb 28 '13 at 12:40
  • Most readers here will not know what a *Lakh* is. (100,000 if I recall correctly). – RBarryYoung Feb 28 '13 at 16:19
  • FYI, in my experience `DefaultBufferSize = 10000` is usually too *high* for network transfers, not too low. Try setting it down to 100. – RBarryYoung Feb 28 '13 at 16:22
  • HI Young- Yes you are right. there are only 2.5k records are transferred in each buffer. So in my guess it doesn't matter how much DefaultBufferMaxRows we are giving unless if we are giving very low amount of rows. – Maximus Mar 01 '13 at 04:15

1 Answers1

4

As a general practice (and if you have enough memory), a smaller number of large buffers is better than a larger number of small buffers BUT not until the point where you have paging to disk (which is bad for obvious reasons)

To test it, you can log the event BufferSizeTuning, which will show you how many rows are in each buffer.

Also, before you begin adjusting the sizing of the buffers, the most important improvement that you can make is to reduce the size of each row of data by removing unneeded columns and by configuring data types appropriately.

Diego
  • 34,802
  • 21
  • 91
  • 134
  • Hi Diego- When I was seeing the package execution I could see aroung 2.5k records are transferred in every batch (we can watch the rows incremental value). The server has 3GB free space in C drive and 28GB free space in D Drive. Please advide me? And Yes I only used required columns in DFT. – Maximus Mar 01 '13 at 04:17
  • Also this is the only package or Job which is running on this time. – Maximus Mar 01 '13 at 05:28
  • Can anyone please answer this question? – Maximus Mar 05 '13 at 05:25
  • free space on the drive should not affect this setting, you are working with memory, not space on your physical drive. – Diego Mar 07 '13 at 13:46
  • My direct question is, if I set DefaultBufferSize at 50 MB and imagine it went out of memory. So what will happen? The package will fail? Or it will give any warning? Is there any way I can see the excedeeded memory size? – Maximus Mar 12 '13 at 05:19
  • no you wont. There is one think on SSIS called "Backpressure Mechanism". It is used when the source or an asynchronous transformation is too fast compared to the next transformation so it is suspended when it creates too many buffers (currently the limit is 5) so you can process fast sources without worrying about opening too many buffers and eventually running out of memory. – Diego Mar 12 '13 at 09:07
  • Thanks Diego. I was asking about spooling. How can I know that the data is spooling?? – Maximus Mar 12 '13 at 09:49