3

I just started using the SSIS tool and I need quick help to load data quarterly

Here`s my scenario:

I came up with a query ( source Database: DB2 ) that will extract data from 2010-01-01 to 2021-12-31,(11 years of data) however the data volume is too high ( around 300 M), so I would like to split the data source query to load data into quarter wise. I tried year wise and still, I am getting more volume of data which my SSIS server is not able to handle. I have created a year loop to loop it through, in that created a script task into it followed by a data flow task.

For example,

select * from tab1 where start_date >= '2010-01-01' and end_Date <= '2010-12-31'

This I would like to loop it as, ( 4 times load, 1 for each quarter)


select from tab1 where start_date >= '2010-01-01' and end_Date <= '2010-03-31'
select from tab1 where start_date >= '2010-04-01' and end_Date <= '2010-06-30'
select from tab1 where start_date >= '2010-07-01' and end_Date <= '2010-09-30'
select from tab1 where start_date >= '2010-10-01' and end_Date <= '2010-12-31'

Year-wise perfectly works fine, however, I am not getting any idea how do I need to load the data into quarter-wise. I want to pass each quarter parameters to the source query as parameters, so overall I need to loop to 48 times ( 2010 to 2021 = 11 yrs * 4 quarters) Any help is greatly appreciated. I can send screenshots of what I have created for the year loop which is working perfectly fine.

Ben Thul
  • 31,080
  • 4
  • 45
  • 68
  • "data volume is too high... which my SSIS server is not able to handle" - based on what metric? I've funneled billions of rows through SSIS in one shot before without an issue. – Ben Thul Jan 27 '22 at 00:10
  • Are you sure you have to filter on BOTH start and end dates, that both are in the same year/quarter, else you will miss some lines ? I think using limit/offset is a better solution but you can filter a quarter using DB2 something like `this_quarter(start_date) = date '2018-04-01'` – nfgl Jan 27 '22 at 09:56
  • Usually you would target a single column to get values in a range. Something like updateDate between [@start] and [@end] – KeithL Jan 27 '22 at 17:27

1 Answers1

1

I think the solution is to use the OFFSET FETCH clause to iterate over data. Why looping over data quarterly while using a number of rows is more precise (each iteration will handle the same amount of data). A step-by-step guide is provided in the following article:

One thing worth mentioning is that the article handles an SQL Server source, while you are using DB2. Then you should take into consideration any syntax difference while using the OFFSET FETCH clause:


Similar questions:

Hadi
  • 36,233
  • 13
  • 65
  • 124