0

Overview: I am running SSIS package to copy data from SQL Server 2016 to Flat file pipe delimited with quoted identifiers and handling escape sequences. I am using Script component in Data flow task as source to write data to flat file as I have to copy data from more than 100 tables with different schema so this is the only dynamic approach worked for me. Also the data is huge in amount most of the tables have 100+ columns and 5 million+ records. I have setup my master package is calling the same package 12 times in parallel for different tables (managed through SQL tables and parameters). On top of it my child package is creating file in batches implemented using For Loop containers and parameters.

Problem Statement: When I am running my package from SSDT it starts writing the data to file in immediately once it process the records but when I am running the same package via SQL Server Agent Job it is taking lot longer time and writing the data to file once all records are processed.

Example: Lets table 'a' is having 4 million records and I am generating 4 files each of 1 million, on same parameter SSDT start writing rows in file ~50K-60k (may be depend on buffer size) as soon as they processed but the same package with same configuration when I run from SQL server agent job it process all 1 million records at and try to write all at a time.

Issue: Because it is writing 1 million records at a time, file creation is taking lot of time just to write 1 million records ~5-10 mins varies based on number of columns in table but from SSDT it is much quicker ~2-5 mins for same table.

Can anybody suggest me what settings I have to check to make it work more faster. The table from which it is selecting data is well indexed. Same query for 1 million records when run against database in SSMS took ~2-4 mins.

Dale K
  • 25,246
  • 15
  • 42
  • 71
BIDeveloper
  • 767
  • 3
  • 14
  • 28
  • 1
    Are SSDT and the SQL Server on different hosts? Is the target flat file on a different or same host? – Nick.Mc Sep 12 '19 at 00:46
  • 1
    For a table that takes 5-10 minutes, where is it spending the time? Make a single run with the Logging level turned up to Verbose and then look at the component timings. Otherwise, we'll need to see more of your code and environment ti diagnose the problem. It could be that the server is overtaxed and when you run N packages in parallel, the data flow task spills to disk. Your local setup doesn't have this problem. It could be as Nick.McDermaid implies in that your IO/Network system are to blame. Too many variables at the moment so see if you can't narrow the scope. – billinkc Sep 12 '19 at 01:55
  • 1
    The other thing, if it's truly just `select * from to table` to flat file destination, that's probably less than an hour's worth of work to generate individual packages using Biml. – billinkc Sep 12 '19 at 01:57
  • @Nick.McDermaid, yes SSDT and SQL Server are on different machine, also the flat file destination is different. – BIDeveloper Sep 12 '19 at 10:18
  • @billinkc, 5-10 mins is to select data from database and writing to flat file, above timings are for 1 single package not running packages in parallel, verbose logging is already enabled, and from there i got know that it stops at data flow task while writing the file. – BIDeveloper Sep 12 '19 at 10:23
  • So your two different cases have different network configurations. I suggest that you firsrt try writing to a file local to the SQL Server and see if that is quicker – Nick.Mc Sep 13 '19 at 00:36
  • @Nick.McDermaid that i have tried and yes it is writing at good speed than on writing to a location on network. – BIDeveloper Sep 16 '19 at 10:15
  • I suggest you now copy the file and see if a local export + copy is quicker than trying to export to a remote share. – Nick.Mc Sep 16 '19 at 10:16

0 Answers0