0

I am generating CSV file from SQL Server view with Powershell to enable users download it from our ASP Web Forms web site:

Powershell script:

Push-Location $PWD;Import-Module -Name SQLPS;Invoke-Sqlcmd -Query 'SELECT 1' ` -Database  *** `  -username '***' -password '***' -Server localhost |Out-Null;Pop-Location;Invoke-Sqlcmd -Query 'query here' ` -Database *** ` -username '***' -password '***' -Server ***| Export-Csv -NoTypeInformation ` -Path 'C:\****\products.csv' ` -Encoding UTF8;

C# code:

using (PowerShell PowerShellInstance = PowerShell.Create())
 {
    PowerShellInstance.AddScript(powerShellScript);                    
    Collection<PSObject> PSOutput = PowerShellInstance.Invoke();
 } 

There are 71784 rows in the view, csv file size is 12Mb. It takes about 15 second and CPU usage increases by 40% during file generation from Web forms site.

When the same script is ran directly from Powershell console it seems that culprit is actually Powershell: enter image description here

Obviously it is not ok for production. We have hundreds of users who will need to generate this file with some specific parameters for each user(I can not serve the same file).

So far I tried:

Thread thread1 = new Thread(builder.GenerateCsvFile); 
thread1.Priority = System.Threading.ThreadPriority.Lowest;
thread1.Start(); 

I am not sure that using Thread is a good idea in the first place, anyway CPU usage is about the same when using Thread. Any advice how to decrease CPU usage with C# code? OR any other low resource usage approach to this?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Nick
  • 423
  • 4
  • 19
  • 1
    Rather than delegating this to the (somewhat black-box) `Invoke-Sqlcmd/Export-Csv` in PS, why not generate the CSV from code in your webapp that queries the DB directly. Then you'll be able to measure hot code and make optimizations. – spender Oct 05 '18 at 15:31
  • @spender I concluded that SQL server should do the job faster than me doing the same stuff from code. – Nick Oct 05 '18 at 15:32
  • So this ticket sounds like it could be closed. Please write your solution and close this :) – ArcSet Oct 05 '18 at 15:54
  • 3
    What you want to do for this to be efficient is read directly from the server in the C# code, and write directly into the response stream, where nothing ever touches the file system. That means "No PowerShell" – Joel Coehoorn Oct 05 '18 at 16:02
  • The first thing that I would do in this case is checking which part of the code causes the CPU usage increase : is it the sql query part or generating the csv file. On top of that : it seems that you are overcomplicating things for a trivial task (execute a SQL query and generate a csv) by calling a Powershell script with your C# code. Is there a special reason for this construction? – bluuf Oct 05 '18 at 16:15

1 Answers1

2

As mentioned in the comments your best bet is to query the DB and generate the CSV in your C# code instead of through a Powershell script. Each time you call the script there will be some overhead as you will get a new powershell process running on your web server. If you have hundreds of calls this will quickly cause slowdowns and add memory pressure, even if you did optimize the PS script. An example of how to do this is in C# is here.

If you do want to continue to use the Powershell solution I would look into using the SqlServer module instead of SQLPS. Per Microsoft, the SQLPS module is no longer being updated and you may get better performance using the SqlServer version of the Invoke-SqlCmd. Even with that, when I tested your powershell both the Invoke-SQLCmd and the Export-Csv caused an equal spike in CPU utilization on my PC using my data.

Brandon McClure
  • 1,329
  • 1
  • 11
  • 32
  • My main concern was amount of data: CSV file 12 Mb, 71784 rows to retrieve from DB. Do you think that it is more efficient to do everything with C#? – Nick Oct 05 '18 at 18:01
  • 1
    Yes. You will have much more flexibility to scale it up and improve efficiency of the specific tasks that are consuming too much resources. IE you could read/write in smaller batches to reduce the memory overhead of reading all the data and writing it all to the file in one go. And since you mentioned multiple users will access the site regularly (possibly concurrently?) you will also have better control over queuing/throttling multiple requests to generate the file then if you just fired off a powershell script each time a user requests the file. – Brandon McClure Oct 05 '18 at 19:32