1

I have a PowerShell script that connects to a database and pulls a list of user data. I take this data and create a foreach loop to run a script for the data.

This is working but its slow as the results could be 1000+ entries, and it has to complete the Script.bat for User A before it can start User B. The Script.bat for a single user is independent from another and takes ~30s for each user.

Is there a way to speed this up at all? I've been playing with -Parallel, ForEach-Object and workflow but I can't get it to work, likely due to me being a noob in PS.

foreach ($row in $Dataset.tables[0].rows)
{
   $UserID=$row.value
   $DeviceID=$row.value1
   $EmailAddress=$row.email_address

   cmd.exe /c "`"$PSScriptRoot`"\bin\Script.bat -c `" -Switch $UserID`" >> `"$PSScriptRoot`"\${FileName3}_REST_${DateTime}.txt 2> nul";
}
codewario
  • 19,553
  • 20
  • 90
  • 159
rcmpayne
  • 163
  • 2
  • 15
  • 4
    So you say you want to speed this up, but the only mention of a bottleneck appears to be the .BAT file in question. _What is the .BAT doing that takes half a minute?_ Seems like that would be the pertinent area to investigate, not the above snippet. – gravity Jan 31 '20 at 16:50
  • the forloop is the issue that i need to correct. The .bat file is expected to take 30 sec the issue is the forloop needs to call the .bat file 1000 times one after another. i am trying to find a way to call the bat file and start the next loop without powershell needing to waiting for the execution of the .bat to finish. trying to get the forloop to multitask :) – rcmpayne Jan 31 '20 at 17:07
  • 2
    Try using jobs: https://www.sconstantinou.com/powershell-jobs/ – Wasif Jan 31 '20 at 17:11

1 Answers1

2

You said it yourself, your bottleneck is with the batch file in your script, not the loop itself. foreach (as opposed to ForEach-Object) is already the faster foreach loop mechanism in PowerShell. Investigate your batch file to find out why it takes 30 seconds to complete, and optimize it where you can.


Using Jobs

Note: Start-Job will run the job under another process. If you have PowerShell Core you can make use of the Start-ThreadJob cmdlet in lieu of Start-Job. This will start your job as part of another thread of the same process instead of starting another process.

If you can't optimize your batch script or optimize it to meet your needs, then you can consider using Start-Job to kick off the job to execute asynchronously, and then check the result and get any output from it using Receive-Job. For example:

# Master list of jobs you need to check the result of later
$jobs = New-Object System.Collections.Generic.List[System.Management.Automation.Job]

# Run your script for each row
foreach ($row in $Dataset.tables[0].rows)
{
   $UserID=$row.value
   $DeviceID=$row.value1
   $EmailAddress=$row.email_address

   # Use Start-Job here to kick off the script and store the job information
   # for later retrieval.
   # The $using: scope modifier allows you to make use of variables that were
   # defined in the session calling Start-Job
   $job = Start-Job -ScriptBlock { cmd.exe /c "`"${using:PSScriptRoot}`"\bin\Script.bat -c `" -Switch ${using:UserID}`" >> `"${using:PSScriptRoot}`"\${using:FileName3}_REST_${DateTime}.txt 2> nul"; }

   # Add the execution to the $jobs list to check the result of later
   # Casting to void here prevents the Add method from returning the object
   # we've added.
   [void]$jobs.Add($job)
}

# Wait for the jobs to be done
Write-Host 'Waiting for all jobs to complete...'
while( $jobs | Where-Object { $_.State -eq 'Running' } ){
  Start-Sleep -s 10
}

# Retrieve the output of the jobs
foreach( $j in $jobs ) {
  Receive-Job $j
}

Note: Since you have ~1000 times you need to execute this script, you may want to consider writing your logic to only run a certain number of jobs at a time. My example above starts all necessary jobs without regarding the number that may execute at once.


For more information about jobs and the properties you can inspect on a running/completed job, check the links below:

* The documentation states that the using scope can only be declared when working with remote sessions, but this seems to work fine with Start-Job even if the job is local.

codewario
  • 19,553
  • 20
  • 90
  • 159
  • I just tried this but it seems the variables outside of the ``` $job = Start-Job -ScriptBlock { }``` is is not available inside the scriptBlock. From the example you have, $UserID is blank when the command within the job run. – rcmpayne Jan 31 '20 at 19:04
  • I updated my answer. You have to pass the arguments into the job. My sample arguments set `UserID` on the `$jobArgs` object, but you can also add the values of `$DateTime` and `$FileName3` as well. Note that my sample payload uses strings for these values but they can be any object type. I also modified the `cmd` string to use sub-expresssions instead of string interpolation, so we can get the property off of the `$args` object within the string. – codewario Jan 31 '20 at 19:17
  • Actually, I did another test, it looks like `Start-Job` lets you use the `$using:` modifier to access variables from the parent session, even if you aren't executing on a remote machine. I've update my answer to recommend the `$using` modifier here. – codewario Jan 31 '20 at 19:26
  • i go tit working. Added ```$job = Start-Job -ScriptBlock { } -ArgumentList $UserID,$DeviceID,$EmailAddress,$Track1,$Track2,$PSScriptRoot,$FileName3,$DateTime``` inside the scriptBlock I use $args[0], $args[1], etc – rcmpayne Jan 31 '20 at 19:51
  • Yeah, that's why I prefer either using the hashmap trick I originally added, or making use of the `using` scope modifier. Having to pass in named arguments and reference them *positionally* is not very readable IMO. – codewario Jan 31 '20 at 19:55
  • Now I need to figure-out how to run in batches of 5-10 vs all ~1000 hitting the same time. I just ran it with a TOP 50 on my sql query and pegged my system :) – rcmpayne Jan 31 '20 at 21:45