1

I see many examples along the lines of:

Get-ChildItem -Filter "*.txt" | ForEach-Object { sajb {ren $_.fullname ($_.directoryname + "\" + "temp_" + $_.name + ".newext") } }

or what I think should be equivalent, using start-job -scriptblock:

Get-ChildItem -Filter "*.txt" | ForEach-Object { Start-Job -ScriptBlock {ren $_.fullname ($_.directoryname + "\" + "temp_" + $_.name + ".newext") } }

But these don't work for me. I get output like this and nothing happens to the files:

enter image description here

or this for my actual usecase:

enter image description here

If I remove the sajb block, then it works fine in series and does exactly what you'd expect. It's only when I try to run all commands in the loop in parallel that it fails.

The same operations do work fine from the Command prompt using:

for %x in ("*.txt") do (start "Convert" cmd /c "ren "%x" "temp_%x.newext"")

My purpose is to do this for some commands that run slowly in series but would run quickly in parallel using ffmpeg and sox, which also work fine in the for loop from the command prompt. I just can't get it working in PowerShell, even in the simple case like the file rename example above. What am I doing wrong with the start-job / sajb?

If it matters, I want to run this as a single PowerShell command from the PS prompt. I do not want to create a PowerShell script.

I see other posts with what look like similar questions, but I don't think they ever received functional answers, or if those answer are correct, I don't understand how to apply them to my situation:

Powershell start-job scriptblock not executing

Powershell Start-Job not executing scriptblock

  • 1
    What makes you think using `Start-Job` will improve your code in any possible way? It will actually make it slower and consume many times more memory – Santiago Squarzon Jun 29 '23 at 21:43
  • @SantiagoSquarzon, if it's like the start command from the command prompt, meaning it launches an instance per loop, then it makes it run about 10x faster. This is because SoX is very slow per conversion (can take over 1 minute per file) and can't run an individual file conversion on more than one thread, so even with a 24 core CPU, it's no faster. But running a dozen+ conversions in parallel, each in its own thread, all finish in about the same amount of time as one, hence 10x or more faster (roughly 1x for each additional CPU core). – GraniteStateColin Jun 29 '23 at 22:00
  • To clarify, the examples in my question are just the simplest cases I could devise to test the start-job function, and even those fail. My actual use case is significantly more complicated, involving running a mix of ffmpeg and sox commands to adjust audio files in batches. Everything works fine in series, just not in parallel. And everything works fine using a for loop and start from a command prompt. It's only the start-job version from PowerShell that I can't get to work. – GraniteStateColin Jun 29 '23 at 22:05
  • 2
    seems like you just need to change `$_.name` and `$_.fullname` and so on to `($using:_).name` and `($using:_).fullname` and so on. the jobs can't see and do not know what `$_` is, you need to pass it to that scope with the `using:` modifier. – Santiago Squarzon Jun 29 '23 at 22:21

1 Answers1

2
  • sajb is simply a built-in alias of the Start-Job cmdlet.

  • Two asides:

    • The Start-ThreadJob cmdlet offers a lightweight, much faster thread-based alternative to the child-process-based regular background jobs created with Start-Job. It comes with PowerShell (Core) 7+ and in Windows PowerShell can be installed on demand with, e.g., Install-Module ThreadJob -Scope CurrentUser. In most cases, thread jobs are the better choice, both for performance and type fidelity - see the bottom section of this answer for why.

    • In PowerShell (Core) 7+, the simplest solution is to use ForEach-Object with the -Parallel parameter, which combines parallel execution with direct access to pipeline input via the automatic $_ variable:

      1..3 |
        ForEach-Object -Parallel { "`$_ is: $_" }
      
  • As Santiago Squarzon notes, any Start-Job (as well as Start-ThreadJob) call inside a (non-parallel) ForEach-Object call will not automatically see the automatic $_ variable reflecting the current pipeline input object, given that it executes in a different runspace (in the case of Start-Job, a runspace in a different process); therefore, you must reference / pass its value explicitly:

    • Either: use $using:_, via the $using: scope:

       1..3 |
         ForEach-Object { Start-Job { $using:_ } } | 
         Receive-Job -Wait -AutoRemoveJob
      
      • Note: Unexpectedly, up to at least PowerShell 7.3.5 (current as of this writing), calling methods - as opposed to accessing properties - on $using: references requires enclosure in (...) - see GitHub issue #10876
    • Or: pass any values to the job via the -ArgumentList (-Args) parameter, which the job can then access via the automatic $args variable:

       1..3 |
         ForEach-Object { Start-Job { $args[0] } -ArgumentList $_ } | 
         Receive-Job -Wait -AutoRemoveJob
      
    • Additionally, note that in Windows PowerShell (the legacy PowerShell edition whose latest and last version is 5.1), Start-Job script blocks use a default working directory rather than inheriting the caller's - see the bottom section of this answer for details.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    Your response got me looking into the start-threadjob. I had previously tried the -parallel option, which had not worked before, probably because I had missed the ($using). I had been running the default PowerShell in Windows 11, version 5.2. I upgraded to Version 7 and then was able to use "ForEach-Object -parallel" without modifying the rest of the command and it worked perfectly. And yes, it's SO MUCH FASTER than without the -parallel. For anyone else with this problem, I recommend upgrading PS so you can use the -parallel option without making other changes to your code/command/script. – GraniteStateColin Jun 30 '23 at 01:55
  • 1
    Glad to hear it, @GraniteStateColin, but note that with `ForEach-Object -Parallel { ... }` (PS v7+) you do _not_ need the `$using:` scope; you can directly use `$_` to refer to the _pipeline_ input; you do, however, need `$using:` to refer to any other values from the caller's scope. – mklement0 Jun 30 '23 at 02:23
  • 1
    Yes, no need for $using: with PowerShell version 7. However, that didn't seem to be the case with version 5.2. All my problems went away with upgrading to the newer version of PowerShell. – GraniteStateColin Jun 30 '23 at 02:27