25

I can't find a way to pass the function. Just variables.

Any ideas without putting the function inside the ForEach loop?

function CustomFunction {
    Param (
        $A
    )
    Write-Host $A
}

$List = "Apple", "Banana", "Grape" 
$List | ForEach-Object -Parallel {
    Write-Host $using:CustomFunction $_
}

enter image description here

smark91
  • 545
  • 1
  • 5
  • 17
  • 2
    Either package your function in a module, or (re-)define it _inside_ the `-Parallel` block – Mathias R. Jessen Apr 17 '20 at 14:03
  • As an aside: [`Write-Host` is typically the wrong tool to use](http://www.jsnover.com/blog/2013/12/07/write-host-considered-harmful/), unless the intent is to write _to the display only_, bypassing the success output stream and with it the ability to send output to other commands, capture it in a variable, redirect it to a file. To output a value, use it _by itself_; e.g., `$value` instead of `Write-Host $value` (or use `Write-Output $value`, though that is rarely needed). See also: the bottom section of https://stackoverflow.com/a/50416448/45375 – mklement0 Apr 17 '20 at 14:21

4 Answers4

36

The solution isn't quite as straightforward as one would hope:

# Sample custom function.
function Get-Custom {
  Param ($A)
  "[$A]"
}

# Get the function's definition *as a string*
$funcDef = ${function:Get-Custom}.ToString()

"Apple", "Banana", "Grape"  | ForEach-Object -Parallel {
  # Define the function inside this thread...
  ${function:Get-Custom} = $using:funcDef
  # ... and call it.
  Get-Custom $_
}

Note: This answer contains an analogous solution for using a script block from the caller's scope in a ForEach-Object -Parallel script block.

  • Note: If your function were defined in a module that is placed in one of the locations known to the module-autoloading feature, your function calls would work as-is with ForEach-Object -Parallel, without extra effort - but each thread would incur the cost of (implicitly) importing the module.

  • The above approach is necessary, because - aside from the current location (working directory) and environment variables (which apply process-wide) - the threads that ForEach-Object -Parallel creates do not see the caller's state, notably neither with respect to variables nor functions (and also not custom PS drives and imported modules).

  • As of PowerShell 7.2.x, an enhancement is being discussed in GitHub issue #12240 to support copying the caller's state to the parallel threads on demand, which would make the caller's functions automatically available.

Note that redefining the function in each thread via a string is crucial, as an attempt to make do without the aux. $funcDef variable and trying to redefine the function with ${function:Get-Custom} = ${using:function:Get-Custom} fails, because ${function:Get-Custom} is a script block, and the use of script blocks with the $using: scope specifier is explicitly disallowed in order to avoid cross-thread (cross-runspace) issues.

  • However, ${function:Get-Custom} = ${using:function:Get-Custom} would work with Start-Job; see this answer for an example.

  • It would not work with Start-ThreadJob, which currently syntactically allows you to do & ${using:function:Get-Custom} $_, because ${using:function:Get-Custom} is preserved as a script block (unlike with Start-Job, where it is deserialized as a string, which is itself surprising behavior - see GitHub issue #11698), even though it shouldn't. That is, direct cross-thread use of [scriptblock] instances causes obscure failures, which is why ForEach-Object -Parallel prevents it in the first place.

  • A similar loophole that leads to cross-thread issues even with ForEach-Object -Parallel is using a command-info object obtained in the caller's scope with Get-Command as the function body in each thread via the $using: scope: this too should be prevented, but isn't as of PowerShell 7.2.7 - see this post and GitHub issue #16461.

${function:Get-Custom} is an instance of namespace variable notation, which allows you to both get a function (its body as a [scriptblock] instance) and to set (define) it, by assigning either a [scriptblock] or a string containing the function body.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Thank you very much. It is not the cleaner solution I was hoping for but it works. Performance-side every iteration is basically instantiating a new function. It was like inserting the function inside the foreach but more cleaner visually, right? – smark91 Apr 17 '20 at 15:02
  • Glad to hear it was helpful, @smark91. The technique is primarily useful if you have a preexisting function that you want to use in the `ForEach-Object -Parallel` block; directly inserting the function definition is probably faster, though I'm not sure it makes much difference in practice. – mklement0 Apr 17 '20 at 15:10
  • 2
    This is all great for one-offs but if you have several modules imported, more functions defined, variables up in the air, essentially a whole house of cards going, it's too much trouble and too prone to error. Here's to hoping the PowerShell Core crew decide to make runspace copying an option. – Max Cascone Apr 21 '21 at 18:48
2

I added a whole set of custom functions to parallel processes via a ps1 file by using an include inside the loop. This keeps things very clean and neat.

ForEach-Object -Parallel {
    # Include custom functions inside parallel scope
    . $using:PSScriptRoot\CustomFunctions.ps1
    # Now you can reference any function defined in the file
    My-CustomFunction
    ....

This indeed incurs overhead requiring the loading of functions in each parallel process, but in my case this was miniscule related to the overall processing time.

sharme202
  • 21
  • 2
1

I just figured out another way using get-command, which works with the call operator. $a ends up being a FunctionInfo object.

EDIT: I'm told this isn't thread safe, but I don't understand why.

function hi { 'hi' }
$a = get-command hi
1..3 | foreach -parallel { & $using:a }

hi
hi
hi
mklement0
  • 382,024
  • 64
  • 607
  • 775
js2010
  • 23,033
  • 6
  • 64
  • 66
  • 3
    Actually, it turns out there are indeed thread-safety issues, even with function bodies that do not rely on the caller's state; this can lead to obscure failures - see [this question](https://stackoverflow.com/q/74257757/45375). – mklement0 Oct 31 '22 at 02:17
0

So I figured out another little trick that may be useful for people trying to add the functions dynamically, particularly if you might not know the name of it beforehand, such as when the functions are in an array.

# Store the current function list in a variable
$initialFunctions=Get-ChildItem Function:

# Source all .ps1 files in the current folder and all subfolders
Get-ChildItem . -Recurse | Where-Object { $_.Name -like '*.ps1' } |
     ForEach-Object { . "$($_.FullName)" }

# Get only the functions that were added above, and store them in an array
$functions = @()
Compare-Object $initialFunctions (Get-ChildItem Function:) -PassThru |
    ForEach-Object { $functions = @($functions) + @($_) }

1..3 | ForEach-Object -Parallel {
    # Pull the $functions array from the outer scope and set each function
    # to its definition
    $using:functions | ForEach-Object {
        Set-Content "Function:$($_.Name)" -Value $_.Definition
    }
    # Call one of the functions in the sourced .ps1 files by name
    SourcedFunction $_
}

The main "trick" of this is using Set-Content with Function: plus the function name, since PowerShell essentially treats each entry of Function: as a path.

This makes sense when you consider the output of Get-PSDrive. Since each of those entries can be used as a "Drive" in the same way (i.e., with the colon).

Shenk
  • 352
  • 4
  • 12
  • Promising, though for the reasons discussed in https://stackoverflow.com/q/74257757/45375, this isn't thread-safe and can lead to subtle or not-so-subtle failures at runtime. To make this work robustly, you need to pass the function bodies _as strings_ to the parallel runspaces. – mklement0 Nov 15 '22 at 16:54