2

Lets say I am downloading large file using Net.WebClient's DownloadFile method:

$uri1 = "blabla.com/distro/blabla_2gb.exe"
$localfile1 = "$Env:userprofile\Downloads\blabla_2gb.exe"

$wbcl = New-Object System.Net.WebClient
$wbcl.DownloadFile($uri1, $localfile1)
$wbcl.Dispose()

In this case, I can terminate my script with something like Alt + F4 any moment. The downloading process will stop, and $wbcl will be disposed automatically.

But if I do the same thing inside of a job:

Start-Job -ScriptBlock `
{
  #SAME CODE AS ABOVE
} | Out-Null

#SOME PARALLEL ACTIVITY

Wait-Job -ID 1 | Out-Null

the donwloading continues, even when the parent script is closed. As per documentation, termination of the parent script will result in stopping of all of the corresponding jobs. Then why it continues downloading?

P.S. I know I can avoid starting a job here by using DownloadFileAsync, but I really eager to understand this mechanism :)

  • Computers and software, though amazing tools; are stupid (well maybe compliant is a better word), and will only do what they are told/programmed to do. – postanote Apr 19 '21 at 04:15

2 Answers2

2

I believe this is because execution has flowed into a .NET method where PowerShell no longer has control of it.

For example, if I run...

Start-Job -ScriptBlock { Start-Sleep -Seconds 30 }

...or...

Start-Job -ScriptBlock { while ($true) { } }

...I can see in Task Manager that there are two PowerShell processes. If I then click the close button of the PowerShell window (Alt + F4 doesn't work for me) both processes immediately disappear.

If I run...

Start-Job -ScriptBlock { [System.Threading.Thread]::Sleep([TimeSpan]::FromSeconds(30)) }

...then I also see two PowerShell processes in Task Manager. However, after closing the PowerShell window, only one of the PowerShell processes immediately disappears; the other disappears after the remainder of the 30 seconds. Interestingly, if I run exit instead of closing the PowerShell window, the window remains open with a blinking cursor until the job finishes.

Another way to observe this is with Stop-Job. In this script...

$job = Start-Job -ScriptBlock { Start-Sleep -Seconds 30 }
Start-Sleep -Seconds 1 # Give the job time to transition to the Running state
$job | Stop-Job

...Stop-Job returns immediately, whereas in this script...

$job = Start-Job -ScriptBlock { [System.Threading.Thread]::Sleep([TimeSpan]::FromSeconds(30)) }
Start-Sleep -Seconds 1 # Give the job time to transition to the Running state
$job | Stop-Job

...it takes 30 seconds.

I'm not too familiar with the low-level workings of PowerShell execution, but in the first two snippets when the parent process is closed the job process is running PowerShell code, so it will be able to interrupt at an arbitrary point and respond to the parent process's signal to terminate. In the third snippet, the job process is running .NET code, waiting for the method to return. I can't say if it's that the thread running the .NET code is the same thread that would communicate with the parent process or that it's a different thread and PowerShell is simply respecting the dangers of aborting another thread (that PowerShell has no problem interrupting DownloadFile() to exit when it's run outside of a job suggests the former), but the result is the same: the job process doesn't terminate because it's "stuck" inside .NET code until it completes.

This might also be related to why Ctrl + C doesn't (immediately) work when executing a .NET method. See Powershell AcceptTcpClient() cannot be interrupted by Ctrl-C.

One other point: make sure you call Dispose() inside a finally block to ensure it does get called even if DownloadFile() throws an exception...

$wbcl = New-Object System.Net.WebClient
try
{
    $wbcl.DownloadFile($uri1, $localfile1)
}
finally
{
    $wbcl.Dispose()
}
Lance U. Matthews
  • 15,725
  • 6
  • 48
  • 68
  • Many thanks for this research, sir. Your explanation, and the additional info at the reference page you provided, make the things clear enough for me. As far as I understand this is just somewhat called “by design”. – ivanthestupid Apr 18 '21 at 14:10
  • There are several ways to download with PS. Each has there pros and cons. – postanote Apr 19 '21 at 06:02
0

Simply put, this...

$uri1       = "blabla.com/distro/blabla_2gb.exe"
$localfile1 = "$Env:userprofile\Downloads\blabla_2gb.exe"

$wbcl = New-Object System.Net.WebClient
$wbcl.DownloadFile($uri1, $localfile1)
$wbcl.Dispose()

.. is not a job. It is an interactive session. Close/exit the session, you've killed the activity.

This ...

Start-Job -ScriptBlock `
{
  #SAME CODE AS ABOVE
} | Out-Null

#SOME PARALLEL ACTIVITY

Wait-Job -ID 1 | Out-Null

... is a true background job, started from an interactive Powershell session.

If you want to see what your code is calling then leverage...

Trace-Command

https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/trace-command?view=powershell-7.1

Trace-Command -Name metadata,parameterbinding,cmdlet -Expression {
    $uri1       = "blabla.com/distro/blabla_2gb.exe"
    $localfile1 = "$Env:userprofile\Downloads\blabla_2gb.exe"

    $wbcl = New-Object System.Net.WebClient
    $wbcl.DownloadFile($uri1, $localfile1)
    $wbcl.Dispose()
} -PSHost

You will note, that you get a ton of interactive data from the above.

# Results
<#
DEBUG: ParameterBinding Information: 0 : BIND NAMED cmd line args [New-Object]
DEBUG: ParameterBinding Information: 0 : BIND POSITIONAL cmd line args [New-Object]
DEBUG: ParameterBinding Information: 0 :     BIND arg [System.Net.WebClient] to parameter [TypeName]
DEBUG: ParameterBinding Information: 0 :         Executing VALIDATION metadata: [System.Management.Automation.ValidateTrustedDataAttribute]
DEBUG: ParameterBinding Information: 0 :         BIND arg [System.Net.WebClient] to param [TypeName] SUCCESSFUL
DEBUG: ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [New-Object]
DEBUG: ParameterBinding Information: 0 : CALLING BeginProcessing
DEBUG: ParameterBinding Information: 0 : CALLING EndProcessing
DEBUG: ParameterBinding Information: 0 : BIND NAMED cmd line args [Get-Module]
DEBUG: ParameterBinding Information: 0 :     BIND arg [True] to parameter [ListAvailable]
DEBUG: ParameterBinding Information: 0 :         COERCE arg to [System.Management.Automation.SwitchParameter]
DEBUG: ParameterBinding Information: 0 :             Trying to convert argument value from System.Boolean to System.Management.Automation.SwitchParamet
er...

#>

Doing the above with the Job, will not return interactive stuff. You have to specifically ask about the state of the job/details.

Get-Job
# Results
<#
Id Name PSJobTypeName State     HasMoreData Location  Command 
-- ---- ------------- -----     ----------- --------  ------- 
1  Job1 BackgroundJob Completed True        localhost ... 
#>

Or

Get-Job -Name 'Job1' | 
Select-Object -Property '*' | 
Format-List -Force
# Results
<#
State         : Completed
HasMoreData   : True
StatusMessage : 
Location      : localhost
Command       : 
                    $uri1       = 'http://mirror.internode.on.net/pub/test/10meg.test'
                    $localfile1 = "$Env:userprofile\Downloads\10meg.test"
                
                    $wbcl = New-Object System.Net.WebClient
                    $wbcl.DownloadFile($uri1, $localfile1)
                    $wbcl.Dispose()
            
                    #SOME PARALLEL ACTIVITY
                    Wait-Job -ID 1 | 
                    Out-Null
            
JobStateInfo  : Completed
Finished      : System.Threading.ManualResetEvent
InstanceId    : 1af73ea0-c0bf-4cc1-b637-71b0e48862bc
Id            : 1
Name          : Job1
ChildJobs     : {Job2}
PSBeginTime   : 18-Apr-21 21:29:44
PSEndTime     : 18-Apr-21 21:29:46
PSJobTypeName : BackgroundJob
Output        : {}
Error         : {}
Progress      : {}
Verbose       : {}
Debug         : {}
Warning       : {}
Information   : {}
#>

As per my comment on the ways downloads can happen:

https://blog.jourdant.me/post/3-ways-to-download-files-with-powershell

  1. Invoke-WebRequest

Cons

Speed. This cmdlet is slow. From what I have observed, the HTTP response stream is buffered into memory. Once the file has been fully loaded, it is flushed to disk. This adds a huge performance hit and potential memory issues for large files. If anyone knows specifics on how this cmdlet operates, let me know!.

Another potentially serious con for this method is the reliance on Internet Explorer. For example, this cmdlet cannot be used on Windows Server Core edition servers as the Internet Explorer binaries are not included by default. In some cases, you can use the -UseBasicParsing parameter, but it does not work in all cases.

  1. System.Net.WebClient

A common .NET class used for downloading files is the System.Net.WebClient class.

Cons

There is no visible progress indicator (or any way to query the progress mid-transfer). It essentially blocks the thread until the download completes or fails. This isn't a major con, however, sometimes it is handy to know how far through the transfer you are.

  1. Start-BitsTransfer

If you haven't heard of BITS before, check this out. BITS is primarily designed for asynchronous file downloads, but works perfectly fine synchronously too (assuming you have BITS enabled).

Cons

While BITS is enabled by default on many machines, you can't guarantee it is enabled on all (unless you are actively managing this). Also with the way BITS is designed, if other BITS jobs are running in the background, your job could be queued or run at a later time hindering the execution of your script.

postanote
  • 15,138
  • 2
  • 14
  • 25
  • So basically, as per my newbie comprehension, the job is an another kind of activity with its own internals? Not the straight session-inside-session construct. – ivanthestupid Apr 19 '21 at 07:55
  • Details [About Jobs | MSDOcs](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_jobs?view=powershell-7.1) PowerShell concurrently runs commands and scripts through jobs. There are three jobs types provided by PowerShell to support concurrency. 1. RemoteJob - Commands and scripts run on a remote session. For information, see about_Remote_Jobs. --- 2. BackgroundJob - Commands and scripts run in a separate process on the local machine. --- 3. PSTaskJob or ThreadJob - Commands and scripts run in a separate thread within the same process on the local machine. – postanote Apr 19 '21 at 08:14