File lock issues on ZIP file after [IO.Compression.ZipFile]::ExtractToDirectory()

Question

I am downloading a large ZIP file, unzipping and then deleting that Zip to save space. Or at least trying. Some non trivial number of times, the delete that follows the unzip produces

System.Management.Automation.MethodInvocationException
Exception calling "Delete" with "1" argument(s): "The process cannot access the file 'C:\Users\Px_Install\AppData\Local\Temp\Revit_2023.zip' because it is being used by another process."

On thing I have used before for similar situations is to retry the delete over and over, with a couple second delay between tries, and a total number of retries before giving up. But that's pretty ugly. Especially because the exception type for the delete is pretty generic, so I can't really catch on JUST this condition.

I am using

[IO.Compression.ZipFile]::ExtractToDirectory($downloadFilePath, $deploymentPath)

to do the unzip, and I wonder if there is anything I can do here to ensure that the file lock on the zip file is released immediately? Also, being an issue that comes up in other situations, I wonder if there is a more generic approach that allows me to force the file lock to be addressed immediately on a file by file basis? Failing that, there doesn't seem to be anything I can .Dispose, so maybe there is a different approach that doesn't use a static method, which then provides a way to .Dispose?

EDIT: I am now trying to just identify what is holding the lock, per @santiago-squarzon 's comment about AV. I can create a file that is locked, in the sense that I can't delete it, with this

#$path = '\\Mac\iCloud Drive\Px Tools\PS Concepts\FileLock and delete\New Text Document.txt'
$path = 'C:\\New Text Document.txt'

$stream = [System.IO.StreamWriter]::new($Path)

$stream.close()

if $stream.close() is remmed out. As such, I modified the code here to just find the process that has the lock. No delete and certainly no stop process, since that process, in theory should be the ISE where I am testing the code. So this

foreach ($process in Get-Process) { 
    if ($lockingProcess = ($process.Modules | where {$_.FileName -eq $path})) {
        break
    }
}
Write-Host "$($lockingProcess.Count)"
Write-Host "$($lockingProcess.Name)"

just before the Close. Nothing. I also tried

if ($lockingProcess = ($process.Modules | where {$_.FileName -contains $path})) {
        break
    }

because that makes more logical sense to me, even though the linked code uses -eq which may well work through some PowerShell magic. Still nothing so then I tried

foreach ($process in Get-Process) { 
    Write-Host "$($process.Name)"
    foreach ($filename in $process.Modules.FileName) {
        Write-Host "  $filename"
    }
}

to just get a list of every file being used/locked by each process. Lots of files listed, but never the target file. So now I wonder

1: Is that linked code even viable? And

2: AM I really understanding what a locked file is?

The static method already handles the disposal there is probably something else going on. Could there be something triggering AV? — Santiago Squarzon, Nov 26 '22 at 13:29
As for the `catch` you can examine the `.InnetException` in it and see if it's I/O — Santiago Squarzon, Nov 26 '22 at 13:31
@santiago-squarzon I'll have to check, but yes it is at least possible it's AV. I am going to modify this solution (https://stackoverflow.com/questions/45713467/how-to-force-delete-an-open-file-using-powershell/45714289#45714289) to at least log WHAT has the lock. And I'll experiment with `.InnerException` a bit. Good thing to know about in general. The bummer is how intermittent the problem is, much harder to troubleshoot. — Gordon, Nov 26 '22 at 13:40
@santiago-squarzon I managed to get a somewhat repeatable condition, with a very large ZIP file, and when the delete in PowerShell is having issues, a manual delete actually provides some interesting info. It specifically says `The action can't be completed because the folder is open in powershelgl.exe` so it seems like PowerShell is not releasing after the unzip, but I am not finding any solution to handle that. — Gordon, Nov 27 '22 at 12:12
@santiago-squarzon So, this seems curious, to me at least. I added `[Void]` in front of the `ExtractToDirectory()` line, and so far it seems to have solved the problem. Does that make sense, or is it possible/probably that I haven't solved the problem and it's just the intermittency of the issue that is making it look that way? — Gordon, Nov 27 '22 at 12:52
Casting to `[void]` simply suppresses output. Don't think that has anything to with solving your problem — Santiago Squarzon, Nov 27 '22 at 17:00
I figured, but I had a good couple of hours of tests worked, after I made that change on a larked. Now trying to figure out why `if ($_.Exception.InnerException -and ($_.Exception.InnerException -eq [System.IO.Exception])) {` doesn't work. Fun Sunday. :) — Gordon, Nov 27 '22 at 18:06
Doh! I know that. I probably should be desperately trying to solve this and sign code for a beta on a Sunday night. :) — Gordon, Nov 27 '22 at 18:11
Don't use ISE to test your code, use the CLI (just read the updates in your question) — Santiago Squarzon, Nov 27 '22 at 18:12
So, one of the two approaches I have found to isolating which process has the lock might actually work, but not in the ISE? Ugh. I'll give that a go. I tend to test little tidbits of code in the ISE before implementing and testing via CLI. But bad idea perhaps. Same limitation is true with VS Code? Or is that running in the CLI for all intents and purposes? Want to dump ISE for a host of reasons, that would be a good additional one. — Gordon, Nov 27 '22 at 18:15
I would tell you to avoid ISE at all costs. Terrible coding experience. I use VS Code + PowerShell Preview Extension. — Santiago Squarzon, Nov 27 '22 at 18:19
I'm an Architect who taught myself programming, and started with PS before VS Code was a thing. So I got used to the ISE and didn't realize how bad it is. I am beginning to understand. :) — Gordon, Nov 27 '22 at 18:25

Santiago Squarzon · Accepted Answer · 2022-12-07T15:38:04.963

There are 2 things you could try using the static method. After your call to .ExtractToDirectory(...):

[GC]::Collect()
[GC]::WaitForPendingFinalizers()

Then try Remove-Item. If that didn't help, then see if handle gives you more details as to what is holding the handle on the Zip file. Also, ensure the Zip file is not opened with another program.

As for the alternative to the static method, don't think this will solve your problem but at least this answers this part of your question:

...so maybe there is a different approach that doesn't use a static method, which then provides a way to .Dispose?

Do note, the static method is already disposing for you, hence why I don't believe this code will solve your problem but may be worth a try. The usage is pretty similar to Expand-Archive.

using namespace System.IO
using namespace System.IO.Compression

function Expand-ZipArchive {
    [CmdletBinding(DefaultParameterSetName = 'Path')]
    param(
        [Parameter(ParameterSetName = 'Path', Mandatory, Position = 0, ValueFromPipeline, ValueFromPipelineByPropertyName)]
        [string] $Path,

        [Parameter(ParameterSetName = 'LiteralPath', Mandatory, ValueFromPipelineByPropertyName)]
        [Alias('PSPath')]
        [string] $LiteralPath,

        [Parameter(Mandatory)]
        [string] $DestinationPath,

        [Parameter()]
        [switch] $PassThru
    )

    begin {
        Add-Type -AssemblyName System.IO.Compression
        $DestinationPath = $PSCmdlet.GetUnresolvedProviderPathFromPSPath($DestinationPath)
    }
    process {
        $arguments = switch($PSCmdlet.ParameterSetName) {
            Path { $Path, $false, $false }
            LiteralPath { $LiteralPath, $false, $true }
        }

        foreach($item in $ExecutionContext.InvokeProvider.Item.Get.Invoke($arguments)) {
            try {
                $fileStream = $item.Open([FileMode]::Open)
                $zipArchive = [ZipArchive]::new($fileStream, [ZipArchiveMode]::Read)
                foreach($entry in $zipArchive.Entries) {
                    $destPath = [Path]::GetFullPath([Path]::Combine($DestinationPath, $entry.FullName))

                    # if it's a folder, create it and go next
                    if(-not $entry.Name) {
                        $null = [Directory]::CreateDirectory($destPath)
                        continue
                    }

                    $destParent = [Path]::GetDirectoryName($destPath)

                    if(-not [Path]::Exists($destParent)) {
                        $null = [Directory]::CreateDirectory($destParent)
                    }

                    $childStream   = [File]::Create($destPath)
                    $wrappedStream = $entry.Open()
                    $wrappedStream.CopyTo($childStream)
                    $childStream, $wrappedStream | ForEach-Object Dispose

                    if($PassThru.IsPresent) {
                        $childStream.Name -as [FileInfo]
                    }
                }
            }
            catch {
                $PSCmdlet.WriteError($_)
            }
            finally {
                $zipArchive, $fileStream | ForEach-Object Dispose
            }
        }
    }
}

What is the difference between `[GC]::Collect()` & `[GC]::WaitForPendingFinalizers()`? I did try `[GC]::Collect()` and it makes a difference. But I have read that forcing GC is not a good idea, though the article explaining why went WAY over my head. Latest test is to do 50 6 second wait cycles between tries. With no GC I had had a few occasions where it timed out at 50 cycles. But when I add the GC only at 10 cycles, I have had tests that finish before that, but also a number of tests that finish immediately after the GC. — Gordon, Nov 28 '22 at 12:56
SO, been doing some more reading up on GC, and it seems whenever someone talks about not forcing GC, it's because they are concerned with memory USE only, and arguing that GC will trigger automatically if memory use is an issue, and if it's not then GC can affect the performance of a program because GC can stall the UI (seems like a crappy tradeoff to me). That said, I am not concerned about available RAM, I am concerned about a file lock caused by something GC can take care of. — Gordon, Nov 28 '22 at 13:51
My program is a command line program, and the pause waiting for the lock to release is orders of magnitude longer than the pause caused by GC. So I added the two `[GC]` methods mentioned, immediately after the Extract, and the problem seems to be solved. In addition, I have another situation, post software install, where I want to either delete a folder full of bloated log files, or zip them and then delete. But again, a file lock because the installer hasn't released something. So, added the same two lines after the install line and running some tests. — Gordon, Nov 28 '22 at 13:54
All that said, I am still not clear on what the second `[GC]` method does. It seems like I get the same benefit from only doing the `Collect()`. — Gordon, Nov 28 '22 at 13:55
For anyone wondering, this thread (https://stackoverflow.com/questions/12265598/is-correct-to-use-gc-collect-gc-waitforpendingfinalizers) explains what that second `[GC]` method is doing. Using both together seems to be the safe way to go, though in my case the actual GC effort is pretty small so waiting is maybe not strictly necessary. Super frustrating that 100% of the info I have found deals ONLY with memory performance, and completely ignores the potential for a file lock, that can ONLY be solved by forcing GC. But now I know, and have been 100% successful in all tests since implementing. — Gordon, Nov 30 '22 at 11:05

File lock issues on ZIP file after [IO.Compression.ZipFile]::ExtractToDirectory()

1 Answers1

Linked