0

This is currently what I am trying to execute.

$folderPath = 'M:\abc\WORKFORCE\Media\Attachments' 
 
Write-Host "Executing Script..." 
 
foreach ($file in Get-ChildItem $folderPath -file)
{
       # execute code 
}

However when I execute the powershell script it freezes on me. It's been this way for an hour now. I'm assuming it might be because the directory has over 8 million items in it. Is there a more efficient way to move these items? Is waiting my only option? Or is it not possible to do this at all with powershell because of how large the directory is?

Eric Gumba
  • 435
  • 3
  • 16
  • 1
    Assign your `Get-ChildItem` to `$folderpath` so it's not included in your foreach loop. Also, what's your end goal? – Abraham Zinala May 14 '21 at 03:37
  • You probably need to dispatch a job in parallel. Checkout my answer in this question: https://stackoverflow.com/questions/67393819/recursively-call-a-function-from-itself-inside-a-foreach-object-parallel-block/67395658#67395658 – PollusB May 14 '21 at 05:45
  • Use `Get-ChildItem |ForEach-Object {...}` instead – Mathias R. Jessen May 14 '21 at 07:36
  • Do you use only `file names` inside `#execute code` or you operate with file properties like size, attributes, etc ? – filimonic May 14 '21 at 08:14
  • Definitely the first comment holds a part of the answer to your issue. For each $file in Get-Childitem, has your script check through 8 million files on the disk, for each file it finds. You will want to put the results of Get-ChildItem, into a variable, to enable much faster searching. Otherwise, you'll be running that command 7-8 million times. Where you only really need to run it once. Let us know how much this input improves your script performance, and we'll take a deeper look at it. – Martin Sandgaard Rasmussen May 14 '21 at 08:33
  • You are performing `Get-ChildItem` for every file found :) Gather data once, then perform foreach on data. After that - script should painfully, but gracefully finish working :) PS. For long scripts I like to ouput something like `Write-Output "Working on file $file";` so that I know that script IS working – Karolina Ochlik May 14 '21 at 08:37
  • The issue is more about [NTFS](https://stackoverflow.com/q/197162/503046) and less about Powershell. – vonPryz May 14 '21 at 09:43
  • @vonPryz, perhaps you can expand on that? Like, what knowledge would OP need to dive into, to obtain the highest level of Powershell-script perfomance to lift this task? – Martin Sandgaard Rasmussen May 14 '21 at 09:50

2 Answers2

5

When you do not need any information except file name, you should use [System.IO.Directory]::EnumerateFiles($folderPath, '*')

EnumerateFiles returns IEnumerable[String]. IEnumerable is a special type that can be used in foreach statements. It does not loads information into memory, but instead it gets next item only when requested. It works almost immediately.

So, your code will be

$filesIEnumerable = [System.IO.Directory]::EnumerateFiles($folderPath,'*')
foreach ($fullName in $filesIEnumerable) {
    # code here
    $fileName = [System.IO.Path]::GetFileName($fullName)
    # more code here
}

In case you want to keep in-memory all list of files instead of iterating once ( for example you need to iterate several times ), EnumerateFiles is still a faster and requires less memory than Get-ChildItem because it does not get any extended file attributes:

$files = @([System.IO.Directory]::EnumerateFiles($folderPath,'*'))

Look further about EnumerateFiles at learn.microsoft.com

filimonic
  • 3,988
  • 2
  • 19
  • 26
  • 1
    For perfectioning the answer, perhaps you could include a snip with how to move a file found with your enumeration. It seems that is what the OP needs, to close the case :) – Martin Sandgaard Rasmussen May 14 '21 at 09:54
1

Without further explanation of what the end-goal of the script is; there can not really be a solution to this question. However, a tip on performance, can be given.

Original script:

$folderPath = 'M:\abc\WORKFORCE\Media\Attachments' 
 
Write-Host "Executing Script..." 
 
foreach ($file in Get-ChildItem $folderPath -file)
{
       # execute code 
}

Suggested approach:

$files = Get-ChildItem 'M:\abc\WORKFORCE\Media\Attachments' -file
$DestinationPath = 'F:\DestinationFolder'

Write-Host "Executing Script..." 
 
$Files | ForEach-Object {
       # execute code
       # Write-Verbose "Moving $_.Name"
       # Move-Item -Destination $DestinationPath

}

That being said, it looks like filimonic's take on an answer has a superior speed to its execution, than my suggestion. ( To expand on that, check this thread)

  • You will experience a lag, once you run the script, due to the $Files variable being built. Once that is done, you'll experience a lag while the array is being searched for your queries. Also, keep an eye on memory consumption, while running this. 8 Million files is quite a bit. – Martin Sandgaard Rasmussen May 14 '21 at 08:37
  • How is this different from the original (in which OP will also need to wait for the exact same 2 things sequentially)? – Mathias R. Jessen May 14 '21 at 08:50
  • I noticed your reply to the thread, and i don't see a problem with it. I read up on it, and it looks like your suggestion has quite an edge over the ForEach. Your approach seems better. Mind taking a look at the suggestion edit i made? ( For reference: https://devblogs.microsoft.com/scripting/weekend-scripter-powershell-speed-improvement-techniques/ ) – Martin Sandgaard Rasmussen May 14 '21 at 09:04
  • 1
    It is. The reason is that the pipeline doesn't block - ForEach-Object can start operating on the output stream immediately, it doesn't have to wait for the materialization of a 7M-item array :) see https://stackoverflow.com/a/48523475/712649 for a better explanation of pipeline semantics and timing – Mathias R. Jessen May 14 '21 at 09:14