4

I'm trying to see if a solution with Robocopy could be faster than using the Get-ChildItem for getting a list of files that are inside a given folder (and subfolders...).

In my code I'm using the Get-ChildItem cmdlet to get a list of all files inside a specific folder in order to loop on each of these file:

$files = Get-ChildItem "C:\aaa" -Recurse | where {! $_.PIsContainer} # ! because I don't want to list folders
foreach ($file in $files){
...
}

Now, I have the robocopy command to get a list of all files, however, the output of robocopy is a string.

[string]$result = robocopy "C:\aaa" NULL /l /s /ndl /xx /nc /ns /njh /njs /fp

So, how can I use the output from the robocopy command to loop on each file (similar to what was done with Get-ChildItem?

m_power
  • 3,156
  • 5
  • 33
  • 54
  • What version of PowerShell are you using? If it's 3.0 or above you could remove the `where {! $_.PSIsContainer}` and just do `$files = Get-ChildItem "C:\aaa" -Recurse -File` to retrieve only files, which might improve performance. – Lance U. Matthews Feb 05 '15 at 21:55
  • @BACON, I'm using 2.0. – m_power Feb 06 '15 at 19:06

3 Answers3

6

If you're just looking for a faster way to get that list of files, the legacy dir command will do that:

$files = cmd /c dir c:\aaa /b /s /a-d
foreach ($file in $files){
...
}

Edit: Some comparative performance tests-

(measure-command {gci -r |? {-not $_.psiscontainer } }).TotalMilliseconds
(measure-command {gci -r -file}).TotalMilliseconds
(measure-command {(robocopy . NULL /l /s /ndl /xx /nc /ns /njh /njs /fp) }).TotalMilliseconds
(measure-command {cmd /c dir /b /s /a-d }).TotalMilliseconds

627.5434
417.8881
299.9069
86.9364

The tested directory had 6812 files in 420 sub-directories.

mjolinor
  • 66,130
  • 7
  • 114
  • 135
  • 1
    Since I have to use that command on a folder with >100k files, your solution with `cmd dir` greatly improve the performance. So I selected your answer (the other answers also provide great information). – m_power Feb 06 '15 at 19:28
  • If you need it really fast, then have a look here: https://stackoverflow.com/questions/63956318/fastest-way-to-find-a-full-path-of-a-given-file-via-powershell/64029786#64029786 – Carsten Oct 20 '20 at 14:00
1
$array = $files -split '\r?\n'

I'm assuming $files is text separated by line breaks. This will split the string by line breaks and assign to $array.

briantist
  • 45,546
  • 6
  • 82
  • 127
1

Looks like the Robocopy output is troubled by white space. This works:

(robocopy . NULL /l /s /ndl /xx /nc /ns /njh /njs /fp) | % {gci $_.trim()}

Whether it is quicker depends on how you filter for files. If your PS version supports -file for the gci cmdlet (therefore handing the filtering to the file system provider), then PS is fastest. Using Where-Object roughly doubles that time whilst Robocopy is somewhere in between (for this example of 240 files):

measure-command {gci -r -file}


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 24
Ticks             : 245746
TotalDays         : 2.84428240740741E-07
TotalHours        : 6.82627777777778E-06
TotalMinutes      : 0.000409576666666667
TotalSeconds      : 0.0245746
TotalMilliseconds : 24.5746



measure-command  {gci -r | ? {$_.PSIsContainer -eq $false}}


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 48
Ticks             : 480647
TotalDays         : 5.56304398148148E-07
TotalHours        : 1.33513055555556E-05
TotalMinutes      : 0.000801078333333333
TotalSeconds      : 0.0480647
TotalMilliseconds : 48.0647



measure-command {(robocopy . NULL /l /s /ndl /xx /nc /ns /njh /njs /fp)}


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 36
Ticks             : 365689
TotalDays         : 4.23251157407407E-07
TotalHours        : 1.01580277777778E-05
TotalMinutes      : 0.000609481666666667
TotalSeconds      : 0.0365689
TotalMilliseconds : 36.5689
ConanW
  • 486
  • 3
  • 7