32

I want to write a PowerShell script that will recursively search a directory, but exclude specified files (for example, *.log, and myFile.txt), and also exclude specified directories, and their contents (for example, myDir and all files and folders below myDir).

I have been working with the Get-ChildItem CmdLet, and the Where-Object CmdLet, but I cannot seem to get this exact behavior.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Sako73
  • 9,957
  • 13
  • 57
  • 75

6 Answers6

60

I like Keith Hill's answer except it has a bug that prevents it from recursing past two levels. These commands manifest the bug:

New-Item level1/level2/level3/level4/foobar.txt -Force -ItemType file
cd level1
GetFiles . xyz | % { $_.fullname }

With Hill's original code you get this:

...\level1\level2
...\level1\level2\level3

Here is a corrected, and slightly refactored, version:

function GetFiles($path = $pwd, [string[]]$exclude)
{
    foreach ($item in Get-ChildItem $path)
    {
        if ($exclude | Where {$item -like $_}) { continue }

        $item
        if (Test-Path $item.FullName -PathType Container)
        {
            GetFiles $item.FullName $exclude
        }
    }
} 

With that bug fix in place you get this corrected output:

...\level1\level2
...\level1\level2\level3
...\level1\level2\level3\level4
...\level1\level2\level3\level4\foobar.txt

I also like ajk's answer for conciseness though, as he points out, it is less efficient. The reason it is less efficient, by the way, is because Hill's algorithm stops traversing a subtree when it finds a prune target while ajk's continues. But ajk's answer also suffers from a flaw, one I call the ancestor trap. Consider a path such as this that includes the same path component (i.e. subdir2) twice:

\usr\testdir\subdir2\child\grandchild\subdir2\doc

Set your location somewhere in between, e.g. cd \usr\testdir\subdir2\child, then run ajk's algorithm to filter out the lower subdir2 and you will get no output at all, i.e. it filters out everything because of the presence of subdir2 higher in the path. This is a corner case, though, and not likely to be hit often, so I would not rule out ajk's solution due to this one issue.

Nonetheless, I offer here a third alternative, one that does not have either of the above two bugs. Here is the basic algorithm, complete with a convenience definition for the path or paths to prune--you need only modify $excludeList to your own set of targets to use it:

$excludeList = @("stuff","bin","obj*")
Get-ChildItem -Recurse | % {
    $pathParts = $_.FullName.substring($pwd.path.Length + 1).split("\");
    if ( ! ($excludeList | where { $pathParts -like $_ } ) ) { $_ }
}

My algorithm is reasonably concise but, like ajk's, it is less efficient than Hill's (for the same reason: it does not stop traversing subtrees at prune targets). However, my code has an important advantage over Hill's--it can pipeline! It is therefore amenable to fit into a filter chain to make a custom version of Get-ChildItem while Hill's recursive algorithm, through no fault of its own, cannot. ajk's algorithm can be adapted to pipeline use as well, but specifying the item or items to exclude is not as clean, being embedded in a regular expression rather than a simple list of items that I have used.

I have packaged my tree pruning code into an enhanced version of Get-ChildItem. Aside from my rather unimaginative name--Get-EnhancedChildItem--I am excited about it and have included it in my open source Powershell library. It includes several other new capabilities besides tree pruning. Furthermore, the code is designed to be extensible: if you want to add a new filtering capability, it is straightforward to do. Essentially, Get-ChildItem is called first, and pipelined into each successive filter that you activate via command parameters. Thus something like this...

Get-EnhancedChildItem –Recurse –Force –Svn
    –Exclude *.txt –ExcludeTree doc*,man -FullName -Verbose 

... is converted internally into this:

Get-ChildItem | FilterExcludeTree | FilterSvn | FilterFullName

Each filter must conform to certain rules: accepting FileInfo and DirectoryInfo objects as inputs, generating the same as outputs, and using stdin and stdout so it may be inserted in a pipeline. Here is the same code refactored to fit these rules:

filter FilterExcludeTree()
{
  $target = $_
  Coalesce-Args $Path "." | % {
    $canonicalPath = (Get-Item $_).FullName
    if ($target.FullName.StartsWith($canonicalPath)) {
      $pathParts = $target.FullName.substring($canonicalPath.Length + 1).split("\");
      if ( ! ($excludeList | where { $pathParts -like $_ } ) ) { $target }
    }
  }
} 

The only additional piece here is the Coalesce-Args function (found in this post by Keith Dahlby), which merely sends the current directory down the pipe in the event that the invocation did not specify any paths.

Because this answer is getting somewhat lengthy, rather than go into further detail about this filter, I refer the interested reader to my recently published article on Simple-Talk.com entitled Practical PowerShell: Pruning File Trees and Extending Cmdlets where I discuss Get-EnhancedChildItem at even greater length. One last thing I will mention, though, is another function in my open source library, New-FileTree, that lets you generate a dummy file tree for testing purposes so you can exercise any of the above algorithms. And when you are experimenting with any of these, I recommend piping to % { $_.fullname } as I did in the very first code fragment for more useful output to examine.

Michael Sorens
  • 35,361
  • 26
  • 116
  • 172
  • +1 just noticed this answer because of a comment on my own. Really nice job! I knew my way had a bit of a duct tape and bubble gum flavor to it, but you've taken the solution to the next level. Thanks for pointing out the ancestor trap as well. While you're right that it's unlikely to come up very often, it's something you should be aware of *before* it bites you. – ajk May 31 '13 at 01:54
  • 1
    Thanks for the kind words @ajk. But do not sell yourself short; your answer definitely has merit for its brevity. – Michael Sorens May 31 '13 at 02:16
  • I have set up your test folder hierarchy, then tried your "corrected, and slightly refactored" version, but it still produces the same output as Keith Hill's version. That is, only level2 and level3 are displayed. I tried your "third alternative" and that one works. It displays all levels and the foobar.txt file. I am using PS version 2 on Win 7. FYI. – Sabuncu May 31 '13 at 20:39
  • @Sabuncu: You must be doing something slightly different--I just retested the example code in both V2 and V3 and it works as advertised. – Michael Sorens Jun 01 '13 at 14:51
  • Tested it again, no difference. (Did not type in your code, copied and pasted directly, and double checked.) – Sabuncu Jun 02 '13 at 10:31
  • Can you please explain the following line in your code: `if (!($excludeList | where { $pathParts -like $_ })) { $_ }` - is the context for the second `$_` different than the context for the first `$_`? Thanks. – Sabuncu Jun 10 '13 at 22:46
  • 1
    Yes, they are different. This is a sub-pipeline: `$excludeList | where { $pathParts -like $_ }` so `$_` takes on the value of each member of the exclusion list. Now let's rewrite the original line abstractly as `if (not_on_exclusion_list) { $_ }`. That is, if it passes the condition, output the member of the current pipeline, i.e. the current item returned by `Get-ChildItem`. – Michael Sorens Jun 11 '13 at 02:21
  • Thank you msorens, that was what was tripping me up. MUCH appreciated. +1 – Sabuncu Jun 11 '13 at 17:54
  • Get-EnhancedChildItem is now located [here](http://cleancode.sourceforge.net/api/powershell/CleanCode/FileTools/Get-EnhancedChildItem.html) – Eric Sabine Jan 22 '14 at 16:11
  • Thanks for that update, @EricSabine! I have updated the links in the body of my answer above as well. – Michael Sorens Jan 22 '14 at 21:35
  • An old post but may be worth clarifying. The _directory_ check above didn't work in my case i.e. `Test-Path $item -PathType Container`. I had to use `$item.PSIsContainer` instead. PS: I am using Get-ChildItem with -Recurse switch (in case it has any effect). – Tariq Feb 05 '14 at 19:42
  • 2
    @Tariq: The statement you referenced is _not_ from my code above; that is from Keith Hill's original answer and is, in fact, one of the issues my code addresses. If you look above at my GetFiles function you will see that I use `$item.FullName` rather than just `$item` as the first argument to `Test-Path`, which should be all you need to make it work for you. – Michael Sorens Feb 09 '14 at 00:07
28

The Get-ChildItem cmdlet has an -Exclude parameter that is tempting to use but it doesn't work for filtering out entire directories from what I can tell. Try something like this:

function GetFiles($path = $pwd, [string[]]$exclude) 
{ 
    foreach ($item in Get-ChildItem $path)
    {
        if ($exclude | Where {$item -like $_}) { continue }

        if (Test-Path $item.FullName -PathType Container) 
        {
            $item 
            GetFiles $item.FullName $exclude
        } 
        else 
        { 
            $item 
        }
    } 
}
Keith Hill
  • 194,368
  • 42
  • 353
  • 369
  • I like the way you use the if with a pipeline inside, excellent concise syntax and like @jonZ says you forgot the $exclude parameter in the recursive call – mjsr Nov 06 '11 at 14:07
  • @jonZ, yes that arg shoud get passed down with the recursive call. Good catch. – Keith Hill Nov 06 '11 at 16:54
  • 1
    An old post but may be worth clarifying. The _directory_ check above didn't work in my case i.e. `Test-Path $item -PathType Container`. I had to use `$item.PSIsContainer` instead. PS: I am using Get-ChildItem with -Recurse switch (in case it has any effect). – Tariq Feb 05 '14 at 19:40
  • This answer was mentioned in the blog post *[Practical PowerShell: Pruning File Trees and Extending Cmdlets](https://www.simple-talk.com/dotnet/.net-framework/practical-powershell-pruning-file-trees-and-extending-cmdlets/)* (including having a bug(?) - same bug as mentioned in Michael Sorens' answer (same author)(?) . – Peter Mortensen Nov 24 '15 at 15:43
12

Here's another option, which is less efficient but more concise. It's how I generally handle this sort of problem:

Get-ChildItem -Recurse .\targetdir -Exclude *.log |
  Where-Object { $_.FullName -notmatch '\\excludedir($|\\)' }

The \\excludedir($|\\)' expression allows you to exclude the directory and its contents at the same time.

Update: Please check the excellent answer from msorens for an edge case flaw with this approach, and a much more fleshed out solution overall.

ajk
  • 4,473
  • 2
  • 19
  • 24
  • +1 Can you please explain what the expression `\\excludedir($|\\)` does? Thank you. – Sabuncu May 30 '13 at 19:10
  • 1
    Sure thing! It is a regular expression that matches any file or folder whose full path contains `\excludedir`. The `($|\\)` part means the pattern matches the end of the full path name or a trailing backslash. So it would match `\dir1\dir2\excludedir` or `dir1\excludedir\dir2`. I highly recommend checking out the answer from @msorens. Aside from being an excellent answer in general, he points out a shortcoming in my approach. – ajk May 31 '13 at 01:50
  • +1 Thanks, really like this expression, have put it in my notebook. Please also see my comments to msorens' answer. FYI: Your solution also only goes two levels deep. Don't understand why this is. – Sabuncu Jun 02 '13 at 11:00
0

Recently, I explored the possibilities to parameterize the folder to scan through and the place where the result of recursive scan will be stored. At the end, I also did summarize the number of folders scanned and number of files inside as well. Sharing it with community in case it may help other developers.

    ##Script Starts
    #read folder to scan and file location to be placed

    $whichFolder = Read-Host -Prompt 'Which folder to Scan?'  
    $whereToPlaceReport = Read-Host -Prompt 'Where to place Report'
    $totalFolders = 1
    $totalFiles = 0

    Write-Host "Process started..."

    #IMP separator ? : used as a file in window cannot contain this special character in the file name

    #Get Foldernames into Variable for ForEach Loop
    $DFSFolders = get-childitem -path $whichFolder | where-object {$_.Psiscontainer -eq "True"} |select-object name ,fullName

    #Below Logic for Main Folder
    $mainFiles = get-childitem -path "C:\Users\User\Desktop" -file
    ("Folder Path" + "?" + "Folder Name" + "?" + "File Name " + "?"+ "File Length" )| out-file "$whereToPlaceReport\Report.csv" -Append

    #Loop through folders in main Directory
    foreach($file in $mainFiles)
    {

    $totalFiles = $totalFiles + 1
    ("C:\Users\User\Desktop" + "?" + "Main Folder" + "?"+ $file.name + "?" + $file.length ) | out-file "$whereToPlaceReport\Report.csv" -Append
    }


    foreach ($DFSfolder in $DFSfolders)
    {
    #write the folder name in begining
    $totalFolders = $totalFolders + 1

    write-host " Reading folder C:\Users\User\Desktop\$($DFSfolder.name)"
    #$DFSfolder.fullName | out-file "C:\Users\User\Desktop\PoC powershell\ok2.csv" -Append
    #For Each Folder obtain objects in a specified directory, recurse then filter for .sft file type, obtain the filename, then group, sort and eventually show the file name and total incidences of it.

    $files = get-childitem -path "$whichFolder\$($DFSfolder.name)" -recurse

    foreach($file in $files)
    {
    $totalFiles = $totalFiles + 1
    ($DFSfolder.fullName + "?" + $DFSfolder.name + "?"+ $file.name + "?" + $file.length ) | out-file "$whereToPlaceReport\Report.csv" -Append
    }

    }


    # If running in the console, wait for input before closing.
    if ($Host.Name -eq "ConsoleHost")
    {

    Write-Host "" 
    Write-Host ""
    Write-Host ""

    Write-Host  "                            **Summary**"  -ForegroundColor Red
    Write-Host  "                            ------------" -ForegroundColor Red

    Write-Host  "                           Total Folders Scanned = $totalFolders "  -ForegroundColor Green
    Write-Host  "                           Total Files   Scanned = $totalFiles "     -ForegroundColor Green

    Write-Host "" 
    Write-Host "" 
        Write-Host "I have done my Job,Press any key to exit" -ForegroundColor white
        $Host.UI.RawUI.FlushInputBuffer()   # Make sure buffered input doesn't "press a key" and skip the ReadKey().
        $Host.UI.RawUI.ReadKey("NoEcho,IncludeKeyUp") > $null
    }

##Output

enter image description here

##Bat Code to run above powershell command

@ECHO OFF
SET ThisScriptsDirectory=%~dp0
SET PowerShellScriptPath=%ThisScriptsDirectory%MyPowerShellScript.ps1
PowerShell -NoProfile -ExecutionPolicy Bypass -Command "& {Start-Process PowerShell -ArgumentList '-NoProfile -ExecutionPolicy Bypass -File ""%PowerShellScriptPath%""' -Verb RunAs}";
haldo
  • 14,512
  • 5
  • 46
  • 52
0

A bit late, but try this one.

function Set-Files($Path) {
    if(Test-Path $Path -PathType Leaf) {
        # Do any logic on file
        Write-Host $Path
        return
    }

    if(Test-Path $path -PathType Container) {
        # Do any logic on folder use exclude on get-childitem
        # cycle again
        Get-ChildItem -Path $path | foreach { Set-Files -Path $_.FullName }
    }
}

# call
Set-Files -Path 'D:\myFolder'
Gravity API
  • 680
  • 8
  • 16
0

Commenting here as this seems to be the most popular answer on the subject for searching for files whilst excluding certain directories in powershell.

To avoid issues with post filtering of results (i.e. avoiding permission issues etc), I only needed to filter out top level directories and that is all this example is based on, so whilst this example doesn't filter child directory names, it could very easily be made recursive to support this, if you were so inclined.

Quick breakdown of how the snippet works

$folders << Uses Get-Childitem to query the file system and perform folder exclusion

$file << The pattern of the file I am looking for

foreach << Iterates the $folders variable performing a recursive search using the Get-Childitem command

$folders = Get-ChildItem -Path C:\ -Directory -Name -Exclude Folder1,"Folder 2"
$file = "*filenametosearchfor*.extension"

foreach ($folder in $folders) {
   Get-Childitem -Path "C:/$folder" -Recurse -Filter $file | ForEach-Object { Write-Output $_.FullName }
}
timkly
  • 793
  • 6
  • 14