How do I find all files in a folder whose names contain words from a list?

Question

I have a massive list of files whose names contain a number. On the other hand, I have a list of numbers. I need to find, using PowerShell (or any other Windows resource) the list of files that contain in their names any of the numbers from the other list.

I know how to find one by one using

Get-ChildItem | Where-Object {$_.Name -like "*123*"}

But I don't know how to search by the whole list without using the -or operator.

In a comment below you state, "my search list has hundreds of numbers, so even manually it would be a pain." - please add this requirement - not wanting to enumerate the numbers individually - to your question more explicitly. — mklement0, May 15 '20 at 17:36

js2010 · Answer 1 · 2020-05-15T18:25:53.443

3

get-childitem *123*,*456*,*789*

Patterns from a file:

get-childitem -name | select-string (get-content patterns.txt)

edited May 15 '20 at 18:25

answered May 15 '20 at 13:54

js2010

23,033
6
64
66

mklement0 · Answer 2 · 2020-05-15T15:49:13.463

An efficient approach is to use the regex-based -match, the regular-expression matching operator with alternation (|) to search for one of multiple patterns in a single operation:

$numbers = 42, 43, 44 # ...
Get-ChildItem | Where-Object Name -match ($numbers -join '|')

Alternatively, js2010's helpful answer shows that you can directly use Get-ChildItem's (implied) -Path parameter (whose type is [string[]], i.e., an array of paths), with an array of wildcard expressions:

$numbers = 42, 43, 44 # ...
Get-ChildItem ($numbers -replace '^|$', '*')

The above uses the -replace operator to enclose each number in *...*; that is, the above is the equivalent of:

Get-ChildItem *42*, *43*, *44*

score 0 · Answer 3 · answered May 15 '20 at 13:44

0

Try this:

$files = ( Get-ChildItem 'path' )

$numbers = 1 .. 100 # or your list contents

foreach( $n in $numbers ) {
    foreach( $f in $files.BaseName ) {
        if( $f -like "*$n*" ) {
            "Found $f"
        }
    }
}

answered May 15 '20 at 13:44

Vish

466
2
12

score 0 · Answer 4 · answered May 15 '20 at 20:55

As js2010's helpful answer and mklement0 mention, we can exploit the string array in the Get-ChildItem -Path parameter to do our filtering. These are nice quick elegant solutions and would be great solutions for limited sets of strings.

The quirk comes in with @JBourne's comment when he mentions that he has hundreds of numbers to match. When we are dealing with hundreds of names to match with hundreds of filenames, these methods will all get exponentially slower. e.g. @Vish's very easy to understand answer demonstrates this. When you have, say, 100 numbers, and 1,000 files, you perform 100 x 1,000 = 100,000 evaluations. I assume that the internal code for Get-ChildItem will do something similar when handling string[] arrays on the input.

If we are interested in pure performance, we can't use arrays. Arrays are efficient for storing items, and accessing indexed locations, but are terrible for random querying. What we could use is a slightly more complicated method using Regex and Hashtables. Although Hashtables are a key/value system, and in this case we don't need a "value", they are highly efficient for finding and matching and querying large numbers of keys, typically with a "O(1)" level of success. e.g. our example goes from a O(n*f) problem to an O(n) problem, we only evaluate 1 x 1,000 = 1,000 evaluations.

To start with, we need our list of keys:

$FileWithListOfNumbers = @"
123 = Matched file with 123
456 = Matched file with 456
789 = Matched file with 789
"@

$KeyHashtable = ConvertFrom-StringData $FileWithListOfNumbers

This will load our hashtable with a list of keys. Next, we iterate through our files and use Regex for matching our filenames:

Get-ChildItem | % {
    if($_.Name -match '\D*(\d+)\D*')
    {
        #Filename contains a number, perform a key lookup to see if it matches
        if($KeyHashtable.ContainsKey($Matches[1]))
        {
            Write-Host $_.Name
        }
    }
}

By using Regex for matching (rather than a file system provider to filter) we can use match groups to "pull" out the number. You may have to adjust the Regex based on your specific needs and file naming convention, but it is:

-match '\D*(\d+)\D*'

\D*    - Match 0 or more non-digits
 (     - Start of capture group
  \d+  - Match 1 or more digits
 )     - End of capture group
\D*    - Match 0 or more non-digits

That number we "pull" is stored in the special $Matches variable in the second array location $Matches[1]. We then perform a key lookup with the number to see if it matches anything we are looking for.

How do I find all files in a folder whose names contain words from a list?

4 Answers4