1

EDIT I think I now know what the issue is - The copy numbers are not REALLY part of the filename. Therefore, when the array pulls it and then is used to get the match info, the file as it is in the array does not exist, only the file name with no copy number.

I tried writing a rename script but the same issue exists... only the few files I manually renamed (so they don't contain copy numbers) were renamed (successfully) by the script. All others are shown not to exist.

How can I get around this? I really do not want to manually work with 23000+ files. I am drawing a blank..

HELP PLEASE


I am trying to narrow down a folder full of emails (copies) with the same name "SCADA Alert.eml", "SCADA Alert[1].eml"...[23110], based on contents. And delete the emails from the folder that meet specific content criteria.

When I run it I keep getting the error in the subject line above. It only sees the first file and the rest it says do not exist...

The script reads through the folder, creates an array of names (does this correctly). Then creates an variable, $email, and assigns the content of that file. for each $filename in the array. (this is where is breaks)

Then is should match the specific string I am looking for to the content of the $email var and return true or false. If true I want it to remove the email, $filename, from the folder.

Thus narrowing down the email I have to review. Any help here would be greatly appreciated.

This is what I have so far... (Folder is in the root of C:)

$array = Get-ChildItem -name -Path $FolderToRead   #| Get-Content | Tee C:\Users\baudet\desktop\TargetFile.txt

Foreach ($FileName in $array){

    $FileName          # Check File

    $email = Get-Content $FolderToRead\$FileName  
    $email             # Check Content

    $ContainsString = "False"                      # Set Var 
    $ContainsString                                # Verify Var
    $ContainsString = %{$email -match "SYS$,ROC"}  # Look for String
    $ContainsString                                # Verify result of match

        #if ($ContainsString -eq "True") {
            #Remove-Item $FolderToRead\$element
            #}
}
Brandon A
  • 13
  • 4

2 Answers2

2

Here's a PowerShell-idiomatic solution that also resolves your original problems:

Get-ChildItem -File -LiteralPath $FolderToRead | Where-Object {
  (Get-Content -Raw -LiteralPath $_.FullName) -match 'SYS\$,ROC'
} | Remove-Item -WhatIf

Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.

Note how the $ character in the RHS regex of the -match operator is \-escaped in order to use it verbatim (rather than as metacharacter $, the end-of-input anchor).

Also, given that $ is also used in PowerShell's string interpolation, it's better to use '...' strings (single-quoted, verbatim strings) to represent regexes, assuming no actual up-front string expansion is needed before the regex engine sees the resulting string - see this answer for more information.


As for what you tried:

  • The error message stemmed from the fact that Get-Content $FolderToRead\$FileName binds the file-name argument, $FolderToRead\$FileName, implicitly (positionally) to Get-Content's -Path parameter, which expects PowerShell wildcard patterns.

    • Since your file names literally contain [ and ] characters, they are misinterpreted by the (implied) -Path parameter, which can be avoided by using the -LiteralPath parameter instead (which must be specified explicitly, as a named argument).
  • %{$email -match "SYS$,ROC"} is unnecessarily wrapped in a ForEach-Object call (% is a built-in alias); while that doesn't do any harm in this case, it adds unnecessary overhead;
    $email -match "SYS$,ROC" is enough, though it needs to be corrected to
    $email -match 'SYS\$,ROC', as explained above.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Isn't the `Get-ChildItem | Remove-Item` pattern unreliable because you are pulling files away from underneath the directory iterator? – zett42 Feb 16 '21 at 17:58
  • 1
    @zett42, as far as I know, this would only be a problem if you piped to `Rename-Item` rather than `Remove-Item`, because the latter shouldn't interfere with the enumeration order (I don't know the plumbing, though), and even with `Rename-Item` - due to a PS Core implementation detail - it is strictly speaking only necessary in Windows PowerShell. However, to be safe you can force up-front collection (at the expense of increased memory use) by enclosing the `Get-ChildItem` call in `(...)` - see [this answer](https://stackoverflow.com/a/60302715/45375). – mklement0 Feb 16 '21 at 18:16
  • I remember we had a recent question where piping `Get-ChildItem` into `Remove-Item` caused a problem. Can't find it right now, but will continue searching. Maybe I remember wrong and it was actually `Rename-Item` that was piped into. Unreliable human brain that is. ;-) – zett42 Feb 16 '21 at 18:26
  • Thanks, @zett42, definitely let me know if you find a problem. Informal tests on Windows PowerShell tell me that the `Remove-Item` case is fine, but that's not proof. Unless the `(...)` is truly necessary, however, I'd like to keep it out of the answer - and of course it would be good to know for the future. – mklement0 Feb 16 '21 at 18:37
0
[System.IO.Directory]::EnumerateFiles($Folder) |
    Where-Object {$true -eq [System.IO.File]::ReadAllText($_, [System.Text.Encoding]::UTF8).Contains('SYS$,ROC') } |
    ForEach-Object {
        Write-Host "Removing $($_)"
        #[System.IO.File]::Delete($_)
    }

Your mistakes:

  1. %{$email -match "SYS$,ROC"} - What % is intended to be? This is ForEach-Object alias.
  2. %{$email -match "SYS$,ROC"} - Why use -match? This is much slower than -like or String.Contains()
  3. %{$email -match "SYS$,ROC"} - When using $ inside double quotes, you should escape this using single backtick symbol (I have `$100). Otherwise, everything after $ is variable name: Hello, $username; I's $($weather.ToString()) today!
  4. Write debug output in a right way: use Write-Debug, Write-Verbose, Write-Host, Write-Warning, Write-Error, Write-Information.

Can be better:

  1. Avoid using Get-ChildItem, because Get-ChildItem returns files with attributes (like mtime, atime, ctime, etc). This additional info is additional request per file. When you need only list of files, use native .Net EnumerateFiles from System.IO.Directory. This is significant performace boost on huge amounts of files.
  2. Use RealAllText or ReadAllLines or ReadAllBytes from System.IO.File static class to be more concrete instead of using universal Get-Content.
  3. Use pipelines ;-)
filimonic
  • 3,988
  • 2
  • 19
  • 26