0

I am trying to search a .eml file for all emails and add <> brackets between each email found, here's the code I have, it outputs what should happen but it does not write to the file. Note, I need to keep the existing data in the file (Title, body,etc.), only replacing the email address.

$rawtext = [IO.File]::ReadAllText("c:\scripts\emailex.eml")
$regex = [regex]"(?i)\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b" 
$regex.Matches($rawtext) | ForEach-Object{ $_ -replace $($_.Value), "<$_>" }
admdrew
  • 3,790
  • 4
  • 27
  • 39
scrypto
  • 3
  • 1
  • What exactly is your question/issue? Is it that it isn't writing to the file? You have nothing in your code that writes to a file. – Edward Eisenhart Jan 14 '15 at 20:24
  • I am aware it does not write to the file but I am looking for help in how TO write it to file. When trying to write to a file it would always write the whole line instead of just the email for me. – scrypto Jan 15 '15 at 02:44

1 Answers1

0

I am not convinced that your email Regex is fully robust, but you are only reading the file and not writing back to it.

I recommend using Get-Content and Set-Content and using piping to link everything together. Although this could get slow and memory intensive for very large files.

Something like:

(Get-Content C:\test\test.txt) | 
Foreach-Object {$_ -replace "(?i)\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b",'<$0>'} | 
Set-Content C:\test\test.txt

Running the above Powershell turns the textfile C\test\test.txt from:

Hi my email is bob@gmail.com
I like to email sally@gmail.com

into

Hi my email is <bob@gmail.com>
I like to email <sally@gmail.com>
  • Thanks, I will check this out as soon as I can, I was over thinking it and the files won't be very large anyhow. – scrypto Jan 15 '15 at 02:46
  • Thanks Edward, seems to be working as you mentioned. How hard would it be to check for existing emails that contain <> and exclude them from the replace? What would I have to add? IF, ELSE? – scrypto Jan 15 '15 at 03:04
  • You could do two passes. The first replaces ALL emails and the second changes << and >> to singles. If your Regex engine supports look ahead and look behinds you could use those as well. Which I am pretty certain the replace does. See if (?i)(?<![\w<])[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}(?![\w>]) works for you. See https://regex101.com/r/kL1oR8/1 – Edward Eisenhart Jan 15 '15 at 03:31