3

I have some Powershell that works with mail from Outlook folders. There is a footer on most emails starting with text "------". I want to dump all text after this string.

I have added an expression to Select-Object as follows:

$cleanser = {($_.Body).Substring(0, ($_.Body).IndexOf("------"))}
$someObj | Select-Object -Property @{ Name = 'Body'; Expression = $cleanser}

This works when the IndexOf() returns a match... but when there is no match my Select-Object outputs null.

How can I update my expression to return the original string when IndexOf returns null?

Adam
  • 1,932
  • 2
  • 32
  • 57

2 Answers2

3

I agree with @mklement0 and @PetSerAl Regular Expressions give the best answer. Yay! Regular Expressions to the rescue!

Edit: Fixing my original post.

Going with @Adam's ideas of using a script block in the expression, you simply need to add more logic to the script block to check the index first before using it:

$cleanser = {
    $index = ($_.Body).IndexOf("------");
    if($index -eq -1){
        $index = $_.Body.Length
    };
    ($_.Body).Substring(0, $index)
}

$someObj | Select-Object -Property @{ Name = 'Body'; Expression = $cleanser}
HAL9256
  • 12,384
  • 1
  • 34
  • 46
  • @MathiasR.Jessen, the problem wasn't that `$_.Body` isn't a string - the problem, which HAL9256 has since fixed, was that `$_.Body.IndexOf("------")` was called only _once_, outside the `$cleanser` script block that is used as part of a calculated property. – mklement0 Jul 13 '19 at 23:29
  • Thanks for this. I prefer the less verbose answer and I suspect (though have not tested) that it's probably more performant too. Thanks anyway :-) – Adam Jul 14 '19 at 14:23
3

PetSerAl, as countless times before, has provided the crucial pointer in a comment on the question:

Use PowerShell's -replace operator, which implements regex-based string replacement that returns the input string as-is if the regex doesn't match:

# The script block to use in a calculated property with Select-Object later.
$cleanser = { $_.Body -replace '(?s)------.*' }

If you want to ensure that ------ only matches at the start of a line, use (?sm)^------.*; if you also want to remove the preceding newline, use (?s)\r?\n------.*

  • (?s) is an inline regex option that makes . match newlines too, so that .* effectively matches all remaining text, across lines.

  • By not specifying a replacement operand, '' (the empty string) is implied, which effectively removes the matching part from the input string (technically, a copy of the original string with the matching part removed is returned).

  • If regex '(?s)------.*' does not match, $_.Body is returned as-is (technically, it is the input string itself that is returned, not a copy).

The net effect is that anything starting with ------ is removed, if present.

mklement0
  • 382,024
  • 64
  • 607
  • 775