7

I want to Select-String parts of a file path starting at a string value that is contained in a variable. Let me explain this in an abstracted example.

Let's assume this path: /docs/reports/test reports/document1.docx

Using a regular expression I can get the required string like so: '^.*(?=\/test\s)'

https://regex101.com/r/6mBhLX/5

The resulting string is '/test reports/document1.docx'.

Now, for this to work I have to use the literal string 'test'. However, I would like to know how to use a variable that contains 'test', e.g. $myString.

I already looked at How do you use a variable in a regular expression?, but I couldn't figure out how to adapt this for PowerShell.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
colonel_claypoo
  • 543
  • 1
  • 12
  • 25

3 Answers3

11

I suggest using $([regex]::escape($myString)) inside a double quoted string literal:

$myString="[test]"
$pattern = "^.*(?=/$([regex]::escape($myString))\s)"

Or, in case you do not want to worry with additional escaping, use a regular concatenation using + operator:

$pattern = '^.*(?=/' + [regex]::escape($myString) +'\s)'

The resulting $pattern will look like ^.*(?=/\[test]\s). Since the $myString variable is a literal string, you need to escape all special regex metacharacters (with [regex]::escape()) that may be inside it for the regex engine to interpret it as literal chars.

In your case, you may use

$s = '/docs/reports/test reports/document1.docx'
$myString="test"
$pattern = "^.*(?=/$([regex]::escape($myString))\s)"
$s -replace $pattern

Result: /test reports/document1.docx

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thanks. I tried your code in PowerShell but unfortunately, it didn't return anything (no error though). I used the exact code you provided and then: "/docs/reports/test reports/document1.docx" | Select-String -Pattern $pattern – colonel_claypoo May 07 '18 at 13:37
  • @colonel_claypoo What did you expect to get? – Wiktor Stribiżew May 07 '18 at 13:46
  • My goal was to achieve the same as in this example: [link](https://regex101.com/r/6mBhLX/5). The matching string "/docs/reports" is trimmed from the original string "/docs/reports/test reports/document1.docx" if I use Select-String in PowerShell. Sorry, if was had been a little unclear in my initial post. – colonel_claypoo May 07 '18 at 14:07
  • 1
    @colonel_claypoo If you want to get `/test reports/document1.docx`, use `$myString="test"` and then `$s -replace $pattern` – Wiktor Stribiżew May 07 '18 at 14:10
  • I appreciate it. It basically works but one thing I noticed is that once I change the search term from 'test' to 'test reports' the expression doesn't work anymore. I added *. after \s to allow for any following character in the pattern. – colonel_claypoo May 08 '18 at 06:58
  • 1
    @colonel_claypoo [`^.*(?=\/test reports\s)`](https://regex101.com/r/Zi7mkx/1/) cannot match anything in `/docs/reports/test reports/document1.docx` because `\s` requires a whitespace after `reports`, but there is a `/`. You may [try with a word boundary, `\b`](https://regex101.com/r/Zi7mkx/2). Note that using an optional pattern (pattern that may match an empty string, like `\s*`) at the end of the lookahead is meaningless. – Wiktor Stribiżew May 08 '18 at 07:06
  • Thanks for your suggestion, that absolutely helped. I looked it up as well and it makes more sense to me know so I can agree that \s* at the end is unnecessary. – colonel_claypoo May 08 '18 at 10:08
6

Wiktor Stribiżew's helpful answer provides the crucial pointer:

Use [regex]::Escape() in order to escape a string for safe inclusion in a regex (regular expression) so that it is treated as a literal;
e.g., [regex]::Escape('$10?') yields \$10\? - the characters with special meaning to a regex were \-escaped.

However, I suggest using '...', i.e., building the regex from single-quoted aka verbatim strings:

$myString='test'
$regex = '^.*(?=/' + [regex]::escape($myString) + '\s)'

Using the -f operator - $regex = '^.*(?=/{0}'\s)' -f [regex]::Escape($myString) works too and is perhaps visually cleaner, but note that -f - unlike string concatenation with + - is culture-sensitive, which can lead to different results.

Using '...' strings in regex contexts in PowerShell is a good habit to form:

  • By avoiding "...", so-called expandable strings, you avoid additional up-front interpretation (interpolation a.k.a expansion) of the string, which can have unexpected effects, given that $ has special meaning in both contexts: the start of a variable reference or subexpression when string-expanding, and the end-of-input marker in regexes.

  • Using "..." can be especially tricky in the substitution operand of the regex-based
    -replace operator, where tokens such as $1 refer to capture-group results, and if you used "$1", PowerShell would try to expand a $1 PowerShell variable, which presumably doesn't exist, resulting in the empty string.

    • For a concise, but comprehensive overview of PowerShell's -replace operator, see this answer.
mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    Based on the directions contained herein, I devised my as regex as follows for my powershell script. $regex = '^'+[regex]::escape($pageno)+'\s+|^1$' Here $pageno is a variable that goes on incrementing from 1. I also changed to using single quotes rather than double quotes. – Unnikrishnan Jan 16 '23 at 03:09
0


Just write the variable within double quotes ("pattern"), like this:

PS > $pattern = "^\d+\w+"
PS > "357test*&(fdnsajkfj" -match $pattern          # return true
PS > "357test*&(fdnsajkfj" -match "$pattern.*\w+$"  # return true
PS > "357test*&(fdnsajkfj" -match "$pattern\w+$"    # return false

Please have a try. :)

Chenry Lee
  • 360
  • 2
  • 9
  • This won't help the OP, because they want to include an unknown variable value in a regex pattern while ensuring that the value is treated as a _literal_. In other words: a mechanism is needed that _escapes_ any incidental regex metacharacters in the value. – mklement0 May 07 '18 at 14:30