2

I am confused about the workings of PowerShell's -replace operator in regards to its use with regex. I've looked for documentation online but can't find any that goes into more detail than basic use: it looks for a string, and replaces that string with either another string (if defined) or nothing. Great.

I want to do the same thing as the person in this question where the user wants to extract a simple program name from a complex string. Here is the code that I am trying to replicate:

 $string = '% O0033(SUB RAD MSD 50R III) G91G1X-6.4Z-2.F500 G3I6.4Z-8.G3I6.4 G3R3.2X6.4F500 G91G0Z5. G91G1X-10.4 G3I10.4 G3R5.2X10.4 G90G0Z2. M99 %'
 $program = $string -replace '^%\sO\d{4}\((.+?)\).+$','$1'
 $program

 SUB RAD MSD 50R III

As you can see the output string is the string that the user wants, and everything else is filtered out. The only difference for me is that I want a string that is composed of six digits and nothing else. However when I attempt to do it on a string with my regex, I get this:

$string2 = '1_123456_1'
$program2 = $string -replace '(\d{6})','$1'
$program2

1_123456_1

There is no change. Why is this happening? What should my code be instead? Furthermore, what is the $1 used for in the code?

gsamerica
  • 153
  • 1
  • 3
  • 18

2 Answers2

5

The -replace operator only replaces the part of the string that matches. A capture group matches some subset of the match (or all of it), and the capture group can be referenced in the replace string as you've seen.

Your second example only ever matches that part you want to extract. So you need to ensure that you match the whole string but only capture the part you want to keep, then make the replacement string match your capture:

$string2 = '1_123456_1'
$program2 = $string -replace '\d_(\d{6})_\d','$1'
$program2

How you match "the rest of the string" is up to you; it depends on what could be contained in it. So what I did above is just one possible way. Other possible patterns:

1_(\d{6})_1
[^_]*_(\d{6})_[^_]*
^.*?(\d{6}).*?$
briantist
  • 45,546
  • 6
  • 82
  • 127
5

Capturing groups (pairs of unescaped parentheses) in the pattern are used to allow easy access to parts of a match. When you use -replace on a string, all non-overlapping substrings are matched, and these substrings are replaced/removed.

In your case, -replace '(\d{6})', '$1' means you replace the whole match (that is equal to the first capture, since you enclosed the whole pattern with a capturing group) with itself.

Use -match in cases like yours when you want to get a part of the string:

PS> $string2 = '1_123456_1'
PS> $string2 -match '[0-9]{6}'
PS> $Matches[0]
123456

The -match will get you the first match, just what you want.

Use -replace when you need to get a modified string back (reformatting a string, inserting/removing chars and suchlike).

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    `-match` definitely seems like the better of the two to use, so I think going forward with my code I'll be using that instead of `-replace`. Thanks very much for your answer. – gsamerica Jun 13 '17 at 14:10