0

I am recently learning regex and i am not quite sure how the following regex works:

str.replaceAll("(\\w)(\\w*)", "$2$1ay");

This allows us to do the following:

input string: "Hello World !"
return string: "elloHay orldWay !"

From what I know: w is supposed to match all word characters including 0-9 and underscore and $ matches stuff at the end of string.

Hanming497
  • 21
  • 5
  • 1
    Apart from reading the proposed duplicate, try inserting the text and regex into a site like https://regex101.com to see an explanation of what it is doing. – Nick Jun 03 '20 at 03:07
  • The return string is not because of the regex but because of the `replaceAll` method. It matches on every word that matches that regex and changes it to the second parameter. – leoOrion Jun 03 '20 at 03:26
  • See the explanation at https://regex101.com/r/clWQp1/1 – The fourth bird Jun 03 '20 at 10:45

2 Answers2

0

In the replaceAll method, the first parameter can be a regex. It matches all words in the string with the regex and changes them to the second parameter.

In simple cases replaceAll works like this:

str = "I,am,a,person"
str.replaceAll(",", " ") // I am a person

It matched all the commas and replaced them with a space.

In your case, the match is every alphabetic character(\w), followed by a stream of alphabetic characters(\w*).

The () around \w is to group them. So you have two groups, the first letter and the remaining part. If you use regex101 or some similar website you can see a visualization of this.

Your replacement is $2 -> Second group, followed by $1(remaining part), followed by ay.

Hope this clears it up for you.

leoOrion
  • 1,833
  • 2
  • 26
  • 52
0

Enclosing a regex expression in brackets () will make it a Capturing group.

Here you have 2 capturing groups , (\w) captures a single word character, and (\w*) catches zero or more. $1 and $2 are used to refer to the captured groups, first and second respectively.

Also replaceAll takes each word individually. So in this example in 'Hello' , 'H' is the first captured groups and 'ello' is the second. It's replaced by a reordered version - $2$1 which is basically swapping the captured groups. So you get '$2$1ay' as 'elloHay'

The same for the next word also.

Abin K Paul
  • 163
  • 2
  • 10