I am investigating a regexp mystery. I am tired so I may be missing something obvious - but I can't see any reason for this.
In the examples below, I use perl - but I first saw this in VIM, so I am guessing it is something related to more than one regexp-engines.
Assume we have this file:
$ cat data
1 =2 3 =4
5 =6 7 =8
We can then delete the whitespace in front of the '=' with...
$ cat data | perl -ne 's,(.)\s+=(.),\1=\2,g; print;'
1=2 3=4
5=6 7=8
Notice that in every line, all instances of the match are replaced ; we used the /g search modifier, which doesn't stop at the first replace, and instead goes on replacing till the end of the line.
For example, both the space before the '=2' and the space before the '=4' were removed ; in the same line.
Why not use simpler constructs like 's, =,=,g'? Well, we were preparing for more difficult scenarios... where the right-hand side of the assignments are quoted strings, and can be either single or double-quoted:
$ cat data2
1 ="2" 3 ='4 ='
5 ='6' 7 ="8"
To do the same work (remove the whitespace before the equal sign), we have to be careful, since the strings may contain the equal sign - so we mark the first quote we see, and look for it via back-references:
$ cat data2 | perl -ne 's,(.)\s+=(.)([^\2]*)\2,\1=\2\3\2,g; print;'
1="2" 3='4 ='
5='6' 7="8"
We used the back-reference \2 to search for anything that is not the same quote as the one we first saw, any number of times ([^\2]*). We then searched for the original quote itself (\2). If found, we used back references to refer to the matched parts in the replace target.
Now look at this:
$ cat data3
posAndWidth ="40:5 =" height ="1"
posAndWidth ="-1:8 ='" textAlignment ="Right"
What we want here, is to drop the last space character that exists before all the instances of '=' in every line. Like before, we can't use a simple 's, =",=",g', because the strings themselves may contain the equal sign.
So we follow the same pattern as we did above, and use back-references:
$ cat data3 | perl -ne "s,(\w+)(\s*) =(['\"])([^\3]*)\3,\1\2=\3\4\3,g; print;"
posAndWidth="40:5 =" height ="1"
posAndWidth="-1:8 ='" textAlignment ="Right"
It works... but only on the first match of the line! The space following 'textAlignment' was not removed, and neither was the one on top of it (the 'height' one).
Basically, it seems that /g is not functional anymore: running the same replace command without /g produces exactly the same output:
$ cat data3 | perl -ne "s,(\w+)(\s*) =(['\"])([^\3]*)\3,\1\2=\3\4\3,; print;"
posAndWidth="40:5 =" height ="1"
posAndWidth="-1:8 ='" textAlignment ="Right"
It appears that in this regexp, the /g is ignored. Any ideas why?