Vim Regex Capture Groups [bau -> byau : ceu -> cyeu]

Question

I have a list of words:

bau
ceu
diu
fou
gau

I want to turn that list into:

byau
cyeu
dyiu
fyou
gyau

I unsuccessfully tried the command:

:%s/(\w)(\w\w)/\1y\2/g

Given that this doesn't work, what do I have to change to make the regex capture groups work in Vim?

possible duplicate of [Matching an expression including arbitrary lines with regex in Vim](http://stackoverflow.com/questions/17471929/matching-an-expression-including-arbitrary-lines-with-regex-in-vim) and http://stackoverflow.com/questions/18627893/vim-match-errors-out-with-regular-expression-ffelrf — Ingo Karkat, Nov 11 '13 at 08:50
It's a little bit off-topic so I put it here as a comment but… I'd do `:%norm ay`. — romainl, Nov 11 '13 at 09:12
In your case (if it's exactly like described), it's an option to: move to 2nd column with `l`, enter Visual Block mode with `Ctrl+v`, mark whole column with `Shift+g` followed by `l`, then enter Insert mode with `Shift+i`and input 'y'. 7 keystrokes including finishing `Esc` to exit Insert mode. Not posting as an answer because it's not really about capture groups (which is what I searched for when I found this). :-) — LAFK 4Monica_banAI_modStrike, Aug 21 '16 at 11:23

johnsyweb · Accepted Answer · 2013-11-11T08:52:05.270

309

One way to fix this is by ensuring the pattern is enclosed by escaped parentheses:

:%s/\(\w\)\(\w\w\)/\1y\2/g

Slightly shorter (and more magic-al) is to use \v, meaning that in the pattern after it all ASCII characters except '0'-'9', 'a'-'z', 'A'-'Z' and '_' have a special meaning:

:%s/\v(\w)(\w\w)/\1y\2/g

See:

edited Nov 11 '13 at 08:52

answered Nov 11 '13 at 08:46

johnsyweb

136,902
23
188
247

score 65 · Answer 2 · edited Aug 08 '16 at 13:34

65

You can also use this pattern which is shorter:

:%s/^./&y

%s applies the pattern to the whole file.
^. matches the first character of the line.
&y adds the y after the pattern.

edited Aug 08 '16 at 13:34

Peter Perháč

20,434
21
120
152

answered May 28 '15 at 15:38

Juan

915
1
9
13

2

Its amazing how after more than 10 years and a quite a bit of expertise in vim, I still learn new tricks like using "&" to add rather than to substitute. thanks – Kiteloopdesign Nov 01 '22 at 08:15
1

@Kiteloopdesign `&` is actually just another name for `\0`, which is the capture group containing the entire sequence that was matched. – cuddlebugCuller Apr 12 '23 at 05:07

score 56 · Answer 3 · answered Nov 11 '13 at 08:47

56

If you don't want to escape the capturing groups with backslashes (this is what you've missed), prepend \v to turn Vim's regular expression engine into very magic mode:

:%s/\v(\w)(\w\w)/\1y\2/g

answered Nov 11 '13 at 08:47

Ingo Karkat

167,457
16
250
324

Ingo, sorry for the placing a question in the wrong place: This works find in `:exmode`; is there a way to do it in gvim find/replace dialogue box? – JJoao May 05 '15 at 16:30
3

@JJoao: No, the find/replace box is for literal search and replacement only. You shouldn't be using that, anyway; it's just training wheels for Notepad users. – Ingo Karkat May 06 '15 at 06:50
Ingo, thank you (it is not for me: I am happy with exmode, but for linguists colaborators in a dictionary project): it almost work - with `\v...` regexp work find; in the replacement string, `&` works but `\ ` are protected (`\1\r` are lost) – JJoao May 06 '15 at 08:11
@JJoao: Yes, that's what I found out while playing with it, too. I'm still skeptical whether using Vim without Ex mode is a good idea, but you could easily build your own search-and-replace dialog (internally powered by `:s`) via `inputdialog()` and a bit of Vimscript. – Ingo Karkat May 06 '15 at 08:32
Ingo: Thank you again; I agree with your skeptical opinion. Inputdialg+:s+vimscript is probably the way gvim's find replace is built. For me `\1 \r ` treatment is a gvim bug. I will try to post it in some vim specific list. In the meanwhile I will try my one vimscript-inputdialog. – JJoao May 06 '15 at 09:10

score 17 · Answer 4 · edited Nov 11 '13 at 11:45

17

You also have to escape the Grouping paranthesis:

:%s/\(\w\)\(\w\w\)/\1y\2/g

That does the trick.

edited Nov 11 '13 at 11:45

Christian

25,249
40
134
225

answered Nov 11 '13 at 08:46

Henkersmann

1,190
8
21

3

Coming from Sublime Text 3, this is horrible. Why is the syntax like this? It doesn't make sense to escape characters that aren't literal, normal text. – Unknow0059 Jan 02 '21 at 20:09
@Unknow0059 the parenthesis in this case aren't literal text. they are meta characters that delimit the groups to save for the replace expression. placing a non-escaped paren in an expression will match the literal character, as one would expect (this was what tripped up the OP). – Azure Heights Mar 02 '21 at 20:38
1

I'm a regular vim user and I also think this is terrible. @Unknow0059 – icedwater May 19 '21 at 09:48
1

@Unknow0059 because vim is older than the normal regex syntax that we all use nowadays. Most people that use vim just use the `\v` version described in other answers though, rather than escape every little thing in their regex – CoffeeTableEspresso Aug 07 '22 at 17:49

Victoria Stuart · Answer 5 · 2022-12-01T16:35:11.430

In Vim, on a selection, the following

:'<,'>s/^\(\w\+ - \w\+\).*/\1/

or

:'<,'>s/\v^(\w+ - \w+).*/\1/

parses

Space - Commercial - Boeing

to

Space - Commercial

Similarly,

apple - banana - cake - donuts - eggs

is parsed to

apple - banana

Explanation

^ : match start of line
\-escape (, +, ) per the first regex (accepted answer) -- or prepend with \v (@ingo-karkat's answer)
\w\+ finds a word (\w will find the first character): in this example, I search for a word followed by - followed by another word)
.* after the capturing group is needed to find / match / exclude the remaining text

Addendum. This is a bit off topic, but I would suggest that Vim is not well-suited for the execution of more complex regex expressions / captures. [I am doing something similar to the following, which is how I found this thread.]

In those instances, it is likely better to dump the lines to a text file and edit it "in place"

sed -i ...

or in a redirect

sed ... > out.txt

In a terminal (or BASH script, ...):


echo 'Space Sciences - Private Industry - Boeing' | sed -r 's/^((\w+ ){1,2}- (\w+ ){1,2}).*/\1/'

Space Sciences - Private Industry 

cat in.txt

Space Sciences - Private Industry - Boeing

sed -r 's/^((\w+ ){1,2}- (\w+ ){1,2}).*/\1/' ~/in.txt > ~/out.txt

cat ~/out.txt 

Space Sciences - Private Industry

## Caution: if you forget the > redirect, you'll edit your source.
## Subsequent > redirects also overwrite the output; use >> to append
## subsequent iterations to the output (preserving the previous output).
 
## To edit "in place" (`-i` argument/flag):

sed -i -r 's/^((\w+ ){1,2}- (\w+ ){1,2}).*/\1/' ~/in.txt

cat in.txt

Space Sciences - Private Industry

sed -r 's/^((\w+ ){1,2}- (\w+ ){1,2}).*/\1/'

(note the {1,2}) allows the flexibility of finding {x,y} repetitions of a word(s) -- see https://www.gnu.org/software/sed/manual/html_node/Regular-Expressions.html .

Here, since my phrases are separated by -, I can simply tweak those parameters to get what I want.

Vim Regex Capture Groups [bau -> byau : ceu -> cyeu]

5 Answers5

See:

Linked