I want to match words inside angle brackets (html tags):
<MatchedWord></MartchedWord>
This is what I have so far:
/\v\<\w+\>
The problem is that it matches the <>
too and the /
.
How to do it so it only matches the word?
I want to match words inside angle brackets (html tags):
<MatchedWord></MartchedWord>
This is what I have so far:
/\v\<\w+\>
The problem is that it matches the <>
too and the /
.
How to do it so it only matches the word?
You can assert matching before and after text without including that in the match via Vim's special \zs
(match start) and \ze
(match end) atoms:
/<\/\?\zs\w\+\ze\/\?>
I've included an optional (\?
) slash on both side (e.g. </this>
and <this/>
. Also note that \w\+
isn't a completely correct expression for XML or HTML tags (but it can be a good-enough approximation, depending on your data).
For most other regular expression engines, you need to use lookbehind and lookahead to achieve this. Vim has those, too (\@<=
and \@=
), but the syntax is more awkward, and the matching performance may be poorer.
You dont need to escape angle brackets (square brackets are []) since they are not special characters. You can use capturing groups
<\/?(.+)>
In a non-vim environment, this is achieved using positive lookbehind and lookahead as such:
/(?<=<).*?(?=>)/
This matches the following:
<test> // test
</content> // /content
<div id="box"> // div id="box"
<div id="lt>"> // div id="lt
So as you can see by the final example it's not perfect, but you are using regex on html so you get what you pay for