567

I have a big HTML file that has lots of markup that looks like this:

<p class="MsoNormal" style="margin: 0in 0in 0pt;">
  <span style="font-size: small; font-family: Times New Roman;">stuff here</span>
</p>

I'm trying to do a Vim search-and-replace to get rid of all class="" and style="" but I'm having trouble making the match ungreedy.

My first attempt was this

%s/style=".*?"//g

but Vim doesn't seem to like the ?. Unfortunately removing the ? makes the match too greedy.

How can I make my match ungreedy?

Zombies
  • 25,039
  • 43
  • 140
  • 225
Mark Biek
  • 146,731
  • 54
  • 156
  • 201

8 Answers8

870

Instead of .* use .\{-}.

%s/style=".\{-}"//g

Also, see :help non-greedy

Randy Morris
  • 39,631
  • 8
  • 69
  • 76
  • 54
    Not very intuitive, is this something that only vim does? – Ehtesh Choudhury Dec 08 '12 at 05:08
  • 7
    Yes. Vim has its own regular expression language. – Randy Morris Dec 09 '12 at 00:12
  • 111
    Everything has its own regular expression language... that's one of the biggest issues with regex. – Patrick Farrell Apr 17 '13 at 22:22
  • 45
    Lots of these tools matured around the same time and independently developed their own dialect of a regular expression language. Many of these tools also were trying to solve different problems so it makes sense that the syntax could be -potentially wildly- different across these implementations. We have to accept that this is just how the real world works even though it sometimes makes our lives harder as developers. Luckily many tools at least provide a Perl-compatible implementation of regex these days. Unfortunately Vim is not one of them. – Randy Morris Apr 18 '13 at 17:16
  • 24
    If anyone like myself defaults their search to `\v` (very magic flag) you'll want to use `.{-}`. – jgillman Mar 12 '14 at 00:55
  • 64
    @Shurane @Ziggy Mnemonic: controls the number of repetitions like `{1,3}` does (braces). The minus sign `-` means: repeat as little as possible (little == minus) ;) – Ciro Santilli OurBigBook.com Mar 16 '14 at 21:05
  • 3
    I use regex with countless tools, and they're all pretty much the same. It's Vim's fault for having a nonstandard regex language, not regex's fault. – Glenn Maynard Apr 26 '14 at 03:32
  • 7
    @GlennMaynard You're wrong; check [this answer](http://stackoverflow.com/a/3604643/797744) to see why. – DBedrenko May 31 '14 at 07:35
  • Related: [Searching over multiple lines](http://vim.wikia.com/wiki/Search_across_multiple_lines#Searching_for_multiline_HTML_comments) at Vim Wikia – kenorb Oct 25 '15 at 19:11
  • 5
    I was looking for non-greedy one-or-more, like `/.+?/` in perl. The help file gives the syntax for this, which is `.\{-1,}`. (The 1 is the lower limit.) – tremby Jan 25 '16 at 01:24
  • 3
    Why do we have to escape the first `\{` but not the second `}`? – knub Nov 28 '16 at 09:29
  • 3
    @knub For the same reason that you don't have to escape both when you do, for instance, `a\{2}`. You are escaping the entire `{...}` atom, not the individual characters. – Randy Morris Nov 28 '16 at 11:38
72

Non greedy search in vim is done using {-} operator. Like this:

%s/style=".\{-}"//g

just try:

:help non-greedy
Vilhelm Gray
  • 11,516
  • 10
  • 61
  • 114
49

What's wrong with

%s/style="[^"]*"//g
Paul Tomblin
  • 179,021
  • 58
  • 319
  • 408
19

If you're more comfortable PCRE regex syntax, which

  1. supports the non-greedy operator ?, as you asked in OP; and
  2. doesn't require backwhacking grouping and cardinality operators (an utterly counterintuitive vim syntax requirement since you're not matching literal characters but specifying operators); and
  3. you have [g]vim compiled with perl feature, test using

    :ver and inspect features; if +perl is there you're good to go)

try search/replace using

:perldo s///

Example. Swap src and alt attributes in img tag:

<p class="logo"><a href="/"><img src="/caminoglobal_en/includes/themes/camino/images/header_logo.png" alt=""></a></p>

:perldo s/(src=".*?")\s+(alt=".*?")/$2 $1/

<p class="logo"><a href="/"><img alt="" src="/caminoglobal_en/includes/themes/camino/images/header_logo.png"></a></p>
FrDarryl
  • 301
  • 2
  • 5
  • 1
    `perldo` works great, but unfortunately does not highlight the selected test while typing the regex. – mljrg Feb 08 '19 at 10:25
  • you can't use `perldo` for interactive regex find/replace like you can with the native vim substitute `s/`. Or is it possible? I'd love to be wrong about that. – Z4-tier Oct 05 '20 at 23:15
13

I've found that a good solution to this type of question is:

:%!sed ...

(or perl if you prefer). IOW, rather than learning vim's regex peculiarities, use a tool you already know. Using perl would make the ? modifier work to ungreedy the match.

William Pursell
  • 204,365
  • 48
  • 270
  • 300
  • 2
    good point, but being able to do `/pattern` to check that you're matching the pattern correctly before applying it and using `c` modifier in your vim regular expression is also nice :) – João Portela Dec 30 '10 at 15:00
  • this is correct. all solutions here are not close to non-greedy! if you have to match [0-9]\{7} in a line with lots of text and several occurences of that pattern, no solution here will do. The solutions here only work for simple things (which to be fair, is what was asked). but if you are doing a little more than search till the next quotation, vim won't help. – gcb Jan 28 '14 at 17:54
5

With \v (as suggested in several comments)

:%s/\v(style|class)\=".{-}"//g
JJoao
  • 4,891
  • 1
  • 18
  • 20
4

Plugin eregex.vim handles Perl-style non-greedy operators *? and +?

bain
  • 1,710
  • 14
  • 15
-3

G'day,

Vim's regexp processing is not too brilliant. I've found that the regexp syntax for sed is about the right match for vim's capabilities.

I usually set the search highlighting on (:set hlsearch) and then play with the regexp after entering a slash to enter search mode.

Edit: Mark, that trick to minimise greedy matching is also covered in Dale Dougherty's excellent book "Sed & Awk" (sanitised Amazon link).

Chapter Three "Understanding Regular Expression Syntax" is an excellent intro to the more primitive regexp capabilities involved with sed and awk. Only a short read and highly recommended.

HTH

cheers,

Rob Wells
  • 36,220
  • 13
  • 81
  • 146
  • 7
    Vim's regex processing is actually quite nice. It can do things that sed can't, like match on line/column numbers or match based on per-language classification of characters as keywords or identifiers or whitespace. It also has zero-width assertions and the ability to put expressions in the right side of a replacement. If you use `\v` it helps clean the syntax up a lot. – Brian Carper Aug 20 '09 at 17:08
  • 1
    @Brian, cheers. I'll do a help regex and see what I've been missing. – Rob Wells Aug 20 '09 at 18:22
  • @RobWells, _Sed & Awk_, which is indeed a very good book imho, does not explicitly spend any words on greedy/lazy quantifiers. As a proof, there is absolutely no occurrence of the words _greed_ or _greedy_ in the book, and there's only one, but unrelated, occurrence of the word _lazy_. – Enlico Apr 18 '20 at 18:30
  • @EnricoMariaDeAngelis it is but the example does not refer to the term explicitly. It is about how to tailor your regex to use the "not" operator to achieve non greedy matches. The term greedy and lazy arrived with Perl's NFA engine when they introduced operators to specifically modify greedy match behaviour. – Rob Wells Apr 20 '20 at 11:03