4

I am trying to match an attribute to I can do a search/replace. I am having trouble though because it is matching beyond the quotes of the attribute I want. For example, I want to remove xref="..." from here:

<a href="page.ashx" xref="somethingelse" title="something" class="image">

But when I do a RegEx like this: xref=\".*\", then it selects the attributes xref, title, AND class. How do I tell it to only select the xref attribute?

TruMan1
  • 33,665
  • 59
  • 184
  • 335

3 Answers3

10

I strongly suggest using something other than regex for modifying markup, however, this should work:

xref="[^"]*"
Daniel Haley
  • 51,389
  • 6
  • 69
  • 95
  • Sorry... Tried to remove my upvote because the coloring messed up my interpretation. +1 again. – agent-j Jun 26 '11 at 00:29
  • It depends on your source, though -- if you know it will be formatted this is fine. If your input might be any valid HTML, a regex is really not appropriate -- it is perfectly valid HTML to have something like xref = value (no quotes, or single quotes, or spaces around the equals sign...) – Rob Whelan Oct 03 '13 at 16:09
1

Use the non-greedy version: \".*?\"

.* is greedy selects as much as possible. By adding a ? to it becomes less greedy selecting just as much as needed.

Máthé Endre-Botond
  • 4,826
  • 2
  • 29
  • 48
0

It looks like you're using .net... In C#:

Regex regex = new Regex ("xref=\"[^\"]\"\\s*", RegexOptions.IgnoreCase);
regex.Replace (myHtml, "");
agent-j
  • 27,335
  • 5
  • 52
  • 79