0

My input string looks something like:

var someString = 'This is a nice little string with <a target="_" href="/carSale/12/..">link1</a>. But there is more that we want to do with this. Lets insert another <a target="_" href="/carSale/13/..">link2</a> ';

My end goal is to match every anchor element that has a"carSale" within its href attribute and replace it with the text insider the anchor.

for e.g
Replace <a target="_" href="/carSale/12/..">link1</a>  with string    link1

but it should not replace 
<a target="_" href="/bikeSale/12/..">link3</a> 

since the above href does not contain the string "carSale"

I have created a regular expression object for this. But it seems to be performing a greedy match.

var regEx = /(<a.*carSale.*>)(.*)(<\/a>)/;

var someArr = someString.match(regEx);

console.log(someArr[0]);
console.log(someArr[1]);
console.log(someArr[2]);
console.log(someArr[3]);

Appending the modifier 'g' at the end fo the regular expression gives bizare results.

Fiddle here : http://jsfiddle.net/jameshans/54X5b/

runtimeZero
  • 26,466
  • 27
  • 73
  • 126

4 Answers4

1

Online Demo

I am not sure what is what are your matching groups but how about this expression:

/^<a.*href="((?:.*)carSale(?:.*))".*>(.*)<\/a>$/

Note that in this expression I am matching href to contain carSale which I think is where you want the expression to match.

And since you want to replace the whole expression as I understand all you need to do is:

 var result = '<a target="_" href="\/carSale/12\/..">link1<\/a>'.replace(/(^<a.*href="((?:.*)carSale(?:.*))".*>(.*)<\/a>$)/,"temp text");
Dalorzo
  • 19,834
  • 7
  • 55
  • 102
1

Or this one:

/(<a.*?carSale.*?>)(.*?)(<\/a>)/

The ? makes your repeater non-greedy, so it eats as little as possible, versus the default behavior of * which is to eat as much as possible. So with the ? added, the (.*?) will stop at the first </a> rather than the last one

secretformula
  • 6,414
  • 3
  • 33
  • 56
geert3
  • 7,086
  • 1
  • 33
  • 49
  • The ? makes your repeater non-greedy, so it eats as little as possible, versus the default behavior of * which is to eat as much as possible. So with the ? added, the (.*?) will stop at the first rather than the last one. – geert3 Jun 04 '14 at 20:16
1

Rather than using a regular expression, use a parser. This won't break as easily and uses the native (native as in the browser's) parser so is less susceptible to bugs:

var div = document.createElement("div");
div.innerHTML = someString;

// Get links
var links = div.querySelectorAll("a");
for (var i = 0; i < links.length; ++i) {
    var a = links[i];
    // If the link contains a href with desired properties
    if (a.href.indexOf("carSale") >= 0) {
        // Replace the element with text
        div.replaceChild(document.createTextNode(a.innerHTML), a);
    }
}

See http://jsfiddle.net/prankol57/d72Vr/

However, if you are confident that your html will always follow the pattern specified by your regex, then you can use it. I will drop a link to RegEx match open tags except XHTML self-contained tags

Community
  • 1
  • 1
soktinpk
  • 3,778
  • 2
  • 22
  • 33
  • A quibble: The "native parser" is not part of the JavaScript language; it's part of a specific web client integration. – Dan Korn Jun 04 '14 at 20:12
  • 1
    @Dan Yes, but my point is that something could break his regular expression easily, for it [the regex] would have to grow quite complicated to handle all possible variations of an `a` tag that has a certain href property. However, if he knows where the element is coming from and is confident that the tag will follow his regex (as in, he doesn't need to cover all variations), then it's okay to use the regex. – soktinpk Jun 04 '14 at 20:20
  • I don't disagree with any of that. My only quibble was with the use of the word "native" in reference to a parser that's part of a web browser integration, not part of the JavaScript language itself. – Dan Korn Jun 04 '14 at 20:35
  • @soktinpk .. this surely turned out to be a better solution ..although i had to replace the replaceChild line with var a = A.parentNode.replaceChild(document.createElement("span"), A); – runtimeZero Jun 05 '14 at 20:50
0
(<a[^>]*(href=\"([^>]*(?=carSale)[^>]*)\")[^>]*>)([^<]*)(<\/a>)*

groups 3 and 4 are what you are interested in

George
  • 4,323
  • 3
  • 30
  • 33