0

I am trying to extract a ID from a text using regex. For example 05771292000P from:

<div class="product-price-container">     <a href="/rca-32-class-720p-60hz-led-hdtv-with-built-in-dvd-player-led32b30rqd/226660215" class="product-link subject-price" data-external-product-id="05771292000P">    <span class="save-story-box">   

Tried using the regex (?=id=").*">but it returns ever word after id, which is not helpful.

Any idea what I am doing wrong?

r3x
  • 2,125
  • 5
  • 23
  • 39
  • 1
    Not considering using a proper parser? `(?= ... )` is lookahead, and you're using it like a lookbehind here (lookbehinds are not supported by javascript). You would be slightly better off with `id="([^"]*)"`. – Jerry Apr 04 '14 at 19:14
  • id="([^"]*)" returns id="05771292000P". I need only 05771292000P, this is why I tried using lookarounds. – r3x Apr 04 '14 at 19:22
  • 1
    Well, look at [this question](http://stackoverflow.com/q/22870673/1578604) then. – Jerry Apr 04 '14 at 19:23
  • 1
    [**I'd much rather look at one of the most upvoted questions on SO**](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – adeneo Apr 04 '14 at 19:40

2 Answers2

2

Try this regex:
id="([0-9A-Za-z]+)"

Atri
  • 5,511
  • 5
  • 30
  • 40
  • Same as the above comment. This returns id="05771292000P", and I need 05771292000P. Thanks anyways :) – r3x Apr 04 '14 at 19:23
  • 1
    the () are used to capture value. It will return you 05771292000P when you do var result = regex.exec(yourString); here result will be an array – Atri Apr 04 '14 at 19:32
2

Try

var text = '<div class="product-price-container">     <a href="/rca-32-class-720p-60hz-led-hdtv-with-built-in-dvd-player-led32b30rqd/226660215" class="product-link subject-price" data-external-product-id="05771292000P">    <span class="save-story-box">'
text.match(/data-external-product-id="([0-9A-Za-z]+)"/)[1]

match(...) returns an array with the whole match as the first element and your group match (i.e. [0-9A-Za-z]+) as the second element. If you can trust the source from where you get the text, you can also use jQuery (here the code on jsFiddle):

var text = '<div class="product-price-container">     <a href="/rca-32-class-720p-60hz-led-hdtv-with-built-in-dvd-player-led32b30rqd/226660215" class="product-link subject-price" data-external-product-id="05771292000P">    <span class="save-story-box">'

var id = $($.parseHTML(text)).find("a").attr("data-external-product-id")

alert(id) // 05771292000P

Please have in mind, that everyone can execute malicious JavaScript with $.parseHTML if he has control over the parsed text. So only use the above solution, if you have control over the parsed text.

Stephan Kulla
  • 4,739
  • 3
  • 26
  • 35