1

I am getting raw html from one site in javascript.

Here is the example part:

<a href="/?Stat=5&item=10739">Coldwater</a></b>

Now I use exec to pull some data out with pattern:

Stat=5&item=(\d.*)">(.*)<\/a><\/b>

It works fine in regex tester (link), problem is how to write in js, currently I have this code (returns null):

$.get(link,function(data) {
    var raw = data,
        pattern = / Stat=5&item=(\d.*)">(.*)<\/a><\/b>/gi,
        matches = pattern.exec(raw);
    console.log(matches);
});

Probably I have to remove some single/double quotes, slashes from that raw html?

Johncze
  • 469
  • 1
  • 6
  • 19
  • 1
    HTMl is a tree shaped data structure - consider loading the string into a DOM (either in the browser or JSDOM on the server depending on where your javaScript is running) and querying it using DOM methods. – mikemaccana Sep 17 '14 at 09:03
  • 2
    Why there is a space before `S`? – Avinash Raj Sep 17 '14 at 09:04
  • 3
    Possible duplicate of http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags But seriously, don't use regex, see what Mr_Green said. Put your data to a new element and get all `a` tag's `href`s – Mardoxx Sep 17 '14 at 09:05

2 Answers2

7

Here there is no need to use regex. You can achieve the same by creating a new element.

var a = document.createElement('div'); 
a.innerHTML = yourString;
var result = a.children[0].href;
Mr_Green
  • 40,727
  • 45
  • 159
  • 271
1

Remove the space before the string Stat,

> var str = '<a href="/?Stat=5&item=10739">Coldwater</a></b>';
undefined
> console.log(/Stat=5&item=(\d.*)">(.*)<\/a><\/b>/gi.exec(str)[0]);
Stat=5&item=10739">Coldwater</a></b>
> console.log(/Stat=5&item=(\d.*)">(.*)<\/a><\/b>/gi.exec(str)[1]);
10739
> console.log(/Stat=5&item=(\d.*)">(.*)<\/a><\/b>/gi.exec(str)[2]);
Coldwater
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274