0

I got some html code as a response from an ajax call. And I want to get the content of a specific div. Here's the html:

<html>
     .
     .
   <div id="div-test">
          .
          .
   </div><!--/div-test-->
     .
     .
</html>

Note: I use the <!--/div-test> because div#div-test contains more divs.

And that's my regex:

/<div[^.]*id=\"div\-test\"[^.]*>(.*?)<\/div><\!\-\-\/div\-test\-\->/

But it doesn't work at all. When I try to match it, all I get is a null value. So, is my regex wrong or is there anything else I need to do?

  • append the response in jquery in a div fragment not appended to DOM and then using jQuery find out the 'dev-test' – joyBlanks Aug 18 '15 at 19:07
  • Or if you're not using jQuery, create a [DocumentFragment](https://developer.mozilla.org/en-US/docs/Web/API/DocumentFragment) from the HTML first, and then search it...using a regex to find HTML tags is not reliable if there's any chance the HTML might change i.e. become more complex. – Matt Browne Aug 18 '15 at 19:09

2 Answers2

1

If you're looking for a non-regex approach, and you don't want to append the content on the page directly, you can create a document fragment and search through there:

var content = ""; // HTML FROM AJAX

var div = document.createElement('div');
div.innerHTML = content;
ajax_element = div.firstChild;
var test_content = ajax_element.getElementById('div-test').innerHTML;

as a regex approach, as much as I could advise against it, this might fit your needs:

var search_id = "div-test";
var r = new RegExp("<div[^>]*?id=[^\"]*?[^']*?"+search_id+"[^\"]*?[^']*?[^>]*?((?s).*)<\/div><!--\/"+search_id+"-->");
iam-decoder
  • 2,554
  • 1
  • 13
  • 28
  • 1
    Since there is no reason to down vote this solution, will +1, And please whosoever is downvoting atleast give a reason – Tushar Gupta Aug 18 '15 at 19:26
  • Perhaps somebody thought that the respondent explicitly suggested an approach that, yet obvious and useful, was not what the original poster asked for, even though we all know [about regex-parsing HTML](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). – dakab Aug 18 '15 at 19:32
  • @dakab i answered as asked `"And I want to get the content of a specific div."`, `"So, is my regex wrong or is there anything else I need to do?"` – iam-decoder Aug 18 '15 at 19:34
  • First of all, I didn't downvoted. Yes I know I can do it without regex and yes I know parsing html using regex is not the best way. I just want to see why the regex from my question doesn't work. All answers are accepted and thanks, but I prefer a regex approach :) –  Aug 18 '15 at 19:45
  • @iam-decoder: I was just guessing the reason for the downvote. And true, the “anything else” part of the question opens it for other approaches! While I wouldn’t be afraid of HTML regexps, your `innerHTML` was the first thing that came to my mind too, to “get the content of a specific div”. – dakab Aug 18 '15 at 19:58
  • Yes it's true, the whole “anything else” thing was misleading. –  Aug 18 '15 at 20:05
-1

You can use the regex :

<div[^>]*?id='div-test'[^>]*?>(.*?)<\/div><!--\/div-test-->

Output

enter image description here

Or if the makup is with "" you can use

<div[^>]*?id=\"div-test\"[^>]*?>(.*?)<\/div><!--\/div-test-->

enter image description here

Tushar Gupta
  • 15,504
  • 1
  • 29
  • 47
  • What about the `` ? I need it, as I already said there some nested divs. Without the comment the regex match will fail. –  Aug 18 '15 at 19:12
  • hopefully he never uses `"` to denote the wrappings of the id attribute – iam-decoder Aug 18 '15 at 19:15
  • Well I do use. And this example still gives me null even if I use \ on " –  Aug 18 '15 at 19:19
  • what if there's both `'` and `"` on the page? it's a pretty frequent thing in html. I'd highly suggest you don't use regex for simple checks like this that need to be taken into account. regex for html searching is generally a bad idea, on small scale it's not too bad, but it looks like you're getting a full html document – iam-decoder Aug 18 '15 at 19:21
  • This doesnt call for recommendation, op was stuck with regex, so i gave the solution for the same... moreover there are more than one way of doing things – Tushar Gupta Aug 18 '15 at 19:22
  • Should I stringify the ajax response before I do the regex match? –  Aug 18 '15 at 19:23
  • 1
    your regex is pretty strong, I didn't downvote it. whoever it was got me too XD – iam-decoder Aug 18 '15 at 19:23
  • Ok another question. If I stringify the response it will then contain `\t`, `\r`, etc. How to replace them with `''`? –  Aug 18 '15 at 19:25
  • make another regex to replace them like `content.replace(/\t|\r/g,'');` – iam-decoder Aug 18 '15 at 19:27
  • @TusharGupta i think it was failing for him because in order for your regex to work, the html needs to be in a single line. I think after he does that it'll work how he wants – iam-decoder Aug 18 '15 at 19:29
  • Okay now replace doesn't remove \r and \n. Any idea why? –  Aug 18 '15 at 19:33
  • This is what I got now: `jsonResp = JSON.stringify(resp);` `jsonResp.replace(/\r|\t|\n/g, '');` –  Aug 18 '15 at 19:34
  • without doing the replace or stringify, try this regex: `
    ]*?id=[^\"]*?[^']*?div-test[^\"]*?[^']*?[^>]*?((?s).*)<\/div>`
    – iam-decoder Aug 18 '15 at 19:59
  • Okay it worked like this: `jsonResp = jsonResp.replace(/(?:\\[rn])+/g, '');` and then match this regex: `/
    ]*?id=\\\"div-test\\\"[^>]*?>(.*?)<\/div><\!--\/div-test-->/`
    –  Aug 18 '15 at 20:02
  • Or better: `jsonResp = jsonResp.replace(/(?:\\[rn])+/g, '');` then `jsonResp = jsonResp.replace(/\\*/g, '');` and then match: `/
    ]*?id=\"div-test\"[^>]*?>(.*?)<\/div><\!--\/div-test-->/`
    –  Aug 18 '15 at 20:15