2

I'm using jquery and have an entire html page stored in var page

var page = '<html>...<div id="start">......</div><!-- start -->....</html>';

How can I extract only the section that starts with <div id="start"> all the way to after the end tag </div><!-- start --> such that my output is

<div id="start">......</div><!-- start -->
sullivan
  • 6,693
  • 3
  • 15
  • 8

6 Answers6

2
$(page).find('#start').html();
AndreKR
  • 32,613
  • 18
  • 106
  • 168
  • This would get the inner HTML, and not necessarily matching the original string. – Nick Craver Nov 10 '10 at 21:58
  • Correct. I wondered whether to write this, too, but he wrote that he is using jQuery (wants to use?) and so this is the only way. If he ever needs the whole element (the outerHTML), he will find a way to wrap it. But what for? – AndreKR Nov 10 '10 at 22:03
  • Be VERY careful throwing `$(page)` around like that... You'll want to save a "copy" of that if you plan on searching within it very often. `var $page = $(page);` Otherwise you're going to create/destroy multiple copies of the DOM representation... Also, if you want to work with it in jQuery, you won't need the `.html()` -- i.e. `$page.find("#start").appendTo("#target");` – gnarf Nov 10 '10 at 22:20
2

if it's valid html, it would be easiest to just let the browser do it for you. Something like this would do the trick:

var page = '<html><head><title>foo</title><body><div id="stuff"><div id="start">blah<span>fff</span></div></div></body></head></html>';

var start_div = $('#start', page).parent();
alert( start_div.html() )

You can see this example in action at jsFiddle.

[edit] as @Nick pointed out above, this would probably not include the html comment at the end of the div. It also might not work in all browsers -- I don't know -- you should test it. Post back and let us know.

Lee
  • 13,462
  • 1
  • 32
  • 45
1
var start = page.match(/(<div id="start">.*?<!-- start -->)/m)[1];
Ben Lee
  • 52,489
  • 13
  • 125
  • 145
  • This won't work if there is a newline between the opening and closing tags. – Ender Nov 10 '10 at 22:01
  • @Ender: Of course, forgot to add the "m" for multi-line mode. – Ben Lee Nov 10 '10 at 22:05
  • In any case, chances are the OP did not want what he asked for. A regex will return exactly what he asked for, but if what he really wants is an html component rather than that exact string, he should be using jQuery html parsing like many others here have suggested. – Ben Lee Nov 10 '10 at 22:05
  • Unfortunately, JS doesn't have that feature (or rather, it does, but not in the way you're thinking). Multi-line mode applies only to the start and end of string anchors. The /s flag (which is what allows the . to also match newlines in languages like Perl) isn't supported by javascript. – Ender Nov 10 '10 at 22:08
  • @Ender, you're right -- I always just assumed javascript multiline mode meant the same thing as it does in most regex, but a quick test shows it doesn't. – Ben Lee Nov 10 '10 at 22:32
  • 1
    I had to look it up myself, to be sure :) I found this question that clears it up, if you're interested: http://stackoverflow.com/questions/1068280/javascript-regex-multiline-flag-doesnt-work – Ender Nov 10 '10 at 22:34
1

This should do it:

var result = $(page).find('#start')[0].outerHTML;
Tatu Ulmanen
  • 123,288
  • 34
  • 187
  • 185
  • 1
    [This isn't available in all browsers](http://www.quirksmode.org/dom/w3c_html.html#t05) - if it were, it wouldn't include the comment. – Nick Craver Nov 10 '10 at 21:58
1

regex. or the lazy way (which I don't recommend but is quick..) would be to create a hidden DIV, throw it in the div and do a selector for it

$('#myNewDiv').next('#start').html();
FatherStorm
  • 7,133
  • 1
  • 21
  • 27
1

An appropriate regular expression will get you what you are looking for. Try using a line like this:

var start = page.match(/(<div id="start">[\s\S]*?<\!-- start -->)/)[1];

This uses JavaScript's match method to return an array of matches from your page string, and puts the first parenthetized sub-match (in this case, your #start tag and the following comment), into start.

Here's a demo that shows this method working: http://jsfiddle.net/Ender/mphUj/

Ender
  • 14,995
  • 8
  • 36
  • 51