0

I am trying to parse the HTML of a webpage to DOM by loading it into an iframe and do some searching on the DOM afterwards. Here's the code

function f(callback) {
    var tmp = document.createElement('iframe');
    $(tmp).hide();
    $(tmp).insertAfter($('foo'));
    $(tmp).attr('src', url);

    $(tmp).load(function() {
        var bdy = tmp.contentDocument.body;
        callback(bdy);
        $(tmp).remove();
    });
}

In the callback function if I do something like the following

function callback(bdy) {
    alert($(bdy).find('bar').length);
}

sometimes it gives me the correct value but sometimes it gives me 0 instead. However, if I do the following it works

var tmp = document.createElement('iframe');
$(tmp).hide();
$(tmp).insertAfter($('foo'));
$(tmp).attr('src', url);

$(tmp).load(function(tmp) {
    setTimeout(function() {
        var bdy = tmp.contentDocument.body;
        callback(bdy);
        $(tmp).remove();
    }, '100');
});

Since setTimeout() depends on the client's end, I would like to know if there is any better way to achieve the same goal. Thanks.

nichehole
  • 3
  • 3

1 Answers1

0

Check this question out, along with its many answers and related questions.


Update; here's some code that waits for #bar to get loaded:

function f(callback) {
    var tmp = document.createElement('iframe'),
        $tmp = $(tmp);
    $tmp.hide()
        .insertAfter($('foo'))
        .attr('src', url);

    $tmp.load(function() {
        var bdy = tmp.contentDocument.body,
            $bdy = $(bdy); // small optimization

        var waitForBar = function() {
            if($bdy.find('#bar').length > 0) {
                callback(bdy);
                $tmp.remove();
            } else
                setTimeout(waitForBar, 50);
        };
        waitForBar();
    });
}
Community
  • 1
  • 1
Simeon
  • 5,519
  • 3
  • 29
  • 51
  • I am using the accepted solution from there but it does not solve my problem. I've noticed that it doesn't work all the time even if I apply my function directly on the website instead of loading it into an iframe. – nichehole Jun 04 '11 at 13:26
  • In your callback, what do you get as a result if you try `alert($(bdy))`? Or better, if you are using FireBug or Chrome, do a `console.log($, bdy, $(bdy), $(bdy).find('#bar'))` and comment your results? – Simeon Jun 05 '11 at 19:34
  • If I do `alert($(bdy))` it always gives me `[object HTMLBodyElement]`. Using chromium, a `console.log($, bdy, $(bdy), $(bdy).find('#bar'))` always gives me `function (a,b){return new e.fn.init(a,b,h)}` then the `` node in console but only gives me the node containing `'#bar'` occasionally. I suspect the webpage adds the node containing `'#bar'` into the DOM via some javascript and sometimes before it does that, jquery thought that page's finished loading already? – nichehole Jun 06 '11 at 04:34
  • OK! If you do not have control over the iframe's contents and want to find a dynamically added element, you have no choice but to wait until the content has loaded. In order to determine the amount of wait time needed, see my updated answer. It shows how to wait for #bar to get added. – Simeon Jun 06 '11 at 16:10
  • Thanks. This works. One small question, According to [this question](http://stackoverflow.com/questions/729921/settimeout-or-setinterval), would it be better to use setTimeout recursively instead of setInterval here? – nichehole Jun 07 '11 at 16:21
  • The interval is disposed of when #bar is found, which never takes longer than 50 ms since it's a native `document.getElementById` call. However, your point is valid since setInterval fires the function based on time rather than wait time. Theoretically, it could take longer than 50 ms if the client's computer was hit by the lightning and recovered 51 ms afterwards. So I've edited my answer! – Simeon Jun 07 '11 at 17:11