0

I'm currently working on a project and have some questions regarding javascript / nodejs / request / cheerio .

request(address , function (error, response, html) {
    if (!error && response.statusCode == 200) {
      var $ = cheerio.load(html);
      $('iframe').each(function(i, element){
      var a = $(this).attr('src');
});

} });

So I'm having above code scraping precisely the data I want from some websites. I want it to render it in some template later. However it seems like var a lives only in the above piece of code, and there's no way to make it global (wouldn't mind it) or somehow return it. Any ideas?

orde
  • 5,233
  • 6
  • 31
  • 33
Paweł Laskowski
  • 185
  • 2
  • 14

1 Answers1

2

Using Promises can help us easily extract and later make use of data that is loaded asynchronously. In the code snippet below, I've wrapped your logic into a function that returns a Promise that resolves the necessary data:

function iframes(url) {
    return new Promise((resolve, reject) => {
        request(url , function (error, response, html) {
            if (!error && response.statusCode == 200) {
                const $ = cheerio.load(html);

                // Extract list of each iframe's src attribute
                const sources = $('iframe').map((i, element) => {
                    return element.attribs['src'];
                }).get();

                // Resolve iframe sources
                resolve(sources);
                return;
             }

             // You can pass more error information here
             reject('error loading url for iframe sources');
         });
    });
}

And we can use this function like so:

iframes('http://www.w3schools.com/html/html_iframe.asp')
    .then(srcs => {
        // Can access the sources
        console.log(srcs);
    })
    .catch(err => console.log(err));
Calvin Belden
  • 3,114
  • 1
  • 19
  • 21
  • I will test it in few minutes, thanks a lot for effort though! Really appreciate it – Paweł Laskowski Mar 09 '16 at 10:36
  • Hey, it works like a charm! I have another question to follow up if you don't mind! I have extracted srcs from function iframes as you said and now I'm trying to display them using ejs. So I'd assume they'd be availible. I'm just calling render(toBeDisplayed.html.ejs) but it still says the variable is undefined.. Tried with omitting var etc but still no luck.. – Paweł Laskowski Mar 09 '16 at 11:26
  • When i'm trying to debug it I'm getting strange results as well.. console.log('prior render delLater: ' + links[0]); <- links[0] is defined res.render('delLater.html.ejs'); console.log('after render: ' + link[0]); <- reference error – Paweł Laskowski Mar 09 '16 at 11:34
  • I'd have to see the code to know for sure but I wonder if you're render function is getting called before the Promise (with the sources) is getting resolved. – Calvin Belden Mar 09 '16 at 14:37
  • Hi, Yes - that has been an issue. I've fixed it simply by passing argument to render function. Thanks a lot! – Paweł Laskowski Mar 09 '16 at 15:52