0

I have this code that basically iterates over an array of URL, and for each URL, calls the page and performs some task on the resulting HTML.

But I want the code to be completely synchronous, i.e. I want to execute all the steps sequentially: launch the browser, get the page, use the content.

I don't understand how to do that.

I'm trying to convert to async function and using await, but I can't make it work

const puppeteer = require('puppeteer');
const urlList = ["http://example.com","http://demo.com"]
for (let i = 0; i< urlList.length; i++) {
  let pageContent = puppeteer
                    .launch()
                    .then(function(browser) {
                         return browser.newPage();
                    })
                   .then(function(page) {
                                    return page.goto(urlList[i])
                                               .then(function() {
                                                   return page.content();
                                                   });
                          })
                   .then(function(html){
                              //do something with html
                              console.log(html.length);
                         })
                   .catch(function(err) {
                                             //handle error
                                   });

}
Glasnhost
  • 1,023
  • 14
  • 34

2 Answers2

1

All you need to do is replace each then with an await:

async function myFunction()
{
  try
  {
     let browser =  await puppeteer.launch();
     let page = await browser.newPage();
     await page.goto(urlList[i]);
     let html = page.content();
     // ...etc
  }
  catch
  {
     //handle error
  }
}

also bear in mind that this must bye called inside of an async function.

Also this is NOT synchronous. This simply flattens the promise chain. It still runs asynchronous. You cannot make an async function synchronous. See How do I return the response from an asynchronous call?

Liam
  • 27,717
  • 28
  • 128
  • 190
  • basically the same solution as the accepted one, but the accepted one works right away. In your code, I guess I have to call myFunction() to make it work. But yes, what I wanted to do was exactly to "flatten the promise chain". thx – Glasnhost Oct 20 '20 at 15:52
1

You can't make your function synchronous (well, actually you can do it in Node.js using the deasync npm package but it's a bad practice and often it doesn't make sense). But you can convert your code to async/await:

const puppeteer = require('puppeteer');
const urlList = ["http://example.com","http://demo.com"]
(async () => {
  for (let i = 0; i< urlList.length; i++) {
    try {
      const browser = await puppeteer.launch();
      const page = await browser.newPage();
      await page.goto(urlList[i]);
      const html = await page.content();
      // do something with html
      console.log(html);
    } catch (err) {
      // handle error
    }
  }
})();

As you can see, you also have to wrap your code into an async function. I have used an IIFE, but you can make it a separate function and reuse it anywhere in your code.

Anton
  • 440
  • 3
  • 10
  • I came to the same conclusion myself. Is it true that there is still one async function (the wrapper) but all the requests are called in sequence, that's what I meant with synchronous. – Glasnhost Oct 20 '20 at 15:50