1

I am using Puppeteer to build an automated shopping bot. I have made this function addToCart to add a product to the cart. After adding the product, I want to open a new url in the same browser session but it's not working.

const puppeteer = require("puppeteer");

const product_url =
  "https://www.amazon.com/gp/product/B08ZL7LZW3?pf_rd_r=J6QNRZDRJ7Z8FSF2HVXR&pf_rd_p=6fc81c8c-2a38-41c6-a68a-f78c79e7253f&pd_rd_r=d3adff00-5e6f-4456-8fd0-bd187f2ff86d&pd_rd_w=oO6kF&pd_rd_wg=aoGGq&ref_=pd_gw_unk";

const checkout_url = "https://www.amazon.com/gp/buy/shipoptionselect/handlers/display.html?hasWorkingJavascript=1";

async function givePage() {
  const browser = await puppeteer.launch({
    headless: false,
  });
  const page = await browser.newPage();
  return page;
}

async function addToCart(page) {
  await page.goto(product_url);
  await page.waitForSelector(
    "button[class='single_add_to_cart_button button alt']"
  );

  await page.click(
    "button[class='single_add_to_cart_button button alt']",
    (elem) => elem.click()
  );
}

async function checkout() {
  var page = await givePage();
  await addToCart(page);
  await page.waitForNavigation();
  const page2 = await browser.newPage(); // open new tab
  await page2.goto("https://www.amazon.com/gp/buy/shipoptionselect/handlers/display.html?hasWorkingJavascript=1"); // go to github.com
  await page2.bringToFront(); // make the tab active
}

checkout();
ggorlen
  • 44,755
  • 7
  • 76
  • 106
  • 2
    Your `browser` variable is declared *inside* `givePage()`, so you cannot use it in the `checkout()` function. – Pointy Jun 26 '21 at 15:27

1 Answers1

0

Every call to givePage creates a new browser from scratch, but abandons the reference to the browser object when the function returns, making it impossible to get new pages or close it from a global context (it's possible to use page.browser() to close the browser locally, but it can't be shared between requests).

You may want to make one browser, pull one page from it, and repeatedly access that same page from different functions.

Alternately, you may wish to use one browser, then generate new pages ("tabs") on demand, as appears to be the case here.

Caching promises in variables is one way to achieve these things. In an outer scope, you can write

const browserPromise = puppeteer.launch({headless: false});

Note that there's no await. The idea is to cache the promise that resolves to the browser. When await browserPromise is invoked later in the code, the promise always resolves to the same underlying browser instance. The same approach can work for pages -- const pagePromise = browser.newPage(); invoked multiple times will always give you the same page object.

Here's an example. Error handling and getting past Amazon's robot blocker are left as exercises.

const puppeteer = require("puppeteer");

const product_url = "https://www.amazon.com/gp/product/B08ZL7LZW3?pf_rd_r=J6QNRZDRJ7Z8FSF2HVXR&pf_rd_p=6fc81c8c-2a38-41c6-a68a-f78c79e7253f&pd_rd_r=d3adff00-5e6f-4456-8fd0-bd187f2ff86d&pd_rd_w=oO6kF&pd_rd_wg=aoGGq&ref_=pd_gw_unk";
const checkout_url = "https://www.amazon.com/gp/buy/shipoptionselect/handlers/display.html?hasWorkingJavascript=1";

const browserPromise = puppeteer.launch({headless: false});

const newPage = () => browserPromise.then(browser => browser.newPage());

async function addToCart(page) {
  await page.goto(product_url);
  await page.waitForSelector(
    "button[class='single_add_to_cart_button button alt']"
  );

  await page.click(
    "button[class='single_add_to_cart_button button alt']",
    (elem) => elem.click()
  );
}

async function checkout() {
  const page = await newPage();
  await addToCart(page);
  await page.waitForNavigation();
  const page2 = await newPage(); // open new tab
  await page2.goto("https://www.amazon.com/gp/buy/shipoptionselect/handlers/display.html?hasWorkingJavascript=1"); // go to github.com
  await page2.bringToFront(); // make the tab active

  const browser = await browserPromise;
  await browser.close();
}

checkout();

That said, for small scripts like this you can always inline all of the code in one function. If your script performs a simple task and can be written in 10 or 20 lines to code, it's probably premature to create abstractions.

For larger scripts, you may want to increase abstraction and gather the browser and relevant page functions in a class or object, but whether you have an object property or a loose promise, the underlying approach of awaiting one browser or page promise repeatedly is the same.

ggorlen
  • 44,755
  • 7
  • 76
  • 106
  • HI, @ggorlen Thanks for the detailed explanation. I tried this solution but got the unhandled promise rejection warning and the new tab didn't open with the required URL. – Ahmad_alisaim Jun 26 '21 at 17:51
  • What was the error message, specifically? As I mentioned in the post, Amazon will probably detect that you're a robot and block the request, causing a timeout (```TimeoutError: waiting for selector `button[class='single_add_to_cart_button button alt']` failed: timeout 30000ms exceeded```). That's a different problem than making the browser object accessible by caching the promise. If you comment out `await addToCart(page);` and `await page.waitForNavigation();` you'll see that the fundamental pattern is sound, then follow the tips in the linked post to bypass the robot detector. – ggorlen Jun 26 '21 at 17:55