1

I want to close the page and browser on detection of the change of the innerHTML of a element from "online" to "offline".

Currently I am doing this by checking every 10 seconds.

import puppeteer from 'puppeteer';

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example-chat-app.com');
console.log('chat website has opened')

setInterval(async () => {
    let status = await page.evaluate(`document.querySelector('#statusDiv').innerHTML`)
    if (status == 'offLine') {
        await page.close()
        await browser.close()
    }
}, 1000 * 10);

I guess there should be a way using page.exposeFunction() and MutationObserver but I'm not sure if these will help in my case.

How can I create an event listener for this innerHTML change so I can avoid checking every 10 seconds?

ggorlen
  • 44,755
  • 7
  • 76
  • 106
Alok
  • 7,734
  • 8
  • 55
  • 100

1 Answers1

1

It looks like your .innerHTML might be more precise as .textContent here since that's all it's supposed to contain. I'll assume that it's a text-only element going forward, but if for some reason there is some HTML you want to consider in the comparison, feel free to swap .innerHTML into the code below.


waitForFunction is a the most general "poll on any predicate" tool in Puppeteer. You can think of waitForSelector, waitForResponse and so forth as specialized common cases of waitForFunction.

waitForFunction lets you poll on mutation or requestAnimationFrame, so it's a specialization of calling evaluate and installing your own polling mechanism that tests a predicate repeatedly.

exposeFunction is an interesting idea but won't help much here, since you'll still need to inject code in the browser with an evaluate or similar method anyway. I don't use exposeFunction much.

Here's a generic wrapper on waitForFunction to wait for text to change on an element:

const waitForTextChange = async (
  page,
  sel,
  opts={polling: "mutation", timeout: 30000}
) => {
  const el = typeof sel === "string" ?
    (await page.waitForSelector(sel)) : sel;
  const originalText = await el.evaluate(el => el.textContent);
  return page.waitForFunction(
    (el, originalText) => el.textContent !== originalText,
    opts,
    el,
    originalText,
  );
};

Sample usage:

const puppeteer = require("puppeteer"); // 19.1.0

const html = `
<div id="statusDiv">online</div>
<script>
setTimeout(
  () => document.querySelector("#statusDiv")
    .textContent = "offline",
  4000
);
</script>`;

let browser;
(async () => {
  browser = await puppeteer.launch();
  const [page] = await browser.pages();
  await page.setContent(html);
  const sel = "#statusDiv";
  console.log(await page.$eval(sel, el => el.textContent)); // => online
  await waitForTextChange(page, sel);
  console.log(await page.$eval(sel, el => el.textContent)); // => offline
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close());

The benefit is that you don't need to specify what the text is changing to, only whether it changed.

If you want to be particular about the text you want to wait for, you can use page.waitForSelector("text/offline") or an XPath instead:

await page.waitForSelector('xpath///*[@id="statusDiv" and text()="offline"]');

or the negation:

await page.waitForSelector('xpath///*[@id="statusDiv" and not(text()="online")]');

See this answer for different approaches that allow for whitespace and substrings in your XPath.

If you want to wait for a longer time, go ahead:

await page.waitForSelector(
  'xpath///*[@id="statusDiv" and text()="offline"]',
  {timeout: 10e7}
);

timeout: 0 waits forever, but I prefer something that at least has some termination point as a safety hatch to avoid a zombie process and eventually get a stack trace so I can fix the problem.

Specifying the timeout works on the waitForTextChange or waitForFunction options as well.

ggorlen
  • 44,755
  • 7
  • 76
  • 106