1

I am trying to click the "Create File" button on fakebook's download your information page. I am currently able to goto the page, and I wait for the login process to finish. However, when I try to detect the button using

page.$x("//div[contains(text(),'Create File')]")

nothing is found. The same thing occurs when I try to find it in the chrome dev tools console, both in a puppeteer window and in a regular window outside of the instance of chrome puppeteer is controlling:

enter image description here

This is the html info for the element:

html information for offending element

I am able to find the element however after I have clicked on it using the chrome dev tools inspector tool:

enter image description here (the second print statement is from after I have clicked on it with the element inspector tool)

How should I select this element? I am new to puppeteer and to xpath so I apologize if I just missed something obvious.

A small few links I currently remember looking at previously:

  1. Puppeteer can't find selector
  2. puppeteer cannot find element
  3. puppeteer: how to wait until an element is visible?

My Code:

const StealthPlugin = require("puppeteer-extra-plugin-stealth");

(async () => {
 let browser;
 try {
   puppeteer.use(StealthPlugin());

   browser = await puppeteer.launch({
     headless: false,
     // path: "C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe",
     args: ["--disable-notifications"],
   });
   const pages = await browser.pages();
   const page = pages[0];

   const url = "https://www.facebook.com/dyi?referrer=yfi_settings";

   await page.goto(url);

   //Handle the login process. Since the login page is different from the url we want, I am going to assume the user
   //has logged in if they return to the desired page.
   //Wait for the login page to process
   await page.waitForFunction(
     (args) => {
       return window.location.href !== args[0];
     },
     { polling: "mutation", timeout: 0 },
     [url]
   );

   //Since multifactor auth can resend the user temporarly to the desired url, use a little debouncing to make sure the user is completely done signing in
   // make sure there is no redirect for mfa
   await page.waitForFunction(
     async (args) => {
       // function to make sure there is a debouncing delay between checking the url
       // Taken from: https://stackoverflow.com/a/49813472/11072972
       function delay(delayInms) {
         return new Promise((resolve) => {
           setTimeout(() => {
             resolve(2);
           }, delayInms);
         });
       }

       if (window.location.href === args[0]) {
         await delay(2000);
         return window.location.href === args[0];
       }
       return false;
     },
     { polling: "mutation", timeout: 0 },
     [url]
   );
   // await page.waitForRequest(url, { timeout: 100000 });

   const requestArchiveXpath = "//div[contains(text(),'Create File')]";
   await page.waitForXPath(requestArchiveXpath);
   const [requestArchiveSelector] = await page.$x(requestArchiveXpath);
   await page.click(requestArchiveSelector);

   page.waitForTimeout(3000);
 } catch (e) {
   console.log("End Error: ", e);
 } finally {
   if (browser) {
     await browser.close();
   }
 }
})();
Thomas Sloan
  • 63
  • 1
  • 2
  • 9
  • 3
    I suspect the element is inside an iframe. When you click on an element and inspect it in the DevTools, the console context usually toggles into iframe context and you can find it then. – vsemozhebuty Aug 12 '21 at 18:48
  • 1
    @vsemozhebuty Awesome, Thank you. It is. I'll look into how to click in an iframe now – Thomas Sloan Aug 12 '21 at 18:52
  • Please be aware that Facebook does not allow you to scrape them – WizKid Aug 16 '21 at 00:11

1 Answers1

3

Resolved using the comment above by @vsemozhebuty and source. Only the last few lines inside the try must change:

    const iframeXpath = "//iframe[not(@hidden)]";
    const requestArchiveXpath = "//div[contains(text(),'Create File')]";

    //Wait for and get iframe
    await page.waitForXPath(iframeXpath);
    const [iframeHandle] = await page.$x(iframeXpath);

    //content frame for iframe => https://devdocs.io/puppeteer/index#elementhandlecontentframe
    const frame = await iframeHandle.contentFrame();
    
    //Wait for and get button
    await frame.waitForXPath(requestArchiveXpath);
    const [requestArchiveSelector] = await frame.$x(requestArchiveXpath);

    //click button 
    await requestArchiveSelector.click();
    await page.waitForTimeout(3000);
Thomas Sloan
  • 63
  • 1
  • 2
  • 9