24

I'm trying something really simple:

  • Navigate to google.com
  • Fill the search box with "cheese"
  • Press enter on the search box
  • Print the text for the title of the first result

So simple, but I can't get it to work. This is the code:

const playwright = require('playwright');

(async () => {
  for (const browserType of ['chromium', 'firefox', 'webkit']) {
    const browser = await playwright[browserType].launch();
    try {
      const context = await browser.newContext();
      const page = await context.newPage();
      await page.goto('https://google.com');
      await page.fill('input[name=q]', 'cheese');
      await page.press('input[name=q]', 'Enter');
      await page.waitForNavigation();

      page.waitForSelector('div#rso h3')
          .then(firstResult => console.log(`${browserType}: ${firstResult.textContent()}`))
          .catch(error => console.error(`Waiting for result: ${error}`));
    } catch(error) {
      console.error(`Trying to run test on ${browserType}: ${error}`);
    } finally {
      await browser.close();
    }
  }
})();

At first I tried to get the first result with a page.$() but it didn't work. After investigating the issue a little bit I discovered that page.waitForNavigation() that I thought would be the solution, but it isn't.

I'm using the latest playwright version: 1.0.2.

hardkoded
  • 18,915
  • 3
  • 52
  • 64
  • 1
    This is probably not your issue but for anyone else googling this error: you get "target closed" if your test times out during a `waitForNavigation` call. In this case, it will also say `Timeout of XXXms exceeded.` a bit higher up in the console output. – ehrencrona Dec 21 '21 at 18:07

4 Answers4

10

It seems to me that the only problem was with your initial promise composition, I've just refactored the promise to async/await and using page.$eval to retrieve the textContent it works perfectly, there are no target closed errors anymore.

try {
      const context = await browser.newContext();
      const page = await context.newPage();
      await page.goto('https://google.com');
      await page.fill('input[name=q]', 'cheese');
      await page.press('input[name=q]', 'Enter');
      await page.waitForNavigation();

      // page.waitForSelector('div#rso h3').then(firstResult => console.log(`${browserType}: ${firstResult.textContent()}`)).catch(error => console.error(`Waiting for result: ${error}`));

      await page.waitForSelector('div#rso h3');
      const firstResult = await page.$eval('div#rso h3', firstRes => firstRes.textContent);
      console.log(`${browserType}: ${firstResult}`)
    } catch(error) {
      console.error(`Trying to run test on ${browserType}: ${error}`);
    } finally {
      await browser.close();
    }
  }

Output:

chrome: Cheese – Wikipedia
firefox: Cheese – Wikipedia
webkit: Cheese – Wikipedia

Note: chrome and webkit works, firefox fails on waitForNavigation for me. If I replaced it with await page.waitForTimeout(5000); firefox worked as well. It might be an issue with playwright's Firefox support for the navigation promise.

theDavidBarton
  • 7,643
  • 4
  • 24
  • 51
3

If you await the page.press('input[name=q]', 'Enter'); it might be too late for waitForNavigation to work.

You could remove the await on the press call. You can need to wait for the navigation, not the press action.

const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://google.com');
await page.fill('input[name=q]', 'cheese');
page.press('input[name=q]', 'Enter');
await page.waitForNavigation();

var firstResult = await page.waitForSelector('div#rso h3');
console.log(`${browserType}: ${await firstResult.textContent()}`);

Also notice that you need to await for textContent().

hardkoded
  • 18,915
  • 3
  • 52
  • 64
  • 2
    Wait or not to wait for the page.pres() I think it's irrelevant. I have tried your suggestion of not awaiting for that and I still get the same error. That's not the problem. The problem is it somehow loses the context after that navigation. – Daniel Hernández Alcojor May 22 '20 at 06:32
  • 1
    I'm having the same issue. It's using the context of the page before it navigates. Did you ever find a solution for this? – Justin Young Sep 23 '20 at 18:24
2
  1. In my case the Playwright error Target closed appeared at the first attempt to retrieve a text from the page. The error is inaccurate, the actual reason was that the Basic Auth was enabled in the target site. Playwright could not open a page and just stuck with "Target closed".

    const options = { 
       httpCredentials = { username: 'user', password: 'password'}
    };
    const context = await browser.newContext(options);
    
  2. One more issue was that local tests were running without a problem, including docker containers, while Github CI was failing with Playwright without any details except the above error.
    The reason was with a special symbol in a Github Secret. For example, the dollar sign $ will be just removed from the secret in Github Actions. To correct it, either use env: section

    env:
      XXX: ${ secrets.SUPER_SECRET }
    

    or wrap the secret in single quotes:

    run: |
      export XXX='${{ secrets.YYY}}'
    

    A similar escaping specificity exists in Kubernetes, Docker and Gitlub; $$ becomes $ and z$abc becomes z.

  3. Use mcr.microsoft.com/playwright docker hub image from Microsoft with pre-installed node, npm and playwright. Alternatively during the playwright installation do not forget to install system package dependencies by running npx playwright install-deps.

  4. A VM should have enough resources to handle browser tests. A common problem in CI/CD worfklows.

Artur A
  • 7,115
  • 57
  • 60
  • Artur, are you able to share some configurations for browser launch and dockerfile? I am stuck on a similar problem. I am running automated browsers in GKE and seeing same error. Running them locally on my host laptop or running a docker container on my laptop works perfectly fine. – ashkaps Feb 25 '22 at 13:05
2

Be sure to await all promises and avoid combining then with await.

//vvv
await page.waitForSelector('div#rso h3')
//^^^

Note that await page.waitForNavigation(); can cause a race condition if called after the event that triggers the navigation. I generally avoid waitForNavigation in favor of waiting for a selector or condition that appears on the next page. This typically results in faster, shorter, more reliable code.

If you do use waitForNavigation, set it alongside with Promise.all or before the nav trigger event, in this case press.


After these adjustments, if your goal is to get the data as quickly and reliably as possible rather than test the steps along the way, there's room for improvement.

It's often unnecessary to navigate to a landing page, then type into a box in order to run a search. It's typically faster and less error-prone to navigate directly to the results page with your query encoded into the URL. In this case, your code can be reduced to

const url = "https://www.google.com/search?q=cheese";
await page.goto(url, {waitUntil: "networkidle"});
console.log(await page.textContent(".fP1Qef h3"));

If you notice that the text you want is in the static HTML as is the case here, you can go a step further and block JS and external resources:

const playwright = require("playwright"); // ^1.30.1

let browser;
let context;
(async () => {
  browser = await playwright.chromium.launch();
  context = await browser.newContext({javaScriptEnabled: false});
  const page = await context.newPage();
  const url = "https://www.google.com/search?q=cheese";
  await page.route("**", route => {
    if (route.request().url().startsWith(url)) {
      route.continue();
    }
    else {
      route.abort();
    }
  });

  // networkidle is a suboptimal way to handle redirection
  await page.goto(url, {waitUntil: "networkidle"});
  console.log(await page.locator(".fP1Qef h3").allTextContents());
})()
  .catch(err => console.error(err))
  .finally(async () => {
    await context?.close();
    await browser?.close();
  });

Once you block JS and all external resources, you can often go all the way to the holy grail of web scraping: skip browser automation entirely and use a HTTP request and lightweight HTML parser instead:

const cheerio = require("cheerio"); // 1.0.0-rc.12

const query = "cheese";
const url = `https://www.google.com/search?q=${encodeURIComponent(query)}`;

fetch(url, { // Node 18 or install node-fetch
  headers: {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36",
  }
})
  .then(res => res.text())
  .then(html => {
    const $ = cheerio.load(html);
    console.log($(".LC20lb").first().text()); // first result
    console.log([...$(".LC20lb")].map(e => $(e).text())); // all results
  });
ggorlen
  • 44,755
  • 7
  • 76
  • 106