2

I'm attempting to click on an href in puppeteer js.

Here is my code:

await page.goto('https://example.net/search/images?search=real+estate+postcards');
await page.waitForSelector(".GridItem-item-tVq", {visible: true});

const els = await page.$$(".GridItem-item-tVq");

els.forEach(async el => {
  const href = await el.$eval('a', a => a.getAttribute('href'));
  await page.click(href)
})

when I console.log href I get:

/gallery/93911757/High-End-Real-Estate-Postcard-Design/modules/542619723
/gallery/69359021/Postcard-Design/modules/492523237
/gallery/158597277/Post-Card-Design/modules/894845353
/gallery/126032017/Real-Estate-Agent-Postcard/modules/715550465

This is a url in example.com so in example: example.com/gallery/93911757/High-End-Real-Estate-Postcard-Design/modules/542619723 would go to the page.

I attempt to click on it with this:

els.forEach(async el => {
  const href = await el.$eval('a', a => a.getAttribute('href'));
  await page.click(href)
})

 'Document': '/gallery/93911757/High-End-Real-Estate-Postcard-Design/modules/542619723' is not a valid selector.

Since it says it's not a valid selector I run. I also try:

await page.click(el)

But it gives me an error.
error: selector.startsWith is not a function

Obviously I'm not getting the correct element but how can I get the 'a' to click it?

ggorlen
  • 44,755
  • 7
  • 76
  • 106
FabricioG
  • 3,107
  • 6
  • 35
  • 74

1 Answers1

1

page.click(href) doesn't make sense because page.click()'s parameter should be a CSS selector, not an href. Let's check the docs (emphasis mine):

class Page {
  click(selector: string, options?: Readonly<ClickOptions>): Promise<void>;
  //    ^~~~~~~~~~~~~~~~
}

page.click(element) makes less sense, because an element handle is not a string at all. You could use await element.click() on a handle, but this seems roundabout for your case since you don't have a handle to the <a> yet.

It's easy to assume that you need to click like a human would, but since you have the URLs, you can always use a plain old page.goto(href). It's often more reliable than clicking.

el.$eval("a", a => a.click()); is another useful tool, avoiding visibility issues that sometimes arise with Puppeteer's trusted click. But since you're triggering navigations, you'll want to be careful not to wind up with stale handles from a previous page. I'd pull down all of the hrefs up front, then run a navigation on each one. See Crawling multiple URLs in a loop using Puppeteer.

Also worth reading: Using async/await with a forEach loop. Your page is going to try to click all of those links at the same time, and there'll be no good way to wait for the promises to resolve. You probably want to use a sequential for ... of loop.

I'd also try to avoid element handle arrays. They're typically harder to work with than an $$eval.

const hrefs = await page.$$eval(
  ".GridItem-item-tVq a",
  els => els.map(el => el.href)
);

for (const href of hrefs) {
  await page.goto(href);
  // do something on the page
}

As an aside, if you're just pulling hrefs from a page and you don't care about the content loading otherwise, try using "domcontentloaded" and blocking resources which should shave a couple seconds from your navigation.

ggorlen
  • 44,755
  • 7
  • 76
  • 106