Questions tagged [pyppeteer]

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.


Pyppeteer is mostly used for:

  1. Generate screenshots and PDFs of pages.
  2. Crawl an SPA and generate pre-rendered content (i.e. "SSR").
  3. Scrape content from websites.
  4. Automate form submission, UI testing, keyboard input, etc.
  5. Create an up-to-date, automated testing environment. Run your tests directly in the latest version of Chrome using the latest JavaScript and browser features.
  6. Capture a timeline trace of your site to help diagnose performance issues.

Resources:

Differences from puppeteer

185 questions
19
votes
7 answers

pyppeteer.errors.BrowserError: Browser closed unexpectedly

Today, I learn the lib called pyppeteer,When I run my code import asyncio from pyppeteer import launch async def main(): browser = await launch(options={'devtools': True, 'headless': False}) page = await browser.newPage() await…
Sugan Zhang
  • 233
  • 1
  • 2
  • 12
14
votes
4 answers

nbconvert failed: Pyppeteer is not installed to support Web PDF conversion. Please install `nbconvert[webpdf]` to enable

Getting error as "nbconvert failed: Pyppeteer is not installed to support Web PDF conversion. Please install nbconvert[webpdf] to enable." while trying to download jupyter notebook file (.ipynb) as a PDF file
Girish Shenoy
  • 429
  • 1
  • 4
  • 10
10
votes
5 answers

Pyppeteer: Browser closed unexpectedly in AWS Lambda

I'm running into this error in AWS Lambda. It appears that the devtools websocket is not up. Not sure how to fix it. Any ideas? Thanks for your time. Exception originated from get_ws_endpoint() due to websocket response timeout…
Sudhakar
  • 2,904
  • 8
  • 33
  • 47
8
votes
1 answer

Running pypupeteer in FLASK gives ValueError: signal only works in main thread

I am trying to integrate pyppeteer in a flask app. I have python script that runs pyppeteer and takes a screenshot of a page.This is working file if I run the script individually. The PROBLEM is the same script does not work when i run it in a FLASK…
7
votes
3 answers

Connect with pyppeteer to existing chrome

I want to connect to an existing (already opened, by the user, without any extra flags) Chrome browser using pyppeteer so I would be able to control it. I can do almost every manual action before (for example, enabling remote debugging mode in…
Noam
  • 71
  • 1
  • 5
6
votes
2 answers

How to disable Images/CSS in Pyppeteer?

How to disable images/CSS in Puppeteer? I've seen this tutorial https://www.scrapehero.com/how-to-increase-web-scraping-speed-using-puppeteer/ but I don't know how to translate it to Python
Leo
  • 370
  • 7
  • 12
6
votes
1 answer

Python: keep open browser in pyppeteer and create CDPSession

I've got two issues that I can't solve it at them moment. 1. I would like to keep the browser running so I could just re-connect using pyppeteer.launcher.connect() function but it seems to be closed imidiately even if I don't call…
HTF
  • 6,632
  • 6
  • 30
  • 49
5
votes
1 answer

Python programs hangs after aycnio exception due to Pyppeteer unexpectedly closing

My pyppeteer connection unexpectedly closed, and it left my Python program hanging instead of shutting down and properly logging the error. Does anyone know how to properly catch this exception and properly exit from Python program? Here is part of…
MasayoMusic
  • 594
  • 1
  • 6
  • 24
5
votes
1 answer

Downloading pdf files using playwright-python

I'm trying to download PDF files that are rendered in a browser (not shown as a popup or downloaded) using playwright (Python). No URL is exposed, so you can't simply scrape a link and download it using requests.get("file_url"). I've tried: async…
FarNorth
  • 289
  • 1
  • 7
  • 16
5
votes
1 answer

pyppeteer wait until all elements of page is loaded

I am using pyppeteer to trigger headless chrome and perform some actions. But first I want all the elements of the web page to load completely. The official documentation of pyppeteer suggests a waitUntil parameter which comes with more than 1…
Mahesh
  • 1,117
  • 2
  • 23
  • 42
5
votes
1 answer

Scraping data with pyppeteer

I'm trying to scrape data from this site https://quickfs.net/company/BABA:US using the pyppeteer, without the this website will know I'm scraping. So my first question is: Is it correct that using pyppeteer for scraping I won't be noticed (by the…
TaL
  • 173
  • 2
  • 15
5
votes
1 answer

Python pyppeteer proxy usage

I want to run chromium browser using auth proxy. I have this code, but chromium does not connect via the proxy. Any suggestions please? import asyncio from pyppeteer import launch async def main(): browser = await launch({'http_proxy':…
Jan Šuman
  • 61
  • 1
  • 4
5
votes
0 answers

Pyppeteer / Puppeteer NetworkError: Execution context was destroyed, most likely because of a navigation

I am using puppeteer to do some light crawling ~2K pages. But I keep seeing this error re-ocurring File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 106, in evaluateHandle 'userGesture':…
24x7
  • 409
  • 1
  • 8
  • 23
5
votes
2 answers

Python: Pyppeteer clicking on pop up window

I'm trying to accept the cookies consent on a pop up window that is generated on this page. I tried to use waitForSelector but none of the selectors that I used seems to be visible to the headless browser. I would like to actually switch to "YES"…
HTF
  • 6,632
  • 6
  • 30
  • 49
4
votes
0 answers

Why am I getting: "Future exception was never retrieved" error

I am using asyncio and pyppeteer to test scraping sites. Currently I have: browser = await launch( args=[f'--proxy-server={proxyUrl}'], headless=True, autoClose=False ) to launch the browser. I am using autoClose=False…
ThySpecter
  • 41
  • 3
1
2 3
12 13