5

I am using puppeteer to do some light crawling ~2K pages. But I keep seeing this error re-ocurring

  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 106, in evaluateHandle
    'userGesture': True,
pyppeteer.errors.NetworkError: Protocol error (Runtime.callFunctionOn): Cannot find context with specified id

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
...
  File "/user_code/main.py", line 434, in main_program
    crawl_data = asyncio.get_event_loop().run_until_complete(crawl(browser, url))
  File "/opt/python3.7/lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete
    return future.result()
  File "/user_code/main.py", line 394, in crawl
    title = await page.title()
  File "/env/local/lib/python3.7/site-packages/pyppeteer/page.py", line 1437, in title
    return await frame.title()
  File "/env/local/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 752, in title
    return await self.evaluate('() => document.title')
  File "/env/local/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 295, in evaluate
    pageFunction, *args, force_expr=force_expr)
  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 55, in evaluate
    pageFunction, *args, force_expr=force_expr)
  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 109, in evaluateHandle
    _rewriteError(e)
  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 238, in _rewriteError
    raise type(error)(msg)
pyppeteer.errors.NetworkError: Execution context was destroyed, most likely because of a navigation.
"  

I don't understand how it's triggering an error related to frame.title() because in my code, it only looks for the actual page title not inside its frames.

Also, it calls the page title BEFORE navigating to any frame content at all:

    try:
        # max timeout of 8 seconds
        response = await page.goto(
            url,
            {'timeout': 12000}
        )
        if response.status != 200:
            await page.close()
            return(False)
    except TimeoutError:
        return(False)
    except Exception as e:
        print(e)
        return(False)

    # had this in before, but it was causing too many timeouts.  Error still persists
    #await page.waitForNavigation();

    try:
        source_code = await page.content()
    except:
        return(False)

    # title
    title = await page.title()
    title = title[:1000]

    # get all the frames    
    frames = page.frames
    content = ""
    for frame in frames:
        content_new = await frame.content();
        content += content_new

    await page.close()

what is the likely cause of this recurring error?

24x7
  • 409
  • 1
  • 8
  • 23

0 Answers0