0

I was trying to run playwright on google colab but getting an error

Installed playwright and chromium

!pip install playwright 
!playwright install

To run run async stuff in a notebook

import nest_asyncio
nest_asyncio.apply()

My Code

import time
import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page(storage_state='auth.json')
        await page.goto('https://www.google.com')
        time.sleep(6)
        html = await page.content()

        time.sleep(5)

        # await browser.close()


asyncio.run(main())

which gives me following error

/usr/lib/python3.7/asyncio/futures.py in result(self)
    179         self.__log_traceback = False
    180         if self._exception is not None:
--> 181             raise self._exception
    182         return self._result
    183 

Error: Browser closed.
==================== Browser output: ====================
<launching> /root/.cache/ms-playwright/chromium-1015/chrome-linux/chrome --disable-field-trial-config --disable-background-networking --enable-features=NetworkService,NetworkServiceInProcess --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-back-forward-cache --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=ImprovedCookieControls,LazyFrameLoading,GlobalMediaControls,DestroyProfileOnBrowserClose,MediaRouter,DialMediaRouteProvider,AcceptCHFrame,AutoExpandDetailsElement,CertificateTransparencyComponentUpdater,AvoidUnnecessaryBeforeUnloadCheckSync --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --disable-sync --force-color-profile=srgb --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --no-service-autorun --export-tagged-pdf --no-sandbox --user-data-dir=/tmp/playwright_chromiumdev_profile-IAbW15 --remote-debugging-pipe --no-startup-window
<launched> pid=656
[pid=656][err] src/tcmalloc.cc:283] Attempt to free invalid pointer 0x29000020c5a0 
=========================== logs ===========================
<launching> /root/.cache/ms-playwright/chromium-1015/chrome-linux/chrome --disable-field-trial-config --disable-background-networking --enable-features=NetworkService,NetworkServiceInProcess --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-back-forward-cache --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=ImprovedCookieControls,LazyFrameLoading,GlobalMediaControls,DestroyProfileOnBrowserClose,MediaRouter,DialMediaRouteProvider,AcceptCHFrame,AutoExpandDetailsElement,CertificateTransparencyComponentUpdater,AvoidUnnecessaryBeforeUnloadCheckSync --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --disable-sync --force-color-profile=srgb --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --no-service-autorun --export-tagged-pdf --no-sandbox --user-data-dir=/tmp/playwright_chromiumdev_profile-IAbW15 --remote-debugging-pipe --no-startup-window
<launched> pid=656
[pid=656][err] src/tcmalloc.cc:283] Attempt to free invalid pointer 0x29000020c5a0 
Himanshu Poddar
  • 7,112
  • 10
  • 47
  • 93
  • `headless=False` - does Google Colab runs with a GUI? Change that to True and try again – Barry the Platipus Jul 22 '22 at 19:17
  • I tried with `headless=True` but it did not work, same error `[pid=258][err] src/tcmalloc.cc:283] Attempt to free invalid pointer 0x19180020c5a0 ` – Himanshu Poddar Jul 23 '22 at 15:21
  • To run headed on Google Colab: `!apt install xvfb; pip install pyvirtualdisplay` and `import pyvirtualdisplay; display = pyvirtualdisplay.Display(); display.start()`, credits goes to [this colab](https://colab.research.google.com/drive/13bQO6G_hzE1teX35a3NZ4T5K-ICFFdB5?usp=sharing). – omegastripes Nov 07 '22 at 23:03

1 Answers1

3

When I try to run the chromium browser that downloaded by playwright using this command

!/root/.cache/ms-playwright/chromium-1033/chrome-linux/chrome

It gives this error

src/tcmalloc.cc:283] Attempt to free invalid pointer 0x18400020c5a0

It means the problem is somehow related to the browser downloaded by playwright. We can use a different browser.

First, install chromium.

!apt install chromium-chromedriver

Set executable_path and user_data_dir with launch_persistent_context in your code.

import nest_asyncio
nest_asyncio.apply()

import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch_persistent_context(
            executable_path="/usr/bin/chromium-browser",
            user_data_dir="/content/random-user"
        )
        page = await browser.new_page()
        await page.goto("https://google.com")
        title = await page.title()
        print(f"Title: {title}")
        await browser.close()

asyncio.run(main())

I know this is not the right solution but it works.

Update

Since google colab updated to ubuntu 20.04 installing chromium-browser requires snap and I couldn't install chromium with snap.

But in this github comment someone had already found the real problem(it is about root privilege).

If we follow the same steps and for testing save the code below into a file named test_playwright.py on google colab

import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()
        await page.goto("https://google.com")
        title = await page.title()
        print(f"Title: {title}")
        await browser.close()

asyncio.run(main())

Run it with sudo:

!sudo python test_playwright.py

It works.