Web scraping Content Overflow

Question

I'm trying to scrape a local site using beautifulsoup in Jupyter Lab, but it has only one page with too much content. When I try to run this code:

import requests
from bs4 import BeautifulSoup
import re
import string

login_url=('http://192.168.1.18/index.php?go=login')
login_success=('http://192.168.1.18/cashier')

payload={
    'is_submitted': 1,
    'username':'admin',
    'password':'admin',
    'submit':'Submit',
}
headers={
   'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 Edg/91.0.864.64',
}
s = requests.session()
r = s.post(login_url,data=payload)
soup = BeautifulSoup(r.content,'html.parser')
req =s.get(login_success,headers=headers)
soups= BeautifulSoup(req.content,'html.parser')
print(soups.prettify())

it throws this error:

IOPub data rate exceeded. The Jupyter server will temporarily stop sending output to the client in order to avoid crashing it. To change this limit, set the config variable --ServerApp.iopub_data_rate_limit. Current values: ServerApp.iopub_data_rate_limit=1000000.0 (bytes/sec) ServerApp.rate_limit_window=3.0 (secs)

I already tried this though IOPub data rate exceeded in Jupyter notebook (when viewing image) you can check it for more details.

Does this answer your question? [IOPub data rate exceeded in Jupyter notebook (when viewing image)](https://stackoverflow.com/questions/43288550/iopub-data-rate-exceeded-in-jupyter-notebook-when-viewing-image) — αԋɱҽԃ αмєяιcαη, Jul 19 '21 at 07:46
@αԋɱҽԃαмєяιcαη see my answer for why it is not a duplicate; it might become a dupe once Notebook migrates to Jupyter Server in v7.0 (see https://stackoverflow.com/a/67804732/6646912) or so, but it requires a different solution as for now. — krassowski, Jul 19 '21 at 08:29

score 1 · Accepted Answer · answered Jul 19 '21 at 08:26

please note that this is not an error. Your code runs fine. Jupyter is trying to protect you from crashing your browser by displaying too much content at once. The computations are still going on underneath, it is just the print that gets suppressed to help you. Try printing first 1000 characters or something like this.

As for the question suggested as in comments to be a duplicate: it indeed requires adjustments for JupyterLab 3.0+; please note this is ServerApp rather than NotebookApp now:

jupyter lab --ServerApp.iopub_data_rate_limit=1.0e10

Also if you want to store the setting in a file it should be jupyter_server_config.py not jupyter_notebook_config.py; you can get it by:

jupyter server --generate-config

and then changing the ServerApp.iopub_data_rate_limit traitlet, like:

c.ServerApp.iopub_data_rate_limit = 1000000

There are other traitlets that might be of interest too:

c.ServerApp.iopub_msg_rate_limit = 1000
c.ServerApp.rate_limit_window = 3

Web scraping Content Overflow

1 Answers1