BeautifulSoup can't get all page source sometimes

Question

I'm using Selenium and beautifulSoup4 for scraping. The problem is that my script sometimes 'result'is empty and sometimes no. I don't understand why it's not working sometimes. Is it a security problem in the website or RAM problem ? I have no idea

page_source = BeautifulSoup(driver.page_source, "html.parser")

result= page_source.find_all('div',{'class':'pv-profile-section-pager ember-view'})

When it does not work, do you see any error ? – cruisepandey Jun 09 '21 at 14:15 — cruisepandey, Jun 09 '21 at 14:15
no just the variable result is empty – Lydia Jun 09 '21 at 14:32 — Lydia, Jun 09 '21 at 14:32

score 0 · Answer 1 · answered Jun 09 '21 at 14:40

0

I would suggest to have some delay, cause there's no error as per OP.

put some time.sleep(5)

if you want to do it using Selenium, I would suggest you to have a look on ExplicitWait from Selenium in Python bindings.

Python - Selenium - ExplicitWait

Sample code :

try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "myDynamicElement"))
    )
finally:
    driver.quit()

answered Jun 09 '21 at 14:40

cruisepandey

28,520
6
20
38

A few days ago, running the soup.find_all line correctly gave me the information I needed. However, running this now gives me an empty array [] – Lydia Jun 10 '21 at 08:16

score 0 · Accepted Answer · answered Sep 17 '22 at 18:13

0

your class name can be error somewhere, you can try:

result= page_source.find_all('div',{'class': lambda x: x and 'pv-profile-section-pager' in x})

or iframe html tag can be also a problem here Select iframe using Python + Selenium

answered Sep 17 '22 at 18:13

lam vu Nguyen

433
4
9

BeautifulSoup can't get all page source sometimes

2 Answers2