0

I am trying to scrape house prices data using Selenium and BeautiuflSoup. Here is the code I am using:

driver_path = r"C:\Users\berid\python\webdriver\msedgedriver.exe"
service = Service(driver_path)
driver = webdriver.Edge(service=service)

articles_list=[]
for city_url in city_urls:
    # defining maximum number of pages for each city_url
    driver.get(f'https://www.housing_website.com/city/{city_url}/page-1')
    time.sleep(5)
    main_html=driver.page_source
    
    main_soup=BeautifulSoup(main_html,'html.parser')
    max_pages_tag=main_soup.select_one('div[class="descriptionAndModeContainer"] div[class="homes summary"]')
    max_pages=int(int(max_pages_tag.text.split('of')[-1].split('home')[0].strip())/40) if max_pages_tag else None
    
    for page in range(1,500):
        try:
            page_url = f'https://www.housing_website.com/city/{city_url}/page-{page}'
            driver.get(page_url)
            time.sleep(5)
            html = driver.page_source
            soup = BeautifulSoup(html, 'html.parser')
        
            articles=soup.select('div[class="map collapsedList"] div[class="HomeCardContainer defaultSplitMapListView"] div[class="bottomV2"]')
            for i,article in enumerate(articles):

                articles_list.append(article.text)
            print(city_url.split('/')[-1],f'{page} out of {max_pages}')

        except:
            break
        if page%10==0:
            pickle.dump(articles_list,open(f'csv_files/home_prices/{city_url.split("/")[-1]}_{page-10}-{page-1}.pickle','wb'))
            articles_list=[]
        if max_pages is not None and page > max_pages:
            break
        elif max_pages is None and page==101:
            break
driver.quit()

In jupyter notebook, it freezes on a certain page and in CMD terminal I get this error:

[1500:11068:0715/140721.394:ERROR:fallback_task_provider.cc(124)] Every renderer should have at least one task provided by a primary task provider. If a "Renderer" fallback task is shown, it is a bug. If you have repro steps, please file a new bug and tag it as a dependency of crbug.com/739782.
[1500:11068:0715/140723.018:ERROR:fallback_task_provider.cc(124)] Every renderer should have at least one task provided by a primary task provider. If a "Renderer" fallback task is shown, it is a bug. If you have repro steps, please file a new bug and tag it as a dependency of crbug.com/739782.
[1500:11068:0715/140723.160:ERROR:fallback_task_provider.cc(124)] Every renderer should have at least one task provided by a primary task provider. If a "Renderer" fallback task is shown, it is a bug. If you have repro steps, please file a new bug and tag it as a dependency of crbug.com/739782.
[1500:11068:0715/140723.343:ERROR:fallback_task_provider.cc(124)] Every renderer should have at least one task provided by a primary task provider. If a "Renderer" fallback task is shown, it is a bug. If you have repro steps, please file a new bug and tag it as a dependency of crbug.com/739782.
[1500:11068:0715/140723.604:ERROR:fallback_task_provider.cc(124)] Every renderer should have at least one task provided by a primary task provider. If a "Renderer" fallback task is shown, it is a bug. If you have repro steps, please file a new bug and tag it as a dependency of crbug.com/739782.
[1500:11068:0715/140723.822:ERROR:fallback_task_provider.cc(124)] Every renderer should have at least one task provided by a primary task provider. If a "Renderer" fallback task is shown, it is a bug. If you have repro steps, please file a new bug and tag it as a dependency of crbug.com/739782.
[1500:11068:0715/140724.910:ERROR:fallback_task_provider.cc(124)] Every renderer should have at least one task provided by a primary task provider. If a "Renderer" fallback task is shown, it is a bug. If you have repro steps, please file a new bug and tag it as a dependency of crbug.com/739782.
[1500:13988:0715/140729.803:ERROR:cert_issuer_source_aia.cc(34)] Error parsing cert retrieved from AIA (as DER):
ERROR: Couldn't read tbsCertificate as SEQUENCE
ERROR: Failed parsing Certificate

How can I modify the code and avoid the error?

beridzeg45
  • 246
  • 2
  • 11

0 Answers0