-2

I used the below code to scrape the rent price from the web. the majority of information was correct, except the rent price. I am just puzzled.

url = "https://www.homegate.ch/rent/3003465820"

html_data = requests.get(url).content

soup = BeautifulSoup(html_data,'html.parser')

hh = soup.find_all(attrs={"data-v-8dfba50a":"","data-test":"costs"})

The price on the web is much lower than scraped from BS. But always get the same wrong price. The price displayed on the web is CHF 6,650 price scraped from soup is 8,910

I have tried scraping different blocks using

hh = soup.find_all('script',attrs={"type":"application/ld+json"})
print(hh[0])

But always get the same wrong price 8910 instead of CHF 6,650

Aqib Chattha
  • 197
  • 11
  • 1
    This page has dynamically generated content, meaning that there is javascript code adding content to the page after the initial request is made. You can't get the page using the requests module, you will need something else such as Selenium or playwright, please see the answer here for details https://stackoverflow.com/questions/8049520/how-can-i-scrape-a-page-with-dynamic-content-created-by-javascript-in-python?rq=3 – João Areias Aug 26 '23 at 10:12

1 Answers1

-1

You should use selenium with Beautifulsoup

code will look like:

from bs4 import BeautifulSoup
from selenium import webdriver

url = "https://www.homegate.ch/rent/3003465820"

driver = webdriver.Chrome()
driver.get(url)
bs4 = BeautifulSoup(driver.page_source,'html.parser')
hh = bs4.find_all(attrs={"data-v-8dfba50a":"","data-test":"costs"})
1rone
  • 1
  • 3
  • You cannot avoid opening the website, because it uses dynamically generated content. @João Areias in comments wrote about it. PS. Thanks for editing – 1rone Aug 26 '23 at 10:57