Scraping booking.com with JS part

Asked Jun 25 '20 at 18:06

Active Jun 25 '20 at 18:06

Viewed 125 times

i'm trying to access the data inside the "hp_location_block__container " container, but at the moment, even with using PhantomJS, I don't get anything back:

from selenium import webdriver
from bs4 import BeautifulSoup

if __name__ == '__main__':
    url = "https://www.booking.com/hotel/tz/zuri-zanzibar.html"
    driver = webdriver.PhantomJS()
    driver.get(url)
    html = driver.page_source
    soup = BeautifulSoup(html)
    
    soup.findAll("div", {"class": "hp_location_block__container "})

I get:

Out[1]: []

asked Jun 25 '20 at 18:06

JohnHoopHoop

Most probably what you are trying to parse is dynamic content. does your html file has a div conatining that class name? – mursalin Jun 25 '20 at 18:26
when i inspect the webpage using google chrome, there should be this class yes – JohnHoopHoop Jun 25 '20 at 18:29
Your browser has javascript enabled. BeautifulSoup can only parse if the content is in html=driver.page_source. have a look at this https://stackoverflow.com/questions/15866426/beautifulsoup-not-grabbing-dynamic-content – mursalin Jun 25 '20 at 18:32
i thought phantomJS was supposed to do the job? doesn't seem to work... – JohnHoopHoop Jun 25 '20 at 18:38

Scraping booking.com with JS part

0 Answers0