0

I'm trying to open a Hotel website www.booking.com and extract the name, price, location, and link from the top 50 search results which are sorted by cheapest first. I'm using Selenium python to automate the process However some HTML elements are targetable while others are not. after inspecting the website I realized that all hotel names have the class name: fcab3ed991 a23c043802

I tried to target all of them and put them into an array as seen in my code below. But I can't seem to target the element correctly. What I'm I doing wrong?

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options

PATH= "C:\Program Files (x86)\chromedriver.exe"
driver=webdriver.Chrome(PATH)
driver.get("https://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggI46AdIM1gEaAKIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4AvqR75YGwAIB0gIkZDQ4MTdjZDctYzIyNC00N2RlLWJhYjItZDU1YTAwMGU2M2Q12AIF4AIB&sid=8005d0cc6b75af8d0d2e74451b73cb8b&aid=304142&sb=1&sb_lp=1&src_elem=sb&error_url=https%3A%2F%2Fwww.booking.com%2Findex.html%3Flabel%3Dgen173nr-1FCAEoggI46AdIM1gEaAKIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4AvqR75YGwAIB0gIkZDQ4MTdjZDctYzIyNC00N2RlLWJhYjItZDU1YTAwMGU2M2Q12AIF4AIB%26sid%3D8005d0cc6b75af8d0d2e74451b73cb8b%26sb_price_type%3Dtotal%26%26&ss=Jumeirah%2C+Dubai%2C+Dubai+Emirate%2C+United+Arab+Emirates&is_ski_area=&checkin_year=2022&checkin_month=8&checkin_monthday=1&checkout_year=2022&checkout_month=8&checkout_monthday=3&group_adults=2&group_children=0&no_rooms=1&map=1&b_h4u_keep_filters=&from_sf=1&ss_raw=jum&ac_position=1&ac_langcode=en&ac_click_type=b&dest_id=941&dest_type=district&place_id_lat=25.205553&place_id_lon=55.239216&search_pageview_id=c0ac477da63f02c2&search_pageview_id=c0ac477da63f02c2&search_selected=true&ac_suggestion_list_length=5&ac_suggestion_theme_list_length=0&order=price#map_closed")


try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.CLASS_NAME, "d4924c9e74"))
    )

    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.CLASS_NAME, "fcab3ed991 a23c043802"))
    )
    names=element.find_elements_by_class_name("fcab3ed991 a23c043802")
except:
    driver.quit()
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

1 Answers1

0

To extract the texts from the name and price fields you can use list comprehension and you can use the following locator strategies:

  • Code block:

    driver.execute("get", {'url': 'https://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggI46AdIM1gEaAKIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4AvqR75YGwAIB0gIkZDQ4MTdjZDctYzIyNC00N2RlLWJhYjItZDU1YTAwMGU2M2Q12AIF4AIB&sid=8005d0cc6b75af8d0d2e74451b73cb8b&aid=304142&sb=1&sb_lp=1&src_elem=sb&error_url=https%3A%2F%2Fwww.booking.com%2Findex.html%3Flabel%3Dgen173nr-1FCAEoggI46AdIM1gEaAKIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4AvqR75YGwAIB0gIkZDQ4MTdjZDctYzIyNC00N2RlLWJhYjItZDU1YTAwMGU2M2Q12AIF4AIB%26sid%3D8005d0cc6b75af8d0d2e74451b73cb8b%26sb_price_type%3Dtotal%26%26&ss=Jumeirah%2C+Dubai%2C+Dubai+Emirate%2C+United+Arab+Emirates&is_ski_area=&checkin_year=2022&checkin_month=8&checkin_monthday=1&checkout_year=2022&checkout_month=8&checkout_monthday=3&group_adults=2&group_children=0&no_rooms=1&map=1&b_h4u_keep_filters=&from_sf=1&ss_raw=jum&ac_position=1&ac_langcode=en&ac_click_type=b&dest_id=941&dest_type=district&place_id_lat=25.205553&place_id_lon=55.239216&search_pageview_id=c0ac477da63f02c2&search_pageview_id=c0ac477da63f02c2&search_selected=true&ac_suggestion_list_length=5&ac_suggestion_theme_list_length=0&order=price#map_closed'})
    names = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div[data-testid='title']")))]
    prices = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div[data-testid='price-and-discounted-price'] > span")))]
    for i,j in zip(names, prices):
      print(f"{i} hotel price is {j}")
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Console Output:

    Royal Prestige Hotel hotel price is ₹ 10,871
    Rove La Mer Beach hotel price is ₹ 10,328
    Dubai Marine Beach Resort & Spa hotel price is ₹ 12,133
    Roda Beach Resort hotel price is ₹ 16,525
    Bespoke Residences - 3 Bedroom Waikiki Townhouses hotel price is ₹ 20,395
    Walking distance to Burj al Arab - 1BR Lamtara 2 hotel price is ₹ 16,724
    Mandarin Oriental Jumeira, Dubai hotel price is ₹ 18,108
    Four Seasons Resort Dubai at Jumeirah Beach hotel price is ₹ 20,003
    Bulgari Resort, Dubai hotel price is ₹ 78,274
    Spacious Villa! hotel price is ₹ 62,619
    Palm Beach Hotel hotel price is ₹ 64,794
    York International Hotel hotel price is ₹ 86,971
    Moon , Backpackers , Partition for Couples and for singles hotel price is ₹ 208,731
    Hafez Hotel Apartments Al Ras Metro Station hotel price is ₹ 2,022
    Grand Pearl Hostel For Boys hotel price is ₹ 2,131
    Time Palace Hotel Branch hotel price is ₹ 3,131
    Hostel Youth hotel price is ₹ 3,157
    Grand Mayfair Hotel hotel price is ₹ 3,601
    Explore Old Dubai, Souks, Tastings, Museums hotel price is ₹ 4,592
    Panorama Hotel Bur Dubai hotel price is ₹ 3,674
    Zain International Hotel hotel price is ₹ 3,827
    Panorama Hotel Deira hotel price is ₹ 3,870
    Decent Boys Hostel in center of Bur Dubai next to Burjuman metro Station with all FREE Facilities hotel price is ₹ 3,875
    Brand New Boys Hostel 1 min walk from Burjuman Metro Station EXIT-4 with all Brand New Furnishings & Free Facilities hotel price is ₹ 3,914
    OYO 338 Transworld Hotel hotel price is ₹ 3,914
    

PS: Following this solution you can similarly extract the location and link texts as well and dump in a JSON format.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352