Here's the link of the website : website
I would like to have all the links of th hotels in this location.
Here's my script :
import pandas as pd
import numpy as np
from selenium import webdriver
import time
PATH = "driver\chromedriver.exe"
options = webdriver.ChromeOptions()
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1200,900")
options.add_argument('enable-logging')
driver = webdriver.Chrome(options=options, executable_path=PATH)
driver.get('https://fr.hotels.com/search.do?destination-id=10398359&q-check-in=2021-06-24&q-check-out=2021-06-25&q-rooms=1&q-room-0-adults=2&q-room-0-children=0&sort-order=BEST_SELLER')
cookie = driver.find_element_by_xpath('//button[@class="uolsaJ"]')
try:
cookie.click()
except:
pass
for i in range(30):
driver.execute_script("window.scrollBy(0, 1000)")
time.sleep(5)
time.sleep(5)
my_elems = driver.find_elements_by_xpath('//a[@class="_61P-R0"]')
links = [my_elem.get_attribute("href") for my_elem in my_elems]
X = np.array(links)
print(X.shape)
#driver.close()
But I cannot find a way to tell the script : scroll down until there is nothing more to scroll.
I tried to change this parameters :
for i in range(30):
driver.execute_script("window.scrollBy(0, 1000)")
time.sleep(30)
I changed the time.sleep()
, the number 1000
and so on but my output keep changing and not in the right way.
As you can see, I have scraped a lot of numbers differents. How to make my script scraping a same amout each time ? Not necessarily each links but at last a stable number.
Here it scroll and at one point it seems blocked and scrape all the links it has at the moment. That's not appropriate.