I have a list of Kayak URLs and I'd like to grap the price and link in "View Deal" for the "Best" and "Cheapest" HTML cards, essentially the first two results since I've already sorted the results in the URLs (here's an example of a URL).
I can't get around to grabbing these bits of data using beautifulsoup and I could use some help! Here's what I've tried for pulling price info but I'm getting an empty prices_list
variable. Below is a screenshot of what exactly I'd like to pull info from in the website.
url = https://www.kayak.com/flights/AMS-WMI,nearby/2023-02-15/WMI-SOF,nearby/2023-02-18/SOF-BEG,nearby/2023-02-20/BEG-MIL,nearby/2023-02-23/MIL-AMS,nearby/2023-02-25/?sort=bestflight_a
requests = 0
chrome_options = webdriver.ChromeOptions()
agents = ["Firefox/66.0.3","Chrome/73.0.3683.68","Edge/16.16299"]
print("User agent: " + agents[(requests%len(agents))])
chrome_options.add_argument('--user-agent=' + agents[(requests%len(agents))] + '"')
chrome_options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome('/Users/etc./etc.')
driver.implicitly_wait(10)
driver.get(url)
# getting the prices
sleep(randint(8,10))
xp_prices = '//a[@class="booking-link"]/span[@class="price option-text"]'
prices = driver.find_elements_by_xpath(xp_prices)
prices_list = [price.text.replace('$','') for price in prices if price.text != '']
prices_list = list(map(int, prices_list))