Unable to scrape the "view details "button links as a list for the page "https://www.bmstores.co.uk/stores?location=KA8+9BF"

Question

I am unable to scrape the "view details "button links as a list for the page "https://www.bmstores.co.uk/stores?location=KA8+9BF"..I have tried in both beautifulsoup and selenium in multiple ways.In terms of selenium i used, find element methods using x path and css selector class name but nothing worked.while using selenium got the pop up issue for the site but however it resolved using pop up blockers.

Searched in various sites but got the same beautifulsoup python codes but unable to complete the task. My code is here---when i run i get the 2 repeat errors

1.ElementNotInteractableException: element not interactable 2.NoSuchElementException: Message: no such element: Unable to locate element

My code is here--

from bs4 import BeautifulSoup
import requests
import pandas as pd
from selenium import webdriver as wd
import time
from selenium.common.exceptions import WebDriverException

local_path_of_chrome_driver = "E:\\chromedriver.exe"
driver = wd.Chrome(executable_path=local_path_of_chrome_driver)
driver.maximize_window()

data_links=[]

xpaths = 

["/html/body/div[9]/div/div/div/div/ul/li[1]/div/div[2]/a[1]","/html/body/div[9]/div/div/div/div/ul/li[2]/div/div[2]/a[1]","/html/body/div[9]/div/div/div/div/ul/li[4]/div/div[2]/a[1]","/html/body/div[9]/div/div/div/div/ul/li[5]/div/div[2]/a[1]"]
for j in xpaths:
        try:
            
            driver.find_element_by_xpath(j).click()
            
            time.sleep(3)
        
            driver.switch_to_window(driver.window_handles[-1])
            data_links.append(driver.current_url)
            
            time.sleep(3)
            
            driver.back()
        except:
            pass
            
 driver.close()

Can someone help me out?

score 0 · Answer 1 · answered Oct 28 '21 at 19:26

To scrape the View Details button links as a list from the page https://www.bmstores.co.uk/stores?location=KA8+9BF you have to induce WebDriverWait and you can use the following Locator Strategies:

Code Block:

view_details = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.LINK_TEXT, "View Details")))
for i in view_details:
    print(i.get_attribute("href"))

Console output:

https://www.bmstores.co.uk/stores/ayr-heathfield-retail-park-90
https://www.bmstores.co.uk/stores/prestwick-113
https://www.bmstores.co.uk/stores/irvine-307
https://www.bmstores.co.uk/stores/kilmarnock-310
https://www.bmstores.co.uk/stores/stevenston-319
https://www.bmstores.co.uk/stores/darnley-414
https://www.bmstores.co.uk/stores/east-kilbride-304
https://www.bmstores.co.uk/stores/paisley-linwood-423
https://www.bmstores.co.uk/stores/linwood-hart-street-33
https://www.bmstores.co.uk/stores/paisley-renfrew-road-428

score -1 · Accepted Answer · answered Oct 29 '21 at 12:01

You can fetch all the names and their concerning view details button link using requests module. There are 24 stores in total.

import requests
from urllib.parse import urljoin

base = 'https://www.bmstores.co.uk'
link = 'https://mv7e2a3yql-dsn.algolia.net/1/indexes/*/queries'

params = {
    'x-algolia-agent': 'Algolia for JavaScript (3.35.0); Browser; instantsearch.js (3.6.0); JS Helper (2.28.0)',
    'x-algolia-application-id': 'MV7E2A3YQL',
    'x-algolia-api-key': 'Mzg2ZjM2ZmVmNzhiMmVhZjhhNjQ5ZDAzNGQ5NjE2MTQ1MDQ2ZDAwODBlMjY2YjFkNWFkOTUyOTZkNTRhY2M4MmZpbHRlcnM9JTI4c3RhdHVzJTNBYXBwcm92ZWQlMjkrQU5EK3B1Ymxpc2hkYXRlKyUzQysxNjM1NTAzMzI5K0FORCslMjhleHBpcnlkYXRlKyUzRSsxNjM1NTAzMzI5K09SK2V4cGlyeWRhdGUrJTNEKy0xJTI5',
}

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
    s.headers['Referer'] = 'https://www.bmstores.co.uk/stores?location=KA8+9BF'
    
    page = 0
    
    while page<=3:
        payload = {"requests":[{"indexName":"prod_bmstores_stores","params":f"query=&hitsPerPage=10&page={page}&attributesToRetrieve=*&highlightPreTag=__ais-highlight__&highlightPostTag=__%2Fais-highlight__&getRankingInfo=true&aroundLatLng=55.47888%2C-4.59464&aroundRadius=50000&clickAnalytics=false&facets=%5B%22ranges%22%5D&tagFilters="}]}
        res = s.post(link,params=params,json=payload)
        for item in res.json()['results']:
            for container in item['hits']:
                store_name = container['storename']
                detail_link = urljoin(base,container['url'])
                print(store_name,detail_link)

        page+=1

Unable to scrape the "view details "button links as a list for the page "https://www.bmstores.co.uk/stores?location=KA8+9BF"

2 Answers2