1

I have this page from Grainger that I am working with. I essentially want to click on each table row of the page so it expands and I can scrape the mfr model, price, and description. I believe I have a program already that scrolls through the page and can scrape the price and scroll through the page. However, I am stuck on getting the program to click on each on every table row so it expands and I can scrape the respective data. I have attached my program below for reference as well as the website link. Anyu help is much appreciated!

https://www.grainger.com/category/adhesives-sealants-and-tape/patching-repairing-compounds/wall-repair-patching?categoryIndex=2

import undetected_chromedriver as uc

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
import time


def scrape_page_data():
    WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'pl')))
    container = driver.find_element(By.CLASS_NAME, 'pl')

    # scroll down to load all content on the page
    for _ in range(4):
        driver.execute_script("window.scrollBy(0, 2000);")
        time.sleep(2)

    item_click = container.find_elements(By.TAG_NAME, 'tr')
    item_click.click()
    skus = container.find_elements(By.CSS_SELECTOR, 'sku-value__value')
    prices = container.find_elements(By.CSS_SELECTOR, '.rbqU0E.lVwVq5')
    description = container.find_elements(By.CSS_SELECTOR, '.fUWsxn._3ihaM4')

    return skus, prices, description

def pagination(url, pages=1):
    prod_num = []
    prod_price = []
    prod_desc = []

    page_num = 0
    # iterate over the pages
    for i in range(1, pages + 1):
        # print(f"this is page {i}")
        driver.get(f"{url}?offset={page_num}")

        skus, prices, description = scrape_page_data()

        for sku in skus:
            prod_num.append(sku.text)

        for price in prices:
            prod_price.append(price.text)
        for desc in description:
            prod_desc.append(desc.text)

        print(f"prod_num: {prod_num}")
        print(f"prod_price: {prod_price}")
        print(f"prod_desc: {prod_desc}")
        print(f"prod_num: {len(prod_num)}")
        print(f"prod_price: {len(prod_price)}")
        print(f"prod_desc: {len(prod_desc)}")

        # increment it by 24 since each page has 24 data
        page_num += 24
        time.sleep(1)

    return prod_num, prod_price, prod_desc


# set the website URL and initialize the Chrome driver
website = 'https://www.grainger.com/category/adhesives-sealants-and-tape/patching-repairing-compounds/wall-repair-patching?categoryIndex=2'
options = Options()
# options.add_argument("--geolocation=47.8410,-122.2947")  # set geolocation to Lynnwood, WA
driver = uc.Chrome()

# call the pagination function to scrape data and store it in three separate lists
prod_num, prod_price, prod_desc = pagination(website, pages=1)

# convert the three lists to a pandas dataframe and save it as a CSV file
df = pd.DataFrame({'code': prod_num, 'price': prod_price, 'brand': prod_desc})
df.to_csv('graintest1.csv', index=False)
print(df)

# quit the Chrome driver
driver.quit()

2 Answers2

2

I believe you can use the following code as a guideline. In this code, all rows are expanded, and the product link is accessed to retrieve the corresponding description. This process is repeated for all rows in the tables. The requested information is printed in a simple format; however, you can use libraries like pandas or others to give a more structured format to the extracted data.

from selenium import webdriver
from selenium.webdriver.common.by import By

import time


options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)
driver = webdriver.Chrome(options=options)

website = 'https://www.grainger.com/category/adhesives-sealants-and-tape/patching-repairing-compounds/wall-repair-patching?categoryIndex=2'
driver.get(website)

tables = driver.find_elements(By.XPATH, '//tbody[@class="cjpIYY"]')
for i in range(len(tables)):
    table = driver.find_elements(By.XPATH, '//tbody[@class="cjpIYY"]')[i]
    rows = table.find_elements(By.CLASS_NAME, '-dql2z')
    j = 0
    for j in range(len(rows)):
        # update references
        table = driver.find_elements(By.XPATH, '//tbody[@class="cjpIYY"]')[i]
        row = table.find_elements(By.CLASS_NAME, '-dql2z')[j]
        row.click()
        # to load the expanded information of the row
        time.sleep(1)
        # product page link
        product_page = driver.find_element(By.XPATH, '//a[@class="JEyT-B _3ihaM4"]')
        product_page.click()
        time.sleep(2)

        # Extract data
        print("Mfr. model: %s" % driver.find_element(By.XPATH, '//div[@class="vDgTDH"][2]/dd').text)
        print("Price: %s" % driver.find_element(By.XPATH, '//span[@class="rbqU0E lVwVq5"]').text)
        print("Description: %s" % driver.find_element(By.XPATH, '//dd[@class="W7BBCC"]/p').text)
        driver.back()
        print()
        j += 1
    print()
    i += 1

And the output is this:

Mfr. model: 52084
Price: $9.83
Description: Joint compounds and plaster spread along drywall panel seams, nail heads, and indentions to create a smooth finish to the wall surface. Also called mud, joint compounds are typically used in conjunction with mesh drywall patches and tape. Once dried, these compounds are sandable in preparation for texturing, painting and other finishing applications.

Mfr. model: 10100
Price: $9.22
Description: Joint compounds and plaster spread along drywall panel seams, nail heads, and indentions to create a smooth finish to the wall surface. Also called mud, joint compounds are typically used in conjunction with mesh drywall patches and tape. Once dried, these compounds are sandable in preparation for texturing, painting and other finishing applications.

Mfr. model: 10308
Price: $8.86
Description: Joint compounds and plaster spread along drywall panel seams, nail heads, and indentions to create a smooth finish to the wall surface. Also called mud, joint compounds are typically used in conjunction with mesh drywall patches and tape. Once dried, these compounds are sandable in preparation for texturing, painting and other finishing applications.

Mfr. model: 58550
Price: $11.80
Description: Joint compounds and plaster spread along drywall panel seams, nail heads, and indentions to create a smooth finish to the wall surface. Also called mud, joint compounds are typically used in conjunction with mesh drywall patches and tape. Once dried, these compounds are sandable in preparation for texturing, painting and other finishing applications.

Mfr. model: 10310
Price: $16.68
Description: Joint compounds and plaster spread along drywall panel seams, nail heads, and indentions to create a smooth finish to the wall surface. Also called mud, joint compounds are typically used in conjunction with mesh drywall patches and tape. Once dried, these compounds are sandable in preparation for texturing, painting and other finishing applications.

Mfr. model: 58555
Price: $26.91
Description: Joint compounds and plaster spread along drywall panel seams, nail heads, and indentions to create a smooth finish to the wall surface. Also called mud, joint compounds are typically used in conjunction with mesh drywall patches and tape. Once dried, these compounds are sandable in preparation for texturing, painting and other finishing applications.

Mfr. model: 10114
Price: $17.94
Description: Joint compounds and plaster spread along drywall panel seams, nail heads, and indentions to create a smooth finish to the wall surface. Also called mud, joint compounds are typically used in conjunction with mesh drywall patches and tape. Once dried, these compounds are sandable in preparation for texturing, painting and other finishing applications.

Mfr. model: 60590
Price: $40.49
Description: Joint compounds and plaster spread along drywall panel seams, nail heads, and indentions to create a smooth finish to the wall surface. Also called mud, joint compounds are typically used in conjunction with mesh drywall patches and tape. Once dried, these compounds are sandable in preparation for texturing, painting and other finishing applications.

Mfr. model: 10120
Price: $27.43
Description: Joint compounds and plaster spread along drywall panel seams, nail heads, and indentions to create a smooth finish to the wall surface. Also called mud, joint compounds are typically used in conjunction with mesh drywall patches and tape. Once dried, these compounds are sandable in preparation for texturing, painting and other finishing applications.


Mfr. model: 18746
Price: $14.09
Description: Spackling paste, also called painter's putty and spackle, is used to fill minor imperfections in wall surfaces such as nail holes, dents, and divots. It is fast drying, does not require sanding, and is primarily used in preparation for painting.

Mfr. model: 12330
Price: $12.80
Description: Spackling paste, also called painter's putty and spackle, is used to fill minor imperfections in wall surfaces such as nail holes, dents, and divots. It is fast drying, does not require sanding, and is primarily used in preparation for painting.

Mfr. model: 12288
Price: $18.94
Description: Spackling paste, also called painter's putty and spackle, is used to fill minor imperfections in wall surfaces such as nail holes, dents, and divots. It is fast drying, does not require sanding, and is primarily used in preparation for painting.

Mfr. model: 12278
Price: $18.96
Description: Spackling paste, also called painter's putty and spackle, is used to fill minor imperfections in wall surfaces such as nail holes, dents, and divots. It is fast drying, does not require sanding, and is primarily used in preparation for painting.

Mfr. model: 12142
Price: $12.91
Description: Spackling paste, also called painter's putty and spackle, is used to fill minor imperfections in wall surfaces such as nail holes, dents, and divots. It is fast drying, does not require sanding, and is primarily used in preparation for painting.

Mfr. model: 12132
Price: $10.53
Description: Spackling paste, also called painter's putty and spackle, is used to fill minor imperfections in wall surfaces such as nail holes, dents, and divots. It is fast drying, does not require sanding, and is primarily used in preparation for painting.


Mfr. model: 09903
Price: $4.76
Description: Drywall patches and tape adhere to drywall to cover and bond panel joints and cover larger holes or cracks. They are typically mesh and are used in conjunction with joint compounds to create a smooth finish to the wall surface.

Mfr. model: 09904
Price: $5.66
Description: Drywall patches and tape adhere to drywall to cover and bond panel joints and cover larger holes or cracks. They are typically mesh and are used in conjunction with joint compounds to create a smooth finish to the wall surface.

Mfr. model: 2297
Price: $8.43
Description: Drywall patches and tape adhere to drywall to cover and bond panel joints and cover larger holes or cracks. They are typically mesh and are used in conjunction with joint compounds to create a smooth finish to the wall surface.

Mfr. model: 09007
Price: $8.05
Description: Drywall patches and tape adhere to drywall to cover and bond panel joints and cover larger holes or cracks. They are typically mesh and are used in conjunction with joint compounds to create a smooth finish to the wall surface.

Mfr. model: MJ 100
Price: $10.36
Description: Drywall patches and tape adhere to drywall to cover and bond panel joints and cover larger holes or cracks. They are typically mesh and are used in conjunction with joint compounds to create a smooth finish to the wall surface.

Mfr. model: 13A758
Price: $6.01
Description: Drywall patches and tape adhere to drywall to cover and bond panel joints and cover larger holes or cracks. They are typically mesh and are used in conjunction with joint compounds to create a smooth finish to the wall surface.

Mfr. model: 13A757
Price: $12.50
Description: Drywall patches and tape adhere to drywall to cover and bond panel joints and cover larger holes or cracks. They are typically mesh and are used in conjunction with joint compounds to create a smooth finish to the wall surface.

In my opinion, in this particular case, it would be more beneficial to use a tool like scrapy. This is a tool that I often use in my work, and it has proven to be helpful. It is essential to consider that, depending on the problem you face, one tool may be more suitable than another.

Also, I would like to recommend that you expand your knowledge of selectors for the data extraction process web scraping. In the provided example, the XPATH selector is used as it is the one I find most comfortable; however, you can choose whichever selector you prefer and best suits your needs.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
erick
  • 29
  • 2
1

To extract the Mfr. Model, Web Price and Title for each product from the first table you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Code block:

    driver.get("https://www.grainger.com/category/adhesives-sealants-and-tape/patching-repairing-compounds/wall-repair-patching")
    products = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[role='treegrid'] tbody tr")))
    for product in products:
        product.click()
        print("Mfr. Model: %s" % WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//label[@class='sku-value__label' and contains(., 'Model')]//following::span[1]"))).text)
        print("Web Price: %s" % WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[starts-with(@data-testid, 'pricing-component')]"))).text)
        print("Title: %s" % WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[@data-testid='product-detail-title']"))).get_attribute("title"))
        product.click()
    driver.quit()
    
  • Console Output:

    Mfr. Model: 52084
    Web Price: $9.83
    Title: DAP Patching Compound: Patching Plaster, 32 oz Container Size, Pail, White
    Mfr. Model: 10100
    Web Price: $9.22
    Title: DAP Joint Compound: Wallboard, 48 oz Container Size, Pail, White
    Mfr. Model: 10308
    Web Price: $8.86
    Title: DAP Patching Compound: Plaster of Paris, 64 oz Container Size, Bag, White
    Mfr. Model: 58550
    Web Price: $11.80
    Title: DAP Patching Compound: FastPatch 30, 64 oz Container Size, Pail, White
    Mfr. Model: 10310
    Web Price: $16.68
    Title: DAP Patching Compound: Plaster of Paris, 128 oz Container Size, Bag, White
    Mfr. Model: 58555
    Web Price: $26.91
    Title: DAP Patching Compound: Presto Patch, 128 oz Container Size, Pail, Off-White
    Mfr. Model: 10114
    Web Price: $17.94
    Title: DAP Joint Compound: Lightweight Wallboard, 128 oz Container Size, Pail, White
    Mfr. Model: 60590
    Web Price: $40.49
    Title: DAP Patching Compound: All-Purpose, 128 oz Container Size, Pail, White
    Mfr. Model: 10120
    Web Price: $27.43
    Title: DAP Joint Compound: Premium Lightweight, 128 oz Container Size, Pail, White
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in Python Selenium - get href value

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352