0

I amt trying to get the size and stock information whether it is out of stock or in stock. https://limitededt.com/collections/footwear/products/adidas-originals-jonah-hill-superstar-fw7577

It looks like I have to click on each sizing manually, and the button will either show "SOLD OUT" or "Add to CART".

I am able to retrieve basic information from the HTML page, but this looks like a JS event.

When I click on the size, the url changes to

https://limitededt.com/collections/footwear/products/adidas-originals-jonah-hill-superstar-fw7577?variant=32432939466823

There is additional "variant=32432939466823 "

I was thinking I can manually figure out what the variant is, then use request to load the page again and then try to get the button info and determine whether it's in stock or out of stock.

Are there any alternatives that I can request once and interact with the sizings to check the stocks?

url = "https://limitededt.com/collections/footwear/products/adidas-originals-jonah-hill-superstar-fw7577"

source = requests.get(url).text

soup = BeautifulSoup(source, 'lxml')
bawagoc25
  • 33
  • 3
  • de facto, use [`selenium`](https://selenium-python.readthedocs.io/) – sushanth Jul 13 '20 at 04:35
  • Does this answer your question? [Web-scraping JavaScript page with Python](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python) – metatoaster Jul 13 '20 at 05:42
  • I'm somewhat annoyed that people treat everything like a nail just because they have a hammer (selenium) and don't even bother looking at your problem (i.e. opening the console and finding that the developers actually put a nice message when you change sizes). It's just not helping beginners to actually improve their knowledge. Simply add `.json` to the URL and feel free comment if that doesn't solve your problem (e.g. https://limitededt.com/collections/footwear/products/adidas-originals-jonah-hill-superstar-fw7577.json) – Gregor Jul 13 '20 at 13:25
  • @Gregor i strongly agree what you have explained but there is nothing in the JSON which will give details about the stock availability of a particular size. – Vin Jul 26 '20 at 08:09

1 Answers1

0

By looking at the networking tab in your Browser you find a JSON that contains all the products - you can load this data simply by adding .json to the URL. This does not contain stock information, though, as this data is loaded from a shopify, which seems to be a Canadian multinational e-commerce company headquartered in Ottawa (according to Wikipedia). This data is loaded into a script tag in the DOM that you can easily extract using BeautifulSoup (though I don't really use it - we could simply use regular expressions).

from bs4 import BeautifulSoup
import requests
import re

# define both URLs: for the JSON and the actual website for the shopify stock
url = 'https://limitededt.com/collections/footwear/products/adidas-originals-jonah-hill-superstar-fw7577'
product_info = url + '.json'

# fetch both data sources
with requests.Session() as session:
    soup = BeautifulSoup(session.get(url).text, 'html.parser')
    swym = soup.find('script', {'id': 'swym-snippet'})
    info = session.get(product_info).json()

# this will contain all the products and their stock
products = dict()

# get the data for each product from the json
for variant in info['product']['variants']:
    products[variant['id']] = variant

# find the shopify data using a regular expression
regex = re.compile(r'SwymProductVariants\[[0-9]+\] = ({[^(;)]+)')
inventory = re.findall(regex, swym.text)

# add the inventory information to the previously constructed product dictionary
for inv in inventory:
    indices = [key.strip().split(':')[1] for index, key in enumerate(inv.split(',')) if index in [1, 5]]
    products[int(indices[0])]['stock'] = int(indices[1])
Gregor
  • 588
  • 1
  • 5
  • 19