0

EDIT 3: Adding a TL:DR right at the beginning:

  1. Go this URL: https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389
  2. Click on the "Revisa disponibilidad en tiendas aquí" link a couple of lines below the add to cart button.
  3. A popup appears with inventory information of the product for different stores. In the Network tab of developer tools, I can see this inventory info is received after making a request to https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView. More details on how it's called can be found below
  4. I need to build a Python script that calls this request, without getting the genetic error I'm getting right now, that I think is due to something related to sessions, cookies or authorizations..

Original Question:

I'm working on an assignment to get the inventory information of a specific product from a website, using python

The product url is: https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389

I'm quite new to calling post/get requests, but from studying the information in the network tab of the browser, I found that I can get the information I need by clicking on the "Revisa disponibilidad en tiendas aquí" link a couple of lines below he add to cart button. In the network tab I can see this link calls this request:

https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView

If I concatenate the url with the parameters it uses, I get this:

https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView?storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003&productId=378219&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=378219&sibstore=12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003&requesttype=ajax&authToken=-1002%252Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%253D

And if I run it directly in my browser, I get the information I need:

/* {"InventoryAvailability": [ {"physicalStoreName": "8860", "availableQuantity": "0"}, {"physicalStoreName": "8762", "availableQuantity": "7"}, {"physicalStoreName": "8661", "availableQuantity": "17"}, {"physicalStoreName": "8798", "availableQuantity": "2"}, {"physicalStoreName": "8744", "availableQuantity": "0"}, {"physicalStoreName": "8763", "availableQuantity": "0"}, {"physicalStoreName": "1165", "availableQuantity": "13"}, {"physicalStoreName": "8691", "availableQuantity": "18"}, {"physicalStoreName": "8692", "availableQuantity": "15"}, {"physicalStoreName": "8648", "availableQuantity": "13"}, {"physicalStoreName": "8747", "availableQuantity": "0"}, {"physicalStoreName": "8748", "availableQuantity": "0"}, {"physicalStoreName": "8702", "availableQuantity": "14"}, {"physicalStoreName": "8746", "availableQuantity": "0"}] }*/

Now, I've tried building a python script to replicate this, but when I run it I get this error response:

{"errorCode": "2540",
        "errorMessage": "CMN3101E El sistema no est� disponible debido a \"ErrorCode=2540\n\".",
        "errorMessageKey": "_ERR_GENERIC",
        "errorMessageParam": [{"ErrorCode": "2540"}],
        "correctiveActionMessage": "",
        "correlationIdentifier": "3bff941d:1814799a223:-4d74",
        "exceptionData": {"ErrorCode": "2540"},
        "exceptionType": "1",
        "originatingCommand": "",
        "systemMessage": "El error siguiente se ha producido durante el proceso: \"ErrorCode=2540\n\"."}*/

As I said, this is the first time I try to use requests in a python script, so maybe I'm doing something wrong. I'm thinking it might have something to do with the authtoken parameter, but I'm not sure how to deal with it. Is there a way to pass it from the browser to the script? This is my code. Any suggestions?

import requests

url = 'https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView'
payload = {
    "storeId":"10351",
    "catalogId":"10101",
    "langId":"-5",
    "physicalStoreListIds":"12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003",
    "productId":"378219",
    "fulfilment_type":"Store",
    "productavailable":"true",
    "type":"ItemBean",
    "sibstore":"12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003",
    "requesttype":"ajax",
    "authToken":"-1002%252Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%253D"
        }

x = requests.post(url, data=payload)

print(x.text)

EDIT: It seems my question is not clear enough. Sorry about that. So here goes a summary with more detail:

  • I need to build a Python script to somehow get the inventory of the product in this URL: https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389

  • In the product page, if I click on the "Revisa disponibilidad en tiendas aquí" link almost under the add to cart button, a popup appears with the inventory of the product for all the physical stores.

  • From the network tab in chrome's developer tools, I can see this popup is filled with the output of this POST request: https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView

  • If I run the same request from my browser by concatening all parameters in the payload, I get a response with the inventory information I need. This only works in my browser, as long as I don't close it. If I try on another browser, in incognito mode or in a python script, I get the error described in the original question. This is a sample payload: storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12505%252C12521%252C12554%252C12555%252C12565%252C12567%252C12578%252C12585%252C12609%252C14503&productId=208282&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=208282&sibstore=12505%252C12521%252C12554%252C12555%252C12565%252C12567%252C12578%252C12585%252C12609%252C14503&displayPopupSiblingstores=&requesttype=ajax&authToken=-1002%252Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%253D

  • I made a test script (code on top), trying to run the request, passing the parameters of the payload, but also get the same error.

  • As I said, I'm quite new to running request (GET or POST) from scripts. But my theory is that this error is due to the authtoken parameter in the payload. I need to somehow get a valid authtoken (from a browser, or a script) and use it to execute my request. At least that's my theory, but I'm not sure. Is that correct? If yes, how can I do it? If not, what else can I try?

EDIT 2:

Test code I used to check mechanize, still got the same error when checking resp2.read():

import time
import mechanize
from bs4 import BeautifulSoup

br = mechanize.Browser()
resp = br.open("https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389")

html_string = (resp.read()).decode("utf-8")

# f = open("resp.html", "a")
# f.write(html_string)
# f.close()
time.sleep(5)
soup = BeautifulSoup(html_string, "html.parser")
tmp = soup.find("div", {"id": "physicalSelectedStoreList"})
refresh_url = tmp["refreshurl"]
print("refresh_url: {}".format(refresh_url))
print("")
authtoken_full = refresh_url.split("authToken=")[1]
print("authtoken_full: {}".format(authtoken_full))
authtoken = authtoken_full.split("&storeId")[0]
print("authtoken: {}".format(authtoken))

req_url = "https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView?storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12505%252C12521%252C12554%252C12555%252C12565%252C12567%252C12578%252C12585%252C12609%252C14503&productId=208282&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=208282&sibstore=12505%252C12521%252C12554%252C12555%252C12565%252C12567%252C12578%252C12585%252C12609%252C14503&displayPopupSiblingstores=&requesttype=ajax&authToken="
req_url = req_url + authtoken
print("")
print("Full request: {}".format(req_url))
time.sleep(5)
resp2 = br.open(req_url)
time.sleep(5)
print("")
print(">>>>>>>>>>>>>>> info")
print(resp2.info())
print(">>>>>>>>>>>>>>> read")
print(resp2.read())
Alain
  • 339
  • 3
  • 19

2 Answers2

0

If the URL works as expected in one of your browsers but differently when used in Python, a solution might be to mimic a browser request with your original string by providing a User-Agent header (https://en.wikipedia.org/wiki/User_agent):

import requests
url = 'https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView?storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003&productId=378219&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=378219&sibstore=12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003&requesttype=ajax&authToken=-1002%252Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%253D'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

response = requests.get(url, headers=headers)
print(response.content)

How to use Python requests to fake a browser visit a.k.a and generate User Agent?

  • Didn't work... still geting the same error – Alain Jun 13 '22 at 14:43
  • 1
    How about the `mechanize` module? Mimics browser Gets: https://stackoverflow.com/questions/2567738/browser-simulation-python – Steinn Hauser Magnússon Jun 13 '22 at 14:44
  • 1
    Hmm. I ran your code and got 401 - Unauthorized. Did you try x = requests.get(url, params=payload) ? – stahh Jun 14 '22 at 11:50
  • @stahh following the link provided by the question in my browser, I also get 401 - Unauthorized. I therefore don't believe it's a problem produced by the python code. I think it has something to do with the permissions of the user. – Steinn Hauser Magnússon Jun 15 '22 at 12:19
  • @Alain could you please clarify or close the question? For me, even just inserting `https://www.homedepot.com.mx/` into my browser returns a `401 - Unauthorized`. This makes the larger payload request quite hard to debug – Steinn Hauser Magnússon Jun 15 '22 at 12:28
  • @SteinnHauserMagnusson you shouldn't be getting a 401 error when opening the homepage of the store (https://www.homedepot.com.mx/) in your browser... it's a big webstore and it should be open to the whole world without any kind of permissions... Where you must be getting an error is when calling https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView with all the parameters. As I said in my question, I think the problem is with the authtoken parameter, that I need to get somehow and pass it to the request. Al least that's my theory... – Alain Jun 15 '22 at 16:36
  • 1
    Yeah not sure why I'm getting 401 issues there. I's even happening for (homedepot.com) alone. Either way, did the `mechanize` module help? – Steinn Hauser Magnússon Jun 15 '22 at 17:05
  • With mechanize I was able to open the product page, and from the code get the authtoken, then I tried to run the request using the code I posted in my question, but updating the authtoken parameter from what I got from mechanize, but it didn't work... still got the same error. Not sure why... I edited my question with the code I used to test this. – Alain Jun 15 '22 at 19:52
  • 1
    Strange. Not sure I can help you more with this unfortunately since I'm still getting 401-Unauthorized for the base URL. Must be a regional thing. Good luck! – Steinn Hauser Magnússon Jun 16 '22 at 08:21
0

I can you give you a hacky way of doing this, install postman on your local machine, and then hit the endpoint that you want to get the data from. Make sure you get the right response back, then once you confirm that, within Postman you have an option to convert the call to python/node/curl. etc. Easiest way to make sure your call works and then can switch to any language!

Parachute
  • 1,178
  • 1
  • 8
  • 13
  • Not sure how to this in postman... If I directly paste the post request, with its payload parameters, in postman and execute it, I get the same error as a response... – Alain Jun 16 '22 at 23:02