EDIT 3: Adding a TL:DR right at the beginning:
- Go this URL: https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389
- Click on the "Revisa disponibilidad en tiendas aquí" link a couple of lines below the add to cart button.
- A popup appears with inventory information of the product for different stores. In the Network tab of developer tools, I can see this inventory info is received after making a request to https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView. More details on how it's called can be found below
- I need to build a Python script that calls this request, without getting the genetic error I'm getting right now, that I think is due to something related to sessions, cookies or authorizations..
Original Question:
I'm working on an assignment to get the inventory information of a specific product from a website, using python
The product url is: https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389
I'm quite new to calling post/get requests, but from studying the information in the network tab of the browser, I found that I can get the information I need by clicking on the "Revisa disponibilidad en tiendas aquí" link a couple of lines below he add to cart button. In the network tab I can see this link calls this request:
https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView
If I concatenate the url with the parameters it uses, I get this:
https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView?storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003&productId=378219&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=378219&sibstore=12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003&requesttype=ajax&authToken=-1002%252Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%253D
And if I run it directly in my browser, I get the information I need:
/* {"InventoryAvailability": [ {"physicalStoreName": "8860", "availableQuantity": "0"}, {"physicalStoreName": "8762", "availableQuantity": "7"}, {"physicalStoreName": "8661", "availableQuantity": "17"}, {"physicalStoreName": "8798", "availableQuantity": "2"}, {"physicalStoreName": "8744", "availableQuantity": "0"}, {"physicalStoreName": "8763", "availableQuantity": "0"}, {"physicalStoreName": "1165", "availableQuantity": "13"}, {"physicalStoreName": "8691", "availableQuantity": "18"}, {"physicalStoreName": "8692", "availableQuantity": "15"}, {"physicalStoreName": "8648", "availableQuantity": "13"}, {"physicalStoreName": "8747", "availableQuantity": "0"}, {"physicalStoreName": "8748", "availableQuantity": "0"}, {"physicalStoreName": "8702", "availableQuantity": "14"}, {"physicalStoreName": "8746", "availableQuantity": "0"}] }*/
Now, I've tried building a python script to replicate this, but when I run it I get this error response:
{"errorCode": "2540",
"errorMessage": "CMN3101E El sistema no est� disponible debido a \"ErrorCode=2540\n\".",
"errorMessageKey": "_ERR_GENERIC",
"errorMessageParam": [{"ErrorCode": "2540"}],
"correctiveActionMessage": "",
"correlationIdentifier": "3bff941d:1814799a223:-4d74",
"exceptionData": {"ErrorCode": "2540"},
"exceptionType": "1",
"originatingCommand": "",
"systemMessage": "El error siguiente se ha producido durante el proceso: \"ErrorCode=2540\n\"."}*/
As I said, this is the first time I try to use requests in a python script, so maybe I'm doing something wrong. I'm thinking it might have something to do with the authtoken parameter, but I'm not sure how to deal with it. Is there a way to pass it from the browser to the script? This is my code. Any suggestions?
import requests
url = 'https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView'
payload = {
"storeId":"10351",
"catalogId":"10101",
"langId":"-5",
"physicalStoreListIds":"12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003",
"productId":"378219",
"fulfilment_type":"Store",
"productavailable":"true",
"type":"ItemBean",
"sibstore":"12511%252C12524%252C12539%252C12542%252C12552%252C12583%252C12591%252C12592%252C12598%252C12605%252C12613%252C12614%252C12616%252C14003",
"requesttype":"ajax",
"authToken":"-1002%252Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%253D"
}
x = requests.post(url, data=payload)
print(x.text)
EDIT: It seems my question is not clear enough. Sorry about that. So here goes a summary with more detail:
I need to build a Python script to somehow get the inventory of the product in this URL: https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389
In the product page, if I click on the "Revisa disponibilidad en tiendas aquí" link almost under the add to cart button, a popup appears with the inventory of the product for all the physical stores.
From the network tab in chrome's developer tools, I can see this popup is filled with the output of this POST request: https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView
If I run the same request from my browser by concatening all parameters in the payload, I get a response with the inventory information I need. This only works in my browser, as long as I don't close it. If I try on another browser, in incognito mode or in a python script, I get the error described in the original question. This is a sample payload:
storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12505%252C12521%252C12554%252C12555%252C12565%252C12567%252C12578%252C12585%252C12609%252C14503&productId=208282&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=208282&sibstore=12505%252C12521%252C12554%252C12555%252C12565%252C12567%252C12578%252C12585%252C12609%252C14503&displayPopupSiblingstores=&requesttype=ajax&authToken=-1002%252Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%253D
I made a test script (code on top), trying to run the request, passing the parameters of the payload, but also get the same error.
As I said, I'm quite new to running request (GET or POST) from scripts. But my theory is that this error is due to the authtoken parameter in the payload. I need to somehow get a valid authtoken (from a browser, or a script) and use it to execute my request. At least that's my theory, but I'm not sure. Is that correct? If yes, how can I do it? If not, what else can I try?
EDIT 2:
Test code I used to check mechanize, still got the same error when checking resp2.read():
import time
import mechanize
from bs4 import BeautifulSoup
br = mechanize.Browser()
resp = br.open("https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389")
html_string = (resp.read()).decode("utf-8")
# f = open("resp.html", "a")
# f.write(html_string)
# f.close()
time.sleep(5)
soup = BeautifulSoup(html_string, "html.parser")
tmp = soup.find("div", {"id": "physicalSelectedStoreList"})
refresh_url = tmp["refreshurl"]
print("refresh_url: {}".format(refresh_url))
print("")
authtoken_full = refresh_url.split("authToken=")[1]
print("authtoken_full: {}".format(authtoken_full))
authtoken = authtoken_full.split("&storeId")[0]
print("authtoken: {}".format(authtoken))
req_url = "https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView?storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12505%252C12521%252C12554%252C12555%252C12565%252C12567%252C12578%252C12585%252C12609%252C14503&productId=208282&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=208282&sibstore=12505%252C12521%252C12554%252C12555%252C12565%252C12567%252C12578%252C12585%252C12609%252C14503&displayPopupSiblingstores=&requesttype=ajax&authToken="
req_url = req_url + authtoken
print("")
print("Full request: {}".format(req_url))
time.sleep(5)
resp2 = br.open(req_url)
time.sleep(5)
print("")
print(">>>>>>>>>>>>>>> info")
print(resp2.info())
print(">>>>>>>>>>>>>>> read")
print(resp2.read())