For some reason, when I use the python library 'requests', to GET request a website's html code. It doesn't return the full html code.
What is happening?
import re
import requests
url = 'https://www.aliexpress.com/item/Dragon-Ball-Z-Mug-SON-Goku-Mug-Hot-Changing-Color-Cups-Heat-Reactive-Mugs-and-Cups/32649664569.html'
mess = requests.get(url)
print(mess.text, '\n', '_'*20)
approved = []
images = re.findall(r'(?<=src=")[a-zA-Z0-9 \/\\,._-]+(?=")', mess.text)
for image in images:
print(image)
base, ext = image.rsplit('.', 1)
if ext == 'png' or ext == 'jpg' or ext == 'JPG':
approved.append(image)
Output:
//u.alicdn.com/js/aplus_ae.js
//i.alicdn.com/ae-header/20170208145626/buyer/front/ae-header.js
This picture shows that there is an 'img' tag with the attribute 'src' which is a jpg. But for some reason, it's not captured in the output.