I'm writing my first web scraper in Python and I'm trying to get the product title and price from an Aliexpress product page. I am a total noob in this topic so sorry if this is an obvious question, but the solutions I've tried so far from other posts haven't worked. I'm using xpath to target html elements. I've copied the xpath code from Chrome with the inspect element -> copy xPath tool. It seems to not work the same way it worked on other websites, because the tree.xpath calls just keep returning empty lists. I managed to make it work for the title with trial and error, because it seems to return a list containing all the text on the entire page, and the title is on the third index of the list. I cannot find the index of the price though and also I would like to find the right way to do this. I've tried other people's solutions to similar problems, but nothing seems to work in my case and I am lost. Here is my code:
import requests
from lxml import html
url = 'https://www.aliexpress.com/item/4000203338045.html?spm=a2g0o.detail.1000060.1.77ce75e1YttKZb&gps-id=pcDetailBottomMoreThisSeller&scm=1007.13339.146401.0&scm_id=1007.13339.146401.0&scm-url=1007.13339.146401.0&pvid=662e2a50-e8d2-4ce3-b66e-70afff126070'
page = requests.get(url)
tree = html.fromstring(page.content)
title = tree.xpath('//*[@id="root"]/div/div[1]/div/div[2]/div[1]')[0]
title_text = title.xpath('///text()')[3]
print('Title:',title)
print('Title text:',title_text)
price = tree.xpath('//*[@id="root"]/div/div[2]/div/div[2]/div[4]/div[1]/span')
print('Price:', price)
And here is the output:
Title: <Element div at 0x3f113f0>
Title text: Bluedio T elf 2 Bluetooth earphone TWS wireless earbuds waterproof Sports Headset Wireless Earphone in ear with charging box-in Phone Earphones & Headphones from Consumer Electronics on AliExpress
Price: []
I appreciate your help!