XPath getting empty list

Question

I'm trying to get this number (circled in red), from this website https://www.banxico.org.mx/:

And i have this code to get it but I get an empty list:

linktc='https://www.banxico.org.mx/'
pagetc=requests.get(linktc)
tree=html.fromstring(pagetc.content)
tipocambio=tree.xpath('//div[@id="vFIX"]//span[@class="valor"]//text()')
print("TC: ",tipocambio)

Does someone knows what's the problem?

This is the full xpath for me, I suggest you use the full xpath. `/html/body/div[2]/div[1]/div[3]/div/div[1]/div[1]/div/div/div[5]/div/span[2]` — Celius Stingher, Jun 24 '20 at 23:40

Gilles Quénot · Answer 1 · 2020-06-25T00:15:18.657

The issue here, is that you need a javascript capable library. The value you would like is generated with JS.

You can instead use puppeteer via nodejs:

const puppeteer = require('puppeteer');
const fs = require('fs');
const debug = true;

(async () => {
    const browser = await puppeteer.launch({
        headless: true,
    });

    const page = await browser.newPage();

    // UA
    await page.setUserAgent('Mozilla/5.0 (X11; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0')

    // open main URL
    await page.goto('https://www.banxico.org.mx/', { waitUntil: 'networkidle2' });

    // wait for wanted selector to pop up
    await page.waitForXPath('//div[@id="vFIX"]//span[@class="valor"]');

    // retrieve text content
    var element = await page.$x('//div[@id="vFIX"]//span[@class="valor"]/text()');
    let text = await page.evaluate(element => element.textContent, element[0]);

    console.log(text);

    await browser.close();
})();

Output

22.6662

Or check Web-scraping JavaScript page with Python too

[Accepted answer](https://stackoverflow.com/a/62565881/290085) is a fine work-around (+1 too), but this is technique is more generally applicable to sites with JavaScript-generated output. — kjhughes, Jun 25 '20 at 00:17
Yes, agree both points, not all the times you have JSON (or random data) accessible like this. This solution is more generic, usable in all situations I know — Gilles Quénot, Jun 25 '20 at 00:20

score 2 · Accepted Answer · answered Jun 25 '20 at 00:04

Javascript is needed to display the value. You could use Selenium to get it. Or retrieve the data directly from the JSON loaded in the background :

import urllib.request, json 
with urllib.request.urlopen("https://www.banxico.org.mx/canales/singleFix.json") as url:
    data = json.loads(url.read().decode())
    print(data['valor'])

Output : 22.6662

Alternative : get the value from elsewhere.

from lxml import html
import requests

url = 'https://www.banxico.org.mx/SieInternet/consultarDirectorioInternetAction.do?sector=6&accion=consultarCuadro&idCuadro=CF102&locale=es'
r = requests.get(url)
tree = html.fromstring(r.content)
value=tree.xpath('//tr[@id="nodo_0_0_0"]/td[7]//td[last()]')[0].text
print(value.strip())

Output : 22.6662

Good catch, but not re-usable for any JS rendered websites ;) +1 from here — Gilles Quénot, Jun 25 '20 at 00:06

XPath getting empty list

2 Answers2

Output