Can't get text values using XPATH in python

Question

I'm trying to parse currencies from this bank website. In code:

import requests
import time
import logging
from retrying import retry
from lxml import html

logging.basicConfig(filename='info.log', format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

@retry(wait_fixed=5000)
def fetch_data_from_nb_ved_ru():
try:
    page = requests.get('http://www.nbu.com/exchange_rates')
    #print page.text
    tree = (html.fromstring(page.text))
    #fetched_ved_usd_buy = tree.xpath('//div[@class="exchangeRates"]/table/tbody/tr[5]/td[5]')
    fetched_ved_usd_buy = tree.xpath('/html/body/div[1]/div//div[7]/div/div/div[1]//text()')
    print fetched_ved_usd_buy
    fetched_ved_usd_sell = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[6]/td[6]/text()')).strip()
    fetched_ved_eur_buy = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[7]/td[5]/text()')).strip()
    fetched_ved_eur_sell = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[7]/td[6]/text()')).strip()
    fetched_cb_eur = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[7]/td[4]/text()')).strip()
    fetched_cb_rub = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[18]/td[4]/text()')).strip()
    fetched_cb_usd = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[6]/td[4]/text()')).strip()
except:
    logging.warning("NB VED UZ fetch failed")
    raise IOError("NB VED UZ  fetch failed")
return fetched_ved_usd_buy, fetched_ved_usd_sell, fetched_cb_usd, fetched_ved_eur_buy, fetched_ved_eur_sell,\
    fetched_cb_eur, fetched_cb_rub

while True:
    f = open('values_uzb.txt', 'w')
    ved_usd_buy, ved_usd_sell, cb_usd, ved_eur_buy, ed_eur_sell, cb_eur, cb_rub = fetch_data_from_nb_ved_ru()
               f.write(str(ved_usd_buy)+'\n'+str(ved_usd_sell)+'\n'+str(cb_usd)+'\n'+str(ved_eur_buy)+'\n'+str(ed_eur_sell)+'\n'
        + str(cb_eur)+'\n'+str(cb_rub))

    f.close()
    time.sleep(120)

But it always returns empty string, however if I do print page.text, i can see that the values are on their's places. I got that xpath from firebug. Chrome gives the same xpath. Tried to construct own xpath //div[@class="exchangeRates"]/table/tbody/tr[5]/td[5] but it happens to be not valid to.

Any suggestions? Thanks.

Try without the tbody in the xpath. – Anand S Kumar Aug 27 '15 at 13:30 — Anand S Kumar, Aug 27 '15 at 13:30
Looks like the nbu site is down – heinst Aug 27 '15 at 13:36 — heinst, Aug 27 '15 at 13:36
@AnandSKumar, removed tbody, result is the same :( – cre8eve Aug 27 '15 at 13:47 — cre8eve, Aug 27 '15 at 13:47

score 4 · Accepted Answer · edited May 23 '17 at 10:26

4

I am not certain what you are looking for exactly, but this works:

tree.xpath("/html/body/div[1]/div[7]/div/div/div[1]//text()")

As for starting with the class exchangeRates, I found by using tree.xpath("//div[@class='exchangeRates']/table")[0].getchildren() that there is no tbody child of table, even though browsers say there is. See this SO question for an explanation. Removing tbody from your original xpath does work. However, the one you chose (td[5]) is empty, thus returning []. Try

tree.xpath("//div[@class='exchangeRates']/table/tr[5]/td[4]//text()")
# ['706.65']

or

tree.xpath("//div[@class='exchangeRates']/table/tr[6]/td[5]//text()")
# ['2638.00']

edited May 23 '17 at 10:26

Community

1
1

answered Aug 27 '15 at 14:28

James Pringle

1,079
6
15

Thanks a lot, it worked great! But how did you find the path `"/html/body/div[1]/div[7]/div/div/div[1]//text()"` ? – cre8eve Aug 27 '15 at 15:13
1

Just trial and error. Started with "/html" and added a child one at a time until I found where your first guess stopped working. – James Pringle Aug 28 '15 at 14:53

Pentux · Answer 2 · 2015-08-27T14:42:00.980

Try with this xpath:

tree.xpath('//div[@class="exchangeRates"]//tr[NUMBER OF TR]/td[5]/text()')

Another thing... I thing if you put this code you will improve your code:

trs = tree.xpath('//div[@class="exchangeRates"]//tr')
    for tr in trs:
        currency_code = tr.xpath('./td[7]/text()').strip()

        if currency_code=='USD':
            usd_buy = tr.xpath('./td[5]/text()').strip()
            usd_sell = tr.xpath('./td[6]/text()').strip()
            usd_cb = tr.xpath('./td[4]/text()').strip()

And continue with other currency that you need.

It is a quickly code, if you need more details reply please.

score 0 · Answer 3 · edited Jul 08 '21 at 05:46

0

I use the following statement, which runs perfectly fine for me.

ActualValue = driver.find_element_by_xpath("//div/div[2]/div").text

edited Jul 08 '21 at 05:46

Cody Gray - on strike

239,200
50
490
574

answered May 31 '20 at 17:57

ARB

319
3
6

Can't get text values using XPATH in python

3 Answers3

Linked