3

I'm trying to parse currencies from this bank website. In code:

import requests
import time
import logging
from retrying import retry
from lxml import html

logging.basicConfig(filename='info.log', format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

@retry(wait_fixed=5000)
def fetch_data_from_nb_ved_ru():
try:
    page = requests.get('http://www.nbu.com/exchange_rates')
    #print page.text
    tree = (html.fromstring(page.text))
    #fetched_ved_usd_buy = tree.xpath('//div[@class="exchangeRates"]/table/tbody/tr[5]/td[5]')
    fetched_ved_usd_buy = tree.xpath('/html/body/div[1]/div//div[7]/div/div/div[1]//text()')
    print fetched_ved_usd_buy
    fetched_ved_usd_sell = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[6]/td[6]/text()')).strip()
    fetched_ved_eur_buy = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[7]/td[5]/text()')).strip()
    fetched_ved_eur_sell = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[7]/td[6]/text()')).strip()
    fetched_cb_eur = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[7]/td[4]/text()')).strip()
    fetched_cb_rub = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[18]/td[4]/text()')).strip()
    fetched_cb_usd = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[6]/td[4]/text()')).strip()
except:
    logging.warning("NB VED UZ fetch failed")
    raise IOError("NB VED UZ  fetch failed")
return fetched_ved_usd_buy, fetched_ved_usd_sell, fetched_cb_usd, fetched_ved_eur_buy, fetched_ved_eur_sell,\
    fetched_cb_eur, fetched_cb_rub

while True:
    f = open('values_uzb.txt', 'w')
    ved_usd_buy, ved_usd_sell, cb_usd, ved_eur_buy, ed_eur_sell, cb_eur, cb_rub = fetch_data_from_nb_ved_ru()
               f.write(str(ved_usd_buy)+'\n'+str(ved_usd_sell)+'\n'+str(cb_usd)+'\n'+str(ved_eur_buy)+'\n'+str(ed_eur_sell)+'\n'
        + str(cb_eur)+'\n'+str(cb_rub))

    f.close()
    time.sleep(120)

But it always returns empty string, however if I do print page.text, i can see that the values are on their's places. I got that xpath from firebug. Chrome gives the same xpath. Tried to construct own xpath //div[@class="exchangeRates"]/table/tbody/tr[5]/td[5] but it happens to be not valid to.

Any suggestions? Thanks.

cre8eve
  • 371
  • 1
  • 6
  • 11

3 Answers3

4

I am not certain what you are looking for exactly, but this works:

tree.xpath("/html/body/div[1]/div[7]/div/div/div[1]//text()")

As for starting with the class exchangeRates, I found by using tree.xpath("//div[@class='exchangeRates']/table")[0].getchildren() that there is no tbody child of table, even though browsers say there is. See this SO question for an explanation. Removing tbody from your original xpath does work. However, the one you chose (td[5]) is empty, thus returning []. Try

tree.xpath("//div[@class='exchangeRates']/table/tr[5]/td[4]//text()")
# ['706.65']

or

tree.xpath("//div[@class='exchangeRates']/table/tr[6]/td[5]//text()")
# ['2638.00']
Community
  • 1
  • 1
James Pringle
  • 1,079
  • 6
  • 15
  • Thanks a lot, it worked great! But how did you find the path `"/html/body/div[1]/div[7]/div/div/div[1]//text()"` ? – cre8eve Aug 27 '15 at 15:13
  • 1
    Just trial and error. Started with "/html" and added a child one at a time until I found where your first guess stopped working. – James Pringle Aug 28 '15 at 14:53
1

Try with this xpath:

tree.xpath('//div[@class="exchangeRates"]//tr[NUMBER OF TR]/td[5]/text()')

Another thing... I thing if you put this code you will improve your code:

trs = tree.xpath('//div[@class="exchangeRates"]//tr')
    for tr in trs:
        currency_code = tr.xpath('./td[7]/text()').strip()

        if currency_code=='USD':
            usd_buy = tr.xpath('./td[5]/text()').strip()
            usd_sell = tr.xpath('./td[6]/text()').strip()
            usd_cb = tr.xpath('./td[4]/text()').strip()

And continue with other currency that you need.

It is a quickly code, if you need more details reply please.

Pentux
  • 406
  • 4
  • 13
0

I use the following statement, which runs perfectly fine for me.

ActualValue = driver.find_element_by_xpath("//div/div[2]/div").text
Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
ARB
  • 319
  • 3
  • 6