Extract value in Web scraping with Python Beautiful Soup

Question

How can I extract the values '1.00 TK = 779.8' from the HTML code below?

I tried below code but it din't work;

from bs4 import BeautifulSoup
page = requests.get(<url>).text

##here is the html page content'''<span _ngcontent-his-c101="" id="driveValue" class="ng-binding ng-scope"> 1.00 TK = 779.8<span _ngcontent-his-c101="">Disk Drive Value</span>(DDV) </span>'''

soup = BeautifulSoup(html, 'html.parser')
print(soup.find(id='driveValue').find_next(text=True).strip())

ERROR:

 AttributeError: 'NoneType' object has no attribute 'find_next'

I tried extracting the value using id="driveValue" and it results none — itgeek, Nov 11 '20 at 16:03

MendelG · Answer 1 · 2020-11-11T23:31:57.613

0

Use find_next(), which returns the first match:

from bs4 import BeautifulSoup

html = '''<span _ngcontent-his-c101="" id="driveValue" class="ng-binding ng-scope"> 1.00 TK = 779.8<span _ngcontent-his-c101="">Disk Drive Value</span>(DDV) </span>'''

soup = BeautifulSoup(html, 'html.parser')
print(soup.find(id='driveValue').find_next(text=True).strip())

Output:

1.00 TK = 779.8

Edit: Use Selenium:

from bs4 import BeautifulSoup
from selenium import webdriver
from time import sleep

URL = "https://www.westernunion.com/us/en/web/send-money/start?SrcCode=12345&ReceiveCountry=IN&SendAmount=100&ISOCurrency=CNY&FundsOut=BA&FundsIn=CreditCard"

driver = webdriver.Chrome(r"C:\path\to\chromedriver.exe")
driver.get(URL)
sleep(10)

soup = BeautifulSoup(driver.page_source, "html.parser")

price = driver.find_element_by_css_selector("span.ng-binding.ng-scope").text
print(price)

driver.quit()

Output:

1.00 USD = 73.9375 Indian Rupee (INR)

edited Nov 11 '20 at 23:31

answered Nov 11 '20 at 16:04

MendelG

14,885
4
25
52

getting an error AttributeError: 'NoneType' object has no attribute 'find_next' – itgeek Nov 11 '20 at 16:13
@itgeek The page is probably loaded dynamically. See [my answer](https://stackoverflow.com/a/64143412/12349734) using `selenium` to scrape a dynamic page. – MendelG Nov 11 '20 at 16:18
i tried this as well, print(soup.find('span',id="driveValue")) and it prints "None" – itgeek Nov 11 '20 at 16:23
your solutions helps when i want to extract a value from a string "html = ''' 1.00 TK = 779.8Disk Drive Value(DDV) ''' " – itgeek Nov 11 '20 at 16:41
Here's i don't have a string. I posted html snippet from page content. I'm getting the below error when I execute your solution for my requirement; AttributeError: 'NoneType' object has no attribute 'find_next' – itgeek Nov 11 '20 at 16:42
@itgeek please share the URL – MendelG Nov 11 '20 at 19:14
URL: https://www.westernunion.com/us/en/web/send-money/start?SrcCode=12345&ReceiveCountry=IN&SendAmount=100&ISOCurrency=CNY&FundsOut=BA&FundsIn=CreditCard – itgeek Nov 11 '20 at 20:32
and id="smoExchangeRate". and the value that needs to be extracted is "73.9375" – itgeek Nov 11 '20 at 20:32
I'm new to web scrapping, just started learning. :-) – itgeek Nov 11 '20 at 20:35
1

thanks I kind of knew possibilities using selenium. I was wondering if I can do without selenium. – itgeek Nov 12 '20 at 02:17

score -2 · Answer 2 · answered Nov 11 '20 at 16:07

-2

Hope its help.

from lxml import etree
txt = '''<span _ngcontent-his-c101="" id="driveValue" class="ng-binding ng-scope"> 1.00 TK = 779.8<span _ngcontent-his-c101="">Disk Drive Value</span>(DDV) </span>'''

root = etree.fromstring(txt)
for td in root.xpath('//span[contains(@class, "ng-binding ng-scope")]'):
    print(td.text)

print output

1.00 TK = 779.8

answered Nov 11 '20 at 16:07

Samsul Islam

2,581
2
17
23

Thanks Samsul .. Is there any way to extract using "ID" ? – itgeek Nov 11 '20 at 16:09
Why use XML parsing when the OP has the BeautifulSoup tag? – MendelG Nov 11 '20 at 16:09
Isn't it possible using beautifulSoap ? – itgeek Nov 11 '20 at 16:09
yes, it is possible to extract using id. I use XML parsing because easy-to-use library for processing XML and HTML in the Python. – Samsul Islam Nov 11 '20 at 16:16
It works fine when i just use string but here i need to read the content of the page, and extract the value . – itgeek Nov 11 '20 at 17:50
page = requests.get() soup = BeautifulSoup(page.content, 'html.parser') root = etree.fromstring(soup) print(root) for td in root.xpath('//span[contains(@class, "ng-binding ng-scope")]'): print(td.text) – itgeek Nov 11 '20 at 17:56
error: root = etree.fromstring(soup) File "src/lxml/etree.pyx", line 3237, in lxml.etree.fromstring File "src/lxml/parser.pxi", line 1895, in lxml.etree._parseMemoryDocument ValueError: can only parse strings – itgeek Nov 11 '20 at 17:57
It may help https://stackoverflow.com/questions/36449369/python-xpath-lxml-etree-xpathevalerror-invalid-predicate – Samsul Islam Nov 12 '20 at 18:32

Extract value in Web scraping with Python Beautiful Soup

2 Answers2