1

I am trying to get from the following

<span id="SkuNumber" itemprop="identifier" content="sku:473768" data-nodeid="176579" class="product-code col-lg-4 col-md-4">ΚΩΔ. 473768</span></div>

the value of data-nodeid I did the following

price_nodes = soup.find('span', attrs={'id': 'SkuNumber'})
datanode = price_nodes.select_one('span[data-nodeid]')

But I get "None" How can I fix this? thank you

petezurich
  • 9,280
  • 9
  • 43
  • 57
Maria Georgali
  • 629
  • 1
  • 9
  • 22

2 Answers2

2

If price_nodes is correctly fill

i.e. price_nodes =

<span id="SkuNumber" itemprop="identifier" content="sku:473768" data-nodeid="176579" class="product-code col-lg-4 col-md-4">ΚΩΔ. 473768</span>

You just have to do this:

datanode = price_nodes.get('data-nodeid')

Full code should be:

from bs4 import BeautifulSoup as soup

html = '<div><span id="SkuNumber" itemprop="identifier" content="sku:473768" data-nodeid="176579" class="product-code col-lg-4 col-md-4">ΚΩΔ. 473768</span></div>'
page = soup(html, 'html.parser')
price_nodes = page.find('span', {'id': 'SkuNumber'})
datanode = price_nodes.get('data-nodeid')
Maaz
  • 2,405
  • 1
  • 15
  • 21
  • 1
    Is price_nodes['data-nodeid'] faster than price_nodes.get('data-nodeid') ? – Maria Georgali Feb 07 '20 at 13:30
  • As said here: https://stackoverflow.com/a/36566435/2437141, `get()` seems to be a little bit slower (but not so much), but it allows you to define default value if value is not present, like `get('data-nodeid', 'N/A')` – Maaz Feb 07 '20 at 13:37
1
from bs4 import BeautifulSoup

html = '<span id="SkuNumber" itemprop="identifier" content="sku:473768" data-nodeid="176579" class="product-code col-lg-4 col-md-4">ΚΩΔ. 473768</span></div>'
soup = BeautifulSoup(html)

price_nodes = soup.find('span', attrs={'id': 'SkuNumber'})
print(price_nodes['data-nodeid'])
Daan Klijn
  • 1,269
  • 3
  • 11
  • 28