Python & BeautifulSoup 4 - Unable to get Newegg prices?

Question

I'm attempting to scrape Newegg product pages for prices and I always seem to be running into the same problem - the result is always 'None'.

Here's a few very basic lines of code that work for similar sites such as Amazon:

 data = requests.get('https://www.newegg.com/Product/Product.aspx?Item=N82E16824475015&cm_sp=Homepage_Dailydeal-_-P1_24-475-015-_-03042019')
 soup = BeautifulSoup(data.text, 'html.parser')
 price = soup.find('li', class_='price-current').text.strip()

I'm expecting to get $419.99 as the output, but instead I get None.

When I try to get the product title, I get the desired result. It's only the prices that are giving me this issue. Has anyone had the same issue and how can this be fixed? Thanks in advance.

The web page you're parsing appears to have some dynamically generated content. Try [selenium](https://stackoverflow.com/questions/17540971/how-to-use-selenium-with-python) — Wondercricket, Mar 04 '19 at 21:32
I have tried Selenium and some other things which I was unable to get to work, that's why I had to ask for help. I prefer figuring things out on my own but I was just stuck on this one. I appreciate the quick replies and all the help I got. — kamen1111, Mar 04 '19 at 22:56

score 2 · Accepted Answer · answered Mar 04 '19 at 21:20

2

You can use an attribute selector to target an element containing that price in its content attribute.

import requests
from bs4 import BeautifulSoup

data = requests.get('https://www.newegg.com/Product/Product.aspx?Item=N82E16824475015&cm_sp=Homepage_Dailydeal-_-P1_24-475-015-_-03042019')
soup = BeautifulSoup(data.content, 'lxml')
price = soup.select_one('[itemprop=price]')['content']
print(price)

answered Mar 04 '19 at 21:20

QHarr

83,427
12
54
101

1

FWIW, while still display the correct price, this searches for a different tag than the OP is looking for. Mostly due to the fact the web page the OP is scraping is dynamically generated. Still +1 though – Wondercricket Mar 04 '19 at 21:32
1

@Wondercricket. Thanks. Yup. That is why I went for that. – QHarr Mar 04 '19 at 21:36
1

Simple and gets the job done; just what I needed for this task. Thank you very much. – kamen1111 Mar 04 '19 at 22:54

score 1 · Answer 2 · answered Mar 04 '19 at 21:22

1

I like to use the lxml Library as shown below. With it you can use XPATH which is great.

import urllib2
from lxml import etree

url =  "URL HERE"
response = urllib2.urlopen(url)
htmlparser = etree.HTMLParser()
tree = etree.parse(response, htmlparser)
tree.xpath('//*[@id="newproductversion"]/span/strong')

I get the expected output 419.99

answered Mar 04 '19 at 21:22

Julian Silvestri

1,970
1
15
33

This will come in very handy with something else I'm planning to do, I appreciate it very much. – kamen1111 Mar 04 '19 at 22:53

Python & BeautifulSoup 4 - Unable to get Newegg prices?

2 Answers2