I am getting familiar with Python & am struggling to do the below with BeautifulSoup, Python.
What is expected:
*If the output of the script below contains the string 5378
, it should email me with the line the string appears.
#! /usr/bin/env python
from bs4 import BeautifulSoup
from lxml import html
import urllib2,re
import codecs
import sys
streamWriter = codecs.lookup('utf-8')[-1]
sys.stdout = streamWriter(sys.stdout)
BASE_URL = "http://outlet.us.dell.com/ARBOnlineSales/Online/InventorySearch.aspx?c=us&cs=22&l=en&s=dfh&brandid=2201&fid=111162"
webpage = urllib2.urlopen(BASE_URL)
soup = BeautifulSoup(webpage.read(), "lxml")
findcolumn = soup.find("div", {"id": "itemheader-FN"})
name = findcolumn.text.strip()
print name
I tried using findall(5378, name)
, but it returns to empty braces like this []
.
- I am struggling with Unicode issues if I am trying to use it along with
grep
.
$ python dell.py | grep 5378
Traceback (most recent call last):
File "dell.py", line 18, in <module>
print name
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201d' in position 817: ordinal not in range(128)
Can someone tell me what am I doing wrong in both cases?