How to add the 'features="html.parser"' in python

Question

I have some code that is returning the error: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 14 of the file scrape7.py. To get rid of this warning, pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor.

My code is:

import numbers
import httplib2
import requests
import re
from bs4 import BeautifulSoup, SoupStrainer

http = httplib2.Http()
status, response = http.request('https://www.macys.com/social/the-edit/')

editurlslist = []

for link in BeautifulSoup(response, parse_only=SoupStrainer('a')):
    if link.has_attr('href'):
        if '/the-edit' in link['href']:
            editurlslist.append(str("https://www.macys.com"+link['href']))

products = []

for i in editurlslist:
   products.append(re.findall(r'(data-thisProduct=\"[0-9]{7}\")', 
   requests.get(i).text))

for i in editurlslist:
   products.append(re.findall(r'(productid=\"[0-9]{7}\")', 
   requests.get(i).text))

products2 = [x for x in products if type(x) in numberic_types]

print(products2)

`BeautifulSoup(response, 'html.parser', ...)`; the second argument is called `features`. — Martijn Pieters, Feb 14 '19 at 21:22

Alan Kavanagh · Accepted Answer · 2019-02-14T21:28:44.403

3

Pass the "html.parser" parameter to the BeautifulSoup constructor:

for link in BeautifulSoup(response, "html.parser", parse_only=SoupStrainer('a')):

edited Feb 14 '19 at 21:28

answered Feb 14 '19 at 21:20

Alan Kavanagh

9,425
7
41
65

Thanks for the feedback. Unfortunately, the following error is generated: Traceback (most recent call last): File "scrape7.py", line 14, in for link in BeautifulSoup(response, parser="html.parser", parse_only=SoupStrainer('a')): File "C:\Users\A089955\AppData\Local\Programs\Python\Python37-32\lib\site-packages\bs4\__init__.py", line 183, in __init__ "__init__() got an unexpected keyword argument '%s'" % arg) TypeError: __init__() got an unexpected keyword argument 'parser' – Scott Davis Feb 14 '19 at 21:27
@ScottDavis check the update – Alan Kavanagh Feb 15 '19 at 09:54
Thanks again AK47. It worked! Here's the final code excerpt: for link in BeautifulSoup(response, "html.parser", parse_only=SoupStrainer('a')): – Scott Davis Feb 16 '19 at 21:18

How to add the 'features="html.parser"' in python

1 Answers1