67

When I updated my packages I have this new error:

class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder):
AttributeError: 'module' object has no attribute '_base'

I tried to update beautifulsoup, with no more result. How can I fix that?

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
Ehvince
  • 17,274
  • 7
  • 58
  • 79

8 Answers8

117

I upgraded beautifulsoup4 and html5lib and it resolved the issue.

pip install --upgrade beautifulsoup4
pip install --upgrade html5lib
MattTriano
  • 1,532
  • 2
  • 16
  • 15
27

This is an issue with upstream package html5lib: https://bugs.launchpad.net/beautifulsoup/+bug/1603299 To fix, force downgrade to an older version:

pip install --upgrade html5lib==1.0b8

Bhavuk
  • 287
  • 3
  • 2
20

edit nov, 2017: it seems this doesn't work any more

Finally found out, a search engine didn't throw anything but it's referenced on beautifulsoup's issue tracker: https://bugs.launchpad.net/beautifulsoup/+bug/1603299

it works back with html5lib v0.9999999 (7 nines)

"html5lib<=0.9999999"
Ehvince
  • 17,274
  • 7
  • 58
  • 79
  • This fixes a similar bug in kaggle-cli too – Jim May 21 '17 at 08:54
  • 2
    (on W7).Unfortunately i tried both downgrading and upgrading.I also tried to seta virtual env using Python 2.7. Nothing worked so far, basically I am stuck on using beautifulsoup library – Carmine Tambascia Sep 28 '17 at 16:57
  • I just overcome this error simply checking that Pycharm was using the wrong interpreter in my virtual env. Indeed in the python idle and Powershell I did not faced such error – Carmine Tambascia Sep 28 '17 at 20:10
  • 4
    `html5lib<=0.9999999` has a security vulnerability and should not be used any longer. Source: https://www.sourceclear.com/registry/security/cross-site-scripting-xss-/python/sid-3068 – bzmw Nov 20 '17 at 19:36
  • This is the command to fix it: sudo pip install html5lib==0.9999999 – DataYoda Jul 17 '18 at 07:53
8

The downgrade to html5lib 1.0b8 in @Bhavuk answer works but courses a version issue with bleach.

The solution for me was with a change of version of bleach to be compatible with the new version of html5lib

pip install --upgrade bs4
pip install --upgrade bleach==1.4.2
pip install --upgrade html5lib==1.0b8

Python version 3.5

recurseuntilfor
  • 2,293
  • 1
  • 12
  • 17
  • 1
    For anaconda, I did `conda install html5lib==0.9999999`, which downgraded bleach to 1.5.0 but it worked – prusswan Sep 17 '18 at 11:22
3

The same problem occurred on me. I don't know what you were trying to do, but it occurred on me when I tried to read XML file in pandas, using pd.read_html().

The problem is fixed by upgrading all of beautifulsoup4, html5lib, and lxml, like:

pip install bs4
pip install html5lib
pip install lxml

And restart your Python environment and now it is working.

Blaszard
  • 30,954
  • 51
  • 153
  • 233
0

This command solved the problem for me:

 sudo pip install html5lib==0.9999999
kavya
  • 1
0

Just install html5lib using this because if you install the normal way then you have to spider using python2.

sudo pip3 install html5lib==0.9999999
Jeremy Caney
  • 7,102
  • 69
  • 48
  • 77
0

I found trying to switch versions did not work for me. In the end, based on this issue I edited the relevant file at ~/.local/lib/python3.7/site-packages/bs4/builder/_html5lib.py for my purposes.

Alex Moore-Niemi
  • 2,913
  • 2
  • 24
  • 22