5

I have been struggling to be able to do from lxml import etree (import lxml works fine by the way) The error is:

ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-            packages/lxml/etree.so, 2): Symbol not found: _htmlParseChunk
Referenced from: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/lxml/etree.so
Expected in: flat namespace
in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/lxml/etree.so

i used pip to install lxml, and homebrew to reinstall libxml2 with the right architecture (or so i think) ...does anyone have ideas on how to fix/diagnose this? I'm on 64 bit python

Pat B
  • 564
  • 1
  • 5
  • 15
  • Try using `otool -L` on `etree.so` to see what library paths it is searching for. – Ned Deily Nov 01 '11 at 01:32
  • the output of that is `etree.so: /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.1)` ...not sure what i am supposed to do with this info – Pat B Nov 01 '11 at 01:39
  • although i notice there is no file at the path it outputs there – Pat B Nov 01 '11 at 01:49

1 Answers1

12

lxml is a bit fussy about what 3rd-party libraries it uses and it often needs newer versions than what are supplied by Apple. Suggest you read and follow the instructions here for building lxml from source on Mac OS X including building its own statically linked libs. That should work. (I'm a little surprised that homebrew doesn't already have an lxml recipe.)

UPDATE: Based on the limited information in your comments, it is difficult to be sure exactly what is happening. I suspect you are not using the version of Python you think you are. There are any number of ways to install lxml successfully; that's part of the problem: there are too many options. Rather than trying to debug your setup, here's probably the simplest way to get a working lxml on 10.7 using the Apple-supplied system Python 2.7.

$ sudo STATIC_DEPS=true /usr/bin/easy_install-2.7 lxml

You should then be able to use lxml.etree this way:

$ /usr/bin/python2.7
Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> etree.__file__
'/Library/Python/2.7/site-packages/lxml-2.3.1-py2.7-macosx-10.7-intel.egg/lxml/etree.so'
>>> 

I notice though that the lxml static build process does not produce a working universal build. You'll probably see messages like this during the lxml install:

ld: warning: ignoring file /private/tmp/easy_install-83mJsV/lxml-2.3.1/build/tmp/libxml2/lib/libxslt.a, file was built for archive which is not the architecture being linked (i386)

Assuming the default architecture on your machine is 64-bits, if you try to run in 32-bit mode:

$ arch -i386 /usr/bin/python2.7
Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:06) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: dlopen(/Library/Python/2.7/site-packages/lxml-2.3.1-py2.7-macosx-10.7-intel.egg/lxml/etree.so, 2): Symbol not found: _htmlParseChunk
  Referenced from: /Library/Python/2.7/site-packages/lxml-2.3.1-py2.7-macosx-10.7-intel.egg/lxml/etree.so
  Expected in: flat namespace
 in /Library/Python/2.7/site-packages/lxml-2.3.1-py2.7-macosx-10.7-intel.egg/lxml/etree.so
>>> ^D

And there is the error message you originally reported! So the root cause of that appears to be that the static libraries (libxml2 etc) that lxml builds are not universal. As long as you have no need to use lxml in a 32-bit process (unlikely for most uses), this should not be a problem. Chances are that the Python you were originally using was a 32-bit-only one; that is consistent with some of the other messages you reported.

Ned Deily
  • 83,389
  • 16
  • 128
  • 151
  • ok- i just did that and noticed this proviso at the end of the installation: This formula is keg-only, so it was not symlinked into /usr/local. Mac OS X already provides this program and installing another version in parallel can cause all kinds of trouble. Generally there are no consequences of this for you. If you build your own software and it requires this formula, you'll need to add its lib & include paths to your build variables: LDFLAGS -L/usr/local/Cellar/libxslt/1.1.26/lib CPPFLAGS -I/usr/local/Cellar/libxslt/1.1.26/include – Pat B Nov 01 '11 at 01:52
  • does that mean i need to do something when i pip install lxml? – Pat B Nov 01 '11 at 01:52
  • also, after installing libxslt with homebrew, i still get the same error. I'm thinking that lxml is using the wrong libxml2 and libxslt to build with, is that possible? – Pat B Nov 01 '11 at 01:55
  • I forgot about how fussy lxml is. See updated answer. `lxml` can be installed on 10.7; it's no problem with MacPorts. – Ned Deily Nov 01 '11 at 01:59
  • well, i tried installing with --static-deps... it fails with this: -I/Users/patrickbrooks/Downloads/lxml-lxml-6d1e124/build/tmp/libxml2/include/libxslt -I/Users/patrickbrooks/Downloads/lxml-lxml-6d1e124/build/tmp/libxml2/include/libexslt -I/opt/local/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c src/lxml/lxml.etree.c -o build/temp.macosx-10.5-x86_64-2.6/src/lxml/lxml.etree.o -w -flat_namespace unable to execute /usr/bin/gcc-4.0: No such file or directory error: command '/usr/bin/gcc-4.0' failed with exit status 1 – Pat B Nov 01 '11 at 02:08
  • It looks like you have an older MacPorts Python still installed in /opt/local. You could remove /opt/local/bin/ from your $PATH. Or if you have recently upgraded to 10.7 and still want to use your MacPorts packages you should follow the upgrade steps on the MacPorts website. Then you could just use a MacPorts Python and the MacPorts py26-lxml (or py27-lxml). Don't mix packages! Pick Homebrew or MacPorts or do it all yourself but you'll get into problems if you mix components from different sources. – Ned Deily Nov 01 '11 at 02:17
  • ok, i removed the macports paths from my ~\.bash_profile ...however, python setup.py build --static-deps still fails with the same error. Thanks for the help thus far btw :) – Pat B Nov 01 '11 at 02:31
  • sudo STATIC_DEPS=true /usr/bin/easy_install-2.7 lxml worked like a charm. thank you so much for the effort and advice! i learned something :) – Pat B Nov 01 '11 at 16:42
  • I've been using Inkscape to create icons for my iOS apps. When I try to render a gear, I get a message "he fantastic lxml wrapper for libxml2 is required by inkex.py and therefore this extension. Please download and install the latest version from http://cheeseshop.python.org/pypi/lxml/, or install it through your package manager by a command like: sudo apt-get install python-lxml". Searching for a solution, I found this thread, but I'm not sure if it's relevant. Any suggestions for someone who doesn't know Python? – Victor Engel Jun 29 '13 at 00:01
  • for guys who are having problem with scrapy on anaconda, this works for me – yukclam9 Jul 16 '15 at 16:07