5

I am trying to use pyarrow in with pyspark. However when I try to execute

import pyarrow

I receive the following error

    In [1]: import pyarrow
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-f1048abcb32d> in <module>
----> 1 import pyarrow

~/opt/anaconda3/lib/python3.7/site-packages/pyarrow/__init__.py in <module>
     47 import pyarrow.compat as compat
     48
---> 49 from pyarrow.lib import cpu_count, set_cpu_count
     50 from pyarrow.lib import (null, bool_,
     51                          int8, int16, int32, int64,

ImportError: dlopen(/Users/user/opt/anaconda3/lib/python3.7/site-packages/pyarrow/lib.cpython-37m-darwin.so, 2): Library not loaded: @rpath/libboost_filesystem.dylib
  Referenced from: /Users/user/opt/anaconda3/lib/libarrow.15.1.0.dylib
  Reason: image not found

I have tried to install pyarrow in a conda environment, downgrading to python 3.6 but without success.

Doe someone have any suggestion to solve the problem?

Galuoises
  • 2,630
  • 24
  • 30

2 Answers2

3

It looks like PyArrow was not installed properly. So please try clean older packages and then install pyarrow again using below command,

   {{ conda install -c conda-forge pyarrow }}
Thomas Martin
  • 666
  • 2
  • 6
  • 26
  • Thank you it seems it worked. I have also install apache-arrow with brew install apache-arrow and brew install apache-arrow-glib – Galuoises Feb 24 '20 at 09:24
2

The Accepted answer didn't work for me as I'm in MacOs, I have been researching and the one that helped me was this one. For those that have the same problem but in MacOS.

brew update && brew upgrade
brew switch openssl 1.0.2s

Worked for me Catalina 10.15.4

Gonzalo Garcia
  • 6,192
  • 2
  • 29
  • 32