2

I'm trying to use pyarrow and i keep getting the following error.

ImportError: Can not find the shared library: libhdfs3.so

so i read some stackoverflow and it says that i need to set enviorment variable for ARROW_LIBHDFS_DIR.
The path to libhdfs.so is /usr/local/hadoop/native/
it tried to set it in bashrc but it didn't work
the conda installation doesn't seem to work i.e.

conda install libhdfs3
pip install libhdfs3
conda install -c clinicalgraphics libgcrypt11
conda install libprotobuf=2.5
conda update libhdfs3 

it will be a great help if i get this. thanks in advance.

Kush Singh
  • 157
  • 3
  • 11

1 Answers1

2

ensure libhdfs.so is in $HADOOP_HOME/lib/native as well as in $ARROW_LIBHDFS_DIR

use this to check if you have the variable set in your bash environment ls $ARROW_LIBHDFS_DIR

if not locate the file using locate -l 1 libhdfs.so

Assign the directory path you locate to the ARROW_LIBHDFS_DIR variable using ARROW_LIBHDFS_DIR=<directory location to libhdfs.so>

referenced here in SO - https://stackoverflow.com/a/62749351/6263217

Chess
  • 131
  • 8