26

I'm trying to install the pyarrow on a master instance of my EMR cluster, however I'm always receiving this error.

[hadoop@ip-XXX-XXX-XXX-XXX ~]$ sudo /usr/bin/pip-3.4 install pyarrow
Collecting pyarrow
Downloading https://files.pythonhosted.org/packages/c0/a0/f7e9dfd8988d94f4952f9b50eb04e14a80fbe39218520725aab53daab57c/pyarrow-0.10.0.tar.gz (2.1MB)
100% |████████████████████████████████| 2.2MB 643kB/s 
Requirement already satisfied: numpy>=1.10 in /usr/local/lib64/python3.4/site-packages (from pyarrow)
Requirement already satisfied: six>=1.0.0 in /usr/local/lib/python3.4/site-packages (from pyarrow)
Installing collected packages: pyarrow
Running setup.py install for pyarrow ... error
Complete output from command /usr/bin/python3.4 -u -c "import setuptools, tokenize;__file__='/mnt/tmp/pip-build-pr3y5_mu/pyarrow/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-vmywdpeg-record/install-record.txt --single-version-externally-managed --compile:
/usr/lib64/python3.4/distutils/dist.py:260: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
/mnt/tmp/pip-build-pr3y5_mu/pyarrow/.eggs/setuptools_scm-3.1.0-py3.4.egg/setuptools_scm/utils.py:118: UserWarning: 'git' was not found
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.4
creating build/lib.linux-x86_64-3.4/pyarrow
copying pyarrow/pandas_compat.py -> build/lib.linux-x86_64-3.4/pyarrow
copying pyarrow/serialization.py -> build/lib.linux-x86_64-3.4/pyarrow
......
creating build/lib.linux-x86_64-3.4/pyarrow/tests/data
copying pyarrow/tests/data/v0.7.1.all-named-index.parquet -> build/lib.linux-x86_64-3.4/pyarrow/tests/data
copying pyarrow/tests/data/v0.7.1.column-metadata-handling.parquet -> build/lib.linux-x86_64-3.4/pyarrow/tests/data
copying pyarrow/tests/data/v0.7.1.parquet -> build/lib.linux-x86_64-3.4/pyarrow/tests/data
copying pyarrow/tests/data/v0.7.1.some-named-index.parquet -> build/lib.linux-x86_64-3.4/pyarrow/tests/data
running build_ext
creating build/temp.linux-x86_64-3.4
-- Runnning cmake for pyarrow
cmake -DPYTHON_EXECUTABLE=/usr/bin/python3.4  -DPYARROW_BOOST_USE_SHARED=on -DCMAKE_BUILD_TYPE=release /mnt/tmp/pip-build-pr3y5_mu/pyarrow
unable to execute 'cmake': No such file or directory
error: command 'cmake' failed with exit status 1

----------------------------------------
Command "/usr/bin/python3.4 -u -c "import setuptools, tokenize;__file__='/mnt/tmp/pip-build-pr3y5_mu/pyarrow/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-vmywdpeg-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /mnt/tmp/pip-build-pr3y5_mu/pyarrow/

I don't know why it says 'command 'cmake' failed with exit status 1', In fact to be sure, I preinstalled the cmake, but I still get this error. Furthermore, I can do sudo pip install pyarrow with no problem, but I'm getting error when using sudo pip-3.4 install pyarrow. Am I missing something or maybe this error has nothing to do with cmake? I'll appreciate for any help.

Yiming Wu
  • 611
  • 1
  • 5
  • 11
  • 1
    Is `cmake` on your `$PATH` ? Your use of `sudo` might reset `$PATH`. Check with `sudo env` (or don't use sudo to install simple modules) – Botje Sep 05 '18 at 11:12
  • @Botje cmake is in the $PATH. I'm permission denied if without sudo in EMR. The weird thing here is I can using sudo pip install pyarrow with no problem, but I got error when using sudo pip-3.4 install. – Yiming Wu Sep 05 '18 at 13:31
  • I was getting this error with `sudo pip-3.4 install pyarror`: `package 'arrow' not found`? Installing version 0.9.0 did work for me but I wouldn't call that a ideal solution. The AWS AMI doesn't seem to have a package for arrow. I'm reluctant to download and build arrow myself. Maybe I'll just use python2. – Tim Ludwinski Oct 26 '18 at 15:29
  • 1
    If anyone is looking for this in 2021, what worked for me was to set export `ARROW_HOME=/usr/local`, then `pip3 install pyarrow` worked flawlessly. – pedrostrusso Oct 01 '21 at 18:07

5 Answers5

25

For me (on linux) the problem was a too old version of pip

pip --version
> pip 18.1

which is according to arrow.apache.org too low:

On Linux, you will need pip >= 19.0 to detect the prebuilt binary packages.

to upgrade pip to the latest version, this worked for me:

pip install --upgrade pip

but it might be different for you, see this thread for other ways to upgrade pip.

vvvvv
  • 25,404
  • 19
  • 49
  • 81
Anton
  • 563
  • 4
  • 13
17

Finally I found a way to get around this situation by installing an earlier version of pyarrow. I was trying to install pyarrow-0.10.0 which failed. But if I'm installing the pyarrow-0.9.0, it works. So I think there might be some compatible issues between cmake and pyarrow-0.10.0.

Yiming Wu
  • 611
  • 1
  • 5
  • 11
2

(MacOS) I was installing pyarrow and snowflake-connector-python on python3.11 virtual environment the error message was:

Python pip install pyarrow error, unable to execute 'cmake'

solved by using python3.9 environment like

python3.9 -m venv myvenv
Roshin Raphel
  • 2,612
  • 4
  • 22
  • 40
1

Using the --no-use-pep517 switch with pip did the trick for me. (with Debian 11, Python 3.11, pip 22.3.1)

According to my understanding, pip tries going with PEP-517 and build pyarrow from source.

When I tried, initially this failed with cmake. Once I've installed that via apt-get install cmake, I've got another error about Arrow is not being installed. Then I found out about --no-use-pep517 from some GitHub issues:

Some tickets mentioning this workaround:

Vajk Hermecz
  • 5,413
  • 2
  • 34
  • 25
-5

Seems there is problem with pyarrow with cmake and pip.

You can use conda instead of pip.

conda install -c conda-forge pyarrow