1

I build a Docker image for an armv7 architecture with python packages numpy, scipy, pandas and google-cloud-bigquery using packages from piwheels. The base image is Python:3.7-buster.

If I'm running a container with this image, the container always restarts and gives me the error log "ValueError: This method requires pyarrow to be installed":

Traceback (most recent call last):
  File "main_prog.py", line 3, in <module>
    upload_data()
  File "/usr/src/app/bigquery.py", line 39, in upload_data
    job = client.load_table_from_dataframe(dataframe, table_id, job_config=job_config)  # Make an API request.
  File "/usr/local/lib/python3.7/site-packages/google/cloud/bigquery/client.py", line 2574, in load_table_from_dataframe
    raise ValueError("This method requires pyarrow to be installed")
ValueError: This method requires pyarrow to be installed

So I tried to install pyarrow directly in my Dockerfile with:

RUN pip3 install pyarrow

This gives me the error "ERROR: Could not build wheels for pyarrow which use PEP 517 and cannot be installed directly" during the image build:

> [10/11] RUN pip3 install pyarrow:
#14 164.9   copying pyarrow/tests/parquet/test_parquet_writer.py -> build/lib.linux-armv7l-3.7/pyarrow/tests/parquet
#14 164.9   running build_ext
#14 164.9   creating /tmp/pip-install-jiim0m92/pyarrow_07d2ad5142d7405fa1b4bb2fe83e0428/build/temp.linux-armv7l-3.7
#14 164.9   -- Running cmake for pyarrow
#14 164.9   cmake -DPYTHON_EXECUTABLE=/usr/local/bin/python -DPython3_EXECUTABLE=/usr/local/bin/python  -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=off -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off -DPYARROW_BUILD_HDFS=off -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on -DCMAKE_BUILD_TYPE=release /tmp/pip-install-jiim0m92/pyarrow_07d2ad5142d7405fa1b4bb2fe83e0428
#14 164.9   error: command 'cmake' failed with exit status 1
#14 164.9   ----------------------------------------
#14 164.9   ERROR: Failed building wheel for pyarrow
#14 164.9 Failed to build pyarrow
#14 164.9 ERROR: Could not build wheels for pyarrow which use PEP 517 and cannot be installed directly

Then like its recommended here I tried:

RUN pip3 install pandas-gbq==0.14.0 

and

RUN pip install --upgrade 'google-cloud-bigquery[bqstorage,pandas]'

but nothing worked and every time I get the same error like above. I couldn't find a wheel for pyarrow for armv7 neither on piwheels nor on PyPi.

Does anyone knows an answer? Thank you for your help!

Ohaiomundo
  • 31
  • 7
  • pyarrow doesn't publish wheels for arm7 as far as I know (it does for arm64). You might try installing with conda instead of pip (I'm not sure it is a supported version there either). if neither of these work you probably have to try building pyarrow from source. – Micah Kornfield Sep 26 '21 at 04:18
  • Yes, I tried this already [here](https://stackoverflow.com/questions/69265038/docker-image-build-how-to-install-python-packages-google-cloud-bigquery-and-num). But that didn't worked thats why I tried it with the pywheels. – Ohaiomundo Sep 30 '21 at 09:53

1 Answers1

0

I solved this problem by using a seperate container image with Node-RED

FROM nodered/node-red:latest

RUN npm install node-red-contrib-google-cloud

on which I could use the google-cloud packages. This container handles now the upload task to google-cloud. To use node-red with docker I visited this site and this was the google-cloud-package I installed.

Ohaiomundo
  • 31
  • 7