Building docker image is failing from Dockerfile but not by hand

Question

I have a Dockerfile where one of the step is compiling opencv. The process fails if executed within a docker build, but it works if I execute each step by hand interactively.

Do you have any suggestions?

This is the docker file:

FROM nvcr.io/nvidia/l4t-base:r32.4.4
RUN apt-get update; apt-get -y upgrade
RUN DEBIAN_FRONTEND=noninteractive apt-get -y install build-essential pkg-config  libtbb2 libtbb-dev  libavcodec-dev libavformat-dev libswscale-dev libxvidcore-dev libavresample-dev libtiff-dev libjpeg-dev libpng-dev python-tk libgtk-3-dev libcanberra-gtk-module libcanberra-gtk3-module libv4l-dev libdc1394-22-dev cmake python3-dev python-dev python-numpy python3-numpy ccache
RUN mkdir /src; wget -O /src/opencv-3.4.0.tar.gz https://github.com/opencv/opencv/archive/3.4.0.tar.gz; wget -O /src/opencv_contrib-3.4.0.tar.gz https://github.com/opencv/opencv_contrib/archive/3.4.0.tar.gz; cd /src/; tar xzf opencv-3.4.0.tar.gz; tar xzf opencv_contrib-3.4.0.tar.gz; rm -f /src/*gz
RUN cd /src/opencv-3.4.0; mkdir build; cd build; LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda-10.2/targets/aarch64-linux/lib; PATH=/usr/local/cuda-10.2/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX:PATH=/opt/opencv/3.4.0 -D WITH_CUDA=ON -D CUDA_ARCH_PTX="" -D CUDA_ARCH_BIN="5.3,6.2,7.2" -D WITH_CUBLAS=ON -D WITH_LIBV4L=ON -D BUILD_opencv_python3=ON -D BUILD_opencv_python2=ON -D BUILD_opencv_java=OFF -D WITH_GSTREAMER=ON -D WITH_GTK=ON -D BUILD_TESTS=OFF -D BUILD_PERF_TESTS=OFF -D BUILD_EXAMPLES=OFF -D OPENCV_ENABLE_NONFREE=ON -DCUDA_CUDA_LIBRARY=/usr/local/cuda-10.2/targets/aarch64-linux/lib/stubs/libcuda.so -D OPENCV_EXTRA_MODULES_PATH=/src/opencv_contrib-3.4.0/modules ..
RUN cd /src/opencv-3.4.0/build; make -j4

And this is the error:

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_cublas_LIBRARY (ADVANCED)
    linked by target "opencv_cudev" in directory /src/opencv-3.4.0/modules/cudev
    linked by target "opencv_core" in directory /src/opencv-3.4.0/modules/core
    linked by target "opencv_cudaarithm" in directory /src/opencv-3.4.0/modules/cudaarithm

(the output is too long and the website doesn't allow me to paste the full one)

Once again the same steps run interactively into the container are working well.

Thanks

You might want to study [this](https://ngc.nvidia.com/catalog/containers/nvidia:l4t-base) carefully. Note this statement: "Similarly, CUDA, TensorRT and VisionWorks are ready to use within the l4t-base container as **they are made available from the host by the NVIDIA container runtime.**" The NVIDIA container runtime is not active when you are building a dockerfile. — Robert Crovella, Nov 06 '20 at 16:31
Yes, the problem is opencv shipped with the image is v4 and I need v3.. so no other option than compiling it. However, the issue I have is that the same steps executed one after the other from within the running container are successful. I was guessing it's some environment variables missing, but that's not the case. I tried passing such variables also as ENV options in the Dockerfile, and now directly from the command line... both unsuccessfully... — Marco, Nov 06 '20 at 19:31
Yes, they are successful within the **running** container because the container runtime makes the CUDA that is installed in the base machine visible inside the **running** container. For purposes of a docker build, that is not happening. This has nothing to do with environment variables. To make this work in a container build process, you would need to install CUDA in the container (at least). — Robert Crovella, Nov 06 '20 at 20:01
And so to achieve what @RobertCrovella is advising you to do, run those steps in an `ENTRYPOINT` script, rather than in a `RUN`. — β.εηοιτ.βε, Nov 06 '20 at 21:51

score 0 · Answer 1 · answered Nov 07 '20 at 13:15

So I found the answer here, thanks to the hint in the comments of my initial post.

This is how my daemon.json looks like now:

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
        "default-runtime": "nvidia" 
}

The important section is the default runtime, which was missing before.

Building docker image is failing from Dockerfile but not by hand

1 Answers1