How to cross-compile tensorflow-serving on docker build image with bazel when copts contain spaces

Question

Background info

I would like to run tensorflow-serving on some older machine (target system) which doesn't support modern cpu instructions used in the standard tensorflow build. I used these instructions for installing tf-serving via docker. However I ran into the error Tensorflow Serving Illegal Instruction core dumpedsimilar to this one on github. The suggested solution was to use a docker build-image to compile the binary on my target system which is described here.

Since this part is relevant for the reproduction of my issue I will copy the relevant commands here:

git clone https://github.com/tensorflow/serving
cd serving
docker build --pull -t $USER/tensorflow-serving-devel -f tensorflow_serving/tools/docker/Dockerfile.devel .

This will compile the binary with the flag -march=native in my docker container on my slow target machine and works.

Target system info

However on my old machine the compilation takes forever and I would like to use my other more powerful pc to cross-compile the binary. I used the commands provided in this answer to find out the needed compilation flags of my target system to replicate the build flag -march=native which is the default flag implicitly used in the above process.

gcc -### -E - -march=native 2>&1 | sed -r '/cc1/!d;s/(")|(^.* - )//g'

gave me the following flags:

-march=core2 -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mno-movbe -mno-aes -mno-sha -mno-pclmul -mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mno-xsave -mno-xsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-clwb -mno-mwaitx -mno-clzero -mno-pku --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=core2

Note especially the follwing flags at the end which contain spaces:

 --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=2048

I can provide these flags in the docker build process via the build argument TF_SERVING_BUILD_OPTIONS as described in the docs here

This string is then used to run bazel build which can be seen in the Dockerfile.devel

Thus I take all the flags from above and put --copt= in front and put the resulting string in the variable TF_SERVING_BUILD_OPTIONS. This is my total command including the copts at the end with spaces:

docker build --pull \
    --build-arg TF_SERVING_BUILD_OPTIONS="--copt=-mmmx --copt=-mno-3dnow --copt=-msse --copt=-msse2 --copt=-msse3 --copt=-mssse3 --copt=-mno-sse4a --copt=-mcx16 --copt=-msahf --copt=-mno-movbe --copt=-mno-aes --copt=-mno-sha --copt=-mno-pclmul --copt=-mno-popcnt --copt=-mno-abm --copt=-mno-lwp --copt=-mno-fma --copt=-mno-fma4 --copt=-mno-xop --copt=-mno-bmi --copt=-mno-bmi2 --copt=-mno-tbm --copt=-mno-avx --copt=-mno-avx2 --copt=-mno-sse4.2 --copt=-mno-sse4.1 --copt=-mno-lzcnt --copt=-mno-rtm --copt=-mno-hle --copt=-mno-rdrnd --copt=-mno-f16c --copt=-mno-fsgsbase --copt=-mno-rdseed --copt=-mno-prfchw --copt=-mno-adx --copt=-mfxsr --copt=-mno-xsave --copt=-mno-xsaveopt --copt=-mno-avx512f --copt=-mno-avx512er --copt=-mno-avx512cd --copt=-mno-avx512pf --copt=-mno-prefetchwt1 --copt=-mno-clflushopt --copt=-mno-xsavec --copt=-mno-xsaves --copt=-mno-avx512dq --copt=-mno-avx512bw --copt=-mno-avx512vl --copt=-mno-avx512ifma --copt=-mno-avx512vbmi --copt=-mno-clwb --copt=-mno-mwaitx --copt=-mno-clzero --copt=--param l1-cache-size=32 --copt=--param l1-cache-line-size=64 --copt=--param l2-cache-size=2048 --copt=-mtune=core2" \
    -t $USER/tensorflow/serving-devel \
    -f tensorflow_serving/tools/docker/Dockerfile.devel .

Problem

However bazel complains as follows, which is probably due to the space inbetween --param and l1-cache-size=32 which is a option for the C-compiler provided to a bazel build call.

ERROR: Skipping 'l1-cache-line-size=64': couldn't determine target from filename 'l1-cache-line-size=64'
ERROR: couldn't determine target from filename 'l1-cache-line-size=64'
INFO: Elapsed time: 20.233s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
The command '/bin/sh -c bazel build --color=yes --curses=yes     ${TF_SERVING_BAZEL_OPTIONS}     --verbose_failures     --output_filter=DONT_MATCH_ANYTHING     ${TF_SERVING_BUILD_OPTIONS}     tensorflow_serving/model_servers:tensorflow_model_server &&     cp bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server     /usr/local/bin/' returned a non-zero code: 1

What I tried

I tried escaping the space char in the last flags:

TF_SERVING_BUILD_OPTIONS="--copt=-mmmx ... --copt=--param\ l1-cache-size=32 --copt=--param\ l1-cache-line-size=64 --copt=--param\ l2-cache-size=2048 --copt=-mtune=core2 "

But bazel still complains with the same error message as above.

I tried enclosing the commands in double or single quotes:

TF_SERVING_BUILD_OPTIONS="--copt=-mmmx ... --copt=\"--param l1-cache-size=32\" --copt=\"--param l1-cache-line-size=64\" --copt=\"--param l2-cache-size=2048\" --copt=-mtune=core2 "

Also the same error as before appears.

I tried using inner double quotes for copts and wrap the TF_SERVING_BUILD_OPTIONS with outer single-quotes but same error.
I tried escaping the double-quotes from copts with \x22. A similar error as before apears. This time indicating that the target is malformed ERROR: Skipping 'l1-cache-size=32\x22': Bad target pattern...
I tried escaping the space char with \40:

TF_SERVING_BUILD_OPTIONS="--copt=-mmmx ... --copt=--param\40l1-cache-size=32 --copt=--param\40l1-cache-line-size=64 --copt=--param\40l2-cache-size=2048 --copt=-mtune=core2 "

This time bazel didnt complain, since the argument of copt was one string without normal spaces. However the arguments are passed incorrectly to gcc, since I get the following error:

ERROR: /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/grpc/BUILD:692:1: C++ compilation of rule '@grpc//:grpc_base_c' failed (Exit 1): gcc failed: error executing command 
  (cd /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving && \
  exec env - \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
    PWD=/proc/self/cwd \
    PYTHON_BIN_PATH=/usr/bin/python \
  /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections -fdata-sections '-std=c++0x' -MD -MF bazel-out/k8-opt/bin/external/grpc/_objs/grpc_base_c/endpoint_pair_uv.d '-frandom-seed=bazel-out/k8-opt/bin/external/grpc/_objs/grpc_base_c/endpoint_pair_uv.o' '-DGRPC_ARES=0' -iquote external/grpc -iquote bazel-out/k8-opt/genfiles/external/grpc -iquote bazel-out/k8-opt/bin/external/grpc -iquote external/zlib_archive -iquote bazel-out/k8-opt/genfiles/external/zlib_archive -iquote bazel-out/k8-opt/bin/external/zlib_archive -isystem external/grpc/include -isystem bazel-out/k8-opt/genfiles/external/grpc/include -isystem bazel-out/k8-opt/bin/external/grpc/include -isystem external/zlib_archive -isystem bazel-out/k8-opt/genfiles/external/zlib_archive -isystem bazel-out/k8-opt/bin/external/zlib_archive -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mno-movbe -mno-aes -mno-sha -mno-pclmul -mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mno-xsave -mno-xsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-clwb -mno-mwaitx -mno-clzero '--param\40l1-cache-size=32' '--param\40l1-cache-line-size=64' '--param\40l2-cache-size=2048' '-mtune=core2' '-std=c++14' '-D_GLIBCXX_USE_CXX11_ABI=0' -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/grpc/src/core/lib/iomgr/endpoint_pair_uv.cc -o bazel-out/k8-opt/bin/external/grpc/_objs/grpc_base_c/endpoint_pair_uv.o)
Execution platform: @bazel_tools//platforms:host_platform
gcc: error: unrecognized command line option '--param\40l1-cache-size=32'
gcc: error: unrecognized command line option '--param\40l1-cache-line-size=64'
gcc: error: unrecognized command line option '--param\40l2-cache-size=2048'
Target //tensorflow_serving/model_servers:tensorflow_model_server failed to build

It seems that this is related to the following issue on github.

I tried the compilation withou the flags which contain spaces and this finished fine which strengthens the assumption that the error is due to the space which is incorrectly handled from bazel.

How can I fix that problem?

jww · Accepted Answer · 2019-11-29T05:15:17.430

I would like to run tensorflow-serving on some older machine (target system) which doesn't support modern cpu instructions used in the standard tensorflow build. I used these instructions for installing tf-serving via docker. However I ran into the error Tensorflow Serving Illegal Instruction core dumpedsimilar to this one on github...

Bazel and TensorFlow use -march=native by default in its build flags, if I recall properly.

You should omit the flag, or specify a more appropriate flags like -march=sse4.2.

-march=core2 -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf
-mno-movbe -mno-aes -mno-sha -mno-pclmul -mno-popcnt -mno-abm -mno-lwp -mno-fma
-mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2
-mno-sse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase
-mno-rdseed -mno-prfchw -mno-adx -mfxsr -mno-xsave -mno-xsaveopt -mno-avx512f
-mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec
-mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi
-mno-clwb -mno-mwaitx -mno-clzero -mno-pku --param l1-cache-size=32
--param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=core2

Your dump shows -mno-sse4.1. I believe that means you can use the following and be done with it.

-msse2 -msse3 -mssse3

x86_64 has SSE2 as part of the core instruction set, so it implies MMX and SSE.

I think you should not use -march=core2 and -mtune=core2 because Core2 means you have SSE4.1 (early iCore cpus) or SSE4.2 (later iCore cpus).

And from the GCC man page on x86_64 options this looks suspicious/wrong to me:

core2

Intel Core2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support.

I'm fairly certain Core2's are more capable than SSSE3. I keep several Core2 machines around for testing and they have SSE4.1 and SSE4.2. (I believe one has CRC instructions, and that is SSE4.2 ISA).

I may be wrong about the GCC option page, but it looks suspicious to me.

Tensorflow Serving Illegal Instruction core dumped

What was the illegal instruction?

gcc -### -E - -march=native 2>&1 | sed -r '/cc1/!d;s/(")|(^.* - )//g'

Just another point of view, but I find something like this more useful. From a Skylake machine:

$ gcc -march=native -dM -E - </dev/null | grep -E 'SSE|CRC|AES|PCL|RDRND|RDSEED|AVX' | sort
#define __AES__ 1
#define __AVX__ 1
#define __AVX2__ 1
#define __PCLMUL__ 1
#define __RDRND__ 1
#define __RDSEED__ 1
#define __SSE__ 1
#define __SSE2__ 1
#define __SSE2_MATH__ 1
#define __SSE3__ 1
#define __SSE4_1__ 1
#define __SSE4_2__ 1
#define __SSE_MATH__ 1
#define __SSSE3__ 1

From the preprocessor dump I know I can use -msse2, -msse3, -mssse3, -msse4.1, -msse4.2, -mavx and -mavx2.

And from a Core2 machine:

$ gcc -march=native -dM -E - </dev/null | grep -E 'SSE|CRC|AES|PCL|RDRND|RDSEED|AVX' | sort
#define __SSE__ 1
#define __SSE2__ 1
#define __SSE2_MATH__ 1
#define __SSE3__ 1
#define __SSE4_1__ 1
#define __SSE_MATH__ 1
#define __SSSE3__ 1

From the preprocessor dump I know I can use -msse2, -msse3, -mssse3 and -msse4.1.

And from another Core2 machine:

$ gcc -march=native -dM -E - </dev/null | grep -E 'SSE|CRC|AES|PCL|RDRND|RDSEED|AVX' | sort
#define __SSE2_MATH__ 1
#define __SSE2__ 1
#define __SSE3__ 1
#define __SSE4_1__ 1
#define __SSE_MATH__ 1
#define __SSE__ 1
#define __SSSE3__ 1