4

I'm building tensorflow from source with bazel, as described here: https://www.tensorflow.org/install/install_sources

Following the installation doc, I successfully compile with the following:

bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both \
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"--config=cuda \
-k //tensorflow/tools/pip_package:build_pip_package

a combination of the accepted answer here and a note in the installation documentation "to add --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" to the build command for gcc 5 and later".

however, import tensorflow as tf results in the error

illegal instruction (core dumped), exiting python.

I have additionally tried: conda update libgcc to no avail.

How can I build tensorflow from source with gcc 5.0?

GPhilo
  • 18,519
  • 9
  • 63
  • 89
anon01
  • 10,618
  • 8
  • 35
  • 58
  • Probably a dumb question, but just to be sure: your CPU has support for all the build flags you're specifying, right? – GPhilo Aug 25 '17 at 08:22
  • I did not examine the flags closely - but think it would fail to compile if they weren't supported. – anon01 Aug 25 '17 at 08:26
  • No, it will compile just fine and produce a binary for a different processor. Run in your bash `gcc -march=native -Q --help=target | grep enabled ` and double check that al the flags you specify are actually in the list (especially the -mavx2 and -mfma) – GPhilo Aug 25 '17 at 08:28
  • Woops, I forgot that -mfpmath doesn't show as enabled or disabled. Have a look at the full output of the gcc command for the possible values you can specify (although `both` should be fine) – GPhilo Aug 25 '17 at 08:30
  • 1
    @GPhilo looks like I should have checked more closely! – anon01 Aug 25 '17 at 08:31
  • I'll just recap in an answer then, since it's the kind of problem other people might run into ;) – GPhilo Aug 25 '17 at 08:33
  • @GPhilo, may I know how can I check whether my CPU has support for all the build flags. I use linux aarch64 and I install tensorflow from ```conda install tensorflow```. Installation does not give any issue. But when I predict the model , it gives ```Illegal instruction (core dump)```. My tensorflow version is 2.5.0 and numpy is 1.20.3. Should I install tensorflow from bazel? – Susan Aug 18 '22 at 03:36
  • @anon01, do you have any document of how to build tensorflow from bazel source file? – Susan Aug 18 '22 at 03:37
  • @Susan the only info I have is in the link at the top: https://www.tensorflow.org/install/install_sources – anon01 Aug 18 '22 at 04:05
  • @anon01, what linux architecture are you using? – Susan Aug 18 '22 at 04:16
  • I mean, the question was from five years ago. I don't remember asking it tbh :) – anon01 Aug 18 '22 at 04:55
  • @Susan try installing via pip instead of conda. IIRC at a certain point Tensorflow started building with advanced CPU instructions enabled by default, but I can't remember when that was, check the release info on GitHub for each version of TF 2.x, it should be in there somewhere. If that's the case, your only option is building from source (in which case, there's instructions for it on Tensorflow's website) – GPhilo Aug 18 '22 at 06:53
  • @GPhilo, I read TF 2.5.0's git in this website https://github.com/tensorflow/tensorflow/releases?page=4 ```oneAPI Deep Neural Network Library (oneDNN) CPU performance optimizations from Intel-optimized TensorFlow are now available in the official x86-64 Linux and Windows builds. They are off by default. Enable them by setting the environment variable TF_ENABLE_ONEDNN_OPTS=1.``` Does it mean Tensorflow 2.5.0 is available for only x86-64 architecture? – Susan Aug 18 '22 at 08:14
  • @GPhilo, But at the end of TF2.5.0, it mentions ```Other Added show_debug_info to mlir.convert_graph_def and mlir.convert_function. Added Arm Compute Library (ACL) support to --config=mkl_aarch64 build.``` under Bug Fixes and Other Changes. and the support link is https://github.com/ARM-software/ComputeLibrary. May I know is TF 2.5.0 support aarch64 architecture or not? Should I need to install tensorflow from the bazel? – Susan Aug 18 '22 at 08:18
  • _Does it mean Tensorflow 2.5.0 is available for only x86-64 architecture?_ No, it means there are some specific optimizations enabled if your CPU supports them. _May I know is TF 2.5.0 support aarch64 architecture or not?_ It does, but depending on what you're trying to install TF on, you might have to build from source (which is most likely done via `bazel`). What machine are you trying to install TF on? – GPhilo Aug 18 '22 at 08:28
  • @GPhilo, I am using aarch64 architecture. I also searched how to install tensorflow with bazel. http://zhiyisun.github.io/2017/02/15/Running-Google-Machine-Learning-Library-Tensorflow-On-ARM-64-bit-Platform.html I follow according to this. But in the website it said ```Then modify bazel source file to compatible with aarch64.``` I don't know how to modify the bazel source with the code mentioned below in the website. So I skip that step and try to compile ./compile.sh. But I got ```ERROR: JAVA_HOME (/usr/lib/jvm/java-11-openjdk-arm64/bin/java) is not a path to a working JDK.``` – Susan Aug 18 '22 at 08:48
  • ``` ls -l /etc/alternatives/java``` gives ```/etc/alternatives/java -> /usr/lib/jvm/java-11-openjdk-arm64/bin/java```. ```java --version ``` ```openjdk 11.0.16 2022-07-19 OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu120.04) OpenJDK 64-Bit Server VM (build 11.0.16+8-post-Ubuntu-0ubuntu120.04, mixed mode)```. ```echo $JAVA_HOME``` gives ```/usr/lib/jvm/java-11-openjdk-arm64/bin/java``` – Susan Aug 18 '22 at 08:49
  • May I know how can I solve Error: Java_home is not a path to a working JDK? – Susan Aug 18 '22 at 08:51
  • @susan building on aarch64 is not trivial (and out of scope for this question). Check out [this guide](https://qengineering.eu/install-tensorflow-2.4.0-on-jetson-nano.html) that shows how to build TF on a Jetson Nano (in thelinks at the bottom of the page you also have other options like Raspberry Pi). The java error is probably a configuration issue: if you don't need java bindings, you probably need to disable the build for them. If you do, I can't help as I never had to build them and have no clue of how that's done. – GPhilo Aug 18 '22 at 08:52
  • @GPhilo, When I install Tensorflow from bazel, Bazel is written in Java according to the guide you shared and to compile Bazel, we must first install Java. – Susan Aug 18 '22 at 08:57
  • @susan look up bazelisk and see if that applies to your use-case, it should take care of setting up the correct bazel version for the project you're building. – GPhilo Aug 18 '22 at 10:43
  • @GPhilo, Thank you for the guide. May I know is bazelisk alternative of bazel? But I can solved the issue of ```Error: Java_home is not a path to a working JDK``` and now I can compile Bazel by following the steps from the guide that you shared. – Susan Aug 18 '22 at 16:30
  • @GPhilo, In the guide, after compiling the bazel, I need to copy the binary bazel file from ```output/bazel``` to ```/usr/local/bin/bazel```. May I know when I use the virtual environment using miniconda3 instead of virtual environment from the local, where should I copy the ```output/bazel```? When I check the miniconda3 folder, I found the following subfolders. ```LICENSE.txt compiler_compat etc python-package aarch64-conda-linux-gnu conda-meta include share aarch64-conda_cos7-linux-gnu condabin lib shell bin envs pkgs ssl ``` – Susan Aug 18 '22 at 16:34
  • Hello, anon01, how do you know what command you should run to build the Tensorflow using bazel ```-c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both \ --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"--config=cuda \ -k ``` Because, I want to build Tensorflow for cpu version in aarch64 linux architecture and I don't know what comment I should use to build Tensorflow with bazel. – Susan Aug 19 '22 at 01:26

1 Answers1

6

The error message tells me your program was compiled with instructions your processor doesn't have and a look at your build string makes me suspect of -mavx2 and -mfma which AFAIK are implemented only in rather recent (and high-end) CPUs. Please note, gcc will compile just fine with flags for instructions your CPU doesn't support, but the program won't run on your machine.

To make sure your CPU supports those flags, run in bash gcc -march=native -Q --help=target | grep enabled and check that the output contains all the build flags you want to use (with the exception of -mfpmath which doesn't show as enabled or disabled since it allows a list of output. For that you'll need to check the full gcc -march=... command output).

To answer to your final comment, there is no way to "enable" these instructions, they are implemented in hardware and either they are available on your CPU, or they aren't.

GPhilo
  • 18,519
  • 9
  • 63
  • 89
  • noobish question - is it possible to compile tensorflow with these features turned off, for sake of development on old CPUs? – quester Mar 23 '18 at 20:58
  • 1
    Of course! In the `configure` script there's a stage that asks you what compiler optimization flags to pass, specifying that native is the default. Just replace that with an empty string and that should do it. Otherwise, I think not adding the `-c=opt` option to your Bazel build command should also be equivalent (since those compiler flags are applied when you add that flag to the build command) – GPhilo Mar 23 '18 at 21:07