Inside virtualenv: How to get tensorflow to support sse 4.2 and avx

Question

Just to say it upfront, I'm aware of all the answers that require bazel and they didn't work for me. I'm using virtualenv as the tensorflow website recommends to.

(tensorflow27)name@computersname:~$ bazel build --linkopt='-lrt' -c opt --copt=-mavx --copt=-msse4.2 --copt=-msse4.1 --copt=-msse3-k //tensorflow/tools/pip_package:build_pip_package

will output

ERROR: The 'build' command is only supported from within a workspace.

Basically I followed all steps from here But when I run this validation I get

2017-09-02 11:46:52.613368: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.

2017-09-02 11:46:52.613396: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.

2017-09-02 11:46:52.613416: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.

I DO NOT want to surpress the warnings, I actually want to use SSE 4.2 and AVX (my processor supports both) However I wasn't able to find any instructions anywhere how to compile tensorflow inside a virutal environment such that the support for SSE and AVX is enabled from scratch. It's not even listed in their common installation problems section.

Btw. the system I use for this is Ubuntu 14.04 and I don't have an nvidia graphics card (so no cuda for now), but I plan to get one in the future.

I'm a little bit disappointed that tensorflow doesn't detect CPU capabilities before compilation.

Edit: Untrue, later I figured out it actually does

PS: I have setup two virtual environments, one for python 2.7 and the other one for python 3.0. Ideally I would hope that the solution works for both, as I didn't decide yet which python version I will end up using.

I'd recommend building with `-march=native` to specifically tune for the system you're building on, rather than just trying to pass `-mavx` and so on. (`-mavx` enables all `-msss*` options.) IDK how TensorFlow's build system works, but that's what you should compile with. — Peter Cordes, Sep 02 '17 at 22:48
Possible duplicate of [How to compile Tensorflow with SSE4.2 and AVX instructions?](https://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions) — Peter Cordes, Sep 02 '17 at 22:54
Not really a duplicate as I want to get it work specifically with virtualenv, but I realize now that my question may be a little bit misguided since pip is used in both cases. The Error message was a simple directory mismatch and I just had to switch directory. But I still had to jump over multiple hoops after that. Now I finally have generated the whl file and will try the installation tomorrow. — evolution, Sep 03 '17 at 00:34
re: disappointed it doesn't detect the CPU before compilation: https://www.tensorflow.org/versions/r1.2/install/install_sources says that `./configure` defaults to `-march=native`, which uses everything your CPU supports. IDK why there's such a fuss at https://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions about manually enabling `-mavx2` and `-mfma`, and why someone would recommend that over `-march=native` (unless you're building on an old CPU for deployment to a new one). I made an edit on the top answer there, hopefully I didn't break it. — Peter Cordes, Sep 03 '17 at 00:43
I don't use tensorflow, I'm just here for the SSE/AVX tag, and I know gcc / CPU optimization in general. So I don't have anything to add about `virtualenv`. I think you're right, this might not be a duplicate, except for the how to enable AVX part of the question. On https://www.tensorflow.org/versions/r1.2/install/install_linux#installing_with_virtualenv, it looks like they just want you to install pre-compiled binaries, and AFAIK there aren't separate packages built with `-march=haswell` or whatever. — Peter Cordes, Sep 03 '17 at 00:47
I believe you must install from sources to get optimal performance at present. The error from `bazel` means that you are not in the correct directory when you are trying to build. You need to be in the checked-out TensorFlow source tree to run `bazel build`. The instructions for how to build from sources are here: https://www.tensorflow.org/install/install_sources . Hope that helps! — Peter Hawkins, Sep 07 '17 at 14:24

evolution · Answer 1 · 2017-09-11T21:27:32.300

OK, so it turned out my problems were pretty much independent of which virtual environment I choose. The bazel build failed simply because I was in the wrong directory. I needed to pull tensorflow from git and then cd into it. Then I could build from there, using the default build option -march=native, which already detects my CPU capabilities.

Setting up two different virtual environments for the two different python versions in advance was useful. I used those environments to compile directly inside of them. This results in automatic python version detection. So building in the 2.7 environment will result in a build for python 2.7 etc.

To be somewhat more detailed:

First I installed bazel. Then I installed the dependencies mentioned on the tensorflow page like this:

sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel  
sudo apt-get install python-numpy python-dev python-pip python-wheel

In order to use OpenCL in the build later, I had to download (and complile if I remember correctly) ComputeCpp-CE-0.3.1-Ubuntu.14.04-64bit.tar.gz (For ubuntu 16 it would be ComputeCpp-CE-0.3.1-Ubuntu.16.04-64bit.tar.gz). After compilation I had to move the build result to /usr/local/computecpp:

sudo mkdir /usr/local/computecpp
sudo cp -r ./Downloads/ComputeCpp*/* /usr/local/computecpp

If I remember correctly at this point I then activated my virtual environment with the python version I want to compile against, so the desired python version would be recognized by ./configure.

Then I pulled tensorflow from git and configured it:

git clone https://github.com/tensorflow/tensorflow 
cd tensorflow
./configure

During the configuration routine I only answered yes to jemalloc and OpenCL as I don't have a CUDA card right now. When saying yes to OpenCL I was prompted for the ComputeCpp path which was at that location I created under /usr/local/computecpp

Then, while staying in the tensorflow directory I did

bazel build --config=opt --config=mkl //tensorflow/tools/pip_package:build_pip_package

People who activated cuda during ./configure, should also add a "--config=cuda" to this bazel build command. If you have gcc > 5 you will also need to add '--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0'.

The '--config=mkl' part activates some additional libraries provided by intel specifically to speed up some calculations on their processors, so tensorflow can make use of them. If you don't have an Intel processor, it's probably wise to remove that option.

Btw, at first I also compiled mkl by hand, but that's not necessary it turns out. The bazel build will automatically pull mkl from an online source if it's missing.

Then I created the final package at /tmp/tensorflow_pkg:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

Depending on the python version in your environment you will see a file name reflecting it:

/tmp/tensorflow_pkg/tensorflow-1.3.0-cp27-cp27mu-linux_x86_64.whl
/tmp/tensorflow_pkg/tensorflow-1.3.0-cp36-cp36m-linux_x86_64.whl

Now I could go to the appropriate environment and install it using pip. If you are using conda, the tensorflow website still recommends to use "pip install" and not "conda install". So if I were in a virtual environment with python 2.7 (doesn't matter if conda or not) I would type:

pip install --ignore-installed --upgrade /tmp/tensorflow_pkg/tensorflow-1.3.0-cp27-cp27mu-linux_x86_64.whl

And if I were under python 3.6 I would type:

pip install --ignore-installed --upgrade /tmp/tensorflow_pkg/tensorflow-1.3.0-cp36-cp36m-linux_x86_64.whl

There was one last hoop I needed to jump over: If you stay in the tensorflow directory (which you pulled from git) and then launch python, you won't be able to import tensorflow. I think it's some kind of a bug. So it's important to escape the tensorflow directory before starting python.

cd ..

Now you can start your python inside your virtual environment and import tensorflow. Hopefully without further errors or any warnings about not using the SSE or AVX capabilities of your processor. At least in my case it worked.

If you want to try the SYCL implementation of TensorFlow with ComputeCpp you need to add the --config=sycl to the bazel build step. — Rod Burns, Sep 29 '17 at 09:05

Inside virtualenv: How to get tensorflow to support sse 4.2 and avx

1 Answers1