6

I have a Java demo working that uses Tensorflow for image classification. It runs okay on Windows, but now I want to run it as a web service from the Java Tomcat web server.

I have added all the Tensorflow jars to Tomcat's lib, but Tensorflow has a jni dependency. I'm not sure how to install and link this so Tensorflow can run on the CentOS Linux server.

I have read this, but I do not need to run python on the server, just access Tensorflow from Java.

Update: **Okay, to get this to work on Tomcat on Windows I do the following,

download libtensorflow.jar from, https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.6.0.jar

and then the dll from, https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.6.0.zip (extract zip to get dll)

See, https://www.tensorflow.org/install/install_java

put the jar in my tomcat lib, and create a tomcat dll dir and put the dll in it

edit my setenv.bat and add the line,

SET CATALINA_OPTS=-Xmx4g -XX:PermSize=128m -XX:MaxPermSize=512m -Djava.library.path=D:\Engineering\apache-tomcat-7.0.50\dll

This works on Windows.

For Linux, CentOS 6, I do the same, but instead of the dll download the so files from, https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-linux-x86_64-1.6.0.tar.gz

and edit my setenv.sh and add the lines,

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/tomcat8/so"
export JAVA_OPTS="-server -Xmx38g -Djava.library.path=/usr/local/tomcat8/so"
export CATALINA_OPTS="-Djava.library.path=/usr/local/tomcat8/so"

But none of these seem to work, I always get the error,

Cannot find TensorFlow native library for OS: linux, architecture: x86_64. See https://github.com/tensorflow/tensorflow/tree/master/tensorflow/java/README.md for possible solutions (such as building the library from source). Additional information on attempts to find the native library can be obtained by adding org.tensorflow.NativeLibrary.DEBUG=1 to the system properties of the JVM.

I found there is another deployment option by instead just adding the jar,

to lib, and it will magically find the correct so files.

https://mvnrepository.com/artifact/org.tensorflow/libtensorflow_jni

When I try this option it seems to find the so files, but I get this error,

/usr/local/tomcat8/temp/tensorflow_native_libraries-1522357321965-0/libtensorflow_jni.so: /lib64/libc.so.6: version `GLIBC_2.16' not found (required by /usr/local/tomcat8/temp/tensorflow_native_libraries-1522357321965-0/libtensorflow_jni.so)

Seems like Tensorflow only supports a very specific OS and version??

I found this, Error while importing Tensorflow in python2.7 in Ubuntu 12.04. 'GLIBC_2.17 not found'

But have not tried any of the options yet. Does not look promising for a production system.

Looking at what GLIBC is, it is for GPU, but I don't have or need to use a GPU, just want to use the CPU, why is this library required??

** Update So... I tried to build glibc 1.6.0 on Centos6 so that I could use it by following,

https://unix.stackexchange.com/questions/176489/how-to-update-glibc-to-2-14-in-centos-6-5

The steps worked, but it lead to this error trying to run Tensorflow, seems like it has a dependecy on another lib...

error while loading shared libraries: __vdso_time: invalid mode for dlopen()

At this point I'm ready to give up, and try installing Centos7, but this route will require we upgrade 12 production servers...

James
  • 17,965
  • 11
  • 91
  • 146
  • You have to provide the jni library as you did on Windows. The library must be placed on path you specify with `-Djava.library.path=...`. Single difference: The library is named `lib.so`, not `lib.dll`. Neither your question nor the link tell us which library. So we can't tell you which package to install. – blafasel Oct 19 '17 at 19:00
  • @James is your project a maven one? – hovanessyan Mar 28 '18 at 18:42
  • no, no maven, running tomcat the app is deployed to tomcat webapps, all jars are in tomcat/lib so files are in tomcat/so -- issue is tomcat does not seem to be picking up so path, or tensorflow does not like the so files – James Mar 28 '18 at 18:44
  • @James I've updated the answer to include also the setup with dedicated tomcat web server hosted on linux – hovanessyan Mar 29 '18 at 11:47
  • updated post with new error, `GLIBC_2.16' not found – James Mar 30 '18 at 01:06
  • Can you please run "ldd --version" on your CentOS server - this comes with the glibc package and will tell you the glibc version. The glibc is required by tensorflow, no matter if it's CPU or GPU configured. Looking at the error message it's very likely that your glibc version is older than 2.16. Let me know the present glibc version and I will think of options. – hovanessyan Mar 30 '18 at 07:52
  • Centos 6 uses 2.12 – James Mar 30 '18 at 13:44
  • @James I've updated my answer now with instructions on how to update Glibc on CentOS6. Although I would recommend setting up new server for this tensorflow demo app - something that is more recent. In tensorflow documentation is mentioned they support Ubuntu, so I would go with the latest LTS Ubuntu server. If this is not an option you can try the Glibc upgrading steps suggested in my answer. – hovanessyan Mar 30 '18 at 18:29
  • I tried to update glibc, but leads to another error, (see above) – James Apr 01 '18 at 15:42
  • I also tried using Tensorflow 1.1.0 instead of 1.6.0, but get same error but looking for glib 2.14 (CentOS 6 has 2.12) – James Apr 01 '18 at 15:50
  • Seems like upgrade to Centos7 is the only way – James Apr 01 '18 at 15:51
  • @James if that's the only option left, be warned that the upgrade path is not that trivial. There's an "upgrade tool" which has been reported to be in a broken state and results in broken servers. There has been a bunch of changes between CentOS 6 and 7 and some of them brake compatibility. Upgrading in place is very risky. You should definitely research the option to backup all the data, install a fresh CentOS7, if that's the distro choice, and restore/apply the config and data on the new server. – hovanessyan Apr 02 '18 at 21:01

4 Answers4

3

Well, I have very little knowledge of tensorflow in Java. However, I did do a little research and believe I came to a conclusion to solving your problem.

Of course, the solution you posted in Error while importing Tensorflow in python2.7 in Ubuntu 12.04. 'GLIBC_2.17 not found' actually mitigates the problem if you read the solution by @Igor.

To understand the issue better: The way tensorflow works in java is that your package makes calls to the python library which underneath it actually calls C code which is where the main core and power of the library lies. So you may think of the Java package as a wrapper for the Python tensorflow which is a wrapper for a C library.

Recall that the linux operating systems are built in C and almost always has Glibc preinstalled as system requirements as said here in the first few lines. That being said, the issue you face is that GLibc that tensorflow requires is latest version which is not the same version run by your operating system.

If you read the issue here installation problem (version 'GLIBC' 2.14 not found), you'll see there is a similar issue where the operating system running is Cent OS which is the same as yours. The only difference in that specific issue is that the person in question is using python instead of java, but the issue is the same.

Thus, you have several possible ways to tackle it.

  1. Run your code on another linux based operating system where you Glibc is compatible with tensorflow (or can be easily updated)

  2. Upgrade your system GLIBC globally. This is extremely painful if you are running this on a server and is well documented here. (from what i understood, you may simply get away with installing python to resolve this. Sorry, I didn't read the article completely).

  3. Add a second GLIBC to your system (Risky)

  4. Compile Glibc and Bazel from source. (Sounds like the most plausible explanation to me after the first option)

  5. Compile tensorflow from source to work on your current glibc as suggested by @Igor in this post. I have no idea if this would work, since I am not sure which C functionalities tensorflow library may be calling.

I hope this response was at least of little help. Cheers!

Haris Nadeem
  • 1,322
  • 11
  • 24
  • I don't think there is any good option other than upgrading the OS. Centos 7 was released in 2014, moving from 6.x line is a good idea and will have to be done sooner or later. To make software work with old GLIBC it is pretty much required to use old GCC but it is likely it won't be able to compile modern/recent Tensorflow code. – user158037 Mar 30 '18 at 08:35
  • I agree, within a few months tensorflow might need more and more updates. As of right now, Centos 7.4 uses glibc 2.17 and would be the quickest fix. You would still need to do the steps that you've already worked on with Centos 6, but is better than re-engineering everything. – Haris Nadeem Mar 30 '18 at 15:23
2

This problem is not directly related to Java, it's all about old good C and native libraries linkage. What happened (in short) :

  1. Tensorflow's Java library makes runtime calls to native library through JNI (Java Native Interface)
  2. This native library (.so file inside .jar) was compiled under 'fresher' Linux distro than Centos6, probably under Ubuntu LTS, that's why it was linked to fresher version of glibc library.

There is no simple way to manually update glibc and keep system stable, so best will be upgrade to CentOS 7, which has required version of glibc on board: https://rpmfind.net/linux/rpm2html/search.php?query=libc.so.6%28GLIBC_2.16%29%2864bit%29&submit=Search+...&system=centos&arch=

Alex Chernyshev
  • 1,719
  • 9
  • 11
1

I just had a closer look.

Simple add a dependency to org.tensorflow:tensorflow:1.4.0-rc0 (or whatever version you prefer) to you favorite build tool.

This will introduce a dependency to org.tensorflow:libtensorflow_jni:1.4.0-rc0. This will include the following:

blafasel@localhost:~$ unzip -t .m2/repository/org/tensorflow/libtensorflow_jni/1.4.0-rc0/libtensorflow_jni-1.4.0-rc0.jar
Archive:  .m2/repository/org/tensorflow/libtensorflow_jni/1.4.0-rc0/libtensorflow_jni-1.4.0-rc0.jar
    testing: META-INF/                OK
    testing: META-INF/MANIFEST.MF     OK
    testing: org/                     OK
    testing: org/tensorflow/          OK
    testing: org/tensorflow/native/   OK
    testing: org/tensorflow/native/darwin-x86_64/   OK
    testing: org/tensorflow/native/linux-x86_64/   OK
    testing: org/tensorflow/native/windows-x86_64/   OK
    testing: org/tensorflow/native/darwin-x86_64/libtensorflow_framework.so   OK
    testing: org/tensorflow/native/darwin-x86_64/LICENSE   OK
    testing: org/tensorflow/native/darwin-x86_64/libtensorflow_jni.dylib   OK
    testing: org/tensorflow/native/linux-x86_64/libtensorflow_framework.so   OK
    testing: org/tensorflow/native/linux-x86_64/libtensorflow_jni.so   OK
    testing: org/tensorflow/native/linux-x86_64/LICENSE   OK
    testing: org/tensorflow/native/windows-x86_64/tensorflow_jni.dll   OK
    testing: org/tensorflow/native/windows-x86_64/LICENSE   OK
    testing: META-INF/maven/          OK
    testing: META-INF/maven/org.tensorflow/   OK
    testing: META-INF/maven/org.tensorflow/libtensorflow_jni/   OK
    testing: META-INF/maven/org.tensorflow/libtensorflow_jni/pom.xml   OK
    testing: META-INF/maven/org.tensorflow/libtensorflow_jni/pom.properties   OK
No errors detected in compressed data of .m2/repository/org/tensorflow/libtensorflow_jni/1.4.0-rc0/libtensorflow_jni-1.4.0-rc0.jar.

As you can see this already contains all needed binaries to get JNI working on all officially supported platforms. That contains any Linux on x86_64.

As long as you don't try to use it on a raspi or on 32-bit CentOS and as long as you use a suitable build tool you should be save.

The only risk lies in dependencies of these libraries on other system libs. A call to ldd on libtensorflow_framework.so shows:

blafasel@localhost:~$ ldd org/tensorflow/native/linux-x86_64/libtensorflow_framework.so
    linux-vdso.so.1 =>  (0x00007ffffaa62000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f07c6494000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f07c6290000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f07c6073000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f07c5cf0000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f07c5ada000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f07c5710000)
    /lib64/ld-linux-x86-64.so.2 (0x000056525c661000)

If you don't find these transitive dependencies on your system you probably should try an older version of tensorflow or a newer of CentOs.

blafasel
  • 1,091
  • 14
  • 25
1

DISCLAIMER

Please consider that this answer is lengthier because it answers to the initially posted question, as well as to the other presented problems when the question evolved as more information was supplied through comments and discussions.

UPDATE 2:

New information supplied suggests that the glibc version on the centOS6 server is older than the glibc version tensorflow binary was compiled against. To update the glibc version on the CentOS6 server to newer one, you can try the steps as described in this upgrade script (credit to origin).

I would recommend upgrading the whole server, instead of glibc only.

A lot of other commands on your current server are compiled against your current glibc version. If you do an upgrade of that library you might run into compatibility issues and this might result in the server being broken all together.

#! /bin/sh

# update glibc to 2.17 for CentOS 6

wget http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-2.17-55.el6.x86_64.rpm
wget http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-common-2.17-55.el6.x86_64.rpm
wget http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-devel-2.17-55.el6.x86_64.rpm
wget http://copr-be.cloud.fedoraproject.org/results/mosquito/myrepo-el6/epel-6-x86_64/glibc-2.17-55.fc20/glibc-headers-2.17-55.el6.x86_64.rpm

sudo rpm -Uvh glibc-2.17-55.el6.x86_64.rpm \
glibc-common-2.17-55.el6.x86_64.rpm \
glibc-devel-2.17-55.el6.x86_64.rpm \
glibc-headers-2.17-55.el6.x86_64.rpm

Original Answer:

There's a jar file containing the JNI distribution of tensorflow.

You can simply download the tensorflow_jni.jar using the version that matches the tensorflow.jar - in your case 1.6.0 and package that alongside your application. The JNI jar will be on the class path and will be picked up automatically.

You can also just copy-paste the tensorflow_jni.jar in tomcat's lib folder.

The tensorflow_jni.jar is configured for CPU usage, if you want to use the GPU one, you can download tensorflow_jni_gpu.jar instead.

Demo:

I've made a demo application that is deployed as war package to a dedicated Tomcat 8.5.29, with a single rest endpoint that prints the tensorflow version and I can confirm that providing both tensorflow.jar and tensorflow_jni.jar works, without any additional configuration or tweaking.

I've uploaded the test application in my github account. You can check it out, package it as a war file (mvn package or whatever you are using to do that) and deploy it in Tomcat.

To package it as described, it will require maven, but the main purpose for maven in this case is to download the necessary dependencies declared in the pom file.

If you don't want to use maven, you can download the dependencies by hand, from the provided links above and incorporate them in your application setup.

UPDATE - configuring native libs in dedicated Tomcat

Here's how I did the setup with dedicated Tomcat8, where all the tensorflow dependencies are configured in the web server and not coming with the deployed application.

1) Here's how my war dependency looks like - it has 0 tensorflow dependencies:

war dependencies

In order to produce that with the linked project, you just have to mark the tensorflow dependency as provided in pom.xml:

<dependency>
    <groupId>org.tensorflow</groupId>
    <artifactId>tensorflow</artifactId>
    <version>1.6.0</version>
    <scope>provided</scope> <!-- add this line -->
</dependency>

And pick up the demo-tensorflow-0.0.1-SNAPSHOT.war.original war from target directory (remove .original before deploying to tomcat).

2) Here is the path to the SO files on the file system, reflecting the path you have specified:

so location

3) Tomcat's lib folder:

tomcat libs

4) If I deploy the war package in tomcat and try to access the rest endpoint I will get the same error you're getting:

error jni

5) I've created setenv.sh in CATALINA_BASE (I've added the library path only in CATALINA_OPTS for clarity).

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/tomcat8/so"
export JAVA_OPTS="-server -Xmx38g"
export CATALINA_OPTS="-Djava.library.path=/usr/local/tomcat8/so"

and then

chmod u+x setenv.sh

6) Running tomcat I can see in the log messages the configuration is picked up:

enter image description here

7) Accessing the application this time is successful:

enter image description here

hovanessyan
  • 30,580
  • 6
  • 55
  • 83
  • tensorflow has different lib dependencies/so file depending if you want gpu or cpu support, which one does this lib jar use? – James Mar 28 '18 at 21:00
  • I've updated my answer - the tensorflow_jni.jar is CPU configured, the tensorflow_jni_gpu.jar is GPU configured. – hovanessyan Mar 28 '18 at 21:08
  • thanks for all the help, but it is still not working for me, I updated the post above, no matter what I do I get the same so not found error, even though its shows the so on the path the same as yours – James Mar 29 '18 at 21:22
  • if I try the single jar that finds the correct so automatically it seems to try to load the so but gives the linking error, /usr/local/tomcat8/temp/tensorflow_native_libraries-1522357321965-0/libtensorflow_jni.so: /lib64/libc.so.6: version `GLIBC_2.16' not found (required by /usr/local/tomcat8/temp/tensorflow_native_libraries-1522357321965-0/libtensorflow_jni.so) – James Mar 29 '18 at 21:23
  • using centos 6, does tensorflow only work with a very specific OS and version? – James Mar 29 '18 at 21:23
  • also am using tomcat 7 – James Mar 29 '18 at 21:36
  • Sorry I assumed you were running tomcat 8, as all the paths in your configuration contain "tomcat8" – hovanessyan Mar 30 '18 at 07:53
  • When we are using the pre-build binary - it's build against a set of other libraries with specific versions. So when you run the pre-build tensorflow jni binary it expects specific versions of the libraries it was built against. You either have to supply those libraries versions (or compatible ones) or build tensorflow from source, against those required libraries in you own environment (hence using the versions you have on the CentOS server). The fastest way to resolve this I think would be setting up new server with more recent glibc library. – hovanessyan Mar 30 '18 at 08:10