10

I am using Tensorflow java API (1.8.0) where I load multiple models (in different sessions). Those models are loaded from .pb files using the SavedModelBundle.load(...) method. Those .pb files were obtained by saving Keras' models.

Let's say that I want to load 3 models A, B, C. To do that, I implemented a java Model class :

public class Model implements Closeable {

private String inputName;
private String outputName;
private Session session;
private int inputSize;

public Model(String modelDir, String input_name, String output_name, int inputSize) {
    SavedModelBundle bundle = SavedModelBundle.load(modelDir, "serve");
    this.inputName = input_name;
    this.outputName = output_name;
    this.inputSize = inputSize;
    this.session = bundle.session();
}

public void close() {
    session.close();
}

public Tensor predict(Tensor t) {
    return session.runner().feed(inputName, t).fetch(outputName).run().get(0);
}
}

Then I easily can instantiate 3 Model objects corresponding to my A, B and C models with this class and make predictions with those 3 models in the same java program. I also noticed that if I have a GPU, the 3 models are loaded on it.

However, I would like only model A to be running on GPU and force the 2 others to be running on CPU.

By reading documentation and diving into the source code I didn't find a way to do so. I tried to define a new ConfigProto setting visible devices to None and instantiate a new Session with the graph but it didn't work (see code below).

    public Model(String modelDir, String input_name, String output_name, int inputSize) {
      SavedModelBundle bundle = SavedModelBundle.load(modelDir, "serve");
      this.inputName = input_name;
      this.outputName = output_name;
      this.inputSize = inputSize;
      ConfigProto configProto = ConfigProto.newBuilder().setAllowSoftPlacement(false).setGpuOptions(GPUOptions.newBuilder().setVisibleDeviceList("").build()).build();
      this.session = new Session(bundle.graph(),configProto.toByteArray());
}

When I load the model, it uses the available GPU. Do you have any solution to this problem ?

Thank you for your answer.

Alex
  • 183
  • 11
  • I am not familiar with the Java API of Tensorflow. But I know with its Python API you can do something like: `with tf.device('/cpu:0'): # your graph` or `with tf.device('/gpu:0'): # your graph`. I think it must be something similar in Java as well. So I searched and found [this answer](https://stackoverflow.com/a/47915987/2099607) on SO. I think (specially the last piece of code) is the solution you are looking for, but I am not sure. Please confirm this. – today Jul 13 '18 at 19:21

3 Answers3

1

You can set the device configuration of your tensorflow graph. Here is some relevant code [source].

...
byte[] config = ConfigProto.newBuilder()
                           .setLogDevicePlacement(true)
                           .setAllowSoftPlacement(true)
                           .build()
                           .toByteArray()

Session sessions[] = new Session[numModels];

// function to move the graph definition to a new device 
public static byte[] modifyGraphDef(byte[] graphDef, String device) throws Exception {
  GraphDef.Builder builder = GraphDef.parseFrom(graphDef).toBuilder();
  for (int i = 0; i < builder.getNodeCount(); ++i) {
    builder.getNodeBuilder(i).setDevice(device);
  }
  return builder.build().toByteArray();
} 

graphA.importGraphDef(modifyGraphDef(graphDef, "/gpu:0"));
graphB.importGraphDef(modifyGraphDef(graphDef, "/cpu:0"));

This would probably be cleaner than to do the more obvious setting of the CUDA_VISIBLE_DEVICES environment variable to "" after loading the first model.

modesitt
  • 7,052
  • 2
  • 34
  • 64
  • @alex as some debugging, does setting CUDA_VISIBLE_DEVICES to the empty string prevent all the models from going on the GPU (as it should) – modesitt Jul 17 '18 at 15:43
  • Yes, it does but it is impossible to use this solution for me as it requires to modify an environment variable. – Alex Jul 18 '18 at 12:30
1

According to this issue , the new source code fixed this problem. Unfortunately you will have to build from source following these instructions

Then you can test :

ConfigProto configProto = ConfigProto.newBuilder()
                .setAllowSoftPlacement(true) // allow less GPUs than configured
                .setGpuOptions(GPUOptions.newBuilder().setPerProcessGpuMemoryFraction(0.01).build())
                .build();
SavedModelBundle  bundle = SavedModelBundle.loader(modelDir).withTags("serve").withConfigProto(configProto.toByteArray()).load();
Kraizee
  • 31
  • 2
  • 7
Remzouz
  • 157
  • 7
1

Above given answers did not work for me.Using putDeviceCount("GPU", 0) makes TF use CPU . It is working in version 1.15.0.You can load same model to both cpu and gpu and if gpu throws Resource exhausted: OOM when allocating tensor, use the CPU model to do prediction.

ConfigProto configProtoCpu = ConfigProto.newBuilder().setAllowSoftPlacement(true).putDeviceCount("GPU", 0)
                    .build();
SavedModelBundle modelCpu=SavedModelBundle.loader(modelPath).withTags("serve")
                    .withConfigProto(configProtoCpu.toByteArray()).load();

ConfigProto configProtoGpu = ConfigProto.newBuilder().setAllowSoftPlacement(true)
    .setGpuOptions(GPUOptions.newBuilder().setAllowGrowth(true).build()).build();
SavedModelBundle modelgpu = SavedModelBundle.loader(modelPath).withTags("serve")
                    .withConfigProto(configProtoGpu.toByteArray()).load();
subbu
  • 254
  • 2
  • 7