1

I am trying to run my Keras code as model parallelism. I have been looking into the net about it and I found guidance from Tensorflow, or here but they didn't work. And I get this error all the time:

2022-08-02 09:43:51.638045: I tensorflow/core/platform/cpu_feature_guard.cc:152] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-02 09:43:51.861293: F tensorflow/core/platform/statusor.cc:33] Attempting to fetch value instead of handling error INTERNAL: failed initializing StreamExecutor for CUDA device ordinal 0: INTERNAL: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY: out of memory; total memory reported: 42505273344
Aborted (core dumped)

Could somebody explain to me what is the problem, and how I have to solve it?

Thank you

More explanation: I want to divide my model and each GPU runs a part of the model to avoid the memory problem.

The code:

def model():
  input_low = tf.keras.layers.Input((None,None) + (3,))
  input_med = tf.keras.layers.Input((None,None) + (3,))
  input_high = tf.keras.layers.Input((None,None) + (3,))

  #segmentation stage
  seg_low = seg(input_low,'seg_low')
  expand_seg_low = expanding_layer(seg_low)

  seg_med = seg(input_low,'seg_med')
  expand_seg_med = expanding_layer(seg_med)

  seg_high = seg(input_high,'seg_high')
  expand_seg_high = expanding_layer(seg_high)

  inps = [input_low,input_high,input_med]
  segs = [expand_seg_low,expand_seg_med,expand_seg_high]

  final_out = refinement(inps,segs)

  model = tf.keras.Model(inputs=[input_low,input_med,input_high],outputs=[seg_low,seg_high,final_out])
  model.compile(optimizer=keras.optimizers.Adam(learning_rate = 0.001),
              loss=['binary_crossentropy',
                    'binary_crossentropy',
                    'mse'],
              metrics=['accuracy'])
# tf.keras.utils.plot_model(model,to_file="model.png",show_shapes=True)
  return(model)
  • It clearly says, your GPU is out of memory. Reduce the batch size. – Zabir Al Nazi Aug 02 '22 at 09:55
  • @ZabirAlNazi Thank you but I am actually using three GPUs and batch size 2 but the model is deep. That is why I am trying to use model parallelism. – Ali Reza Omrani Aug 02 '22 at 10:04
  • Can you add a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example)? Without any code is not easy to help you. – ClaudiaR Aug 02 '22 at 10:14
  • @claudia I just added a brief explanation and a code of my model, I don't know if you meant something like this or not. – Ali Reza Omrani Aug 02 '22 at 10:36

0 Answers0