why tensorflow uses 100% of all CPU cores?

Question

I've made a fresh install of Jupyter Notebook kernel and python packages, including tensorflow 2.4.1 (using miniconda env).

When I train and test a model, my CPU usage saturate. In my old install, that's not happen (low CPU usage), and the time to accomplish the tasks was barely the same.

Is there a config of jupyter and/or tensorflow? I've test on Jupyter Notebook and VSCode, the same problem occurs. Ubuntu 20.04 16GB RAM Intel® Core™ i5-8300H CPU @ 2.30GHz × 8

CPU usage when training a simple network model - htop view

Edit: Condition solved.

I've done a deep research on intel website, and found this link, about threading config. for Tensorflow and openMP. I run some quick tests varying the tensorflow 2.x section paramenters below, giving back no improvement.

import tensorflow as tf
tf.config.threading.set_inter_op_parallelism_threads() 
tf.config.threading.set_intra_op_parallelism_threads()
tf.config.set_soft_device_placement(enabled)

then I test the openMP settings, changing OMP_NUM_THREADS from 0 to 8, as reported on the graph below:

training time vs OMP_NUM_THREADS

import os
os.environ["OMP_NUM_THREADS"] = “16”

CPU usage reduced, with lower training time.

CPU usage for OMP_NUM_THREADS equal to 0

OBS.: I am not an expert in ML benchmarks. Just fixed a network training parameters and topology for keras.Sequential() model. Don't know the reason why my CPU was threading at maximum OMP_NUM_THREADS=16 by default.

Do you want it to be less. While I don't know TensorFlow, I seem to recall there are straightforward options to set the number of cores/threads to be used. — 9769953, Aug 27 '21 at 13:49
"and the time to accomplish the tasks was barely the same.": are the results the same? Perhaps one has better results than the other. — 9769953, Aug 27 '21 at 13:50
The problem is that my laptop lags, because i need to work on other tasks when training is being performed. The results are the same, there is no reduction in training time also. — Mateus Hufnagel, Aug 27 '21 at 14:38
I think i solved the condition. You're right, the CPU cores were threading very nice, but this is not bringing processing time improvement for my problem (keras simple sequential model). Changed the `os.environ["OMP_NUM_THREADS"] = 1` and CPU uses reduces drastically, reducing the training time. — Mateus Hufnagel, Aug 27 '21 at 23:10

score 0 · Accepted Answer · answered Nov 16 '22 at 17:53

In our multiuser environment we need to keep some of the cpu's free for higher prioritized jobs (we don't have the rights to 'nice' processes). So reducing tensorflows (Version 2.8.2) cpu-greed is quite essential. The abovementioned solution works in our purely cpu environment (linux, 40 cores) with a restriction of the inter/intra-threads to 1.

import os
os.environ["OMP_NUM_THREADS"] = “8”

import tensorflow as tf
tf.config.threading.set_inter_op_parallelism_threads(1) 
tf.config.threading.set_intra_op_parallelism_threads(1)

This restricts the cpu-usage to 800%.

why tensorflow uses 100% of all CPU cores?

1 Answers1

Linked