How do the `intra_op_parallelism_threads` and `inter_op_parallelism_threads` options work in TensorFlow?

Asked Aug 25 '16 at 16:28

Active Aug 25 '16 at 16:28

Viewed 2,775 times

My understanding is that TensorFlow creates two thread pools on each device: one for intra op parallelism and one for inter op parallelism.

Suppose there are 3 independent ops A, B, C placed on /gpu:0 and intra_op_parallelism_threads=5. Suppose A and B have a single-threaded GPU kernel implementation, and C has a multi-threaded kernel implementation, does that mean that they can all run in parallel on the same device, A and B using just 1 GPU thread while C uses up to 3 GPU threads?

Now suppose inter_op_parallelism_threads=2, does that mean that only 2 out of 3 ops can be evaluated simultaneously on /gpu:0, so in the example above, it may be A+B, B+C or A+C depending on who gets there first?

Note: I'm trying to make sense of @mrry's answer to this question: Tensorflow: executing an ops with a specific core of a CPU

edited May 23 '17 at 12:25

Community

asked Aug 25 '16 at 16:28

MiniQuark

46,633
36
147
183

3

intra_op_parallelism_threads affects the number of threads in Eigen threadpool, so it has no effect on GPU operations. GPU ops tend to grab the whole GPU – Yaroslav Bulatov Aug 27 '16 at 18:27

How do the `intra_op_parallelism_threads` and `inter_op_parallelism_threads` options work in TensorFlow?

0 Answers0