Maximum 1X speedup achieved on Macbook Pro 2019 GPU for tensorflow operations

Question

Context:

I wanted to test the speedup that we can achieve by using Macbook Pro 2019 GPU for tensorflow operations.

As advised, in the following snippet, I am using tensorflow library's in-built function tf.multiply() to fetch the output of multiplication operation of a tensor with a constant:

import tensorflow as tf

tensor = tf.constant([[1, 2],
                      [3, 4]])

def cpu():
    with tf.device('/CPU:0'):
        tensor_1 = tf.multiply(tensor, 2)
        return tensor_1

def gpu():
    with tf.device('/device:GPU:0'):
        tensor_1 = tf.multiply(tensor, 2)
        return tensor_1


# We run each op once to warm up; see: https://stackoverflow.com/a/45067900
cpu()
gpu()

import time
n = 10000

start = time.time()
for i in range(n):
    cpu()
end = time.time()

cpu_time = end - start


start = time.time()
for i in range(n):
    gpu()
end = time.time()
gpu_time = end - start

print('GPU speedup over CPU: {}x'.format((cpu_time / gpu_time)))

Results:

I could achieve a maximum speedup of 1X.

My Question:

Ideally, if the tf.multiply() is optimised to run on the GPU, why am I not getting a better speedup of lets say 2X to 10X?

System & environment:

MacOS Ventura 13.3.1
Processor - 2.6 GHz 6-Core Intel Core i7
Graphics - AMD Radeon Pro 5300M 4 GB, Intel UHD Graphics 630 1536 MB
Memory - 16 GB 2667 MHz DDR4
tensorflow-macos python package version = 2.9.0
tensorflow-metal python package version = 0.6.0
Python version - 3.8.16

perhaps you might try it with a `tensor` that is 2 or 3 orders of magnitude larger. — JonSG, May 31 '23 at 19:37
This computation is ridiculously small to be accelerated by a GPU. — Dr. Snoopy, May 31 '23 at 19:45

score 1 · Answer 1 · answered Jun 01 '23 at 07:54

Special thanks to @JonSG and Dr. Matias Valdenegro-Toro (@Dr. Snoopy) for their comments to try with the operations involving tensors of the higher orders (or shape).

I implemented the tensor multiplication as follows and could achieve 8X speedup:

import tensorflow as tf

tensor_1 = tf.random.uniform(shape=(400, 2700))
tensor_2 = tf.random.uniform(shape=(2700, 800))

def cpu():
    with tf.device('/CPU:0'):
        tensor_3 = tf.matmul(tensor_1, tensor_2)
        return tensor_3

def gpu():
    with tf.device('/device:GPU:0'):
        tensor_3 = tf.matmul(tensor_1, tensor_2)
        return tensor_3


# We run each op once to warm up; see: https://stackoverflow.com/a/45067900
cpu()
gpu()

import time
n = 1000

start = time.time()
for i in range(n):
    cpu()
end = time.time()

cpu_time = end - start


start = time.time()
for i in range(n):
    gpu()
end = time.time()
gpu_time = end - start

print('GPU speedup over CPU: {}x'.format((cpu_time / gpu_time)))

Output:

GPU speedup over CPU: 8.871644178654558x

Maximum 1X speedup achieved on Macbook Pro 2019 GPU for tensorflow operations

1 Answers1