0

Sorry I know this question has been asked many times, but it seems like none of the answers I found really solved my problem, I already checked several times that my versions are compatible with given version on official website. version info:
System: Windows 10
IDE: PyCharm Professional
Tensorflow version: 2.3.0
CUDA version: 10.1
CUDNN version: cudnn-10.1-windows10-x64-v7.6.5.32_3
Python version: 3.7
environment variable added to PATH

I think tf can already recognize and run with my gpu, since I tried running:

import tensorflow as tf
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))

which shows result shown in cmd after running above code

In addition, running print(tf.config.experimental.list_physical_devices('GPU')) outputs [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

but when I actually try to run a simple CNN:

import os
# os.environ['TF_CPP_MIN_LOG_LEVEL'] = "1"
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow import keras


class MyCallback(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if logs.get("loss") < .01:
            print("loss below .01, ending training")
            self.model.stop_training = True


if __name__ == '__main__':
    # print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
    callbacks = MyCallback()

    mnist = keras.datasets.fashion_mnist
    (trainX, trainY), (testX, testY) = mnist.load_data()
    # plt.imshow(trainX[0], cmap="gray")
    # plt.show()
    trainX = trainX / 255.0
    testX = testX / 255.0
    trainX = trainX.reshape(60000, 28, 28, 1)
    testX = testX.reshape(10000, 28, 28, 1)
    model = keras.models.Sequential([keras.layers.Conv2D(64, (3, 3), activation=tf.nn.leaky_relu,
                                                         input_shape=(28, 28, 1), padding="same"),
                                     keras.layers.MaxPooling2D(2, 2),
                                     keras.layers.Conv2D(64, (3, 3), activation=tf.nn.leaky_relu),
                                     keras.layers.MaxPooling2D(3, 3),
                                     keras.layers.Conv2D(64, (3, 3), activation=tf.nn.leaky_relu),
                                     keras.layers.Flatten(),
                                     keras.layers.Dense(256, activation=tf.nn.leaky_relu),
                                     # keras.layers.Dense(128, activation=tf.nn.leaky_relu),
                                     keras.layers.Dense(10, activation=tf.nn.softmax)])
    model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.0012),
                  loss=tf.losses.sparse_categorical_crossentropy)
    model.fit(trainX, trainY, epochs=4000, verbose=2, callbacks=[callbacks])
    model.evaluate(testX, testY)

it shows:

2020-12-25 01:05:12.889545: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only Relying on driver to perform ptx compilation. Modify $PATH to customize ptxas location. This message will be only logged once.

the first few lines of outputs: output warning

another reason why I believe it is not using gpu is that from task manager, it show gpu is only used <=3%, but cpu is >=45% Can somebody help? Thanks!

full warning:

2020-12-25 01:12:13.322490: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll 2020-12-25 01:12:16.212006: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll 2020-12-25 01:12:16.253006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1660 Ti computeCapability: 7.5 coreClock: 1.59GHz coreCount: 24 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s 2020-12-25 01:12:16.254288: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll 2020-12-25 01:12:16.259076: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll 2020-12-25 01:12:16.262699: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll 2020-12-25 01:12:16.264358: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll 2020-12-25 01:12:16.268891: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll 2020-12-25 01:12:16.272625: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll 2020-12-25 01:12:16.282877: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll 2020-12-25 01:12:16.283119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-12-25 01:12:16.283488: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2020-12-25 01:12:16.291040: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x223ba066800 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-12-25 01:12:16.291333: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-12-25 01:12:16.293402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1660 Ti computeCapability: 7.5 coreClock: 1.59GHz coreCount: 24 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s 2020-12-25 01:12:16.294634: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll 2020-12-25 01:12:16.294798: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll 2020-12-25 01:12:16.294981: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll 2020-12-25 01:12:16.295133: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll 2020-12-25 01:12:16.295285: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll 2020-12-25 01:12:16.295438: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll 2020-12-25 01:12:16.295620: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll 2020-12-25 01:12:16.296033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-12-25 01:12:16.939243: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-12-25 01:12:16.939417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 2020-12-25 01:12:16.939519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N 2020-12-25 01:12:16.939768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4615 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5) 2020-12-25 01:12:16.942524: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x223fb58b970 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2020-12-25 01:12:16.942780: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1660 Ti, Compute Capability 7.5 Epoch 1/4000 2020-12-25 01:12:17.682064: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll 2020-12-25 01:12:17.988403: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll 2020-12-25 01:12:19.257223: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only Relying on driver to perform ptx compilation. Modify $PATH to customize ptxas location. This message will be only logged once.

Andrey
  • 5,932
  • 3
  • 17
  • 35
seermer
  • 571
  • 6
  • 12
  • Everything here points that the GPU is being used, you cannot use GPU utilization to infer if the GPU is being used or not. I do not see a problem here. – Dr. Snoopy Dec 24 '20 at 17:22
  • thanks for answering. So does that mean I can just ignore the warning "Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only Relying on driver to perform ptx compilation. Modify $PATH to customize ptxas location. This message will be only logged once." – seermer Dec 24 '20 at 17:27
  • if you are in doubt, you could try to force device placement using with tf.device('/GPU:0'): # in case your device is not available, it will fail. It is definitely suspicious that your CPU utilization is high and GPU low – CrazyBrazilian Dec 25 '20 at 04:44
  • thanks! I think using with tf.device('/GPU:0'): works! now gpu usage is getting higher – seermer Dec 25 '20 at 05:21

0 Answers0