Sorry I know this question has been asked many times, but it seems like none of the answers I found really solved my problem, I already checked several times that my versions are compatible with given version on official website. version info:
System: Windows 10
IDE: PyCharm Professional
Tensorflow version: 2.3.0
CUDA version: 10.1
CUDNN version: cudnn-10.1-windows10-x64-v7.6.5.32_3
Python version: 3.7
environment variable added to PATH
I think tf can already recognize and run with my gpu, since I tried running:
import tensorflow as tf
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))
In addition, running print(tf.config.experimental.list_physical_devices('GPU'))
outputs
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
but when I actually try to run a simple CNN:
import os
# os.environ['TF_CPP_MIN_LOG_LEVEL'] = "1"
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow import keras
class MyCallback(keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if logs.get("loss") < .01:
print("loss below .01, ending training")
self.model.stop_training = True
if __name__ == '__main__':
# print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
callbacks = MyCallback()
mnist = keras.datasets.fashion_mnist
(trainX, trainY), (testX, testY) = mnist.load_data()
# plt.imshow(trainX[0], cmap="gray")
# plt.show()
trainX = trainX / 255.0
testX = testX / 255.0
trainX = trainX.reshape(60000, 28, 28, 1)
testX = testX.reshape(10000, 28, 28, 1)
model = keras.models.Sequential([keras.layers.Conv2D(64, (3, 3), activation=tf.nn.leaky_relu,
input_shape=(28, 28, 1), padding="same"),
keras.layers.MaxPooling2D(2, 2),
keras.layers.Conv2D(64, (3, 3), activation=tf.nn.leaky_relu),
keras.layers.MaxPooling2D(3, 3),
keras.layers.Conv2D(64, (3, 3), activation=tf.nn.leaky_relu),
keras.layers.Flatten(),
keras.layers.Dense(256, activation=tf.nn.leaky_relu),
# keras.layers.Dense(128, activation=tf.nn.leaky_relu),
keras.layers.Dense(10, activation=tf.nn.softmax)])
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.0012),
loss=tf.losses.sparse_categorical_crossentropy)
model.fit(trainX, trainY, epochs=4000, verbose=2, callbacks=[callbacks])
model.evaluate(testX, testY)
it shows:
2020-12-25 01:05:12.889545: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only Relying on driver to perform ptx compilation. Modify $PATH to customize ptxas location. This message will be only logged once.
the first few lines of outputs:
another reason why I believe it is not using gpu is that from task manager, it show gpu is only used <=3%, but cpu is >=45% Can somebody help? Thanks!
full warning:
2020-12-25 01:12:13.322490: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll 2020-12-25 01:12:16.212006: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll 2020-12-25 01:12:16.253006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1660 Ti computeCapability: 7.5 coreClock: 1.59GHz coreCount: 24 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s 2020-12-25 01:12:16.254288: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll 2020-12-25 01:12:16.259076: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll 2020-12-25 01:12:16.262699: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll 2020-12-25 01:12:16.264358: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll 2020-12-25 01:12:16.268891: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll 2020-12-25 01:12:16.272625: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll 2020-12-25 01:12:16.282877: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll 2020-12-25 01:12:16.283119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-12-25 01:12:16.283488: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2020-12-25 01:12:16.291040: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x223ba066800 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-12-25 01:12:16.291333: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-12-25 01:12:16.293402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1660 Ti computeCapability: 7.5 coreClock: 1.59GHz coreCount: 24 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s 2020-12-25 01:12:16.294634: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll 2020-12-25 01:12:16.294798: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll 2020-12-25 01:12:16.294981: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll 2020-12-25 01:12:16.295133: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll 2020-12-25 01:12:16.295285: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll 2020-12-25 01:12:16.295438: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll 2020-12-25 01:12:16.295620: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll 2020-12-25 01:12:16.296033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-12-25 01:12:16.939243: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-12-25 01:12:16.939417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 2020-12-25 01:12:16.939519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N 2020-12-25 01:12:16.939768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4615 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5) 2020-12-25 01:12:16.942524: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x223fb58b970 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2020-12-25 01:12:16.942780: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1660 Ti, Compute Capability 7.5 Epoch 1/4000 2020-12-25 01:12:17.682064: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll 2020-12-25 01:12:17.988403: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll 2020-12-25 01:12:19.257223: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only Relying on driver to perform ptx compilation. Modify $PATH to customize ptxas location. This message will be only logged once.