I believe my neural network model is only running on CPU because whenever It's training I check the task manager and the GPU utilization is 0%. I also tried printing the elapsed time before and after I went through the necessary packages installation and it's always the same. What am I doing wrong??
I have an RTX 3060 GPU, I have installed CUDA v11.6 and cuDNN v11.5, I added all the necessary paths to the environment variables, I also have tf-gpu and keras-gpu installed.
running this code
print(f"Tensor Flow Version: {tf.__version__}")
print(f"Keras Version: {tensorflow.keras.__version__}")
print()
print(f"Python {sys.version}")
print(f"Pandas {pd.__version__}")
gpu = len(tf.config.list_physical_devices('GPU'))>0
print("GPU is", "available" if gpu else "NOT AVAILABLE")
print(device_lib.list_local_devices())
prints
Tensor Flow Version: 2.6.0
Keras Version: 2.6.0
Python 3.9.7 (default, Sep 16 2021, 16:59:28) [MSC v.1916 64 bit (AMD64)]
Pandas 1.4.1
GPU is available
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13579942905822528038
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 3665166336
locality {
bus_id: 1
links {
}
}
incarnation: 16782751195021072368
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6"
]
2022-03-28 16:03:59.493492: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-28 16:04:00.265191: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3495 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
2022-03-28 16:04:00.450856: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2022-03-28 16:04:01.910615: I tensorflow/stream_executor/cuda/cuda_blas.cc:1760] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2022-03-28 16:19:08.490212: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /device:GPU:0 with 3495 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
And this my neural network model
import tensorflow as tf
from keras.models import Sequential
import keras.layers as layers
import keras.metrics as metrics
import pandas as pd
import glob
import time
model = Sequential()
model.add(layers.Dense(20, input_shape=(3,)))
model.add(layers.Dense(30))
model.add(layers.Dense(1))
model.compile(loss=['mean_absolute_error'],
optimizer='adam',
metrics=[metrics.MeanAbsoluteError()])
t1 = time.time()
model.fit(x_train, y_train, epochs=10, batch_size=None)
t2 = time.time()
print(t2-t1)