I am trying to train my model on preprocessed data stored in the parquet format. I am using keras to train an autoencoder model. And in order to load the data on the go instead of loading whole data in memory, I am using petastorm.
Troupble is, the GPU utilization on kaggle notebook barely touches 5 or 6%.
Whereas, I want almost commplete GPU utilization.
Below is the code I am using:
early_stop = EarlyStopping(monitor='val_loss',patience=1)
with make_batch_reader('file:///kaggle/working/scaled.parquet', num_epochs=1,shuffle_row_groups=False) as train_reader:
#Loading batch(Main Batch) of data with 6400 rows
train_ds = make_petastorm_dataset(train_reader).unbatch().map(lambda x:(tf.convert_to_tensor(x))).batch(6400,drop_remainder=True)
print(type(train_ds))
for ele in train_ds:
tensor = tf.reshape(ele,(6400,1,15))
# Below we are obtaining a sub-batch from the main batch and t
model.fit(tensor,tensor,batch_size=32,epochs=1, validation_split=0.1,callbacks=[early_stop],verbose=1)```
What have I tried so far:
I implemented below based on internet suggestions. But there was no improvement in the results.
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only use the first GPU
try:
tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)
except RuntimeError as e:
print(e)
I also performed a test as suggested by stackoverflow post
which tells me that GPU is active.
how do I accelerate the performance?