Training (NLP) Transformer For text summarization ,Using TensorFlow on 8Giga RTX2070

Asked Aug 13 '23 at 05:42

Active Aug 13 '23 at 18:03

Viewed 41 times

-1

I'm building a text summarizer using tensorflow and transformer architecture. for learning purposes and I have the following Parameters

encoder vocab size : 100000
decoder vocab size  : 10000
encoder maxlen : 1000
decoder maxlen : 80
nun layers : 4
d model :128
dff : 512
num heads : 4
batch size :32

It's working fine and I'm kind of happy with the initial results as its my first model.

but my question is: My VRAM used is always 6.7gig out of 8 gig ALWAYS , no matter how small I tune the hyper Parameters, and when I try to to make the Parameters larger it goes to 6.7 then throw the will known error "OOM" given the fact that the 1.3 are completely free..... any thoughts ?

all I want is to get the full utilization of my GPU not only almost 80%

edited Aug 13 '23 at 18:03

desertnaut

57,590
26
140
166

asked Aug 13 '23 at 05:42

Ibrahim Nada

Training (NLP) Transformer For text summarization ,Using TensorFlow on 8Giga RTX2070

0 Answers0