Why running a DQN, the memory of my program increase at every model.fit() call. using memory_profiler on the train() function in my DQN I get this:
Line # Mem usage Increment Occurrences Line Contents
=============================================================
124 315.7 MiB 315.7 MiB 1 @profile
125 def train(self):
126
127 315.7 MiB 0.0 MiB 1 if len(self.replaybuffer) < MIN_REPLAY_BUFFER:
128 return
129
130 #get idexes from replay buffer
131 315.7 MiB 0.0 MiB 1 rpbIdexes = np.random.choice(range(len(self.replaybuffer)), size=BATCHSIZE, replace=False)
132
133 #predict in bulk for speed up
134 315.7 MiB 0.0 MiB 35 nextObservations = np.array([self.replaybuffer[rpbId][4] for rpbId in rpbIdexes])
135 315.7 MiB 0.0 MiB 1 if self.useTarget:
136 315.7 MiB 0.0 MiB 1 nextPredictions = self.targetmodel(nextObservations.reshape(-1, 4))
137 else:
138 nextPredictions = self.model(nextObservations.reshape(-1, 4))
139
140 # x and y arrays for fitting
141 315.7 MiB 0.0 MiB 1 x, y = [], []
142 315.7 MiB 0.0 MiB 33 for idx, rpbId in enumerate(rpbIdexes):
143 315.7 MiB 0.0 MiB 32 observation = self.replaybuffer[rpbId][0]
144 315.7 MiB 0.0 MiB 32 x.append(observation)
145
146 315.7 MiB 0.0 MiB 32 nextPrediction = np.max(nextPredictions[idx])
147
148 315.7 MiB 0.0 MiB 32 done = self.replaybuffer[rpbId][5]
149 315.7 MiB 0.0 MiB 32 reward = self.replaybuffer[rpbId][3]
150 315.7 MiB 0.0 MiB 32 if done:
151 315.7 MiB 0.0 MiB 2 target = reward
152 else:
153 315.7 MiB 0.0 MiB 30 target = reward + self.gamma * nextPrediction
154 # change the prediciton
155 315.7 MiB 0.0 MiB 32 takenAction = self.replaybuffer[rpbId][2]
156 315.7 MiB 0.0 MiB 32 prediction = self.replaybuffer[rpbId][1]
157
158 315.7 MiB 0.0 MiB 32 prediction = prediction.numpy()
159 315.7 MiB 0.0 MiB 32 prediction[takenAction] = target
160 315.7 MiB 0.0 MiB 32 y.append(prediction)
161 315.7 MiB 0.0 MiB 1 x = tf.convert_to_tensor(x)
162 315.7 MiB 0.0 MiB 1 y= tf.convert_to_tensor(y)
163 316.1 MiB 0.4 MiB 1 history = self.model.fit(x=x, y=y, batch_size=BATCHSIZE, verbose=0, shuffle=False)
164 316.1 MiB 0.0 MiB 1 return history
Every time model.fit() gets called a small amount of MiB is "leaked". When running for a long time, the memory gets full.
Im using tensforflow 2.12.0, python 3.10, and its running on my CPU.
I have tried to convert the x and y input to tensors but this did not help. Also I've tried to clear the garbage collector and clear the keras session (keras.backend.clear_Session()
).