I'm making goal-keeper bot in haxball game. It worked well when i trained less but i worked worse when i trained more.
Last reinforcement state: 5160 episode - 4171281 steps - 0.05 epsilon:
Last fit result= acc: 0.9905, loss: 293408940460887.5000
(previous fit result like: acc: 0.7, loss: 0.012)
Last game result image: https://i.ibb.co/NmLL5b7/hax2.gif
This Keras model:
NBACTIONS = 3
WINDOW_WIDTH = 630
WINDOW_HEIGHT = 400
IMGHEIGHT = int(WINDOW_HEIGHT/5)
IMGWIDTH = int(WINDOW_WIDTH/5)
IMGHISTORY = 4
model = Sequential()
model.add(Conv2D(32, kernel_size=4, strides = (2,2), input_shape = (DCQL_Global.IMGHEIGHT,DCQL_Global.IMGWIDTH,DCQL_Global.IMGHISTORY),padding = "same"))
model.add(Activation("relu"))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(64,kernel_size=4,strides=(2,2),padding="same"))
model.add(Activation("relu"))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(64,kernel_size=3,strides=(1,1),padding="same"))
model.add(Activation("relu"))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation("relu"))
model.add(Dense(units= DCQL_Global.NBACTIONS, activation="linear"))
model.compile(loss = "mse", optimizer=Adam(lr=0.0001), metrics=['accuracy'])
Reward rules= goal: -1, save(not missing goal): +1, after kicking the ball if gk follow the ball correctly +0.01 else gk follow the ball incorrectly -0.01 reward.
I'm preprocessing the image like the following link:
https://i.ibb.co/yPdrJKJ/hax.gif
Preprocessing code:
def ProcessGameImage(I):
GreyImage = skimage.color.rgb2gray(I)
CroppedImage = GreyImage
ReducedImage = skimage.transform.resize(CroppedImage,(DCQL_Global.IMGHEIGHT,DCQL_Global.IMGWIDTH))
ReducedImage = skimage.exposure.rescale_intensity(ReducedImage, out_range = (0,255))
ReducedImage = ReducedImage / 255.0
return ReducedImage.astype(np.float)
What's my fault?