I'm doing Reinforcement Learning to train autonomous navigation for a drone. The supported problem is camera based autonomous navigation thanks to a simulated environment and a CNN: AlexNet. I want to add two other inputs to optimize the navigation: the position/orientation of the drone and the volume unit traveled in the form of a 3D matrix. The purpose of the neural network is to predict the next action of the drone.
I want to add those two new inputs after the convolutional layers but I don't know how to integrate them. I intent to flatten the 3D matrix and to add it along with the position/orientation before the fully connected layer but I don't know if it's the right way to do it.
What kind of network is the best to integrate those two new inputs?