I have the frames from a surveillance camera observing a big public hall from an angle close to 45° relative to the ceiling. Every person is annotated for each frame they appear in with screen coordinates (1920 X 1080 px). Which means I have (X,Y) coordinates for each person for each frame.
Now what I need to do is transform these coordinates into real world coordinates, since naturally there is a distortion between sections close to the camera and sections further away from it. In the end I hope to have transformed coordinates which depict constant movement in a linear fashion, no matter where it occurs, if that makes sense. (I measure the velocity of each person)
I don´t know the height of the camera, nor the exact angle, but it is sufficient to estimate both.
Thanks for any help.
EDIT: Basically this is a mathematical question. I don´t have any code that deals with this part of the problem. The sole purpose is to improve the accuracy of my program by having more accurate coordinates than the current which are distorted.