I have a quadrotor which flies around and knows its x, y, z positions and angular displacement along the x, y, z axis. It captures a constant stream of images which are converted into depth maps (we can estimate the distance between each pixel and the camera).
How can one program an algorithm which converts this information into a 3D model of the environment? That is, how can we generate a virtual 3D map from this information?
Example: below is a picture that illustrates what the quadrotor captures (top) and what the image is converted into to feed into a 3D mapping algorithm (bottom)
Let's suppose this image was taken from a camera with x, y, z coordinates (10, 5, 1) in some units and angular displacement of 90, 0, 0 degrees about the x, y, z axes. What I want to do is take a bunch of these photo-coordinate tuples and convert them into a single 3D map of the area.
Edit 1 on 7/30: One obvious solution is to use the angle of the quadrotor wrt to x, y, and z axes with the distance map to figure out the Cartesian coordinates of any obstructions with trig. I figure I could probably write an algorithm which uses this approach with a probabilistic method to make a crude 3D map, possibly vectorizing it to make it faster.
However, I would like to know if there is any fundamentally different and hopefully faster approach to solving this?