3d model construction using multiple images from multiple points (kinect)

Question

is it possible to construct a 3d model of a still object if various images along with depth data was gathered from various angles, what I was thinking was have a sort of a circular conveyor belt where a kinect would be placed and the conveyor belt while the real object that is to be reconstructed in 3d space sits in the middle. The conveyor belt thereafter rotates around the image in a circle and lots of images are captured (perhaps 10 image per second) which would allow the kinect to catch an image from every angle including the depth data, theoretically this is possible. The model would also have to be recreated with the textures.

What I would like to know is whether there are any similar projects/software already available and any links would be appreciated Whether this is possible within perhaps 6 months How would I proceed to do this? Such as any similar algorithm you could point me to and such

Thanks, MilindaD

Go for broke, use the Kinect's video camera to create the textures! — Coeffect, Jul 05 '11 at 13:54

score 5 · Answer 1 · edited Apr 09 '13 at 13:24

5

It is definitely possible and there are a lot of 3D scanners which work out there, with more or less the same principle of stereoscopy.

You probably know this, but just to contextualize: The idea is to get two images from the same point and to use triangulation to compute the 3d coordinates of the point in your scene. Although this is quite easy, the big issue is to find the correspondence between the points in your 2 images, and this is where you need a good software to extract and recognize similar points.

There is an open-source project called Meshlab for 3d vision, which includes 3d reconstruction* algorithms. I don't know the details of the algorithms, but the software is definitely a good entrance point if you want to play with 3d.

I used to know some other ones, I will try to find them and add them here:

Insight3d

(*Wiki page has no content, redirects to login for editing)

edited Apr 09 '13 at 13:24

handle

5,859
3
54
82

answered Jul 04 '11 at 10:45

drolex

415
3
8

1

Because MilindaD is using the kinect the triangulation issue is already solved, and for each pixel depth is already available. The trick is to match these 3D images from different angles with each other to get a more complete picture of the 3D shape of the object. Nevertheless, this is doable. – jilles de wit Jul 04 '11 at 12:13
OK, I am not very familiar with the kinect system. If the issue is merging the different 3d parts, then again, take a look at meshlab, this is already implemented (look for merging). The merging of textures is always a bit difficult because of the changes in the lighting conditions, but if you manage to control them, the result should be pretty good. Otherwise a lot of professional software will do this part perfectly (rapidform, geomagic, polyworks), but they cost a lot. Really. Meshlab is in my opinion a good free alternative. – drolex Jul 04 '11 at 12:25
yes, meshlab should be able to do the merging of textures. I was just commenting that the first (difficult) step is already done by the kinect. – jilles de wit Jul 04 '11 at 14:03
In 2D when we need to stitch images, keywords like grafting or matching pop up. Check those out. Also, for e.g. Microsofts photosynth is a good example to look at! – brainydexter Jul 05 '11 at 08:14
@brainydexter Didn't know about the term "grafting", interesting. Would you have a precise definition for it, somewhere? I have been working quite a while in image processing and I still fail to understand how the terminology can be so fuzzy... – drolex Jul 05 '11 at 11:12
@drolex: I took a course in computational photography and I happen to remember the term from there. IIRC, "grafting" refers to notion of finding the "matching points" between two images and "registering" the images. I'll try to find a relevant link and post here later. – brainydexter Jul 05 '11 at 18:39
Because the Kinect uses an IR light/sensor, lighting changes shouldn't cause problems when using the depth data, which is what I assume would be used for the merging. – Coeffect Jul 06 '11 at 19:53
Yeah the texture is usually not used for the texture anyway (if you have enough 3D information it's useless), but the lighting will affect the final model with the textures merged. – drolex Jul 07 '11 at 08:28

score 3 · Answer 2 · answered Jul 13 '11 at 04:30

Check out https://bitbucket.org/tobin/kinect-point-cloud-demo/overview which is a code sample for the Kinect for Windows SDK that does specifically this. Currently it uses the bitmaps captured by the depth sensor, and iterates through the byte array to create a point cloud in a PLY format that can read by MeshLab. The next stage of us is to apply/refine a delanunay triangle algoirthim to form a mesh instead of points, which a texture can be applied. A third stage would then me a mesh merging formula to combine multiple caputres from the Kinect to form a full 3D object mesh.

This is based on some work I done in June using Kinect for the purposes of 3D printing capture.

The .NET code in this source code repository will however get you started with what you want to achieve.

score 2 · Answer 3 · answered Sep 22 '11 at 19:26

Autodesk has a piece of software that will do what you are asking for it is called "Photofly". It is currently in the labs section. Using a series of images taken from multiple angles the 3d geometry is created and then photo mapped with your images to create the scene.

score 2 · Answer 4 · answered Nov 07 '11 at 12:24

2

If you interested more in theoretical (i mean if you want to know how) part of this problem, here is some document from Microsoft Research about moving depth camera and 3D reconstruction.

answered Nov 07 '11 at 12:24

Zakus

199
12

score 0 · Answer 5 · answered Nov 10 '13 at 17:59

Try out VisualSfM (http://ccwu.me/vsfm/) by Changchang Wu (http://ccwu.me/)

It takes multiple images from different angles of the scene and outputs a 3D point cloud.

The algorithm is called "Structure from Motion". Brief idea of the algorithm : It involves extracting feature points in each image; finding correspondences between them across images; building feature tracks, estimating camera matrices and thereby the 3D coordinates of the feature points.

3d model construction using multiple images from multiple points (kinect)

5 Answers5

Linked