Image recognition and 3d rendering

Question

How hard would it be to take an image of an object (in this case of a predefined object), and develop an algorithm to cut just that object out of a photo with a background of varying complexity.

Further to this, a photo's object (say a house, car, dog - but always of one type) would need to be transformed into a 3d render. I know there are 3d rendering engines available (at a cost, free, or with some clause), but for this to work the object (subject) would need to be measured in all sorts of ways - e.g. if this is a person, we need to measure height, the curvature of the shoulder, radius of the face, length of each finger, etc.

What would the feasibility of solving this problem be? Anyone know any good links specialing in this research area? I've seen open source solutions to this problem which leaves me with the question of the ease of measuring the object while tracing around it to crop it out.

Thanks

Essentially I want to take a 2d image (typical image:which is easier than a complex photo containing multiple objects, etc.)

,

But effectively I want to turn that into a 3d image, so wouldn't what I want to do involve building a 3d rendering/modelling engine?

Furthermore, that link I have provided goes into 3ds max, with a few properties set, and a render is made.

It sounds like you may have additional information/constraints that you are not sharing. First of all, to do reconstruction you need at least two images. Second the reconstruction itself tells nothing about scale. To determine the size of something in an image you have to have a reference. — Carlos Rendon, Jan 09 '09 at 23:54
Not at all. I just don't know anything about this field (being a general web developer first and foremost). This is why I'm getting opinions here and researching the problem before I write a single line of code to solve this problem. — GurdeepS, Jan 10 '09 at 00:27

score 4 · Answer 1 · answered Jan 09 '09 at 21:21

4

It sounds like you want to do several things, all in the domain of computer vision.

Object Recognition (i.e. find the predefined object)
3D Reconstruction (make the 3d model from the image)
Image Segmentation (cut out just the object you are worried about from the background)

I've ranked them in order of easiest to hardest (according to my limited understanding). All together I would say it is a very complicated problem. I would look at the following Wikipedia links for more information:

Computer Vision Overview (Wikipedia)

The Eight Point Algorithm (for 3d reconstruction)

Image Segmentation

answered Jan 09 '09 at 21:21

Carlos Rendon

6,174
5
34
50

1

Image segmentation is not difficult, if you can assume what you're segmenting obeys certain rules. The example I remember from school was grains of rice being photographed as they proceeded along a conveyor belt. Through segmentation, their size and quality could be measured. – Charlie Salts Jan 11 '09 at 18:36
1

I agree that segmentation can be simple in a highly constrained environment such as a factory with very homogeneous objects. But in general the problem of segmenting objects is difficult. – Carlos Rendon Jan 11 '09 at 22:14

score 1 · Answer 2 · answered Jan 09 '09 at 22:08

You're right this is an extremely hard set of problems, particularly that of inferring 3D information from a 2D image. Only a very limited understanding exists of how our visual system extrapolates 3D information from 2D images, one such approach is known as "Shape from Shading" and the linked google search shows how much (and consequently how little) we know.

Rob

score 1 · Answer 3 · answered Jan 11 '09 at 14:15

This is a very difficult task. The hardest part is not recognising or segmenting the object from the image, but rather inferring the 3-D geometry of the object from the 2-D image. You will have more success if you can use a stereoscopic camera (or a laser scanner, if you have access to one ;).

For the case of 2-D images, try googling for "shape-from-shading". This is a method for inferring 3-D shape from a 2-D image. It does make assumptions about illumination conditions and surface properties (BRDF and geometry) that may fail in many cases, but if you are using it for only a predefined class of objects (e.g. human faces) it can work reasonably well.

score 0 · Answer 4 · answered Jan 09 '09 at 21:26

0

Assuming it's possible, that would be extremely difficult, especially with only one image of the object. The rasterizer has to guess at the depth and distances of objects.

What you describe sounds very similar to Microsoft PhotoSynth.

answered Jan 09 '09 at 21:26

tsilb

7,977
13
71
98

Image recognition and 3d rendering

4 Answers4