Camera Geometry: Algorithm for "object area correction"

Question

A project I've been working on for the past few months is calculating the top area of an object taken with a 3D depth camera from top view.

workflow of my project:

capture a group of objects image(RGB,DEPTH data) from top-view
Instance Segmentation with RGB image
Calculate the real area of the segmented mask with DEPTH data

Some problem on the project:

All given objects have different shapes
The side of the object, not the top, begins to be seen as it moves to the outside of the image.
Because of this, the mask area to be segmented gradually increases.
As a result, the actual area of an object located outside the image is calculated to be larger than that of an object located in the center.

In the example image, object 1 is located in the middle of the angle, so only the top of the object is visible, but object 2 is located outside the angle, so part of the top is lost and the side is visible.

Because of this, the mask area to be segmented is larger for objects located on the periphery than for objects located in the center.

I only want to find the area of the top of an object.

example what I want image:

fig 2

Is there a way to geometrically correct the area of an object located on outside of the image?

I tried to calibrate by multiplying the area calculated according to the angle formed by Vector 1 connecting the center point of the camera lens to the center point of the floor and Vector 2 connecting the center point of the lens to the center of gravity of the target object by a specific value. However, I gave up because I couldn't logically explain how much correction was needed.

fig 3:

Spektre · Accepted Answer · 2023-01-27T08:49:00.627

0

What I would do is convert your RGB and Depth image to 3D mesh (surface with bumps) using your camera settings (FOVs,focal length) something like this:

Align already captured rgb and depth images

and then project it onto ground plane (perpendicul to camera view direction in the middle of screen). To obtain ground plane simply take 3 3D positions of the ground p0,p1,p2 (forming triangle) and using cross product to compute the ground normal:

n = normalize(cross(p1-p0,p2-p1))

now you plane is defined by p0,n so just each 3D coordinate convert like this:

by simply adding normal vector (towards ground) multiplied by distance to ground, if I see it right something like this:

p' = p + n * dot(p-p0,n)

That should eliminate the problem with visible sides on edges of FOV however you should also take into account that by showing side some part of top is also hidden so to remedy that you might also find axis of symmetry, and use just half of top side (that is not hidden partially) and just multiply the measured half area by 2 ...

edited Jan 27 '23 at 08:49

answered Jan 26 '23 at 10:00

Spektre

49,595
11
110
380

First of all thank you so much for your nice comments. However, I have a question from your comments. I acquired the image after installing the camera in the top-view direction (perfectly perpendicular to the ground) on the ceiling 2800mm from the ground. So, is it correct that the plane representing the ground is simply z = 2800? If so, if I project the chicken onto an imaginary z=2500 plane parallel to the ground, like the fig 3 I just added to the text, I would expect the side to be projected onto the plane as it is. **How do you define the plane that represents the ground?** – 윤도현 Jan 27 '23 at 02:51
@윤도현 You take 3 3D positions of the ground `p0,p1,p2` (forming triangle) and using cross product to compute the normal `n = normalize(cross(p1-p0,p2-p1))` now you plane is defined by `p0,n` – Spektre Jan 27 '23 at 08:27
Since chickens don't have flat tops, the camera might not be able to see the "far corner". This method fails with spherical chickens, for example. – Matt Timmermans Jan 27 '23 at 13:44
1

@Spektre As you said, after creating a 3D mesh and projecting it on z=0 (ground), the desired shape was obtained. Thank you very much!! – 윤도현 Jan 29 '23 at 07:35

score 0 · Answer 2 · answered Jan 26 '23 at 13:38

Accurate computation is virtually hopeless, because you don't see all sides.

Assuming your depth information is available as a range image, you can consider the points inside the segmentation mask of a single chicken, estimate the vertical direction at that point, rotate and project the points to obtain the silhouette.

But as a part of the surface is occluded, you may have to reconstruct it using symmetry.

Your comments have been very helpful too. Thank you very much! — 윤도현, Jan 30 '23 at 11:41

score 0 · Answer 3 · answered Jan 27 '23 at 13:53

There is no way to do this accurately for arbitrary objects, since there can be parts of the object that contribute to the "top area", but which the camera cannot see. Since the camera cannot see these parts, you can't tell how big they are.

Since all your objects are known to be chickens, though, you could get a pretty accurate estimate like this:

Use Principal Component Analysis to determine the orientation of each chicken.
Using many objects in many images, find a best-fit polynomial that estimates apparent chicken size by distance from the image center, and orientation relative to the distance vector.
For any given chicken, then, you can divide its apparent size by the estimated average apparent size for its distance and orientation, to get a normalized chicken size measurement.

Your comments have been very helpful too. Thank you very much! — 윤도현, Jan 29 '23 at 07:37

Camera Geometry: Algorithm for "object area correction"

3 Answers3