Perspective Projection in Android in an augmented reality application

Question

Currently I'm writing an augmented reality app and I have some problems to get the objects on my screen. It's very frustrating for me that I'm not able to transform gps-points to the correspending screen-points on my android device. I've read many articles and many other posts on stackoverflow (I've already asked similar questions) but I still need your help.

I did the perspective projection which is explained in wikipedia.

What do I have to do with the result of the perspective projection to get the resulting screenpoint?

score 16 · Accepted Answer · edited Feb 08 '17 at 14:41

16

The Wikipedia article also confused me when I read it some time ago. Here is my attempt to explain it differently:

The Situation

Let's simplify the situation. We have:

Our projected point D(x,y,z) - what you call relativePositionX|Y|Z
An image plane of size w * h
A half-angle of view α

... and we want:

The coordinates of B in the image plane (let's call them X and Y)

A schema for the X-screen-coordinates:

E is the position of our "eye" in this configuration, which I chose as origin to simplify.

The focal length f can be estimated knowing that:

tan(α) = (w/2) / f (1)

A bit of Geometry

You can see on the picture that the triangles ECD and EBM are similar, so using the Side-Splitter Theorem, we get:

MB / CD = EM / EC <=> X / x = f / z (2)

With both (1) and (2), we now have:

X = (x / z) * ( (w / 2) / tan(α) )

If we go back to the notation used in the Wikipedia article, our equation is equivalent to:

b_x = (d_x / d_z) * r_z

You can notice we are missing the multiplication by s_x / r_x. This is because in our case, the "display size" and the "recording surface" are the same, so s_x / r_x = 1.

Note: Same reasoning for Y.

Practical Use

Some remarks:

Usually, α = 45deg is used, which means tan(α) = 1. That's why this term doesn't appear in many implementations.
If you want to preserve the ratio of the elements you display, keep f constant for both X and Y, ie instead of calculating:
- X = (x / z) * ( (w / 2) / tan(α) ) and Y = (y / z) * ( (h / 2) / tan(α) )
... do:
- X = (x / z) * ( (min(w,h) / 2) / tan(α) ) and Y = (y / z) * ( (min(w,h) / 2) / tan(α) )
Note: when I said that "the "display size" and the "recording surface" are the same", that wasn't quite true, and the min operation is here to compensate this approximation, adapting the square surface r to the potentially-rectangular surface s.

Note 2: Instead of using min(w,h) / 2, Appunta uses screenRatio= (getWidth()+getHeight())/2 as you noticed. Both solutions preserve the elements ratio. The focal, and thus the angle of view, will simply be a bit different, depending on the screen's own ratio. You can actually use any function you want to define f.
As you may have noticed on the picture above, the screen coordinates are here defined between [-w/2 ; w/2] for X and [-h/2 ; h/2] for Y, but you probably want [0 ; w] and [0 ; h] instead. X += w/2 and Y += h/2 - Problem solved.

Conclusion

I hope this will answer your questions. I'll stay near if it needs editions.

Bye!

< Self-promotion Alert > I actually made some time ago an article about 3D projection and rendering. The implementation is in Javascript, but it should be quite easy to translate.

edited Feb 08 '17 at 14:41

Community

1
1

answered May 23 '13 at 09:13

benjaminplanche

14,689
5
57
69

Hey! Thanks for this really informative reply!! I'll give it a try :) And thank you for your article about 3D projection! It's really useful for me ;) – Frame91 May 23 '13 at 09:56
Can you explain to me, what for an angle of view do you mean? I've the horizontal and vertical view angle (didn't check if they are equal), and in the wikipedia article is also mentioned a diagonally view angle (: – Frame91 May 23 '13 at 10:00
1

In the method I present, *2α* is both the horizontal and vertical angle of view (since I use a smaller effective **square** image plane of dimensions *min(h,w)* x *min(h,w)*). But you can tweak it to use your 2 values and the whole effective screen *w* x *h* instead, if you want. – benjaminplanche May 23 '13 at 12:08
If i got it right, I have to use the horizontal view angle in X = (x / z) * ( (min(w,h) / 2) / tan(α) ) and the vertical view angle in Y = (y / z) * ( (min(w,h) / 2) / tan(α) ). The dimension would be h*w instead of min(h,w)*min(h,w) ? So i get: X = (x / z) * ( w / 2) / tan(verticalviewangle) ) and Y = (y / z) * ( h / 2) / tan(horizontalviewangle) )? thanks a lot! – Frame91 May 23 '13 at 12:20
1

Yes, just check out for your ratio then. Depending on your angles and dimensions, the results may be a bit affected (but I guess your angles must already be computed to prevent that, so it should be ok) :) – benjaminplanche May 23 '13 at 12:54
thank you very much! I don't get any "good" value, but I think this depends on my transformation from lat/lon/alt to a cartesian coordinate system. My coordinate-data is in meter (x,y,z), and I'm calculating it with the haversina formula and the bearing between camera to the point I want to project. With both values, I can use trigonometry for calculating x and y. My z value is simply the subtraction from camera-height and point height. Is this coordinate system usable? Do I have to convert the meters in some other value? (e.g. pixel?) thanks a lot! – Frame91 May 23 '13 at 18:34
1

Hey, I'm not really familiar with lat/lon/alt coordinates, sorry. But you should maybe first check where the error comes from - your coordinates system conversion or your projection. Try for instance to use you projection method with a simple input *(8 points forming a cube for instance)* so you can check the output, knowing what to expect *(a cube from the chosen point of view)*. – benjaminplanche May 25 '13 at 08:11
okay..anyway... thanks a lot for your really informative replies!! – Frame91 May 25 '13 at 11:39
Okay... I've one last question ;) What I don't get: How does my projection "know" where to project the point on the screen, because: My coordinate system is in meter, my width/height of the screen is in pixel. For example, I could change my coordinate system to ECEF or other cartesian coordinate system, and I'll get other dx,dy,dz values... which is really weird. You mentioned, that you aren't "augment" things, so I think, you are simply developing 3d applications with a "virtual" camera. Is your camera height/width in the same unit as your "world" ? – Frame91 May 25 '13 at 23:16
2

During the projection, when you do "b_x = (d_x / d_z) * r_z", d_x and d_z are in the real-world unit you chose, while b_x and r_z are in your screen's unit (pixels probably). "(d_x / d_z)" is thus unitless since you calculate the relative proportion, and by multiplying by r_z, you apply this proportion to the screen size. – benjaminplanche May 26 '13 at 03:24
Hey... me again, I've tested a few things and I still don't get how the projection "knows" which is far and which is near: When I'm changing my coordinate system to "millimeters" I get values 1000 times higher then before. My relativePositionX is for example 1100 in this case. This results in screen point coordinates like (5000,10000) which cannot be shown on my display. Currently, my coordinate system is in meter and I get some "valid" values... but my projection can't know about this. Where do I have to manage this? Thanks – Frame91 May 26 '13 at 18:06
I tried to answer more explicitly in your other thread (http://stackoverflow.com/a/16792965/624547). Hope it will help. – benjaminplanche May 28 '13 at 13:11
1

Hi, sorry about the necropost, but can you explain the "Usually, α = 90deg is used, which means tan(α) = 1"? Isn't tan(90deg) supposed to be "undefined"? – kikito Nov 09 '16 at 01:54
1

Hi @kikito, thanks for spotting the typo! It was supposed to be _2α = 90deg_ (or _α = 45deg_). I corrected my replies accordingly. Thanks again! – benjaminplanche Nov 12 '16 at 16:00

Perspective Projection in Android in an augmented reality application

1 Answers1

The Situation

A bit of Geometry

Practical Use

Conclusion

Linked