picking in 3D with ray-tracing using NinevehGL or OpenGL i-phone

Question

I couldn't find the correct and understandable expression of picking in 3D with method of ray-tracing. Has anyone implemented this algorithm in any language? Share directly working code, because since pseudocodes can not be compiled, they are genereally written with lacking parts.

jozxyqk · Accepted Answer · 2013-11-22T13:23:16.143

What you have is a position in 2D on the screen. The first thing to do is convert that point from pixels to normalized device coordinates — -1 to 1. Then you need to find the line in 3D space that the point represents. For this, you need the transformation matrix/ces that your 3D app uses to create a projection and camera.

Typically you have 3 matrics: projection, view and model. When you specify vertices for an object, they're in "object space". Multiplying by the model matrix gives the vertices in "world space". Multiplying again by the view matrix gives "eye/camera space". Multiplying again by the projection gives "clip space". Clip space has non-linear depth. Adding a Z component to your mouse coordinates puts them in clip space. You can perform the line/object intersection tests in any linear space, so you must at least move the mouse coordinates to eye space, but it's more convenient to perform the intersection tests in world space (or object space depending on your scene graph).

To move the mouse coordinates from clip space to world space, add a Z-component and multiply by the inverse projection matrix and then the inverse camera/view matrix. To create a line, two points along Z will be computed — from and to.

enter image description here

In the following example, I have a list of objects, each with a position and bounding radius. The intersections of course never match perfectly but it works well enough for now. This isn't pseudocode, but it uses my own vector/matrix library. You'll have to substitute your own in places.

vec2f mouse = (vec2f(mousePosition) / vec2f(windowSize)) * 2.0f - 1.0f;
mouse.y = -mouse.y; //origin is top-left and +y mouse is down

mat44 toWorld = (camera.projection * camera.transform).inverse();
//equivalent to camera.transform.inverse() * camera.projection.inverse() but faster

vec4f from = toWorld * vec4f(mouse, -1.0f, 1.0f);
vec4f to = toWorld * vec4f(mouse, 1.0f, 1.0f);

from /= from.w; //perspective divide ("normalize" homogeneous coordinates)
to /= to.w;

int clickedObject = -1;
float minDist = 99999.0f;

for (size_t i = 0; i < objects.size(); ++i)
{
    float t1, t2;
    vec3f direction = to.xyz() - from.xyz();
    if (intersectSphere(from.xyz(), direction, objects[i].position, objects[i].radius, t1, t2))
    {
        //object i has been clicked. probably best to find the minimum t1 (front-most object)
        if (t1 < minDist)
        {
            minDist = t1;
            clickedObject = (int)i;
        }
    }
}

//clicked object is objects[clickedObject]

Instead of intersectSphere, you could use a bounding box or other implicit geometry, or intersect a mesh's triangles (this may require building a kd-tree for performance reasons).

[EDIT]
Here's an implementation of the line/sphere intersect (based off the link above). It assumes the sphere is at the origin, so instead of passing from.xyz() as p, give from.xyz() - objects[i].position.

//ray at position p with direction d intersects sphere at (0,0,0) with radius r. returns intersection times along ray t1 and t2
bool intersectSphere(const vec3f& p, const vec3f& d, float r, float& t1, float& t2)
{
    //http://wiki.cgsociety.org/index.php/Ray_Sphere_Intersection
    float A = d.dot(d);
    float B = 2.0f * d.dot(p);
    float C = p.dot(p) - r * r;

    float dis = B * B - 4.0f * A * C;

    if (dis < 0.0f)
        return false;

    float S = sqrt(dis);    

    t1 = (-B - S) / (2.0f * A);
    t2 = (-B + S) / (2.0f * A);
    return true;
}

It is a great explanation, thank you for telling this long. I will try this method. Can you please also share the implementation of function: `intersectSphere()`? — ibrahim demir, Nov 22 '13 at 13:02
`mat44 toWorld = (camera.projection * camera.transform).inverse();` I didn't understand what camera.projection and camera.transform matrixes is. Can you explain it or write equivalent matrix in ninevehgl framework ? — ibrahim demir, Nov 23 '13 at 14:46
Both matrices you will need to build yourself or get them from your framework/library. A quick google brought me to [`NGLCamera`](http://nineveh.gl/docs/Classes/NGLCamera.html). It looks like you can use `matrixViewProjection` which is `projection * view` (same as above) and I'm assuming the class has an `inverse()` function. — jozxyqk, Nov 23 '13 at 15:32
[My Code Here](http://pastebin.com/19FTiuqv) This is my code, But the result is something like this: from: 0.056349 -0.000000 0.000000 -0.000000 to : 0.056349 0.000000 0.000000 -0.000000 You said that dividing with w will normalize but w = 0 in here. What's wrong? — ibrahim demir, Nov 23 '13 at 16:27
If `w` is zero, the matrices are probably incorrect. Also both vectors are the same. I'd print out the matrices and make sure they aren't all zero. There might be some more initialization or updating the camera class needs. — jozxyqk, Nov 23 '13 at 16:44
[Here is the Vector output](http://pastebin.com/qn7AQ1Y5) I dont know which part to update, I am newbie in this subject. I would be appreciated for any help. — ibrahim demir, Nov 23 '13 at 16:58
@ibrahimdemir Double check your near/far values for the camera aren't too far apart, although from the looks of it they're nearly the same :S. Eg, use a range of 0.01 to 100.0. Much further and you run into precision issues. Can you show me the separate projection and view matrices? — jozxyqk, Nov 23 '13 at 17:25
I have find the problem, it is smt. strange with pointer usage in objective C. I edited my code. But `to` vector's `z`parameter is too high and `w` parameter is always nearly 1. So dividing with w is not a solution in this case.. [My updated code](http://pastebin.com/C5a8YCXF) [Here is the Log](http://pastebin.com/41X2L9NB) — ibrahim demir, Nov 24 '13 at 14:21
And here is the contents of [Projection matrix](http://pastebin.com/DEhvYnVN) and [view matrix](http://pastebin.com/EANs7Hsh) — ibrahim demir, Nov 24 '13 at 14:24
At first glance it looks like the matrix multiplies might be transposed (see [row major vs column major](http://www.opengl.org/archives/resources/faq/technical/transformations.htm)). And I'd assume the library stores its matrices in the same order. So for example instead of `from.x=m[0]*x+m[1]*y+m[2]*z+m[3]*w`, use `from.x=m[0]*x+m[4]*y+m[8]*z+m[12]*w`. — jozxyqk, Nov 24 '13 at 18:09
[the result, when I update as 0-4-8-12](http://pastebin.com/HKisWA2M) from & to variables are stable to this numbers according to touching point.. (Z and W are same everytime, X and Y changing according to touch coordinate) — ibrahim demir, Nov 24 '13 at 19:47
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/41833/discussion-between-jozxyqk-and-ibrahim-demir) — jozxyqk, Nov 25 '13 at 04:02

score 0 · Answer 2 · answered Mar 10 '21 at 11:34

vec4f from = toWorld * vec4f(mouse, -1.0f, 1.0f);
vec4f to = toWorld * vec4f(mouse, 1.0f, 1.0f);

I'm assuming that 'from' is the position of the mouse cursor? If so then why is its z negative one, if we are assuming openGL coordinates.

Also in this way do we assume that the depth at this time is -1 to +1 right? Rather than the depth of our frustrum.

picking in 3D with ray-tracing using NinevehGL or OpenGL i-phone

2 Answers2

Linked