34

I have a renderer using directx and openGL, and a 3d scene. The viewport and the window are of the same dimensions.

How do I implement picking given mouse coordinates x and y in a platform independent way?

Tom J Nowell
  • 9,588
  • 17
  • 63
  • 91
  • Take a look at [this guide](http://antongerdelan.net/opengl/raycasting.html), it might be helpful – Megidd Jun 26 '18 at 10:24

6 Answers6

30

If you can, do the picking on the CPU by calculating a ray from the eye through the mouse pointer and intersect it with your models.

If this isn't an option I would go with some type of ID rendering. Assign each object you want to pick a unique color, render the objects with these colors and finally read out the color from the framebuffer under the mouse pointer.

EDIT: If the question is how to construct the ray from the mouse coordinates you need the following: a projection matrix P and the camera transform C. If the coordinates of the mouse pointer is (x, y) and the size of the viewport is (width, height) one position in clip space along the ray is:

mouse_clip = [
  float(x) * 2 / float(width) - 1,
  1 - float(y) * 2 / float(height),
  0,
  1]

(Notice that I flipped the y-axis since often the origin of the mouse coordinates are in the upper left corner)

The following is also true:

mouse_clip = P * C * mouse_worldspace

Which gives:

mouse_worldspace = inverse(C) * inverse(P) * mouse_clip

We now have:

p = C.position(); //origin of camera in worldspace
n = normalize(mouse_worldspace - p); //unit vector from p through mouse pos in worldspace
Peter O.
  • 32,158
  • 14
  • 82
  • 96
Andreas Brinck
  • 51,293
  • 14
  • 84
  • 114
  • 4
    @Tom That wasn't totally clear from the question. Anyways, I've edited my answer, hope it's some help. – Andreas Brinck Jan 19 '10 at 13:22
  • 1
    Its worth noting that if you use DirectX-alike matrices then the multiplication order is reversed. – Goz Jan 19 '10 at 21:49
  • thanks :) I should've been clearer at first, but your edited answer is what I wanted! – Tom J Nowell Jan 21 '10 at 19:06
  • 2
    I'm not sure if the error is in your work or mine, but I could only get this algorithm to work when I used the negative value of the near clipping plane for the z coordinate of mouse_clip. IE: `mouse_clip = [float(x) * 2 / float(width) - 1, 1 - float(y) * 2 / float(height), -1 * near_clipping_plane, 1]` – Alex W Apr 22 '12 at 19:06
  • 2
    Just to save others the trouble: This method only works if the 3. coordinate in mouse_clip is not 0 but -near_depth. Additionally for an orthogonal matrix p has to be computed differently. – Danvil Dec 07 '13 at 22:45
  • Should `mouseWorldSpace` be divided by it's own `w` coordinate, since it has a perspective transformation applied to it? – bwroga Mar 18 '18 at 19:05
26

Here's the viewing frustum:

viewing frustum

First you need to determine where on the nearplane the mouse click happened:

  1. rescale the window coordinates (0..640,0..480) to [-1,1], with (-1,-1) at the bottom-left corner and (1,1) at the top-right.
  2. 'undo' the projection by multiplying the scaled coordinates by what I call the 'unview' matrix: unview = (P * M).inverse() = M.inverse() * P.inverse(), where M is the ModelView matrix and P is the projection matrix.

Then determine where the camera is in worldspace, and draw a ray starting at the camera and passing through the point you found on the nearplane.

The camera is at M.inverse().col(4), i.e. the final column of the inverse ModelView matrix.

Final pseudocode:

normalised_x = 2 * mouse_x / win_width - 1
normalised_y = 1 - 2 * mouse_y / win_height
// note the y pos is inverted, so +y is at the top of the screen

unviewMat = (projectionMat * modelViewMat).inverse()

near_point = unviewMat * Vec(normalised_x, normalised_y, 0, 1)
camera_pos = ray_origin = modelViewMat.inverse().col(4)
ray_dir = near_point - camera_pos
nornagon
  • 15,393
  • 18
  • 71
  • 85
  • what is that "modelView" matrix you refer to? Is that the combination of the modelToWorld matrix of the model we're trying to hit and the camera's viewMatrix? – Jubei Nov 06 '12 at 06:40
  • 2
    It's been a while since I wrote this, but I think it's the matrix that transforms world coordinates into camera coordinates. If there's a line in your vertex shader like `gl_Position = projection * modelView * vertexPos;`, it's the bit in the middle, where the `projection` matrix is the translation from camera to viewport coordinates. HTH :/ – nornagon Nov 06 '12 at 12:31
2

Well, pretty simple, the theory behind this is always the same

1) Unproject two times your 2D coordinate onto the 3D space. (each API has its own function, but you can implement your own if you want). One at Min Z, one at Max Z.

2) With these two values calculate the vector that goes from Min Z and point to Max Z.

3) With the vector and a point calculate the ray that goes from Min Z to MaxZ

4) Now you have a ray, with this you can do a ray-triangle/ray-plane/ray-something intersection and get your result...

feal87
  • 927
  • 3
  • 11
  • 29
1

I have little DirectX experience, but I'm sure it's similar to OpenGL. What you want is the gluUnproject call.

Assuming you have a valid Z buffer you can query the contents of the Z buffer at a mouse position with:

// obtain the viewport, modelview matrix and projection matrix
// you may keep the viewport and projection matrices throughout the program if you don't change them
GLint viewport[4];
GLdouble modelview[16];
GLdouble projection[16];
glGetIntegerv(GL_VIEWPORT, viewport);
glGetDoublev(GL_MODELVIEW_MATRIX, modelview);
glGetDoublev(GL_PROJECTION_MATRIX, projection);

// obtain the Z position (not world coordinates but in range 0 - 1)
GLfloat z_cursor;
glReadPixels(x_cursor, y_cursor, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT, &z_cursor);

// obtain the world coordinates
GLdouble x, y, z;
gluUnProject(x_cursor, y_cursor, z_cursor, modelview, projection, viewport, &x, &y, &z);

if you don't want to use glu you can also implement the gluUnProject you could also implement it yourself, it's functionality is relatively simple and is described at opengl.org

wich
  • 16,709
  • 6
  • 47
  • 72
  • @Tom as I said, if you don't want to use the glu function you can just implement it's functionality yourself, all you would then need is to get the modelview and projection matrices for each and get the z window position for each. – wich Jan 22 '10 at 08:28
  • which would require me to figure out the same xyz, I could then post that xyz and mark that as the answer instead of this one, you see my reasoning? Someone else posted the math anyways – Tom J Nowell Jan 22 '10 at 19:34
0

Ok, this topic is old but it was the best I found on the topic, and it helped me a bit, so I'll post here for those who are are following ;-)

This is the way I got it to work without having to compute the inverse of Projection matrix:

void Application::leftButtonPress(u32 x, u32 y){
    GL::Viewport vp = GL::getViewport(); // just a call to glGet GL_VIEWPORT
vec3f p = vec3f::from(                        
        ((float)(vp.width - x) / (float)vp.width),
        ((float)y / (float)vp.height),
            1.);
    // alternatively vec3f p = vec3f::from(                        
    //      ((float)x / (float)vp.width),
    //      ((float)(vp.height - y) / (float)vp.height),
    //      1.);

    p *= vec3f::from(APP_FRUSTUM_WIDTH, APP_FRUSTUM_HEIGHT, 1.);
    p += vec3f::from(APP_FRUSTUM_LEFT, APP_FRUSTUM_BOTTOM, 0.);

    // now p elements are in (-1, 1)
    vec3f near = p * vec3f::from(APP_FRUSTUM_NEAR);
    vec3f far = p * vec3f::from(APP_FRUSTUM_FAR);

    // ray in world coordinates
    Ray ray = { _camera->getPos(), -(_camera->getBasis() * (far - near).normalize()) };

    _ray->set(ray.origin, ray.dir, 10000.); // this is a debugging vertex array to see the Ray on screen

    Node* node = _scene->collide(ray, Transform());
   cout << "node is : " << node << endl;
}

This assumes a perspective projection, but the question never arises for the orthographic one in the first place.

Tom J Nowell
  • 9,588
  • 17
  • 63
  • 91
nulleight
  • 794
  • 1
  • 5
  • 12
0

I've got the same situation with ordinary ray picking, but something is wrong. I've performed the unproject operation the proper way, but it just doesn't work. I think, I've made some mistake, but can't figure out where. My matix multiplication , inverse and vector by matix multiplications all seen to work fine, I've tested them. In my code I'm reacting on WM_LBUTTONDOWN. So lParam returns [Y][X] coordinates as 2 words in a dword. I extract them, then convert to normalized space, I've checked this part also works fine. When I click the lower left corner - I'm getting close values to -1 -1 and good values for all 3 other corners. I'm then using linepoins.vtx array for debug and It's not even close to reality.

unsigned int x_coord=lParam&0x0000ffff; //X RAW COORD
unsigned int y_coord=client_area.bottom-(lParam>>16); //Y RAW COORD

double xn=((double)x_coord/client_area.right)*2-1; //X [-1 +1]
double yn=1-((double)y_coord/client_area.bottom)*2;//Y [-1 +1]

_declspec(align(16))gl_vec4 pt_eye(xn,yn,0.0,1.0); 
gl_mat4 view_matrix_inversed;
gl_mat4 projection_matrix_inversed;
cam.matrixProjection.inverse(&projection_matrix_inversed);
cam.matrixView.inverse(&view_matrix_inversed);

gl_mat4::vec4_multiply_by_matrix4(&pt_eye,&projection_matrix_inversed);
gl_mat4::vec4_multiply_by_matrix4(&pt_eye,&view_matrix_inversed);

line_points.vtx[line_points.count*4]=pt_eye.x-cam.pos.x;
line_points.vtx[line_points.count*4+1]=pt_eye.y-cam.pos.y;
line_points.vtx[line_points.count*4+2]=pt_eye.z-cam.pos.z;
line_points.vtx[line_points.count*4+3]=1.0;
Sergey
  • 7,985
  • 4
  • 48
  • 80
Antiusninja
  • 157
  • 1
  • 17