0

I have what I believed to be a basic need: from "2D position of the mouse on the screen", I need to get "the closest 3D point in the 3D world". Looks like ray-tracing common problematic (even if it's not mine).

I googled / read a lot: looks like the topic is messy and lots of things gets unfortunately quickly intricated. My initial problem / need involves lots of 3D points what I do not know (meshes or point cloud from the internet), so, it's impossible to understand what result you should expect! Thus, I decided to create simple shapes (triangle, quadrangle, cube) with points that I know (each coord of each point is 0.f or 0.5f in local frame), and, try to see if I can "recover" 3D point positions from the mouse cursor when I move it on the screen.

Note: all coord of all points of all shapes are known values like 0.f or 0.5f. For example, with the triangle:

float vertices[] = {
    -0.5f, -0.5f, 0.0f,
     0.5f, -0.5f, 0.0f,
     0.0f,  0.5f, 0.0f
};

What I do

I have a 3D OpenGL renderer where I added a GUI to have controls on the rendered scene

  • Transformations: tx, ty, tz, rx, ry, rz are controls that enables to change the model matrix. In code

    // create transformations: model represents local to world transformation
    model = glm::mat4(1.0f); // initialize matrix to identity matrix first
    model = glm::translate(model, glm::vec3(tx, ty, tz));
    model = glm::rotate(model, glm::radians(rx), glm::vec3(1.0f, 0.0f, 0.0f));
    model = glm::rotate(model, glm::radians(ry), glm::vec3(0.0f, 1.0f, 0.0f));
    model = glm::rotate(model, glm::radians(rz), glm::vec3(0.0f, 0.0f, 1.0f));
    ourShader.setMat4("model", model);
    

    model changes only the position of the shape in the world and has no connection with the position of the camera (that's what I understand from tutorials).

  • Camera: from here, I ended-up with a camera class that holds view and proj matrices. In code

    // get view and projection from camera
    view = cam.getViewMatrix();
    ourShader.setMat4("view", view);
    proj = cam.getProjMatrix((float)SCR_WIDTH, (float)SCR_HEIGHT, near, 100.f);
    ourShader.setMat4("proj", proj);
    

    The camera is a fly-like camera that can be moved when moving the mouse or using keyboard arrows and that does not act on model, but only on view and proj (that's what I understand from tutorials).

    The shader then uses model, view and proj this way:

    uniform mat4 model;
    uniform mat4 view;
    uniform mat4 proj;
    void main()
    {
       // note that we read the multiplication from right to left
       gl_Position = proj * view * model * vec4(aPos.x, aPos.y, aPos.z, 1.0);
    
  • Screen to world: as using glm::unProject didn't always returned results I expected, I added a control to not use it (back-projecting by-hand). In code, first I get the cursor mouse position frame3DPos following this

    // glfw: whenever the mouse moves, this callback is called
    // -------------------------------------------------------
    void mouseCursorCallback(GLFWwindow* window, double xposIn, double yposIn)
    {
        // screen to world transformation
    
        xposScreen = xposIn;
        yposScreen = yposIn;
    
        int windowWidth = 0, windowHeight = 0; // size in screen coordinates.
        glfwGetWindowSize(window, &windowWidth, &windowHeight);
        int frameWidth = 0, frameHeight = 0; // size in pixel.
        glfwGetFramebufferSize(window, &frameWidth, &frameHeight);
        glm::vec2 frameWinRatio = glm::vec2(frameWidth, frameHeight) /
                                  glm::vec2(windowWidth, windowHeight);
        glm::vec2 screen2DPos = glm::vec2(xposScreen, yposScreen);
        glm::vec2 frame2DPos = screen2DPos * frameWinRatio; // window / frame sizes may be different.
        frame2DPos = frame2DPos + glm::vec2(0.5f, 0.5f); // shift to GL's center convention.
        glm::vec3 frame3DPos = glm::vec3(0.0f, 0.0f, 0.0f);
        frame3DPos.x = frame2DPos.x;
        frame3DPos.y = frameHeight - 1.0f - frame2DPos.y; // GL's window origin is at the bottom left
        frame3DPos.z = 0.f;
        glReadPixels((GLint) frame3DPos.x, (GLint) frame3DPos.y, // CAUTION: cast to GLint.
                     1, 1, GL_DEPTH_COMPONENT,
                     GL_FLOAT, &zbufScreen); // CAUTION: GL_DOUBLE is NOT supported.
        frame3DPos.z = zbufScreen; // z-buffer.
    

    And then I can call glm::unProject or not (back-projecting by-hand) according to controls in GUI

    glm::vec3 world3DPos = glm::vec3(0.0f, 0.0f, 0.0f);
    if (screen2WorldUsingGLM) {
        glm::vec4 viewport(0.0f, 0.0f, (float) frameWidth, (float) frameHeight);
        world3DPos = glm::unProject(frame3DPos, view * model, proj, viewport);
    } else {
        glm::mat4 trans = proj * view * model;
        glm::vec4 frame4DPos(frame3DPos, 1.f);
        frame4DPos = glm::inverse(trans) * frame4DPos;
        world3DPos.x = frame4DPos.x / frame4DPos.w;
        world3DPos.y = frame4DPos.y / frame4DPos.w;
        world3DPos.z = frame4DPos.z / frame4DPos.w;
    }
    

    Question: glm::unProject doc says Map the specified window coordinates (win.x, win.y, win.z) into object coordinates, but, I am not sure to understand what are object coordinates. Does object coordinates refers to local, world, view or clip space described here?

  • Z-buffering is always allowed whatever the shape is 2D (triangle, quadrangle) or 3D (cube). In code

    glEnable(GL_DEPTH_TEST); // Enable z-buffer.
    while (!glfwWindowShouldClose(window)) {
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); // also clear the z-buffer
    

In picture I get

enter image description here

The camera is positioned at (0., 0., 0.) and looks "ahead" (front = -z as z-axis is positive from screen to me). The shape is positioned (using tx, ty, tz, rx, ry, rz) "in front of the camera" with tz = -5 (5 units following the front vector of the camera)

What I get

Triangle in initial setting

enter image description here

I have correct xpos and ypos in world frame but incorrect zpos = 0. (z-buffering is allowed). I expected zpos = -5 (as tz = -5).

Question: why zpos is incorrect?

If I do not use glm::unProject, I get outer space results

enter image description here

Question: why "back-projecting" by-hand doesn't return consistent results compared to glm::unProject? Is this logical? Arethey different operations? (I believed they should be equivalent but they are obviously not)

Triangle moved with translation

enter image description here

After translation of about tx = 0.5 I still get same coordinates (local frame) where I expected to have previous coord translated along x-axis. Not using glm::unProject returns oute-space results here too...

Question: why translation (applied by model - not view nor proj) is ignored?

Cube in initial setting

enter image description here

I get correct xpos, ypos and zpos?!... So why is this not working the same way with the "2D" triangle (which is "3D" one to me, so, they should behave the same)?

Cube moved with translation

enter image description here

Translated along ty this time seems to have no effect (still get same coordinates - local frame).

Question: like with triangle, why translation is ignored?

What I'd like to get

The main question is why the model transformation is ignored? If this is to be expected, I'd like to understand why. If there's a way to recover the "true" position of the shape in the world (including model transformation) from the position of the mouse cursor, I'd like to understand how.

fghoussen
  • 393
  • 3
  • 16

1 Answers1

0

Question: glm::unProject doc says Map the specified window coordinates (win.x, win.y, win.z) into object coordinates, but, I am not sure to understand what are object coordinates. Does object coordinates refers to local, world, view or clip space described here?

As I am new to OpenGL, I didn't get that object coordinates from glm::unProject doc is another way to refer to local space. Solution: pass view*model to glm::unProject and apply model again, or, pass view to glm::unProject as explained here: Screen Coordinates to World Coordinates.

This fixes all weird behaviors I observed.

fghoussen
  • 393
  • 3
  • 16