Get 3D coordinates of a mouse click in WebGL

Question

Since there is suprisingly almost no information on webGL (or I just don't know how to search for it), I have a question about how to transform a mouse coordinates to 3D coordinates, so to see where exactly on the screen I am clicking.

So my case is that I have a very simple skybox, the camera is positioned at [0, 0, 0] and I can look around it by clicking and dragging. What I want to do is be able to click somewhere on that skybox and know where I have clicked as I need to put an annotation (some text, or html element) on that position. And that html element must move and go out of view with me turning to another side. So what I need is a way to get a mouse click and find out which side of the cube I am clicking on and at what coordinates, so I can place the annotations correctly.

I am using a plain WebGL, I don't use THREE.js or anything like that. Since its just one cube, I can only assume finding the intersection won't be that hard and won't require extra libraries.

this is not a webgl problem but a math one. take a look at [ray plane intersection testing](https://www.scratchapixel.com/lessons/3d-basic-rendering/minimal-ray-tracer-rendering-simple-shapes/ray-plane-and-ray-disk-intersection). — LJᛃ, Feb 09 '20 at 16:43
well I need to transform my 2d mouse coordinates into a 3d coordinates for me to be able to do that, so you didn't read my question — Poyr23, Feb 09 '20 at 16:47
I feel like you don't realize that a 2D point represents an infinite ray in 3D space but maybe I didn't read your question, good luck figuring it out! — LJᛃ, Feb 09 '20 at 17:01
alright, we are getting somewhere, infinite ray with what coordinates :) — Poyr23, Feb 09 '20 at 17:26

gman · Accepted Answer · 2020-02-10T06:14:38.313

Well you're certainly right that it's hard to find an example

A common webgl shader projects in 3D using code like either

gl_Position = matrix * position;

or

gl_Position = projection * modelView * position;

or

gl_Position = projection * view * world * position;

which are all the same thing basically. They take position and multiply it by a matrix to convert to clip space. You need to do the opposite to go the other way, take a position in clip space and covert back to position space which is

inverse (projection * view * world) * clipSpacePosition

So, take your 3D library and compute the inverse of the matrix you're passing to WebGL. For exmaple here is some code that is computing matrices to draw something using twgl's math library

  const fov = 30 * Math.PI / 180;
  const aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
  const zNear = 0.5;
  const zFar = 10;
  const projection = m4.perspective(fov, aspect, zNear, zFar);

  const eye = [1, 4, -6];
  const target = [0, 0, 0];
  const up = [0, 1, 0];
  const camera = m4.lookAt(eye, target, up);

  const view = m4.inverse(camera);
  const viewProjection = m4.multiply(projection, view);
  const world = m4.rotationY(time);

For a shader that is effectively doing this

  gl_Position = viewProjection * world * position

So we need the inverse

  const invMat = m4.inverse(m4.multiply(viewProjection, world));

Then we need a clip space ray. We're going from 2D to 3D so we'll make a ray that cuts through the frustum starting at zNear and ending at zFar by using -1 and +1 as our Z value

  canvas.addEventListener('mousemove', (e) => {
     const rect = canvas.getBoundingClientRect();
     const x = e.clientX - rect.left;
     const y = e.clientY - rect.top;

     const clipX = x / rect.width  *  2 - 1;
     const clipY = y / rect.height * -2 + 1;

     const start = m4.transformPoint(invMat, [clipX, clipY, -1]);
     const end   = m4.transformPoint(invMat, [clipX, clipY,  1]);

     ... do something with start/end
  });

start and end are now relative to position (the data in your geometry) so you now have to use some ray to triangle code in JavaScript to walk through all your triangles and see if the ray from start to end intersecs one or more of your triangles.

Note if all you want is a ray in world space, not position space then you'd use

  const invMat = m4.inverse(viewProjection);

"use strict";

const vs = `
uniform mat4 u_world;
uniform mat4 u_viewProjection;

attribute vec4 position;
attribute vec2 texcoord;
attribute vec4 color;

varying vec4 v_position;
varying vec2 v_texcoord;
varying vec4 v_color;

void main() {
  v_texcoord = texcoord;
  v_color = color;
  gl_Position = u_viewProjection * u_world * position;
}
`;

const fs = `
precision mediump float;

varying vec2 v_texcoord;
varying vec4 v_color;

uniform sampler2D tex;

void main() {
  gl_FragColor = texture2D(tex, v_texcoord) * v_color;
}
`;

const m4 = twgl.m4;
const gl = document.querySelector("#c").getContext("webgl");

// compiles shaders, links, looks up locations
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);

const cubeArrays = twgl.primitives.createCubeVertices(1);
cubeArrays.color = {value: [0.2, 0.3, 1, 1]};
// calls gl.createBuffer, gl.bindBuffer, gl.bufferData
// for each array
const cubeBufferInfo = twgl.createBufferInfoFromArrays(gl, cubeArrays);

const numLines = 50;
const positions = new Float32Array(numLines * 3 * 2);
const colors = new Float32Array(numLines * 4 * 2);
// calls gl.createBuffer, gl.bindBuffer, gl.bufferData
// for each array
const linesBufferInfo = twgl.createBufferInfoFromArrays(gl, {
  position: positions,
  color: colors,
  texcoord: { value: [0, 0], },
});

const tex = twgl.createTexture(gl, {
  minMag: gl.NEAREST,
  format: gl.LUMINANCE,
  src: [
    255, 192,
    192, 255,
  ],
});

let clipX = 0;
let clipY = 0;
let lineNdx = 0;

function render(time) {
  time *= 0.001;
  twgl.resizeCanvasToDisplaySize(gl.canvas);
  gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);

  gl.enable(gl.DEPTH_TEST);
  gl.enable(gl.CULL_FACE);
  gl.clear(gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);

  const fov = 30 * Math.PI / 180;
  const aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
  const zNear = 1;
  const zFar = 10;
  const projection = m4.perspective(fov, aspect, zNear, zFar);

  const eye = [Math.cos(time), Math.sin(time), 6];
  const target = [0, 0, 0];
  const up = [0, 1, 0];
  const camera = m4.lookAt(eye, target, up);
  
  const view = m4.inverse(camera);
  const viewProjection = m4.multiply(projection, view);
  const world = m4.rotateX(m4.rotationY(1), 1);

  gl.useProgram(programInfo.program);
  // calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
  twgl.setBuffersAndAttributes(gl, programInfo, cubeBufferInfo);
  twgl.setUniformsAndBindTextures(programInfo, {
    tex,
    u_world: world,
    u_viewProjection: viewProjection,
    color: [0.2, 0.3, 1, 1],
  });
  // calls gl.drawArrays or gl.drawElements
  twgl.drawBufferInfo(gl, cubeBufferInfo);

  // add a line in world space
  const invMat = m4.inverse(viewProjection);
  const start = m4.transformPoint(invMat, [clipX, clipY, -1]);
  const end   = m4.transformPoint(invMat, [clipX, clipY,  1]);
  const poffset = lineNdx * 3 * 2;
  const coffset = lineNdx * 4 * 2;
  const color = [Math.random(), Math.random(), Math.random(), 1];
  positions.set(start, poffset);
  positions.set(end, poffset + 3);
  colors.set(color, coffset);
  colors.set(color, coffset + 4);

  gl.bindBuffer(gl.ARRAY_BUFFER, linesBufferInfo.attribs.position.buffer);
  gl.bufferSubData(gl.ARRAY_BUFFER, 0, positions);
  gl.bindBuffer(gl.ARRAY_BUFFER, linesBufferInfo.attribs.color.buffer);
  gl.bufferSubData(gl.ARRAY_BUFFER, 0, colors);

  lineNdx = (lineNdx + 1) % numLines;  

  // calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
  twgl.setBuffersAndAttributes(gl, programInfo, linesBufferInfo);
  twgl.setUniformsAndBindTextures(programInfo, {
    tex,
    u_world: m4.identity(),
    u_viewProjection: viewProjection,
    color: [1, 0, 0, 1],
  });
  // calls gl.drawArrays or gl.drawElements
  twgl.drawBufferInfo(gl, linesBufferInfo, gl.LINES);

  requestAnimationFrame(render);
}
requestAnimationFrame(render);


gl.canvas.addEventListener('mousemove', (e) => {
   const canvas = gl.canvas;
   const rect = canvas.getBoundingClientRect();
   const x = e.clientX - rect.left;
   const y = e.clientY - rect.top;

   clipX = x / rect.width  *  2 - 1;
   clipY = y / rect.height * -2 + 1;
});

body { margin: 0; }
canvas { width: 100vw; height: 100vh; display: block; }

<canvas id="c"></canvas>
<script src="https://twgljs.org/dist/4.x/twgl-full.min.js"></script>

As for WebGL info there is some here

Thank you for the detailed answer. Since I didn't have much time for the project I decided to go for something a bit less accurate and a bit more heavy, but easier to implement. What I did was divide the cube into small triangles which have different color (but are not drawn) and form there when a mouse click happens I can read the color of the pixels and determine where I clicked. Since I only have this one cube it is doing ok, but it is certainly not the best solution. If I have some time left before the deadline I can try to do that which I assume would be better by a big magnitude. — Poyr23, Feb 11 '20 at 02:09
So that is a different quesiton. Your question was "Get 3D coordinates of a mouse click in WebGL" it was not "how do I pick things in WebGL". For that [here's one solution](https://stackoverflow.com/questions/51747996/on-the-browser-how-to-plot-100k-series-with-64-128-points-each/51757743#51757743) — gman, Feb 11 '20 at 03:12
I never asked a different question, I was simply stating that I decided to go for a different approach, the question is still the same. And I thanked you for the answer. — Poyr23, Feb 11 '20 at 03:27

Get 3D coordinates of a mouse click in WebGL

1 Answers1