Transpose z-position from perspective to orthographic camera in three.js

Question

I have a scene where I want to combine perspective objects (ie. objects that appear smaller when they are far away) with orthogographic objects (ie. objects that appear the same size irrespective of distance). The perspective objects are part of the rendered "world", while the orthogographic objects are adornments, like labels or icons. Unlike a HUD, I want the orthogographic objects to be rendered "within" the world, which means that they can be covered by world objects (imagine a plane passing before a label).

My solution is to use one renderer, but two scenes, one with a PerspectiveCamera and one with an OrthogographicCamera. I render them in sequence without clearing the z buffer (the renderer's autoClear property is set to false). The problem that I am facing is that I need to synchronize the placement of the objects in each scene so that an object in one scene is assigned a z-position that is behind objects in the other scene that are before it, but before objects that are behind it.

To do that, I am designating my perspective scene as the "leading" scene, ie. all coordinates of all objects (perspective and orthogographic) are assigned based on this scene. The perspective objects use these coordinates directly and are rendered within that scene and with the perspective camera. The coordinates of the orthogographic objects are transformed to the coordinates in the orthogographic scene and then rendered in that scene with the orthogographic camera. I do the transformation by projecting the coordinates in the perspective scene to the perspective camera's view pane and then back to the orthogonal scene with the orthogographic camera:

position.project(perspectiveCamera).unproject(orthogographicCamera);

Alas, this is not working as indended. The orthogographic objects are always rendered before the perspective objects even if they should be between them. Consider this example, in which the blue circle should be displayed behind the red square, but before the green square (which it isn't):

var pScene = new THREE.Scene();
var oScene = new THREE.Scene();

var pCam = new THREE.PerspectiveCamera(40, window.innerWidth / window.innerHeight, 1, 1000);
pCam.position.set(0, 40, 50);
pCam.lookAt(new THREE.Vector3(0, 0, -50));

var oCam = new THREE.OrthographicCamera(window.innerWidth / -2, window.innerWidth / 2, window.innerHeight / 2, window.innerHeight / -2, 1, 500);
oCam.Position = pCam.position.clone();

pScene.add(pCam);
pScene.add(new THREE.AmbientLight(0xFFFFFF));

oScene.add(oCam);
oScene.add(new THREE.AmbientLight(0xFFFFFF));

var frontPlane = new THREE.Mesh(new THREE.PlaneGeometry(20, 20), new THREE.MeshBasicMaterial( { color: 0x990000 }));
frontPlane.position.z = -50;
pScene.add(frontPlane);

var backPlane = new THREE.Mesh(new THREE.PlaneGeometry(20, 20), new THREE.MeshBasicMaterial( { color: 0x009900 }));
backPlane.position.z = -100;
pScene.add(backPlane);

var circle = new THREE.Mesh(new THREE.CircleGeometry(60, 20), new THREE.MeshBasicMaterial( { color: 0x000099 }));
circle.position.z = -75;

//Transform position from perspective camera to orthogonal camera -> doesn't work, the circle is displayed in front
circle.position.project(pCam).unproject(oCam);

oScene.add(circle);

var renderer = new THREE.WebGLRenderer();
renderer.setSize(window.innerWidth, window.innerHeight);
document.body.appendChild(renderer.domElement);

renderer.autoClear = false;
renderer.render(oScene, oCam);
renderer.render(pScene, pCam);

You can try out the code here.

In the perspective world the (world) z-position of the circle is -75, which is between the squares (-50 and -100). But it is actually displayed in front of both squares. If you manually set the circles z-position (in the orthogographic scene) to -500 it is displayed between the squares, so with the right positioning, what I'm trying should be possible in principle.

I know that I can not render a scene the same with orthogographic and perspective cameras. My intention is to reposition all orthogographic objects before each rendering so that they appear to be at the right position.

What do I have to do to calculate the orthogographic coordinates from the perspective coordinates so that the objects are rendered with the right depth values?

UPDATE:

I have added an answer with my current solution to the problem in case someone has a similar problem. However, since this solution does not provide the same quality as the orthogographic camera. So I would still be happy if somoeone could explain why the orthogographic camera does not work as expected and/or provide a solution to the problem.

Would you be willing to use just one scene and one perspective camera and achieve the same effect by adjusting the scale of the labels/sprites on the fly? https://stackoverflow.com/a/28875549/1461008 — WestLangley, Oct 19 '17 at 15:35
Thanks for the sugggestion. I came across this option (and this actual Q&A) during my research. The reason I didn't go for it is that I don't only need the adornments to be of the same size, I need them to be of an *exact* size, which is easy with an orthographic camera. But I don't really care about the proposed solution, so if there is a way to determine the *exact* scaling factor from the expected screen size, rendering in the same scene and scaling up would be fine, too. — Sefe, Oct 19 '17 at 18:49
In your app, are the adornments sprites or meshes? Exact size in what units -- world units or pixels? — WestLangley, Oct 19 '17 at 20:14
I prefer to use sprites, but I can use a planar mesh, too. My base is pixels, since I can be sure that texts, icons etc are rendered in their original sizes. With an orthogonal camera, I can set the scale of the sprite (or mesh) to size in pixels and the projection will be always right (that's the part that is working). But if I could get a scaling factor that would give me the same result in a single scene, that would also work for me. Seems also like the much simpler solution. — Sefe, Oct 19 '17 at 20:46
As a first attempt, use one scene/camera and this: `var vector = new THREE.Vector3(); // create once and reuse; var dist = vector.setFromMatrixPosition( sprite.matrixWorld ).distanceTo( camera.position ); sprite.scale.x = sprite.scale.y = dist / scaleFactor;` — WestLangley, Oct 19 '17 at 21:33
I will try that. What is `scaleFactor`? Is it the scale of the sprite in pixels (like with the orthogonal camera) or something else? If it is something else, how do I calculate it? — Sefe, Oct 20 '17 at 05:12
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/157125/discussion-between-sefe-and-westlangley). — Sefe, Oct 20 '17 at 08:00

Rabbid76 · Accepted Answer · 2019-03-22T21:39:37.660

You are very close to the result what you have expected. You have forgotten to update the camera matrices, which have to be calculated that the operation project and project can proper work:

pCam.updateMatrixWorld ( false );
oCam.updateMatrixWorld ( false );
circle.position.project(pCam).unproject(oCam);

Explanation:

In a rendering, each mesh of the scene usually is transformed by the model matrix, the view matrix and the projection matrix.

Projection matrix:
The projection matrix describes the mapping from 3D points of a scene, to 2D points of the viewport. The projection matrix transforms from view space to the clip space, and the coordinates in the clip space are transformed to the normalized device coordinates (NDC) in the range (-1, -1, -1) to (1, 1, 1) by dividing with the w component of the clip coordinates.
View matrix:
The view matrix describes the direction and position from which the scene is looked at. The view matrix transforms from the wolrd space to the view (eye) space. In the coordinat system on the viewport, the X-axis points to the left, the Y-axis up and the Z-axis out of the view (Note in a right hand system the Z-Axis is the cross product of the X-Axis and the Y-Axis).
Model matrix:
The model matrix defines the location, oriantation and the relative size of a mesh in the scene. The model matrix transforms the vertex positions from of the mesh to the world space.

If a fragment is drawn "behind" or "before" another fragment, depends on the depth value of the fragment. While for orthographic projection the Z coordinate of the view space is linearly mapped to the depth value, in perspective projection it is not linear.

In general, the depth value is calculated as follows:

float ndc_depth = clip_space_pos.z / clip_space_pos.w;
float depth = (((farZ-nearZ) * ndc_depth) + nearZ + farZ) / 2.0;

The projection matrix describes the mapping from 3D points of a scene, to 2D points of the viewport. It transforms from eye space to the clip space, and the coordinates in the clip space are transformed to the normalized device coordinates (NDC) by dividing with the w component of the clip coordinates.

At Orthographic Projection the coordinates in the eye space are linearly mapped to normalized device coordinates.

Orthographic Projection

At Orthographic Projection the coordinates in the eye space are linearly mapped to normalized device coordinates.

Orthographic Projection Matrix:

r = right, l = left, b = bottom, t = top, n = near, f = far 

2/(r-l)         0               0               0
0               2/(t-b)         0               0
0               0               -2/(f-n)        0
-(r+l)/(r-l)    -(t+b)/(t-b)    -(f+n)/(f-n)    1

At Orthographic Projection, the Z component is calcualted by the linear function:

z_ndc = z_eye * -2/(f-n) - (f+n)/(f-n)

Perspective Projection

At Perspective Projection the projection matrix describes the mapping from 3D points in the world as they are seen from of a pinhole camera, to 2D points of the viewport.
The eye space coordinates in the camera frustum (a truncated pyramid) are mapped to a cube (the normalized device coordinates).

Perspective Projection

Perspective Projection Matrix:

r = right, l = left, b = bottom, t = top, n = near, f = far

2*n/(r-l)      0              0                0
0              2*n/(t-b)      0                0
(r+l)/(r-l)    (t+b)/(t-b)    -(f+n)/(f-n)    -1    
0              0              -2*f*n/(f-n)     0

At Perspective Projection, the Z component is calcualted by the rational function:

z_ndc = ( -z_eye * (f+n)/(f-n) - 2*f*n/(f-n) ) / -z_eye

See a detailed description at the answer to the Stack Overflow question How to render depth linearly in modern OpenGL with gl_FragCoord.z in fragment shader?

In your case this means, that you have to choose the Z coordinate of the circle in the orthographic projection in that way, that the depth value is inbetween of the depths of the objects in the perspective projection.
Since the depth value in nothing else than depth = z ndc * 0.5 + 0.5 in both cases, it also possible to do the calculations by normalized device coordinates instead of depth values.

The normalized device coordinates can easily be caluclated by the project function of the THREE.PerspectiveCamera. The project converrts from wolrd space to view space and from view space to normalized device coordinates.

To find a Z coordinate which is in between in orthographic projection, the middle normalized device Z coordinate, has to be transformed to a view space Z coordinate. This can be done by the unproject function of the THREE.PerspectiveCamera. The unproject converts from normalized device coordinates to view space and from view space to world sapce.

See further OpenGL - Mouse coordinates to Space coordinates.

See the example:

var renderer, pScene, oScene, pCam, oCam, frontPlane, backPlane, circle;

  var init = function () {
    pScene = new THREE.Scene();
    oScene = new THREE.Scene();
    
    pCam = new THREE.PerspectiveCamera(40, window.innerWidth / window.innerHeight, 1, 1000);
    pCam.position.set(0, 40, 50);
    pCam.lookAt(new THREE.Vector3(0, 0, -50));
    
    oCam = new THREE.OrthographicCamera(window.innerWidth / -2, window.innerWidth / 2, window.innerHeight / 2, window.innerHeight / -2, 1, 500);
    oCam.Position = pCam.position.clone();
    
    pScene.add(pCam);
    pScene.add(new THREE.AmbientLight(0xFFFFFF));
    
    oScene.add(oCam);
    oScene.add(new THREE.AmbientLight(0xFFFFFF));
    
    
    frontPlane = new THREE.Mesh(new THREE.PlaneGeometry(20, 20), new THREE.MeshBasicMaterial( { color: 0x990000 }));
    frontPlane.position.z = -50;
    pScene.add(frontPlane);
    
    backPlane = new THREE.Mesh(new THREE.PlaneGeometry(20, 20), new THREE.MeshBasicMaterial( { color: 0x009900 }));
    backPlane.position.z = -100;
    pScene.add(backPlane);

    circle = new THREE.Mesh(new THREE.CircleGeometry(20, 20), new THREE.MeshBasicMaterial( { color: 0x000099 }));
    circle.position.z = -75;

    
    //Transform position from perspective camera to orthogonal camera -> doesn't work, the circle is displayed in front
    pCam.updateMatrixWorld ( false );
    oCam.updateMatrixWorld ( false );
    circle.position.project(pCam).unproject(oCam);
    
    oScene.add(circle);
    
    renderer = new THREE.WebGLRenderer();
    renderer.setSize(window.innerWidth, window.innerHeight);
    document.body.appendChild(renderer.domElement);
  };
  
  var render = function () {
  
    renderer.autoClear = false;
    renderer.render(oScene, oCam);
    renderer.render(pScene, pCam);
  };
  
  var animate = function () {
      requestAnimationFrame(animate);
      //controls.update();
      render();
  };
  
  
  init();
  animate();

html,body {
    height: 100%;
    width: 100%;
    margin: 0;
    overflow: hidden;
}

<script src="https://threejs.org/build/three.min.js"></script>

Thanks! Very nice answer. It doesn't only solve my problem, it explains the issue comprehensively. I understand now why it didn't work and what I have to do in similar cases in the future. — Sefe, Nov 01 '17 at 12:31

Sefe · Answer 2 · 2019-01-23T08:03:23.763

I have found a solution that involves only the perspective camera and scales the adornments according to their distance to the camera. It is similar to the answer posted to a similar question, but not quite the same. My specific issue is that I don't only need the adornments to be the same size independent of their distance to the camera, I also need to control their exact size on screen.

To scale them to the right size, not to any size that does not change, I use the function to calculate on screen size found in this answer to calculate the position of both ends of a vector of a known on-screen length and check the length of the projection to the screen. From the difference in length I can calculate the exact scaling factor:

var widthVector = new THREE.Vector3( 100, 0, 0 );
widthVector.applyEuler(pCam.rotation);

var baseX = getScreenPosition(circle, pCam).x;
circle.position.add(widthVector);
var referenceX = getScreenPosition(circle, pCam).x;
circle.position.sub(widthVector);

var scale = 100 / (referenceX - baseX);
circle.scale.set(scale, scale, scale);

The problem with this solution is that in most of the cases the calculation is precise enough to provide an exact size. But every now and then some rounding error makes the adornment not render correctly.

Transpose z-position from perspective to orthographic camera in three.js

2 Answers2

Explanation:

Orthographic Projection

Perspective Projection

Linked