New Answer
After clarifications in the comments, the question being asked can be summed up as:
How do I effectively transform a quad in terms of pixels for use in a GUI?
As mentioned in the original question, the simplest approach to this will be using an Orthographic Projection. What is an Orthographic Projection?
a method of projection in which an object is depicted or a surface mapped using parallel lines to project its shape onto a plane.
In practice, you may think of this as a 2D projection. Distance plays no role, and the OpenGL coordinates map to pixel coordinates. See this answer for a bit more information.
By using an Orthographic Projection instead of a Perspective Projection you can start thinking of all of your transformations in terms of pixels.
Instead of defining a quad as (25 x 25)
world units in dimension, it is (25 x 25)
pixels in dimension.
Or instead of translating by 50
world units along the world x-axis, you translate by 50
pixels along the screen x-axis (to the right).
So how do you create an Orthographic Projection?
First, they are usually defined using the following parameters:
left
- X coordinate of the left vertical clipping plane
right
- X coordinate of the right vertical clipping plane
bottom
- Y coordinate of the bottom horizontal clipping plane
top
- Y Coordinate of the top horizontal clipping plane
near
- Near depth clipping plane
far
- Far depth clipping plane
Remember, all units are in pixels. A typical Orthographic Projection would be defined as:
glOrtho(0.0, windowWidth, windowHeight, 0.0f, 0.0f, 1.0f);
Assuming you do not (or can not) make use of glOrtho
(you have your own Matrix
class or another reason), then you must calculate the Orthographic Projection matrix yourself.
The Orthographic Matrix is defined as:
2/(r-l) 0 0 -(r+l)/(r-l)
0 2/(t-b) 0 -(t+b)/(t-b)
0 0 -2/(f-n) -(f+n)/(f-n)
0 0 0 1
Source A, Source B
At this point I recommend using a pre-made mathematics library unless you are determined to use your own. One of the most common bug sources I see in practice are matrix-related and the less time you spend debugging matrices, the more time you have to focus on other more fun endeavors.
GLM is a widely-used and respected library that is built to model GLSL functionality. The GLM implementation of glOrtho
can be seen here at line 100
.
How to use an Orthographic Projection?
Orthographic projections are commonly used to render a GUI on top of your 3D scene. This can be done easily enough by using the following pattern:
- Clear Buffers
- Apply your Perspective Projection Matrix
- Render your 3D objects
- Apply your Orthographic Projection Matrix
- Render your 2D/GUI objects
- Swap Buffers
Old Answer
Note that this answered the wrong question. It assumed the question boiled down to "How do I convert from Screen Space to NDC Space?". It is left in case someone searching comes upon this question looking for that answer.
The goal is convert from Screen Space to NDC Space. So let's first define what those spaces are, and then we can create a conversion.
Normalized Device Coordinates
NDC space is simply the result of performing perspective division on our vertices in clip space.
clip.xyz /= clip.w
Where clip
is the coordinate in clip space.
What this does is place all of our un-clipped vertices into a unit cube (on the range of [-1, 1]
on all axis), with the screen center at (0, 0, 0)
. Any vertices that are clipped (lie outside the view frustum) are not within this unit cube and are tossed away by the GPU.
In OpenGL this step is done automatically as part of Primitive Assembly (D3D11 does this in the Rasterizer Stage).
Screen Coordinates
Screen coordinates are simply calculated by expanding the normalized coordinates to the confines of your viewport.
screen.x = ((view.w * 0.5) * ndc.x) + ((w * 0.5) + view.x)
screen.y = ((view.h * 0.5) * ndc.y) + ((h * 0.5) + view.y)
screen.z = (((view.f - view.n) * 0.5) * ndc.z) + ((view.f + view.n) * 0.5)
Where,
screen
is the coordinate in screen-space
ndc
is the coordinate in normalized-space
view.x
is the viewport x origin
view.y
is the viewport y origin
view.w
is the viewport width
view.h
is the viewport height
view.f
is the viewport far
view.n
is the viewport near
Converting from Screen to NDC
As we have the conversion from NDC to Screen above, it is easy to calculate the reverse.
ndc.x = ((2.0 * screen.x) - (2.0 * x)) / w) - 1.0
ndc.y = ((2.0 * screen.y) - (2.0 * y)) / h) - 1.0
ndc.z = ((2.0 * screen.z) - f - n) / (f - n)) - 1.0
Example:
viewport (w, h, n, f) = (800, 600, 1, 1000)
screen.xyz = (400, 300, 200)
ndc.xyz = (0.0, 0.0, -0.599)
screen.xyz = (575, 100, 1)
ndc.xyz = (0.4375, -0.666, -0.998)
Further Reading
For more information on all of the transform spaces, read OpenGL Transformation.
Edit for Comment
In the comment on the original question, Bo specifies screen-space origin as top-left.
For OpenGL, the viewport origin (and thus screen-space origin) lies at the bottom-left. See glViewport.
If your pixel coordinates are truly top-left origin then that needs to be taken into account when transforming screen.y
to ndc.y
.
ndc.y = 1.0 - ((2.0 * screen.y) - (2.0 * y)) / h)
This is needed if you are transforming, say, a coordinate of a mouse-click on screen/gui into NDC space (as part of a full transform to world space).