How exactly does OpenGL do perspectively correct linear interpolation?

Question

If linear interpolation happens during the rasterization stage in the OpenGL pipeline, and the vertices have already been transformed to screen-space, where does the depth information used for perspectively correct interpolation come from?

Can anybody give a detailed description of how OpenGL goes from screen-space primitives to fragments with correctly interpolated values?

You might find [this example](https://webglfundamentals.org/webgl/lessons/webgl-3d-perspective-correct-texturemapping.html) informative — gman, Feb 21 '18 at 09:38

Yakov Galka · Accepted Answer · 2022-12-01T01:28:15.280

The output of a vertex shader is a four component vector, vec4 gl_Position. From Section 13.6 Coordinate Transformations of core GL 4.4 spec:

Clip coordinates for a vertex result from shader execution, which yields a vertex coordinate gl_Position.

Perspective division on clip coordinates yields normalized device coordinates, followed by a viewport transformation (see section 13.6.1) to convert these coordinates into window coordinates.

OpenGL does the perspective divide as

device.xyz = gl_Position.xyz / gl_Position.w

But then keeps the 1 / gl_Position.w as the last component of gl_FragCoord:

gl_FragCoord.xyz = device.xyz scaled to viewport
gl_FragCoord.w = 1 / gl_Position.w

This transform is bijective, so no depth information is lost. In fact as we see below, the 1 / gl_Position.w is crucial for perspective correct interpolation.

Short introduction to barycentric coordinates

Given a triangle (P0, P1, P2) one can parametrize all the points inside the triangle by the linear combinations of the vertices:

P(b0,b1,b2) = P0*b0 + P1*b1 + P2*b2

where b0 + b1 + b2 = 1 and b0 ≥ 0, b1 ≥ 0, b2 ≥ 0.

Given a point P inside the triangle, the coefficients (b0, b1, b2) that satisfy the equation above are called the barycentric coordinates of that point. For non-degenerate triangles they are unique, and can be calculated as quotients of the areas of the following triangles:

b0(P) = area(P, P1, P2) / area(P0, P1, P2)
b1(P) = area(P0, P, P2) / area(P0, P1, P2)
b2(P) = area(P0, P1, P) / area(P0, P1, P2)

Each bi can be thought of as 'how much of Pi has to be mixed in'. So b = (1,0,0), (0,1,0) and (0,0,1) are the vertices of the triangle, (1/3, 1/3, 1/3) is the barycenter, and so on.

Given an attribute (f0, f1, f2) on the vertices of the triangle, we can now interpolate it over the interior:

f(P) = f0*b0(P) + f1*b1(P) + f2*b2(P)

This is a linear function of P, therefore it is the unique linear interpolant over the given triangle. The math also works in either 2D or 3D.

Perspective correct interpolation

Let's say we fill a projected 2D triangle on the screen. For every fragment we have its window coordinates. First we calculate its barycentric coordinates by inverting the P(b0,b1,b2) function, which is a linear function in window coordinates. This gives us the barycentric coordinates of the fragment on the 2D triangle projection.

Perspective correct interpolation of an attribute would vary linearly in the clip coordinates (and by extension, world coordinates). For that we need to get the barycentric coordinates of the fragment in clip space.

As it happens (see [1] and [2]), the depth of the fragment is not linear in window coordinates, but the depth inverse (1/gl_Position.w) is. Accordingly the attributes and the clip-space barycentric coordinates, when weighted by the depth inverse, vary linearly in window coordinates.

Therefore, we compute the perspective corrected barycentric by:

     ( b0 / gl_Position[0].w, b1 / gl_Position[1].w, b2 / gl_Position[2].w )
B = -------------------------------------------------------------------------
      b0 / gl_Position[0].w + b1 / gl_Position[1].w + b2 / gl_Position[2].w

and then use it to interpolate the attributes from the vertices.

Note: GL_NV_fragment_shader_barycentric exposes the device-linear barycentric coordinates through gl_BaryCoordNoPerspNV and the perspective corrected through gl_BaryCoordNV.

Implementation

Here is a C++ code that rasterizes and shades a triangle on the CPU, in a manner similar to OpenGL. I encourage you to compare it with the shaders listed below:

struct Renderbuffer { int w, h, ys; void *data; };
struct Vert { vec4 position, texcoord, color; };
struct Varying { vec4 texcoord, color; };

void vertex_shader(const Vert &in, vec4 &gl_Position, Varying &OUT) {
    OUT.texcoord = in.texcoord;
    OUT.color = in.color;
    gl_Position = vec4(in.position.x, in.position.y, -2*in.position.z - 2*in.position.w, -in.position.z);
}

void fragment_shader(vec4 &gl_FragCoord, const Varying &IN, vec4 &OUT) {
    OUT = IN.color;
    vec2 wrapped = IN.texcoord.xy - floor(IN.texcoord.xy);
    bool brighter = (wrapped[0] < 0.5) != (wrapped[1] < 0.5);
    if(!brighter)
        OUT.rgb *= 0.5f;
}

// render output unit/render operations pipeline
void rop(Renderbuffer &buf, int x, int y, const vec4 &c) {
    uint8_t *p = (uint8_t*)buf.data + buf.ys*(buf.h - y - 1) + 4*x;
    p[0] = linear_to_srgb8(c[0]);
    p[1] = linear_to_srgb8(c[1]);
    p[2] = linear_to_srgb8(c[2]);
    p[3] = lround(c[3]*255);
}

void draw_triangle(Renderbuffer &color_attachment, const box2 &viewport, const Vert *verts) {
    auto area = [](const vec2 &p0, const vec2 &p1, const vec2 &p2) { return cross(p1 - p0, p2 - p0); };
    auto interpolate = [](const auto a[3], auto p, const vec3 &coord) { return coord.x*a[0].*p + coord.y*a[1].*p + coord.z*a[2].*p; };

    Varying perVertex[3];
    vec4 gl_Position[3];

    box2 aabb = { viewport.hi, viewport.lo };
    for(int i = 0; i < 3; ++i) {
        vertex_shader(verts[i], gl_Position[i], perVertex[i]);

        // convert to normalized device coordinates
        gl_Position[i].w = 1/gl_Position[i].w;
        gl_Position[i].xyz *= gl_Position[i].w;

        // convert to window coordinates
        gl_Position[i].xy = mix(viewport.lo, viewport.hi, 0.5f*(gl_Position[i].xy + 1.0f));
        aabb = join(aabb, gl_Position[i].xy);
    }

    const float denom = 1/area(gl_Position[0].xy, gl_Position[1].xy, gl_Position[2].xy);

    // loop over all pixels in the rectangle bounding the triangle
    const ibox2 iaabb = lround(aabb);
    for(int y = iaabb.lo.y; y < iaabb.hi.y; ++y)
    for(int x = iaabb.lo.x; x < iaabb.hi.x; ++x)
    {
        vec4 gl_FragCoord;
        gl_FragCoord.xy = vec2(x, y) + 0.5f;

        // fragment barycentric coordinates in window coordinates
        const vec3 barycentric = denom*vec3(
            area(gl_FragCoord.xy, gl_Position[1].xy, gl_Position[2].xy),
            area(gl_Position[0].xy, gl_FragCoord.xy, gl_Position[2].xy),
            area(gl_Position[0].xy, gl_Position[1].xy, gl_FragCoord.xy)
        );

        // discard fragment outside the triangle. this doesn't handle edges correctly.
        if(barycentric.x < 0 || barycentric.y < 0 || barycentric.z < 0)
            continue;

        // interpolate inverse depth linearly
        gl_FragCoord.z = interpolate(gl_Position, &vec4::z, barycentric);
        gl_FragCoord.w = interpolate(gl_Position, &vec4::w, barycentric);

        // clip fragments to the near/far planes (as if by GL_ZERO_TO_ONE)
        if(gl_FragCoord.z < 0 || gl_FragCoord.z > 1)
            continue;

        // convert to perspective correct (clip-space) barycentric
        const vec3 perspective = 1/gl_FragCoord.w*barycentric*vec3(gl_Position[0].w, gl_Position[1].w, gl_Position[2].w);

        // interpolate attributes
        Varying varying = {
            interpolate(perVertex, &Varying::texcoord, perspective),
            interpolate(perVertex, &Varying::color, perspective),
        };

        vec4 color;
        fragment_shader(gl_FragCoord, varying, color);
        rop(color_attachment, x, y, color);
    }
}

int main(int argc, char *argv[]) {
    Renderbuffer buffer = { 512, 512, 512*4 };
    buffer.data = calloc(buffer.ys, buffer.h);

    // VAO interleaved attributes buffer
    Vert verts[] = {
        { { -1, -1, -2, 1 }, { 0, 0, 0, 1 }, { 0, 0, 1, 1 } },
        { { 1, -1, -1, 1 }, { 10, 0, 0, 1 }, { 1, 0, 0, 1 } },
        { { 0, 1, -1, 1 }, { 0, 10, 0, 1 }, { 0, 1, 0, 1 } },
    };

    box2 viewport = { 0, 0, buffer.w, buffer.h };
    draw_triangle(buffer, viewport, verts);

    stbi_write_png("out.png", buffer.w, buffer.h, 4, buffer.data, buffer.ys);
}

OpenGL shaders

Here are the OpenGL shaders used to generate the reference image.

Vertex shader:

#version 450 core
layout(location = 0) in vec4 position;
layout(location = 1) in vec4 texcoord;
layout(location = 2) in vec4 color;
out gl_PerVertex { vec4 gl_Position; };
layout(location = 0) out Varying { vec4 texcoord; vec4 color; } OUT;
void main() {
    OUT.texcoord = texcoord;
    OUT.color = color;
    gl_Position = vec4(position.x, position.y, -2*position.z - 2*position.w, -position.z);
}

Fragment shader:

#version 450 core
layout(location = 0) in Varying { vec4 texcoord; vec4 color; } IN;
layout(location = 0) out vec4 OUT;
void main() {
    OUT = IN.color;
    vec2 wrapped = fract(IN.texcoord.xy);
    bool brighter = (wrapped.x < 0.5) != (wrapped.y < 0.5);
    if(!brighter)
        OUT.rgb *= 0.5;
}

Results

Here are the almost identical images generated by the C++ (left) and OpenGL (right) code:

The differences are caused by different precision and rounding modes.

For comparison, here is one that is not perspective correct (uses barycentric instead of perspective for the interpolation in the code above):

Thank you! This is exactly the kind of answer I was hoping for! But I'm still having some trouble. Is one of the following points incorrect? 1. Proper interpolation of fragment attributes requires that the perspective division not be done yet, since meaningful w values are necessary for this. 2. Fragments (which correspond directly to pixels) cannot be generated until after the viewport transformation. 3. The viewport transformation is applied to Normalized Device Coordinates 4. Normalized Device Coordinates are acquired by performing the perspective division on clip coordinates. — AIGuy110, Jun 28 '14 at 02:53
Ah, so the clip-space coordinates of the vertices are saved and then retrieved after the perspective division? That makes sense. Thank you :). — AIGuy110, Jun 28 '14 at 16:24
@user1003620: What GL does here: The whole clip space coords are not stored., but the clip space `w` coordiante is. Actually, `gl_FragCoord.w` will contain the (per fragment linearily interpolated) `1/w` coordinate, which is kind of a by-produced from the perspective correction, and can be quite useful to habe at hands in the shader, also. — derhass, Jun 29 '14 at 12:52
Under the heading **How correct perspective interpolation is computed?**, should `w` be equal to `−z` and not `−1/z`? It seems to make more sense when `w = −z` since a point in clip space `(x, y, *, −z)` would be, post homogenization, `(u = x / −z, v = y / −z)`, which agrees with your `(u, v)` in the sentence before the one on `w`. — legends2k, Dec 07 '15 at 12:15
@legends2k: this is a different `w`. `w` in clip space is `-z`, but then the third coordinate after the homogeneous divide is `-1/z`. The 'inverse z' is frequently called w, e.g. as in W-buffer versus Z-buffer. — Yakov Galka, Dec 08 '15 at 00:06
This answer can do correct depth calculation but it will not do perspective-correct interpolation of vertex attributes. Following the math of this [post](https://www.scratchapixel.com/lessons/3d-basic-rendering/rasterization-practical-implementation/perspective-correct-interpolation-vertex-attributes) will result in the equation in @derhass answer below. — Petrakeas, Jan 02 '17 at 18:05
@ybungalobill you're right. At first glance, I thought you were using barycentric coordinates in screen space which didn't account for the perspective division. — Petrakeas, Jan 02 '17 at 18:34
@Petrakeas: even though my answer was technically correct, I rewrote it to use barycentric coordinates so it would be less confusing. Also an implementation is now included. — Yakov Galka, Apr 04 '19 at 19:32
When using this algorithm, the objects behind the camera are still visible (but strangely deformed). What could be the issue? — neoexpert, Jan 10 '23 at 10:13
@neoexpert it doesn't implement primitive clipping, only fragment clipping. — Yakov Galka, Jan 10 '23 at 14:50

derhass · Answer 2 · 2015-05-05T18:25:35.237

The formula that you will find in the GL specification (look on page 427; the link is the current 4.4 spec, but it has always been that way) for perspective-corrected interpolation of the attribute value in a triangle is:

   a * f_a / w_a   +   b * f_b / w_b   +  c * f_c / w_c
f=-----------------------------------------------------
      a / w_a      +      b / w_b      +     c / w_c

where a,b,c denote the barycentric coordinates of the point in the triangle we are interpolating for (a,b,c >=0, a+b+c = 1), f_i the attribute value at vertex i, and w_i the clip space w coordinate of vertex i. Note that the barycentric coordinates are calculated only for the 2D projection of the window space coords of the triangle (so z is ignored).

This is what the formulas that ybungalowbill gave in his fine answer boils down to, in the general case, with an arbitrary projection axis. Actually, the last row of the projection matrix defines just the projection axis the image plane will be orthogonal to, and the clip space w component is just the dot product between the vertex coords and that axis.

In the typical case, the projection matrix has (0,0,-1,0) as the last row, so it transfroms so that w_clip = -z_eye, and this is what ybungalowbill used. However, since w is what we actually will do the division by (that is the only nonlinear step in the whole transformation chain), this will work for any projection axis. It will also work in the trivial case of orthogonal projections where w is always 1 (or at least constant).

Note a few things for an efficient implementation of this. The inversion 1/w_i can be pre-calculated per vertex (let's call them q_i in the following), it does not have to be re-evaluated per fragment. And it is totally free since we divide by w anyway, when going into NDC space, so we can save that value. The GL spec does never describe how a certain feature is to be implemented internally, but the fact that the screen space coordinates will be accessible in glFragCoord.xyz, and gl_FragCoord.w is guaranteed to give the (lineariliy interpolated) 1/w clip space coordinate is quite revealing here. That per-fragment 1_w value is actually the denominator of the formula given above.
The factors a/w_a, b/w_b and c/w_c are each used two times in the formula. And these are also constant for any attribute value, now matter how many attributes there are to be interpolated. So, per fragment, you can calculate a'=q_a * a, b'=q_b * b and c'=q_c and get
```
  a' * f_a + b' * f_b + c' * f_c
f=------------------------------
           a' + b' + c'
```

So the perspective interpolation boils down to

3 additional multiplications,
2 additional additions, and
1 additional division

per fragment.

This answer was super helpful, way easier to implement than the accepted one. In the first formula you have w*c when you mean w_c though. Also the formula can be found on page 427 of that spec you linked for anyone else looking for it. There is a formula for the barycentric coords on http://en.wikipedia.org/wiki/Barycentric_coordinate_system and you only need to use the x,y values to calculate those. — Chris Pushbullet, Apr 08 '15 at 16:12
@christopherhesse: Thanks for your feedback. I updated the answer slightly. The formula is correct now, and I also mentioned that the barycentric coords have to be calculated based on the 2D projection of the triangle. I also fixed lots of typos and made the language more clear. — derhass, Apr 08 '15 at 17:17
You can get a better understanding of how this equation is formed by reading this great post: https://www.scratchapixel.com/lessons/3d-basic-rendering/rasterization-practical-implementation/visibility-problem-depth-buffer-depth-interpolation — Petrakeas, Jan 02 '17 at 17:27
If you want to do this in a traditional vertex and fragment shader for whatever reason, you can use the existing interpolation. It is sufficient to simply multiply the attribute in the vertex shader with `1/w`. Send `1/w` with the vertex attributes to be interpolated. In the fragment shader divide the attributes by the interpolated `1/w`. Make sure to use the `noperspective` keyword for the attributes you want to manually correct and the `1/w` attribute. — Selmar, Nov 06 '18 at 20:42

How exactly does OpenGL do perspectively correct linear interpolation?

2 Answers2

Short introduction to barycentric coordinates

Perspective correct interpolation

Implementation

OpenGL shaders

Results

Linked

Related