Some computer graphics techniques require comparing pixel depths. Since the hardware already calculates and stores pixel depths, it would be ideal if we read the depths from the hardware depth buffer instead of calculating it ourselves and using extra storage.
When using perspective projection, the pixel depths stored in the hardware depth buffer are non-linear. That means that the distance between 0.1 and 0.2 is not the same that between 0.8 and 0.9.
This post is about explaining how to convert a non-linear pixel depth (as a result of a symmetrical perspective projection) to the view space z, which is linear. The formula is easy. The explanation is more complicated. If you just want the formula, look at the GLSL code at the end of this post.
The strategy of the explanation is first to explain the process that converts a view space position to a pixel (non-linear) depth and then derive the inverse process.
View space position to pixel depth
- Project view space position. Using the perspective projection matrix, the view space position is converted to a clip space position.
- Convert to Normalized Device Coordinates. The clip space position is divided by its w component. The resulting NDC are in the range [-1, 1] (I think Direct3D normalizes z to be in the range [0, 1]).
- Convert normalized device z to pixel depth: The depth buffer stores values between [0, 1], so the normalized device z is scaled from [-1, 1] to [0, 1].
Pixel depth to view space z
- Convert pixel depth to normalized device z. The pixel depth is read from the depth buffer and converted to NDC, that is, scaled from [0, 1] to [-1, 1].
- Convert normalized device z to clip space z. The normalized device z is multiplied by the clip space w. Note that at this stage we don’t know the value of the clip space w, we are just expressing the formula algebraically which we’ll manipulate later.
- Convert clip space z to view space z. An inverse symmetrical perspective projection matrix has the following form (the -1 is for a left-handed coordinate system. For a right-handed coordinate system it would be +1):
The view space position is obtained by multiplying the inverse projection matrix by the clip space position. The equation is:
We are only interested in finding view space z. From the third row we get:
Remember that we don’t know the clip space w. From the fourth row we get:
Finally, we isolate the view space z:
Here GLSL code: