# Mathematics of the depth metric when generating shadow maps and rendering with shadows

This article has been written to keep track of how the depth metric is computed when generating shadow maps and rendering shadows in Babylon.js, as understanding how we ended up using the formula described below can be hard at times, especially when you throw into the mix the support of the reverse depth buffer and the reduced NDC Z range that WebGPU is using...

## Generating the shadow map

Shadow maps are generated by the `ShadowGenerator`

class for standard shadows, and by the `CascadedShadowGenerator`

class for cascaded shadow maps.

The idea is to generate a texture that contains the depth of the geometry which is closest to the light when rendering the scene from the light point of view. Basically, this depth is the z coordinate of the 3D point when transformed into the view space of the light (in this space, the Z axis is going forward). So, if we have two points A and B, if `zA < zB`

in this space, A is closer to the light than B and it is `zA`

which will be written to the shadow map.

Babylon.js is not doing anything fancy here and is simply using the **transformation** matrix (**view** x **projection**) of the light to render the shadow casters and generate the shadow map. However, there are several cases to consider.

### PCF and PCSS filtering

When using PCF (*Percentage Closer Filtering*) and PCSS (*Percentage Closer Soft Shadows*) to render actual shadows, we are using as our shadow map the depth texture generated by the GPU when rendering the shadow casters. So, this texture is automatically generated as part of the rendering and we have nothing specific to do, except applying the bias value defined in the shadow generator. The shader code looks like this (in the *shadowMapVertexMetric.fx* file):

#if SM_DEPTHTEXTURE == 1#ifdef IS_NDC_HALF_ZRANGE#define BIASFACTOR 0.5#else#define BIASFACTOR 1.0#endif#if SM_USE_REVERSE_DEPTHBUFFER == 1gl_Position.z -= biasAndScaleSM.x * gl_Position.w * BIASFACTOR;#elsegl_Position.z += biasAndScaleSM.x * gl_Position.w * BIASFACTOR;#endif#endif

`SM_DEPTHTEXTURE`

is set to 1 only when using PCF/PCSS filtering. `biasAndScaleSM.x`

is the bias value (note that the *normal bias* is applied earlier and modifies the world position of the 3D point).

We are multiplying by `gl_Position.w`

because the GPU, as part of its computations, will do `gl_Position.z / gl_Position.w`

before writing the value to the depth texture: by pre-multiplying by `gl_Position.w`

, we make sure the final result is simply biased by a constant `biasAndScaleSM.x * BIASFACTOR`

value.

When the NDC space has a `0..1`

Z range (meaning **IS_NDC_HALF_ZRANGE** is defined), we use a bias factor of 0.5 so that the final bias applied to the position has the same scale than when the range is `-1..1`

.

Note that in the standard case (when not using the reverse depth buffer), we **add** the bias to the position, so we move a little farther the depth value / the geometry. There is another strategy that would be to not apply the bias in the shadow map but at the shadow rendering stage. In that case, we would **subtract** the bias from the current depth (from the light) of the pixel to achieve the same result.

Finally, when using the reverse depth buffer we simply reverse (negate) the bias offset as now bigger z values means nearer geometries.

### Generating a depth metric

When not using PCF / PCSS modes (actually, we also need the depth metric described here in PCSS mode), we need to generate a depth metric, which is the depth value we will use when doing depth comparisons to compute the shadow level of a given pixel.

The computation we are doing to generate this value is (in the *shadowMapVertexMetric.fx* file):

#if SM_USE_REVERSE_DEPTHBUFFER == 1vDepthMetricSM = (-gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;#elsevDepthMetricSM = (gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;#endif

The aim is to generate a normalized value between `0..1`

from the `gl_Position.z`

value, which is the z component of the 3D vertex after the **transformation** matrix (**view** x **projection**) has been applied.

In the next sections, we are going to explain which values to set in `depthValuesSM.x`

and `depthValuesSM.y`

to achieve this goal.

As a preamble, we will only focus on the **projection** matrix because we are only interested in how the projection remaps the z values to the NDC space and the **view** matrix does not come into play in this computation.

**Notes**:

- we could have used the depth texture described in the previous section in all cases to retrieve the depth values we need and avoid having to deal with this depth metric, but for historical reasons and because WebGL1 does not support depth textures, we need this depth metric.
- this section assumes the NDC Z range is
`-1..1`

. We will handle the`0..1`

range later. - the reverse depth buffer case is handled simply by swapping the near and far planes of the light in the projection matrix
- the projection matrices we are dealing with are for a left handed coordinate system but the results are the same for a right handed system
- the depth renderer is also using the same computation to generate the depth texture, so what we are describing below for the spotlight (perspective projection) is applicable to the depth renderer (the light being replaced by the camera).

#### Directional light

Directional lights are using an orthographic projection to transform points to NDC space (clip space to be precise). This projection is:

`n`

and `f`

are the near and far planes of the light (`light.shadowMinZ`

/ `light.shadowMaxZ`

if defined, `camera.minZ`

/ `camera.maxZ`

if not) respectively. Note that we are only interested in the transformation of the z coordinate, so we don't need the `a`

, `b`

, `i0`

and `i1`

values:

It's a linear function of z (which is something we want), but the range is not `0..1`

when z takes values between `n`

and `f`

:

So the range is `-1..1`

. To remap this range to `0..1`

we can simply add 1 to z and divide everything by 2.

Looking back at how `vDepthMetric`

is defined:

vDepthMetricSM = (gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;

(don't forget that `z_ortho = gl_Position.z`

)

We simply need to have `depthValuesSM.x = 1`

and `depthValuesSM.y = 2`

.

In the javascript code, the `depthValuesSM`

shader variable is set like this:

effect.setFloat2("depthValuesSM", this.getLight().getDepthMinZ(scene.activeCamera), this.getLight().getDepthMinZ(scene.activeCamera) + this.getLight().getDepthMaxZ(scene.activeCamera));

So:

depthValuesSM.x = this.getLight().getDepthMinZ(scene.activeCamera);depthValuesSM.y = this.getLight().getDepthMinZ(scene.activeCamera) + this.getLight().getDepthMaxZ(scene.activeCamera);

Which means that for directional lights, `getDepthMinZ`

must return `1`

and `getDepthMaxZ`

must also return `1`

.

In the reverse depth buffer case:

This time the range is `1..-1`

. However, in the shader, for the reverse depth buffer case we have:

vDepthMetricSM = (-gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;

which means `z_ortho`

is multiplied by `-1`

before the addition with `depthValuesSM.x`

. So, `1..-1`

is becoming `-1..1`

and we are now back to the same case than previously, so we need the same values in `depthValuesSM.x`

and `depthValuesSM.y`

(that is, 1 and 2 respectively).

#### Spot light

Spot lights are using a perspective projection to transform points to NDC space (clip space to be precise). This projection is:

Regarding the range when z takes values between `n`

and `f`

:

The range is `-n..f`

, which means that for spot lights we need `getDepthMinZ`

to return `n`

and `getDepthMaxZ`

to return `f`

to remap this range to `0..1`

once we apply the computation (recall that `depthValuesSM.x = light.getDepthMinZ()`

and `depthValuesSM.y = light.getDepthMinZ() + light.getDepthMaxZ()`

):

vDepthMetricSM = (gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;

As for the directional light case, the reverse depth buffer range is negated compared to the normal case, but because of the minus sign in front of `gl_Position.z`

in the `vDepthMetricSM`

formula, `getDepthMinZ`

and `getDepthMaxZ`

must return the same values.

#### Point light

Point lights are using shadow maps that are storing the distance of the geometry to the light. This distance is computed as `length(position - lightPosition)`

, which is then remapped to the `0..1`

range (in the `shadowMapFragment.fx`

file):

depthSM = (length(vPositionWSM - lightDataSM) + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;

It's the same computation than previously described except that we are using the distance to the light instead of the depth. `vPositionWSM`

is the world position of the point and `lightDataSM`

the world position of the light. There's no specific case for the reverse depth buffer mode as it is irrelevant: we are computing a distance, not a depth.

Note that in reality we are not remapping to `0..1`

with this formula because `length(vPositionWSM - lightDataSM)`

has no maximum bound, it can go to +infinity: `vPositionWSM`

is constrained to be in the view frustum but the light can be positioned anywhere in the world. So, to simplify things, we setup the `getDepthMinZ`

and `getDepthMaxZ`

functions to return the same values than in the spot light case, meaning `n`

and `f`

respectively. It's not really important that we are not remapping strictly to `0..1`

as long as we use the same computation when rendering shadows, so that both values can be compared.

Notes:

- Even if
`length(position - lightPosition)`

can go to +infinity in theory, the point light is generally not too far from the geometry which is currently in the view frustum because for lights that would be too far their contributions would be very small (or 0) and the light would not cast shadows anyway (every (point) light as a maximum distance after which it falls to 0 intensity) - we could remove the remapping altogether and simply use
`length(position - lightPosition)`

as the depth metric, but that would require using a float texture in all cases. When in WebGL1 mode and if the float texture extension is not supported, we are using a UNORM 8 bits texture, so we need a`0..1`

remapping

### Generating a depth metric (NDC `0..1`

Z range)

When using a NDC space where the z coordinate is in the `0..1`

range, the orthographic and perspective projection matrices do change. Let's see how it changes the results from the previous section.

#### Directional light

We can see that in the non reverse depth buffer case the remapping is already `0..1`

, so `getDepthMinZ`

should return 0 and `getDepthMaxZ`

should return 1.

In the reverse depth buffer case, as we have a negation of z in the `vDepthMetric`

formula, the `z_ortho`

range is `-1..0`

. We need to add 1 to remap to `0..1`

. To do that, we can simply have `getDepthMinZ`

return 1 and `getDepthMaxZ`

return 0.

#### Spot light

In the non reverse depth buffer case, we need to remap `0..f`

to `0..1`

: we need to divide by `f`

. To do that, `getDepthMinZ`

should return 0 and `getDepthMaxZ`

should return `f`

.

In the reverse depth buffer case, we need to remap `-n..0`

(don't forget that when using the reverse depth buffer we have `-gl_Position.z`

in the `vDepthMetric`

formula, not `gl_Position.z`

) to `0..1`

: we need to add `n`

and divide by `n`

. To do that, `getDepthMinZ`

should return `n`

and `getDepthMaxZ`

should return 0.

#### Point light

Point lights are no different than in the NDC `-1..1`

range case because we are exclusively dealing with distances, the NDC z range is irrelevant.

## Shadow rendering

There's not much to say regarding the shadow rendering part: we simply have to make sure we use the exact same formula to compute the depth metric of the current pixel than the ones used to generate the shadow maps. The shader code used to compute the `vDepthMetric`

value is in this case (in the `shadowsVertex.fx`

file):

#if USE_REVERSE_DEPTHBUFFERvDepthMetric{X} = (-vPositionFromLight{X}.z + light{X}.depthValues.x) / light{X}.depthValues.y;#elsevDepthMetric{X} = (vPositionFromLight{X}.z + light{X}.depthValues.x) / light{X}.depthValues.y;#endif

So, we must pass in `light{X}.depthValues.x`

and `light{X}.depthValues.y`

the same values that we passed in the `depthValuesSM.x`

and `depthValuesSM.y`

parameters when generating the shadow maps.

**To sum up**

First recall that:

depthValues.x = light.getDepthMinZ(camera);depthValues.y = light.getDepthMinZ(camera) + light.getDepthMaxZ(camera);

and that `n`

is the near plane distance and `f`

the far plane distance (`light.shadowMinZ`

/ `light.shadowMaxZ`

if defined, `camera.minZ`

/ `camera.maxZ`

else):

Directional light | Spot light | Point light | |
---|---|---|---|

NDC -1..1 | Directional light minZ=1 | Spot light minZ=n | Point light minZ=n |

Â | Directional light maxZ=1 | Spot light maxZ=f | Point light maxZ=f |

NDC -1..1 + reverse depth buffer | Directional light minZ=1 | Spot light minZ=n | Point light minZ=n |

Â | Directional light maxZ=1 | Spot light maxZ=f | Point light maxZ=f |

NDC 0..1 | Directional light minZ=0 | Spot light minZ=0 | Point light minZ=n |

Â | Directional light maxZ=1 | Spot light maxZ=f | Point light maxZ=f |

NDC 0..1 + reverse depth buffer | Directional light minZ=1 | Spot light minZ=n | Point light minZ=n |

Â | Directional light maxZ=0 | Spot light maxZ=0 | Point light maxZ=f |

In this table, `minZ`

is for `getDepthMinZ`

and `maxZ`

is for `getDepthMaxZ`

.