Vulkan Logo

26. Cluster Culling Shading

This new shader type have execution environments similar to that of compute shaders, where a collection of shader invocations form a workgroup and cooperate to perform coarse level geometry culling and level-of-detail selection, a shader invocation can emit a set of built-in output variables via a new built-in function, Cluster Culling Shader organizes these emitted variables into a drawing command used by subsequent rendering pipeline.

26.1. Cluster Culling Shader Input

The only inputs available to the cluster culling shader are variables identifying the specific workgroup and invocation.

26.2. Cluster Culling Shader Output

If a cluster is survived after culling in a cluster culling shader invocation, a drawing command used to draw this cluster should be emitted by this shader invocation for further rendering processing. There are two types of drawing command, indexed mode and non-indexed mode , both type of drawing command are consists of a set of built-in output variables which have a similar definition to VkDrawIndexedIndirectCommand and VkDrawIndirectCommand members.

cluster culling shader have the following built-in output variables.

  • built-in variable IndexCountHUAWEI is the number of vertices to draw.

  • built-in variable VertexCountHUAWEI is the number of vertices to draw.

  • built-in variable InstanceCountHUAWEI is the number of instances to draw.

  • built-in variable FirstIndexHUAWEI is the base index within the index buffer.

  • built-in variable FirstVertexHUAWEI is the index of the first vertex to draw

  • built-in variable VertexOffsetHUAWEI is the value added to the vertex index before indexing into the vertex buffer.

  • built-in variable FirstInstanceHUAWEI is the instance ID of the first instance to draw.

  • built-in variable ClusterIdHUAWEI is the index of cluster being rendered by this drawing command. When cluster culling shader enable, ClusterIdHUAWEI will replace gl_DrawID pass to vertex shader.

26.3. Cluster Culling Shader Cluster Ordering

  • When a cluster culling shader is used, all output clusters generated by DispatchClusterHUAWEI() in a given workgroup are pass to subsequent pipeline stage before any cluster generated from subsequent workgroup.

  • In a workgroup, the order of output clusters generated by DispatchClusterHUAWEI() is specified by the local invocation id, from a lower value to higher value.

  • If any cluster culling invocation in the workgroup does not call DispatchClusterHUAWEI(), no cluster will be sent to the subsequent rendering pipeline.

  • Any cluster culling shader invocation may also call DispatchClusterHUAWEI() many times as shown below:

// Cluster Culling Shader sample code:
        ......
    DispatchClusterHUAWEI();  // dispatch 0
        ......
    DispatchClusterHUAWEI();  // dispatch 1
        ......
    DispatchClusterHUAWEI();  // dispatch 2
 	......

In this case, the output sequence of clusters in a workgroup are specified as shown below ( in case of 32 shader invocations in a workgroup):

1. shader invocation0.dispatch0
2. shader invocation1.dispatch0,
            ..........
32. shader invocation31.dispatch0
33. shader invocation0.dispatch1
34. shader invocation1.dispatch1
            ..........
64. shader invocation31.dispatch1
65. shader invocation0.dispatch2
66. shader invocation1.dispatch2
            ..........
96. shader Invocation31.dispatch2

26.4. Cluster Culling Shader Primitive Ordering

Following guarantees are provided for the relative ordering of primitives produced by a cluster culling shader, as they pertain to primitive order.

  • Limited guarantees are provided for the relative ordering of primitives produced by a cluster culling shader, as they pertain to primitive order.

  • The order of primitives in a given cluster is specified by the content of

    • DistaptchClusterHUAWEI() with indexed output built-in variables, vertices sourced from a lower index buffer addresses to higher addresses.

    • DispatchClusterHUAWEI() with non-indexed output built-in variables, from vertices with a lower numbered vertexIndex to a higher numbered vertexIndex.