The order in which image or buffer memory is read or written by shaders is largely undefined. For some shader types (vertex, tessellation evaluation, and in some cases, fragment), even the number of shader invocations that may perform loads and stores is undefined.
In particular, the following rules apply:
![]() | Note |
---|---|
The above limitations on shader invocation order make some forms of synchronization between shader invocations within a single set of primitives unimplementable. For example, having one invocation poll memory written by another invocation assumes that the other invocation has been launched and will complete its writes in finite time. |
Stores issued to different memory locations within a single shader invocation may not be visible to other invocations, or may not become visible in the order they were performed.
The OpMemoryBarrier
instruction can be used to provide stronger
ordering of reads and writes performed by a single invocation.
OpMemoryBarrier
guarantees that any memory transactions issued by the
shader invocation prior to the instruction complete prior to the memory
transactions issued after the instruction.
Memory barriers are needed for algorithms that require multiple invocations
to access the same memory and require the operations to be performed in a
partially-defined relative order.
For example, if one shader invocation does a series of writes, followed by
an OpMemoryBarrier
instruction, followed by another write, then the
results of the series of writes before the barrier become visible to other
shader invocations at a time earlier or equal to when the results of the
final write become visible to those invocations.
In practice it means that another invocation that sees the results of the
final write would also see the previous writes.
Without the memory barrier, the final write may be visible before the
previous writes.
Writes that are the result of shader stores through a variable decorated
with Coherent
automatically have available writes to the same buffer,
buffer view, or image view made visible to them, and are themselves
automatically made available to access by the same buffer, buffer view, or
image view.
Reads that are the result of shader loads through a variable decorated with
Coherent
automatically have available writes to the same buffer, buffer
view, or image view made visible to them.
The order that coherent writes to different locations become available is
undefined, unless enforced by a memory barrier instruction or other memory
dependency.
![]() | Note |
---|---|
Explicit memory dependencies must still be used to guarantee availability and visibility for access via other buffers, buffer views, or image views. |
The built-in atomic memory transaction instructions can be used to read and
write a given memory address atomically.
While built-in atomic functions issued by multiple shader invocations are
executed in undefined order relative to each other, these functions perform
both a read and a write of a memory address and guarantee that no other
memory transaction will write to the underlying memory between the read and
write.
Atomic operations ensure automatic availability and visibility for writes
and reads in the same way as those to Coherent
variables.
Example 8.1. Note
Memory accesses performed on different resource descriptors with the same
memory backing may not be well-defined even with the Coherent
decoration or via atomics, due to things such as image layouts or ownership
of the resource - as described in the Synchronization and Cache Control chapter.
![]() | Note |
---|---|
Atomics allow shaders to use shared global addresses for mutual exclusion or as counters, among other uses. |