Copyright 2014-2018 The Khronos Group Inc.
This Specification is protected by copyright laws and contains material proprietary to Khronos. Except as described by these terms, it or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast or otherwise exploited in any manner without the express prior written permission of Khronos. Khronos grants a conditional copyright license to use and reproduce the unmodified Specification for any purpose, without fee or royalty, EXCEPT no licenses to any patent, trademark or other intellectual property rights are granted under these terms.
Khronos makes no, and expressly disclaims any, representations or warranties, express or implied, regarding this Specification, including, without limitation: merchantability, fitness for a particular purpose, non-infringement of any intellectual property, correctness, accuracy, completeness, timeliness, and reliability. Under no circumstances will Khronos, or any of its Promoters, Contributors or Members, or their respective partners, officers, directors, employees, agents or representatives be liable for any damages, whether direct, indirect, special or consequential damages for lost revenues, lost profits, or otherwise, arising from or in connection with these materials.
This Specification has been created under the Khronos Intellectual Property Rights Policy, which is Attachment A of the Khronos Group Membership Agreement available at www.khronos.org/files/member_agreement.pdf, and which defines the terms 'Scope', 'Compliant Portion', and 'Necessary Patent Claims'. Parties desiring to implement the Specification and make use of Khronos trademarks in relation to that implementation, and receive reciprocal patent license protection under the Khronos Intellectual Property Rights Policy must become Adopters and confirm the implementation as conformant under the process defined by Khronos for this Specification; see https://www.khronos.org/adopters.
This Specification contains substantially unmodified functionality from, and is a successor to, Khronos specifications including OpenGL, OpenGL ES and OpenCL.
Some parts of this Specification are purely informative and so are EXCLUDED from the Scope of this Specification. The Document Conventions section of the Introduction defines how these parts of the Specification are identified.
Where this Specification uses technical terminology, defined in the Glossary or otherwise, that refer to enabling technologies that are not expressly set forth in this Specification, those enabling technologies are EXCLUDED from the Scope of this Specification. For clarity, enabling technologies not disclosed with particularity in this Specification (e.g. semiconductor manufacturing technology, hardware architecture, processor architecture or microarchitecture, memory architecture, compiler technology, object oriented technology, basic operating system technology, compression technology, algorithms, and so on) are NOT to be considered expressly set forth; only those application program interfaces and data structures disclosed with particularity are included in the Scope of this Specification.
For purposes of the Khronos Intellectual Property Rights Policy as it relates to the definition of Necessary Patent Claims, all recommended or optional features, behaviors and functionality set forth in this Specification, if implemented, are considered to be included as Compliant Portions.
Where this Specification includes normative references to external documents, only the specifically identified sections of those external documents are INCLUDED in the Scope of this Specification. If not created by Khronos, those external documents may contain contributions from non-members of Khronos not covered by the Khronos Intellectual Property Rights Policy.
This document contains extensions which are not ratified by Khronos, and as such is not a ratified Specification, though it contains text from (and is a superset of) the ratified Vulkan Specification. The ratified versions of the Vulkan Specification can be found at https://www.khronos.org/registry/vulkan/specs/1.1/html/vkspec.html (core only) and https://www.khronos.org/registry/vulkan/specs/1.1-khr_extensions/html/vkspec.html (core with KHR extensions).
Vulkan and Khronos are registered trademarks of The Khronos Group Inc. ASTC is a trademark of ARM Holdings PLC; OpenCL is a trademark of Apple Inc.; and OpenGL is a registered trademark of Silicon Graphics International, all used under license by Khronos. All other product names, trademarks, and/or company names are used solely for identification and belong to their respective owners.
1. Introduction
This document, referred to as the “Vulkan Specification” or just the “Specification” hereafter, describes the Vulkan Application Programming Interface (API). Vulkan is a C99 API designed for explicit control of low-level graphics and compute functionality.
The canonical version of the Specification is available in the official Vulkan Registry (http://www.khronos.org/registry/vulkan/). The source files used to generate the Vulkan specification are stored in the Vulkan Documentation Repository (https://github.com/KhronosGroup/Vulkan-Docs). The source repository additionally has a public issue tracker and allows the submission of pull requests that improve the specification.
1.1. Document Conventions
The Vulkan specification is intended for use by both implementors of the API and application developers seeking to make use of the API, forming a contract between these parties. Specification text may address either party; typically the intended audience can be inferred from context, though some sections are defined to address only one of these parties. (For example, Valid Usage sections only address application developers). Any requirements, prohibitions, recommendations or options defined by normative terminology are imposed only on the audience of that text.
Note
Structure and enumerated types defined in extensions that were promoted to core in Vulkan 1.1 are now defined in terms of the equivalent Vulkan 1.1 interfaces. This affects the Vulkan Specification, the Vulkan header files, and the corresponding XML Registry. |
1.1.1. Normative Terminology
Within this specification, the key words must, required, should, recommended, may, and optional are to be interpreted as described in RFC 2119 - Key words for use in RFCs to Indicate Requirement Levels (http://www.ietf.org/rfc/rfc2119.txt). These key words are highlighted in the specification for clarity. In text addressing application developers, their use expresses requirements that apply to application behavior. In text addressing implementors, their use expresses requirements that apply to implementations.
In text addressing application developers, the additional key words can and cannot are to be interpreted as describing the capabilities of an application, as follows:
- can
-
This word means that the application is able to perform the action described.
- cannot
-
This word means that the API and/or the execution environment provide no mechanism through which the application can express or accomplish the action described.
These key words are never used in text addressing implementors.
Note
There is an important distinction between cannot and must not, as used in this Specification. Cannot means something the application literally is unable to express or accomplish through the API, while must not means something that the application is capable of expressing through the API, but that the consequences of doing so are undefined and potentially unrecoverable for the implementation. |
Unless otherwise noted in the section heading, all sections and appendices in this document are normative.
1.1.2. Technical Terminology
The Vulkan Specification makes use of common engineering and graphics terms such as Pipeline, Shader, and Host to identify and describe Vulkan API constructs and their attributes, states, and behaviors. The Glossary defines the basic meanings of these terms in the context of the Specification. The Specification text provides fuller definitions of the terms and may elaborate, extend, or clarify the Glossary definitions. When a term defined in the Glossary is used in normative language within the Specification, the definitions within the Specification govern and supersede any meanings the terms may have in other technical contexts (i.e. outside the Specification).
1.1.3. Normative References
References to external documents are considered normative references if the Specification uses any of the normative terms defined in Normative Terminology to refer to them or their requirements, either as a whole or in part.
The following documents are referenced by normative sections of the specification:
IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2008, http://dx.doi.org/10.1109/IEEESTD.2008.4610935, August, 2008.
A. Garrard, Khronos Data Format Specification, version 1.2, https://www.khronos.org/registry/DataFormat/specs/1.2/dataformat.1.2.html, September, 2017.
J. Kessenich, SPIR-V Extended Instructions for GLSL, Version 1.00, https://www.khronos.org/registry/spir-v/, February 10, 2016.
J. Kessenich, B. Ouriel, and R. Krisch, SPIR-V Specification, Version 1.3, Revision 2, Unified, https://www.khronos.org/registry/spir-v/, May 11, 2018.
J. Leech and T. Hector, Vulkan Documentation and Extensions: Procedures and Conventions, https://www.khronos.org/registry/vulkan/specs/1.1/styleguide.html
Vulkan Loader Specification and Architecture Overview, https://github.com/KhronosGroup/Vulkan-Loader/blob/master/loader/LoaderAndLayerInterface.md, August, 2016.
2. Fundamentals
This chapter introduces fundamental concepts including the Vulkan architecture and execution model, API syntax, queues, pipeline configurations, numeric representation, state and state queries, and the different types of objects and shaders. It provides a framework for interpreting more specific descriptions of commands and behavior in the remainder of the Specification.
2.1. Host and Device Environment
The Vulkan Specification assumes and requires: the following properties of the host environment with respect to Vulkan implementations:
-
The host must have runtime support for 8, 16, 32 and 64-bit signed and unsigned twos-complement integers, all addressable at the granularity of their size in bytes.
-
The host must have runtime support for 32- and 64-bit floating-point types satisfying the range and precision constraints in the Floating Point Computation section.
-
The representation and endianness of these types on the host must match the representation and endianness of the same types on every physical device supported.
Note
Since a variety of data types and structures in Vulkan may be accessible by both host and physical device operations, the implementation should be able to access such data efficiently in both paths in order to facilitate writing portable and performant applications. |
2.2. Execution Model
This section outlines the execution model of a Vulkan system.
Vulkan exposes one or more devices, each of which exposes one or more queues which may process work asynchronously to one another. The set of queues supported by a device is partitioned into families. Each family supports one or more types of functionality and may contain multiple queues with similar characteristics. Queues within a single family are considered compatible with one another, and work produced for a family of queues can be executed on any queue within that family. This Specification defines four types of functionality that queues may support: graphics, compute, transfer, and sparse memory management.
Note
A single device may report multiple similar queue families rather than, or as well as, reporting multiple members of one or more of those families. This indicates that while members of those families have similar capabilities, they are not directly compatible with one another. |
Device memory is explicitly managed by the application. Each device may advertise one or more heaps, representing different areas of memory. Memory heaps are either device local or host local, but are always visible to the device. Further detail about memory heaps is exposed via memory types available on that heap. Examples of memory areas that may be available on an implementation include:
-
device local is memory that is physically connected to the device.
-
device local, host visible is device local memory that is visible to the host.
-
host local, host visible is memory that is local to the host and visible to the device and host.
On other architectures, there may only be a single heap that can be used for any purpose.
A Vulkan application controls a set of devices through the submission of command buffers which have recorded device commands issued via Vulkan library calls. The content of command buffers is specific to the underlying implementation and is opaque to the application. Once constructed, a command buffer can be submitted once or many times to a queue for execution. Multiple command buffers can be built in parallel by employing multiple threads within the application.
Command buffers submitted to different queues may execute in parallel or even out of order with respect to one another. Command buffers submitted to a single queue respect submission order, as described further in synchronization chapter. Command buffer execution by the device is also asynchronous to host execution. Once a command buffer is submitted to a queue, control may return to the application immediately. Synchronization between the device and host, and between different queues is the responsibility of the application.
2.2.1. Queue Operation
Vulkan queues provide an interface to the execution engines of a device. Commands for these execution engines are recorded into command buffers ahead of execution time. These command buffers are then submitted to queues with a queue submission command for execution in a number of batches. Once submitted to a queue, these commands will begin and complete execution without further application intervention, though the order of this execution is dependent on a number of implicit and explicit ordering constraints.
Work is submitted to queues using queue submission commands that typically
take the form vkQueue*
(e.g. vkQueueSubmit,
vkQueueBindSparse), and optionally take a list of semaphores upon
which to wait before work begins and a list of semaphores to signal once
work has completed.
The work itself, as well as signaling and waiting on the semaphores are all
queue operations.
Queue operations on different queues have no implicit ordering constraints, and may execute in any order. Explicit ordering constraints between queues can be expressed with semaphores and fences.
Command buffer submissions to a single queue respect submission order and other implicit ordering guarantees, but otherwise may overlap or execute out of order. Other types of batches and queue submissions against a single queue (e.g. sparse memory binding) have no implicit ordering constraints with any other queue submission or batch. Additional explicit ordering constraints between queue submissions and individual batches can be expressed with semaphores and fences.
Before a fence or semaphore is signaled, it is guaranteed that any previously submitted queue operations have completed execution, and that memory writes from those queue operations are available to future queue operations. Waiting on a signaled semaphore or fence guarantees that previous writes that are available are also visible to subsequent commands.
Command buffer boundaries, both between primary command buffers of the same or different batches or submissions as well as between primary and secondary command buffers, do not introduce any additional ordering constraints. In other words, submitting the set of command buffers (which can include executing secondary command buffers) between any semaphore or fence operations execute the recorded commands as if they had all been recorded into a single primary command buffer, except that the current state is reset on each boundary. Explicit ordering constraints can be expressed with explicit synchronization primitives.
There are a few implicit ordering guarantees between commands within a command buffer, but only covering a subset of execution. Additional explicit ordering constraints can be expressed with the various explicit synchronization primitives.
Note
Implementations have significant freedom to overlap execution of work submitted to a queue, and this is common due to deep pipelining and parallelism in Vulkan devices. |
Commands recorded in command buffers either perform actions (draw, dispatch, clear, copy, query/timestamp operations, begin/end subpass operations), set state (bind pipelines, descriptor sets, and buffers, set dynamic state, push constants, set render pass/subpass state), or perform synchronization (set/wait events, pipeline barrier, render pass/subpass dependencies). Some commands perform more than one of these tasks. State setting commands update the current state of the command buffer. Some commands that perform actions (e.g. draw/dispatch) do so based on the current state set cumulatively since the start of the command buffer. The work involved in performing action commands is often allowed to overlap or to be reordered, but doing so must not alter the state to be used by each action command. In general, action commands are those commands that alter framebuffer attachments, read/write buffer or image memory, or write to query pools.
Synchronization commands introduce explicit execution and memory dependencies between two sets of action commands, where the second set of commands depends on the first set of commands. These dependencies enforce that both the execution of certain pipeline stages in the later set occur after the execution of certain stages in the source set, and that the effects of memory accesses performed by certain pipeline stages occur in order and are visible to each other. When not enforced by an explicit dependency or implicit ordering guarantees, action commands may overlap execution or execute out of order, and may not see the side effects of each other’s memory accesses.
The device executes queue operations asynchronously with respect to the host. Control is returned to an application immediately following command buffer submission to a queue. The application must synchronize work between the host and device as needed.
2.3. Object Model
The devices, queues, and other entities in Vulkan are represented by Vulkan objects. At the API level, all objects are referred to by handles. There are two classes of handles, dispatchable and non-dispatchable. Dispatchable handle types are a pointer to an opaque type. This pointer may be used by layers as part of intercepting API commands, and thus each API command takes a dispatchable type as its first parameter. Each object of a dispatchable type must have a unique handle value during its lifetime.
Non-dispatchable handle types are a 64-bit integer type whose meaning is implementation-dependent, and may encode object information directly in the handle rather than acting as a reference to an underlying object. Objects of a non-dispatchable type may not have unique handle values within a type or across types. If handle values are not unique, then destroying one such handle must not cause identical handles of other types to become invalid, and must not cause identical handles of the same type to become invalid if that handle value has been created more times than it has been destroyed.
All objects created or allocated from a VkDevice
(i.e. with a
VkDevice
as the first parameter) are private to that device, and must
not be used on other devices.
2.3.1. Object Lifetime
Objects are created or allocated by vkCreate*
and vkAllocate*
commands, respectively.
Once an object is created or allocated, its “structure” is considered to
be immutable, though the contents of certain object types is still free to
change.
Objects are destroyed or freed by vkDestroy*
and vkFree*
commands, respectively.
Objects that are allocated (rather than created) take resources from an existing pool object or memory heap, and when freed return resources to that pool or heap. While object creation and destruction are generally expected to be low-frequency occurrences during runtime, allocating and freeing objects can occur at high frequency. Pool objects help accommodate improved performance of the allocations and frees.
It is an application’s responsibility to track the lifetime of Vulkan objects, and not to destroy them while they are still in use.
The ownership of application-owned memory is immediately acquired by any Vulkan command it is passed into. Ownership of such memory must be released back to the application at the end of the duration of the command, so that the application can alter or free this memory as soon as all the commands that acquired it have returned.
The following object types are consumed when they are passed into a Vulkan command and not further accessed by the objects they are used to create. They must not be destroyed in the duration of any API command they are passed into:
-
VkShaderModule
-
VkPipelineCache
-
VkValidationCacheEXT
A VkRenderPass
object passed as a parameter to create another object
is not further accessed by that object after the duration of the command it
is passed into.
A VkRenderPass
used in a command buffer follows the rules described
below.
A VkPipelineLayout
object must not be destroyed while any command
buffer that uses it is in the recording state.
VkDescriptorSetLayout
objects may be accessed by commands that
operate on descriptor sets allocated using that layout, and those descriptor
sets must not be updated with vkUpdateDescriptorSets after the
descriptor set layout has been destroyed.
Otherwise, a VkDescriptorSetLayout
object passed as a parameter to
create another object is not further accessed by that object after the
duration of the command it is passed into.
The application must not destroy any other type of Vulkan object until all uses of that object by the device (such as via command buffer execution) have completed.
The following Vulkan objects must not be destroyed while any command buffers using the object are in the pending state:
-
VkEvent
-
VkQueryPool
-
VkBuffer
-
VkBufferView
-
VkImage
-
VkImageView
-
VkPipeline
-
VkSampler
-
VkDescriptorPool
-
VkFramebuffer
-
VkRenderPass
-
VkCommandBuffer
-
VkCommandPool
-
VkDeviceMemory
-
VkDescriptorSet
-
VkObjectTableNVX
-
VkIndirectCommandsLayoutNVX
Destroying these objects will move any command buffers that are in the recording or executable state, and are using those objects, to the invalid state.
The following Vulkan objects must not be destroyed while any queue is executing commands that use the object:
-
VkFence
-
VkSemaphore
-
VkCommandBuffer
-
VkCommandPool
In general, objects can be destroyed or freed in any order, even if the object being freed is involved in the use of another object (e.g. use of a resource in a view, use of a view in a descriptor set, use of an object in a command buffer, binding of a memory allocation to a resource), as long as any object that uses the freed object is not further used in any way except to be destroyed or to be reset in such a way that it no longer uses the other object (such as resetting a command buffer). If the object has been reset, then it can be used as if it never used the freed object. An exception to this is when there is a parent/child relationship between objects. In this case, the application must not destroy a parent object before its children, except when the parent is explicitly defined to free its children when it is destroyed (e.g. for pool objects, as defined below).
VkCommandPool
objects are parents of VkCommandBuffer
objects.
VkDescriptorPool
objects are parents of VkDescriptorSet
objects.
VkDevice
objects are parents of many object types (all that take a
VkDevice
as a parameter to their creation).
The following Vulkan objects have specific restrictions for when they can be destroyed:
-
VkQueue
objects cannot be explicitly destroyed. Instead, they are implicitly destroyed when theVkDevice
object they are retrieved from is destroyed. -
Destroying a pool object implicitly frees all objects allocated from that pool. Specifically, destroying
VkCommandPool
frees allVkCommandBuffer
objects that were allocated from it, and destroyingVkDescriptorPool
frees allVkDescriptorSet
objects that were allocated from it. -
VkDevice
objects can be destroyed when allVkQueue
objects retrieved from them are idle, and all objects created from them have been destroyed. This includes the following objects:-
VkFence
-
VkSemaphore
-
VkEvent
-
VkQueryPool
-
VkBuffer
-
VkBufferView
-
VkImage
-
VkImageView
-
VkShaderModule
-
VkPipelineCache
-
VkPipeline
-
VkPipelineLayout
-
VkSampler
-
VkDescriptorSetLayout
-
VkDescriptorPool
-
VkFramebuffer
-
VkRenderPass
-
VkCommandPool
-
VkCommandBuffer
-
VkDeviceMemory
-
VkValidationCacheEXT
-
-
VkPhysicalDevice
objects cannot be explicitly destroyed. Instead, they are implicitly destroyed when theVkInstance
object they are retrieved from is destroyed. -
VkInstance
objects can be destroyed once allVkDevice
objects created from any of itsVkPhysicalDevice
objects have been destroyed.
2.3.2. External Object Handles
As defined above, the scope of object handles created or allocated from a
VkDevice
is limited to that logical device.
Objects which are not in scope are said to be external.
To bring an external object into scope, an external handle must be exported
from the object in the source scope and imported into the destination scope.
Note
The scope of external handles and their associated resources may vary according to their type, but they can generally be shared across process and API boundaries. |
2.4. Application Binary Interface
The mechanism by which Vulkan is made available to applications is platform- or implementation- defined. On many platforms the C interface described in this Specification is provided by a shared library. Since shared libraries can be changed independently of the applications that use them, they present particular compatibility challenges, and this Specification places some requirements on them.
Shared library implementations must use the default Application Binary
Interface (ABI) of the standard C compiler for the platform, or provide
customized API headers that cause application code to use the
implementation’s non-default ABI.
An ABI in this context means the size, alignment, and layout of C data
types; the procedure calling convention; and the naming convention for
shared library symbols corresponding to C functions.
Customizing the calling convention for a platform is usually accomplished by
defining calling
convention macros appropriately in vk_platform.h
.
On platforms where Vulkan is provided as a shared library, library symbols beginning with “vk” and followed by a digit or uppercase letter are reserved for use by the implementation. Applications which use Vulkan must not provide definitions of these symbols. This allows the Vulkan shared library to be updated with additional symbols for new API versions or extensions without causing symbol conflicts with existing applications.
Shared library implementations should provide library symbols for commands in the highest version of this Specification they support, and for Window System Integration extensions relevant to the platform. They may also provide library symbols for commands defined by additional extensions.
Note
These requirements and recommendations are intended to allow implementors to take advantage of platform-specific conventions for SDKs, ABIs, library versioning mechanisms, etc. while still minimizing the code changes necessary to port applications or libraries between platforms. Platform vendors, or providers of the de facto standard Vulkan shared library for a platform, are encouraged to document what symbols the shared library provides and how it will be versioned when new symbols are added. Applications should only rely on shared library symbols for commands in the minimum core version required by the application. vkGetInstanceProcAddr and vkGetDeviceProcAddr should be used to obtain function pointers for commands in core versions beyond the application’s minimum required version. |
2.5. Command Syntax and Duration
The Specification describes Vulkan commands as functions or procedures using C99 syntax. Language bindings for other languages such as C++ and JavaScript may allow for stricter parameter passing, or object-oriented interfaces.
Vulkan uses the standard C types for the base type of scalar parameters
(e.g. types from <stdint.h>
), with exceptions described below, or
elsewhere in the text when appropriate:
VkBool32
represents boolean True
and False
values, since C does
not have a sufficiently portable built-in boolean type:
typedef uint32_t VkBool32;
VK_TRUE
represents a boolean True (integer 1) value, and
VK_FALSE
a boolean False (integer 0) value.
All values returned from a Vulkan implementation in a VkBool32
will
be either VK_TRUE
or VK_FALSE
.
Applications must not pass any other values than VK_TRUE
or
VK_FALSE
into a Vulkan implementation where a VkBool32
is
expected.
VkDeviceSize
represents device memory size and offset values:
typedef uint64_t VkDeviceSize;
Commands that create Vulkan objects are of the form vkCreate*
and take
Vk*CreateInfo
structures with the parameters needed to create the
object.
These Vulkan objects are destroyed with commands of the form
vkDestroy*
.
The last in-parameter to each command that creates or destroys a Vulkan
object is pAllocator
.
The pAllocator
parameter can be set to a non-NULL
value such that
allocations for the given object are delegated to an application provided
callback; refer to the Memory Allocation chapter for
further details.
Commands that allocate Vulkan objects owned by pool objects are of the form
vkAllocate*
, and take Vk*AllocateInfo
structures.
These Vulkan objects are freed with commands of the form vkFree*
.
These objects do not take allocators; if host memory is needed, they will
use the allocator that was specified when their parent pool was created.
Commands are recorded into a command buffer by calling API commands of the
form vkCmd*
.
Each such command may have different restrictions on where it can be used:
in a primary and/or secondary command buffer, inside and/or outside a render
pass, and in one or more of the supported queue types.
These restrictions are documented together with the definition of each such
command.
The duration of a Vulkan command refers to the interval between calling the command and its return to the caller.
2.5.1. Lifetime of Retrieved Results
Information is retrieved from the implementation with commands of the form
vkGet*
and vkEnumerate*
.
Unless otherwise specified for an individual command, the results are invariant; that is, they will remain unchanged when retrieved again by calling the same command with the same parameters, so long as those parameters themselves all remain valid.
2.6. Threading Behavior
Vulkan is intended to provide scalable performance when used on multiple host threads. All commands support being called concurrently from multiple threads, but certain parameters, or components of parameters are defined to be externally synchronized. This means that the caller must guarantee that no more than one thread is using such a parameter at a given time.
More precisely, Vulkan commands use simple stores to update the state of Vulkan objects. A parameter declared as externally synchronized may have its contents updated at any time during the host execution of the command. If two commands operate on the same object and at least one of the commands declares the object to be externally synchronized, then the caller must guarantee not only that the commands do not execute simultaneously, but also that the two commands are separated by an appropriate memory barrier (if needed).
Note
Memory barriers are particularly relevant for hosts based on the ARM CPU architecture, which is more weakly ordered than many developers are accustomed to from x86/x64 programming. Fortunately, most higher-level synchronization primitives (like the pthread library) perform memory barriers as a part of mutual exclusion, so mutexing Vulkan objects via these primitives will have the desired effect. |
Similarly the application must avoid any potential data hazard of
application-owned memory that has its
ownership temporarily acquired
by a Vulkan command.
While the ownership of application-owned memory remains acquired by a
command the implementation may read the memory at any point, and it may
write non-const
qualified memory at any point.
Parameters referring to non-const
qualified application-owned memory
are not marked explicitly as externally synchronized in the Specification.
Many object types are immutable, meaning the objects cannot change once
they have been created.
These types of objects never need external synchronization, except that they
must not be destroyed while they are in use on another thread.
In certain special cases mutable object parameters are internally
synchronized, making external synchronization unnecessary.
One example of this is the use of a VkPipelineCache
in
vkCreateGraphicsPipelines
and vkCreateComputePipelines
, where
external synchronization around such a heavyweight command would be
impractical.
The implementation must internally synchronize the cache in this example,
and may be able to do so in the form of a much finer-grained mutex around
the command.
Any command parameters that are not labeled as externally synchronized are
either not mutated by the command or are internally synchronized.
Additionally, certain objects related to a command’s parameters (e.g.
command pools and descriptor pools) may be affected by a command, and must
also be externally synchronized.
These implicit parameters are documented as described below.
Parameters of commands that are externally synchronized are listed below.
There are also a few instances where a command can take in a user allocated list whose contents are externally synchronized parameters. In these cases, the caller must guarantee that at most one thread is using a given element within the list at a given time. These parameters are listed below.
In addition, there are some implicit parameters that need to be externally
synchronized.
For example, all commandBuffer
parameters that need to be externally
synchronized imply that the commandPool
that was passed in when
creating that command buffer also needs to be externally synchronized.
The implicit parameters and their associated object are listed below.
2.7. Errors
Vulkan is a layered API. The lowest layer is the core Vulkan layer, as defined by this Specification. The application can use additional layers above the core for debugging, validation, and other purposes.
One of the core principles of Vulkan is that building and submitting command buffers should be highly efficient. Thus error checking and validation of state in the core layer is minimal, although more rigorous validation can be enabled through the use of layers.
The core layer assumes applications are using the API correctly. Except as documented elsewhere in the Specification, the behavior of the core layer to an application using the API incorrectly is undefined, and may include program termination. However, implementations must ensure that incorrect usage by an application does not affect the integrity of the operating system, the Vulkan implementation, or other Vulkan client applications in the system. In particular, any guarantees made by an operating system about whether memory from one process can be visible to another process or not must not be violated by a Vulkan implementation for any memory allocation. Vulkan implementations are not required to make additional security or integrity guarantees beyond those provided by the OS unless explicitly directed by the application’s use of a particular feature or extension (e.g. via robust buffer access).
Note
For instance, if an operating system guarantees that data in all its memory allocations are set to zero when newly allocated, the Vulkan implementation must make the same guarantees for any allocations it controls (e.g. VkDeviceMemory). |
Applications can request stronger robustness guarantees by enabling the
robustBufferAccess
feature as described in Features, Limits, and Formats.
Validation of correct API usage is left to validation layers. Applications should be developed with validation layers enabled, to help catch and eliminate errors. Once validated, released applications should not enable validation layers by default.
2.7.1. Valid Usage
Valid usage defines a set of conditions which must be met in order to achieve well-defined run-time behavior in an application. These conditions depend only on Vulkan state, and the parameters or objects whose usage is constrained by the condition.
Some valid usage conditions have dependencies on run-time limits or feature availability. It is possible to validate these conditions against Vulkan’s minimum supported values for these limits and features, or some subset of other known values.
Valid usage conditions do not cover conditions where well-defined behavior (including returning an error code) exists.
Valid usage conditions should apply to the command or structure where complete information about the condition would be known during execution of an application. This is such that a validation layer or linter can be written directly against these statements at the point they are specified.
Note
This does lead to some non-obvious places for valid usage statements. For instance, the valid values for a structure might depend on a separate value in the calling command. In this case, the structure itself will not reference this valid usage as it is impossible to determine validity from the structure that it is invalid - instead this valid usage would be attached to the calling command. Another example is draw state - the state setters are independent, and can cause a legitimately invalid state configuration between draw calls; so the valid usage statements are attached to the place where all state needs to be valid - at the draw command. |
Valid usage conditions are described in a block labelled “Valid Usage” following each command or structure they apply to.
2.7.2. Implicit Valid Usage
Some valid usage conditions apply to all commands and structures in the API, unless explicitly denoted otherwise for a specific command or structure. These conditions are considered implicit, and are described in a block labelled “Valid Usage (Implicit)” following each command or structure they apply to. Implicit valid usage conditions are described in detail below.
Valid Usage for Object Handles
Any input parameter to a command that is an object handle must be a valid object handle, unless otherwise specified. An object handle is valid if:
-
It has been created or allocated by a previous, successful call to the API. Such calls are noted in the Specification.
-
It has not been deleted or freed by a previous call to the API. Such calls are noted in the Specification.
-
Any objects used by that object, either as part of creation or execution, must also be valid.
The reserved values VK_NULL_HANDLE and NULL
can be used in place of
valid non-dispatchable handles and dispatchable handles, respectively, when
explicitly called out in the Specification.
Any command that creates an object successfully must not return these
values.
It is valid to pass these values to vkDestroy*
or vkFree*
commands, which will silently ignore these values.
Valid Usage for Pointers
Any parameter that is a pointer must be a valid pointer only if it is explicitly called out by a Valid Usage statement.
A pointer is “valid” if it points at memory containing values of the number and type(s) expected by the command, and all fundamental types accessed through the pointer (e.g. as elements of an array or as members of a structure) satisfy the alignment requirements of the host processor.
Valid Usage for Strings
Any parameter that is a pointer to char
must be a finite sequence of
values terminated by a null character, or if explicitly called out in the
Specification, can be NULL
.
Valid Usage for Enumerated Types
Any parameter of an enumerated type must be a valid enumerant for that type. A enumerant is valid if:
-
The enumerant is defined as part of the enumerated type.
-
The enumerant is not one of the special values defined for the enumerated type, which are suffixed with
_BEGIN_RANGE
,_END_RANGE
,_RANGE_SIZE
or_MAX_ENUM
1.- 1
-
The meaning of these special tokens is not exposed in the Vulkan Specification. They are not part of the API, and they should not be used by applications. Their original intended use was for internal consumption by Vulkan implementations. Even that use will no longer be supported in the future, but they will be retained for backwards compatibility reasons.
Any enumerated type returned from a query command or otherwise output from Vulkan to the application must not have a reserved value. Reserved values are values not defined by any extension for that enumerated type.
Note
This language is intended to accommodate cases such as “hidden” extensions known only to driver internals, or layers enabling extensions without knowledge of the application, without allowing return of values not defined by any extension. |
Valid Usage for Flags
A collection of flags is represented by a bitmask using the type
VkFlags
:
typedef uint32_t VkFlags;
Bitmasks are passed to many commands and structures to compactly represent
options, but VkFlags
is not used directly in the API.
Instead, a Vk*Flags
type which is an alias of VkFlags
, and
whose name matches the corresponding Vk*FlagBits
that are valid for
that type, is used.
Any Vk*Flags
member or parameter used in the API as an input must be
a valid combination of bit flags.
A valid combination is either zero or the bitwise OR of valid bit flags.
A bit flag is valid if:
-
The bit flag is defined as part of the
Vk*FlagBits
type, where the bits type is obtained by taking the flag type and replacing the trailingFlags
withFlagBits
. For example, a flag value of type VkColorComponentFlags must contain only bit flags defined by VkColorComponentFlagBits. -
The flag is allowed in the context in which it is being used. For example, in some cases, certain bit flags or combinations of bit flags are mutually exclusive.
Any Vk*Flags
member or parameter returned from a query command or
otherwise output from Vulkan to the application may contain bit flags
undefined in its corresponding Vk*FlagBits
type.
An application cannot rely on the state of these unspecified bits.
Valid Usage for Structure Types
Any parameter that is a structure containing a sType
member must have
a value of sType
which is a valid VkStructureType value matching
the type of the structure.
Structure types supported by the Vulkan API include:
typedef enum VkStructureType {
VK_STRUCTURE_TYPE_APPLICATION_INFO = 0,
VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO = 1,
VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO = 2,
VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO = 3,
VK_STRUCTURE_TYPE_SUBMIT_INFO = 4,
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO = 5,
VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE = 6,
VK_STRUCTURE_TYPE_BIND_SPARSE_INFO = 7,
VK_STRUCTURE_TYPE_FENCE_CREATE_INFO = 8,
VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO = 9,
VK_STRUCTURE_TYPE_EVENT_CREATE_INFO = 10,
VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO = 11,
VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO = 12,
VK_STRUCTURE_TYPE_BUFFER_VIEW_CREATE_INFO = 13,
VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO = 14,
VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO = 15,
VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO = 16,
VK_STRUCTURE_TYPE_PIPELINE_CACHE_CREATE_INFO = 17,
VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO = 18,
VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO = 19,
VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO = 20,
VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_STATE_CREATE_INFO = 21,
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO = 22,
VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO = 23,
VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO = 24,
VK_STRUCTURE_TYPE_PIPELINE_DEPTH_STENCIL_STATE_CREATE_INFO = 25,
VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO = 26,
VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO = 27,
VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO = 28,
VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO = 29,
VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO = 30,
VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO = 31,
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO = 32,
VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO = 33,
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO = 34,
VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET = 35,
VK_STRUCTURE_TYPE_COPY_DESCRIPTOR_SET = 36,
VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO = 37,
VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO = 38,
VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO = 39,
VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO = 40,
VK_STRUCTURE_TYPE_COMMAND_BUFFER_INHERITANCE_INFO = 41,
VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO = 42,
VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO = 43,
VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER = 44,
VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER = 45,
VK_STRUCTURE_TYPE_MEMORY_BARRIER = 46,
VK_STRUCTURE_TYPE_LOADER_INSTANCE_CREATE_INFO = 47,
VK_STRUCTURE_TYPE_LOADER_DEVICE_CREATE_INFO = 48,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SUBGROUP_PROPERTIES = 1000094000,
VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_INFO = 1000157000,
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO = 1000157001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES = 1000083000,
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS = 1000127000,
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO = 1000127001,
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_FLAGS_INFO = 1000060000,
VK_STRUCTURE_TYPE_DEVICE_GROUP_RENDER_PASS_BEGIN_INFO = 1000060003,
VK_STRUCTURE_TYPE_DEVICE_GROUP_COMMAND_BUFFER_BEGIN_INFO = 1000060004,
VK_STRUCTURE_TYPE_DEVICE_GROUP_SUBMIT_INFO = 1000060005,
VK_STRUCTURE_TYPE_DEVICE_GROUP_BIND_SPARSE_INFO = 1000060006,
VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_DEVICE_GROUP_INFO = 1000060013,
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_DEVICE_GROUP_INFO = 1000060014,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_GROUP_PROPERTIES = 1000070000,
VK_STRUCTURE_TYPE_DEVICE_GROUP_DEVICE_CREATE_INFO = 1000070001,
VK_STRUCTURE_TYPE_BUFFER_MEMORY_REQUIREMENTS_INFO_2 = 1000146000,
VK_STRUCTURE_TYPE_IMAGE_MEMORY_REQUIREMENTS_INFO_2 = 1000146001,
VK_STRUCTURE_TYPE_IMAGE_SPARSE_MEMORY_REQUIREMENTS_INFO_2 = 1000146002,
VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2 = 1000146003,
VK_STRUCTURE_TYPE_SPARSE_IMAGE_MEMORY_REQUIREMENTS_2 = 1000146004,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2 = 1000059000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2 = 1000059001,
VK_STRUCTURE_TYPE_FORMAT_PROPERTIES_2 = 1000059002,
VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2 = 1000059003,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2 = 1000059004,
VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2 = 1000059005,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MEMORY_PROPERTIES_2 = 1000059006,
VK_STRUCTURE_TYPE_SPARSE_IMAGE_FORMAT_PROPERTIES_2 = 1000059007,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SPARSE_IMAGE_FORMAT_INFO_2 = 1000059008,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_POINT_CLIPPING_PROPERTIES = 1000117000,
VK_STRUCTURE_TYPE_RENDER_PASS_INPUT_ATTACHMENT_ASPECT_CREATE_INFO = 1000117001,
VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO = 1000117002,
VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_DOMAIN_ORIGIN_STATE_CREATE_INFO = 1000117003,
VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO = 1000053000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_FEATURES = 1000053001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_PROPERTIES = 1000053002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VARIABLE_POINTER_FEATURES = 1000120000,
VK_STRUCTURE_TYPE_PROTECTED_SUBMIT_INFO = 1000145000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROTECTED_MEMORY_FEATURES = 1000145001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROTECTED_MEMORY_PROPERTIES = 1000145002,
VK_STRUCTURE_TYPE_DEVICE_QUEUE_INFO_2 = 1000145003,
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_CREATE_INFO = 1000156000,
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_INFO = 1000156001,
VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO = 1000156002,
VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO = 1000156003,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLER_YCBCR_CONVERSION_FEATURES = 1000156004,
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_IMAGE_FORMAT_PROPERTIES = 1000156005,
VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO = 1000085000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_IMAGE_FORMAT_INFO = 1000071000,
VK_STRUCTURE_TYPE_EXTERNAL_IMAGE_FORMAT_PROPERTIES = 1000071001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_BUFFER_INFO = 1000071002,
VK_STRUCTURE_TYPE_EXTERNAL_BUFFER_PROPERTIES = 1000071003,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES = 1000071004,
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_BUFFER_CREATE_INFO = 1000072000,
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO = 1000072001,
VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO = 1000072002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_FENCE_INFO = 1000112000,
VK_STRUCTURE_TYPE_EXTERNAL_FENCE_PROPERTIES = 1000112001,
VK_STRUCTURE_TYPE_EXPORT_FENCE_CREATE_INFO = 1000113000,
VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_CREATE_INFO = 1000077000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_SEMAPHORE_INFO = 1000076000,
VK_STRUCTURE_TYPE_EXTERNAL_SEMAPHORE_PROPERTIES = 1000076001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MAINTENANCE_3_PROPERTIES = 1000168000,
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_SUPPORT = 1000168001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_DRAW_PARAMETER_FEATURES = 1000063000,
VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR = 1000001000,
VK_STRUCTURE_TYPE_PRESENT_INFO_KHR = 1000001001,
VK_STRUCTURE_TYPE_DEVICE_GROUP_PRESENT_CAPABILITIES_KHR = 1000060007,
VK_STRUCTURE_TYPE_IMAGE_SWAPCHAIN_CREATE_INFO_KHR = 1000060008,
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_SWAPCHAIN_INFO_KHR = 1000060009,
VK_STRUCTURE_TYPE_ACQUIRE_NEXT_IMAGE_INFO_KHR = 1000060010,
VK_STRUCTURE_TYPE_DEVICE_GROUP_PRESENT_INFO_KHR = 1000060011,
VK_STRUCTURE_TYPE_DEVICE_GROUP_SWAPCHAIN_CREATE_INFO_KHR = 1000060012,
VK_STRUCTURE_TYPE_DISPLAY_MODE_CREATE_INFO_KHR = 1000002000,
VK_STRUCTURE_TYPE_DISPLAY_SURFACE_CREATE_INFO_KHR = 1000002001,
VK_STRUCTURE_TYPE_DISPLAY_PRESENT_INFO_KHR = 1000003000,
VK_STRUCTURE_TYPE_XLIB_SURFACE_CREATE_INFO_KHR = 1000004000,
VK_STRUCTURE_TYPE_XCB_SURFACE_CREATE_INFO_KHR = 1000005000,
VK_STRUCTURE_TYPE_WAYLAND_SURFACE_CREATE_INFO_KHR = 1000006000,
VK_STRUCTURE_TYPE_ANDROID_SURFACE_CREATE_INFO_KHR = 1000008000,
VK_STRUCTURE_TYPE_WIN32_SURFACE_CREATE_INFO_KHR = 1000009000,
VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT = 1000011000,
VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_RASTERIZATION_ORDER_AMD = 1000018000,
VK_STRUCTURE_TYPE_DEBUG_MARKER_OBJECT_NAME_INFO_EXT = 1000022000,
VK_STRUCTURE_TYPE_DEBUG_MARKER_OBJECT_TAG_INFO_EXT = 1000022001,
VK_STRUCTURE_TYPE_DEBUG_MARKER_MARKER_INFO_EXT = 1000022002,
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_IMAGE_CREATE_INFO_NV = 1000026000,
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_BUFFER_CREATE_INFO_NV = 1000026001,
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_MEMORY_ALLOCATE_INFO_NV = 1000026002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_FEATURES_EXT = 1000028000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_PROPERTIES_EXT = 1000028001,
VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_STREAM_CREATE_INFO_EXT = 1000028002,
VK_STRUCTURE_TYPE_TEXTURE_LOD_GATHER_FORMAT_PROPERTIES_AMD = 1000041000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_CORNER_SAMPLED_IMAGE_FEATURES_NV = 1000050000,
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO_NV = 1000056000,
VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO_NV = 1000056001,
VK_STRUCTURE_TYPE_IMPORT_MEMORY_WIN32_HANDLE_INFO_NV = 1000057000,
VK_STRUCTURE_TYPE_EXPORT_MEMORY_WIN32_HANDLE_INFO_NV = 1000057001,
VK_STRUCTURE_TYPE_WIN32_KEYED_MUTEX_ACQUIRE_RELEASE_INFO_NV = 1000058000,
VK_STRUCTURE_TYPE_VALIDATION_FLAGS_EXT = 1000061000,
VK_STRUCTURE_TYPE_VI_SURFACE_CREATE_INFO_NN = 1000062000,
VK_STRUCTURE_TYPE_IMAGE_VIEW_ASTC_DECODE_MODE_EXT = 1000067000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ASTC_DECODE_FEATURES_EXT = 1000067001,
VK_STRUCTURE_TYPE_IMPORT_MEMORY_WIN32_HANDLE_INFO_KHR = 1000073000,
VK_STRUCTURE_TYPE_EXPORT_MEMORY_WIN32_HANDLE_INFO_KHR = 1000073001,
VK_STRUCTURE_TYPE_MEMORY_WIN32_HANDLE_PROPERTIES_KHR = 1000073002,
VK_STRUCTURE_TYPE_MEMORY_GET_WIN32_HANDLE_INFO_KHR = 1000073003,
VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR = 1000074000,
VK_STRUCTURE_TYPE_MEMORY_FD_PROPERTIES_KHR = 1000074001,
VK_STRUCTURE_TYPE_MEMORY_GET_FD_INFO_KHR = 1000074002,
VK_STRUCTURE_TYPE_WIN32_KEYED_MUTEX_ACQUIRE_RELEASE_INFO_KHR = 1000075000,
VK_STRUCTURE_TYPE_IMPORT_SEMAPHORE_WIN32_HANDLE_INFO_KHR = 1000078000,
VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_WIN32_HANDLE_INFO_KHR = 1000078001,
VK_STRUCTURE_TYPE_D3D12_FENCE_SUBMIT_INFO_KHR = 1000078002,
VK_STRUCTURE_TYPE_SEMAPHORE_GET_WIN32_HANDLE_INFO_KHR = 1000078003,
VK_STRUCTURE_TYPE_IMPORT_SEMAPHORE_FD_INFO_KHR = 1000079000,
VK_STRUCTURE_TYPE_SEMAPHORE_GET_FD_INFO_KHR = 1000079001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PUSH_DESCRIPTOR_PROPERTIES_KHR = 1000080000,
VK_STRUCTURE_TYPE_COMMAND_BUFFER_INHERITANCE_CONDITIONAL_RENDERING_INFO_EXT = 1000081000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_CONDITIONAL_RENDERING_FEATURES_EXT = 1000081001,
VK_STRUCTURE_TYPE_CONDITIONAL_RENDERING_BEGIN_INFO_EXT = 1000081002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT16_INT8_FEATURES_KHR = 1000082000,
VK_STRUCTURE_TYPE_PRESENT_REGIONS_KHR = 1000084000,
VK_STRUCTURE_TYPE_OBJECT_TABLE_CREATE_INFO_NVX = 1000086000,
VK_STRUCTURE_TYPE_INDIRECT_COMMANDS_LAYOUT_CREATE_INFO_NVX = 1000086001,
VK_STRUCTURE_TYPE_CMD_PROCESS_COMMANDS_INFO_NVX = 1000086002,
VK_STRUCTURE_TYPE_CMD_RESERVE_SPACE_FOR_COMMANDS_INFO_NVX = 1000086003,
VK_STRUCTURE_TYPE_DEVICE_GENERATED_COMMANDS_LIMITS_NVX = 1000086004,
VK_STRUCTURE_TYPE_DEVICE_GENERATED_COMMANDS_FEATURES_NVX = 1000086005,
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_W_SCALING_STATE_CREATE_INFO_NV = 1000087000,
VK_STRUCTURE_TYPE_SURFACE_CAPABILITIES_2_EXT = 1000090000,
VK_STRUCTURE_TYPE_DISPLAY_POWER_INFO_EXT = 1000091000,
VK_STRUCTURE_TYPE_DEVICE_EVENT_INFO_EXT = 1000091001,
VK_STRUCTURE_TYPE_DISPLAY_EVENT_INFO_EXT = 1000091002,
VK_STRUCTURE_TYPE_SWAPCHAIN_COUNTER_CREATE_INFO_EXT = 1000091003,
VK_STRUCTURE_TYPE_PRESENT_TIMES_INFO_GOOGLE = 1000092000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_PER_VIEW_ATTRIBUTES_PROPERTIES_NVX = 1000097000,
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_SWIZZLE_STATE_CREATE_INFO_NV = 1000098000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DISCARD_RECTANGLE_PROPERTIES_EXT = 1000099000,
VK_STRUCTURE_TYPE_PIPELINE_DISCARD_RECTANGLE_STATE_CREATE_INFO_EXT = 1000099001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_CONSERVATIVE_RASTERIZATION_PROPERTIES_EXT = 1000101000,
VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_CONSERVATIVE_STATE_CREATE_INFO_EXT = 1000101001,
VK_STRUCTURE_TYPE_HDR_METADATA_EXT = 1000105000,
VK_STRUCTURE_TYPE_ATTACHMENT_DESCRIPTION_2_KHR = 1000109000,
VK_STRUCTURE_TYPE_ATTACHMENT_REFERENCE_2_KHR = 1000109001,
VK_STRUCTURE_TYPE_SUBPASS_DESCRIPTION_2_KHR = 1000109002,
VK_STRUCTURE_TYPE_SUBPASS_DEPENDENCY_2_KHR = 1000109003,
VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO_2_KHR = 1000109004,
VK_STRUCTURE_TYPE_SUBPASS_BEGIN_INFO_KHR = 1000109005,
VK_STRUCTURE_TYPE_SUBPASS_END_INFO_KHR = 1000109006,
VK_STRUCTURE_TYPE_SHARED_PRESENT_SURFACE_CAPABILITIES_KHR = 1000111000,
VK_STRUCTURE_TYPE_IMPORT_FENCE_WIN32_HANDLE_INFO_KHR = 1000114000,
VK_STRUCTURE_TYPE_EXPORT_FENCE_WIN32_HANDLE_INFO_KHR = 1000114001,
VK_STRUCTURE_TYPE_FENCE_GET_WIN32_HANDLE_INFO_KHR = 1000114002,
VK_STRUCTURE_TYPE_IMPORT_FENCE_FD_INFO_KHR = 1000115000,
VK_STRUCTURE_TYPE_FENCE_GET_FD_INFO_KHR = 1000115001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SURFACE_INFO_2_KHR = 1000119000,
VK_STRUCTURE_TYPE_SURFACE_CAPABILITIES_2_KHR = 1000119001,
VK_STRUCTURE_TYPE_SURFACE_FORMAT_2_KHR = 1000119002,
VK_STRUCTURE_TYPE_DISPLAY_PROPERTIES_2_KHR = 1000121000,
VK_STRUCTURE_TYPE_DISPLAY_PLANE_PROPERTIES_2_KHR = 1000121001,
VK_STRUCTURE_TYPE_DISPLAY_MODE_PROPERTIES_2_KHR = 1000121002,
VK_STRUCTURE_TYPE_DISPLAY_PLANE_INFO_2_KHR = 1000121003,
VK_STRUCTURE_TYPE_DISPLAY_PLANE_CAPABILITIES_2_KHR = 1000121004,
VK_STRUCTURE_TYPE_IOS_SURFACE_CREATE_INFO_MVK = 1000122000,
VK_STRUCTURE_TYPE_MACOS_SURFACE_CREATE_INFO_MVK = 1000123000,
VK_STRUCTURE_TYPE_DEBUG_UTILS_OBJECT_NAME_INFO_EXT = 1000128000,
VK_STRUCTURE_TYPE_DEBUG_UTILS_OBJECT_TAG_INFO_EXT = 1000128001,
VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT = 1000128002,
VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CALLBACK_DATA_EXT = 1000128003,
VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CREATE_INFO_EXT = 1000128004,
VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_USAGE_ANDROID = 1000129000,
VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_PROPERTIES_ANDROID = 1000129001,
VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_FORMAT_PROPERTIES_ANDROID = 1000129002,
VK_STRUCTURE_TYPE_IMPORT_ANDROID_HARDWARE_BUFFER_INFO_ANDROID = 1000129003,
VK_STRUCTURE_TYPE_MEMORY_GET_ANDROID_HARDWARE_BUFFER_INFO_ANDROID = 1000129004,
VK_STRUCTURE_TYPE_EXTERNAL_FORMAT_ANDROID = 1000129005,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLER_FILTER_MINMAX_PROPERTIES_EXT = 1000130000,
VK_STRUCTURE_TYPE_SAMPLER_REDUCTION_MODE_CREATE_INFO_EXT = 1000130001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_INLINE_UNIFORM_BLOCK_FEATURES_EXT = 1000138000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_INLINE_UNIFORM_BLOCK_PROPERTIES_EXT = 1000138001,
VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET_INLINE_UNIFORM_BLOCK_EXT = 1000138002,
VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_INLINE_UNIFORM_BLOCK_CREATE_INFO_EXT = 1000138003,
VK_STRUCTURE_TYPE_SAMPLE_LOCATIONS_INFO_EXT = 1000143000,
VK_STRUCTURE_TYPE_RENDER_PASS_SAMPLE_LOCATIONS_BEGIN_INFO_EXT = 1000143001,
VK_STRUCTURE_TYPE_PIPELINE_SAMPLE_LOCATIONS_STATE_CREATE_INFO_EXT = 1000143002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT = 1000143003,
VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT = 1000143004,
VK_STRUCTURE_TYPE_IMAGE_FORMAT_LIST_CREATE_INFO_KHR = 1000147000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_BLEND_OPERATION_ADVANCED_FEATURES_EXT = 1000148000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_BLEND_OPERATION_ADVANCED_PROPERTIES_EXT = 1000148001,
VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_ADVANCED_STATE_CREATE_INFO_EXT = 1000148002,
VK_STRUCTURE_TYPE_PIPELINE_COVERAGE_TO_COLOR_STATE_CREATE_INFO_NV = 1000149000,
VK_STRUCTURE_TYPE_PIPELINE_COVERAGE_MODULATION_STATE_CREATE_INFO_NV = 1000152000,
VK_STRUCTURE_TYPE_DRM_FORMAT_MODIFIER_PROPERTIES_LIST_EXT = 1000158000,
VK_STRUCTURE_TYPE_DRM_FORMAT_MODIFIER_PROPERTIES_EXT = 1000158001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_DRM_FORMAT_MODIFIER_INFO_EXT = 1000158002,
VK_STRUCTURE_TYPE_IMAGE_DRM_FORMAT_MODIFIER_LIST_CREATE_INFO_EXT = 1000158003,
VK_STRUCTURE_TYPE_IMAGE_DRM_FORMAT_MODIFIER_EXPLICIT_CREATE_INFO_EXT = 1000158004,
VK_STRUCTURE_TYPE_IMAGE_DRM_FORMAT_MODIFIER_PROPERTIES_EXT = 1000158005,
VK_STRUCTURE_TYPE_VALIDATION_CACHE_CREATE_INFO_EXT = 1000160000,
VK_STRUCTURE_TYPE_SHADER_MODULE_VALIDATION_CACHE_CREATE_INFO_EXT = 1000160001,
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_BINDING_FLAGS_CREATE_INFO_EXT = 1000161000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DESCRIPTOR_INDEXING_FEATURES_EXT = 1000161001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DESCRIPTOR_INDEXING_PROPERTIES_EXT = 1000161002,
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_VARIABLE_DESCRIPTOR_COUNT_ALLOCATE_INFO_EXT = 1000161003,
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_VARIABLE_DESCRIPTOR_COUNT_LAYOUT_SUPPORT_EXT = 1000161004,
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_SHADING_RATE_IMAGE_STATE_CREATE_INFO_NV = 1000164000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADING_RATE_IMAGE_FEATURES_NV = 1000164001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADING_RATE_IMAGE_PROPERTIES_NV = 1000164002,
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_COARSE_SAMPLE_ORDER_STATE_CREATE_INFO_NV = 1000164005,
VK_STRUCTURE_TYPE_RAY_TRACING_PIPELINE_CREATE_INFO_NV = 1000165000,
VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_CREATE_INFO_NV = 1000165001,
VK_STRUCTURE_TYPE_GEOMETRY_NV = 1000165003,
VK_STRUCTURE_TYPE_GEOMETRY_TRIANGLES_NV = 1000165004,
VK_STRUCTURE_TYPE_GEOMETRY_AABB_NV = 1000165005,
VK_STRUCTURE_TYPE_BIND_ACCELERATION_STRUCTURE_MEMORY_INFO_NV = 1000165006,
VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET_ACCELERATION_STRUCTURE_NV = 1000165007,
VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_INFO_NV = 1000165008,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_RAY_TRACING_PROPERTIES_NV = 1000165009,
VK_STRUCTURE_TYPE_RAY_TRACING_SHADER_GROUP_CREATE_INFO_NV = 1000165011,
VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_INFO_NV = 1000165012,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_REPRESENTATIVE_FRAGMENT_TEST_FEATURES_NV = 1000166000,
VK_STRUCTURE_TYPE_PIPELINE_REPRESENTATIVE_FRAGMENT_TEST_STATE_CREATE_INFO_NV = 1000166001,
VK_STRUCTURE_TYPE_DEVICE_QUEUE_GLOBAL_PRIORITY_CREATE_INFO_EXT = 1000174000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_8BIT_STORAGE_FEATURES_KHR = 1000177000,
VK_STRUCTURE_TYPE_IMPORT_MEMORY_HOST_POINTER_INFO_EXT = 1000178000,
VK_STRUCTURE_TYPE_MEMORY_HOST_POINTER_PROPERTIES_EXT = 1000178001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_MEMORY_HOST_PROPERTIES_EXT = 1000178002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_ATOMIC_INT64_FEATURES_KHR = 1000180000,
VK_STRUCTURE_TYPE_CALIBRATED_TIMESTAMP_INFO_EXT = 1000184000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_AMD = 1000185000,
VK_STRUCTURE_TYPE_DEVICE_MEMORY_OVERALLOCATION_CREATE_INFO_AMD = 1000189000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_PROPERTIES_EXT = 1000190000,
VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_DIVISOR_STATE_CREATE_INFO_EXT = 1000190001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_FEATURES_EXT = 1000190002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DRIVER_PROPERTIES_KHR = 1000196000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR = 1000197000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_COMPUTE_SHADER_DERIVATIVES_FEATURES_NV = 1000201000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MESH_SHADER_FEATURES_NV = 1000202000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MESH_SHADER_PROPERTIES_NV = 1000202001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAGMENT_SHADER_BARYCENTRIC_FEATURES_NV = 1000203000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_IMAGE_FOOTPRINT_FEATURES_NV = 1000204000,
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_EXCLUSIVE_SCISSOR_STATE_CREATE_INFO_NV = 1000205000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXCLUSIVE_SCISSOR_FEATURES_NV = 1000205002,
VK_STRUCTURE_TYPE_CHECKPOINT_DATA_NV = 1000206000,
VK_STRUCTURE_TYPE_QUEUE_FAMILY_CHECKPOINT_PROPERTIES_NV = 1000206001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_MEMORY_MODEL_FEATURES_KHR = 1000211000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PCI_BUS_INFO_PROPERTIES_EXT = 1000212000,
VK_STRUCTURE_TYPE_IMAGEPIPE_SURFACE_CREATE_INFO_FUCHSIA = 1000214000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAGMENT_DENSITY_MAP_FEATURES_EXT = 1000218000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAGMENT_DENSITY_MAP_PROPERTIES_EXT = 1000218001,
VK_STRUCTURE_TYPE_RENDER_PASS_FRAGMENT_DENSITY_MAP_CREATE_INFO_EXT = 1000218002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SCALAR_BLOCK_LAYOUT_FEATURES_EXT = 1000221000,
VK_STRUCTURE_TYPE_IMAGE_STENCIL_USAGE_CREATE_INFO_EXT = 1000246000,
VK_STRUCTURE_TYPE_DEBUG_REPORT_CREATE_INFO_EXT = VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT,
VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_FEATURES_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_FEATURES,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_PROPERTIES_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_PROPERTIES,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2,
VK_STRUCTURE_TYPE_FORMAT_PROPERTIES_2_KHR = VK_STRUCTURE_TYPE_FORMAT_PROPERTIES_2,
VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2,
VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2_KHR = VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MEMORY_PROPERTIES_2_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MEMORY_PROPERTIES_2,
VK_STRUCTURE_TYPE_SPARSE_IMAGE_FORMAT_PROPERTIES_2_KHR = VK_STRUCTURE_TYPE_SPARSE_IMAGE_FORMAT_PROPERTIES_2,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SPARSE_IMAGE_FORMAT_INFO_2_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SPARSE_IMAGE_FORMAT_INFO_2,
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_FLAGS_INFO_KHR = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_FLAGS_INFO,
VK_STRUCTURE_TYPE_DEVICE_GROUP_RENDER_PASS_BEGIN_INFO_KHR = VK_STRUCTURE_TYPE_DEVICE_GROUP_RENDER_PASS_BEGIN_INFO,
VK_STRUCTURE_TYPE_DEVICE_GROUP_COMMAND_BUFFER_BEGIN_INFO_KHR = VK_STRUCTURE_TYPE_DEVICE_GROUP_COMMAND_BUFFER_BEGIN_INFO,
VK_STRUCTURE_TYPE_DEVICE_GROUP_SUBMIT_INFO_KHR = VK_STRUCTURE_TYPE_DEVICE_GROUP_SUBMIT_INFO,
VK_STRUCTURE_TYPE_DEVICE_GROUP_BIND_SPARSE_INFO_KHR = VK_STRUCTURE_TYPE_DEVICE_GROUP_BIND_SPARSE_INFO,
VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_DEVICE_GROUP_INFO_KHR = VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_DEVICE_GROUP_INFO,
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_DEVICE_GROUP_INFO_KHR = VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_DEVICE_GROUP_INFO,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_GROUP_PROPERTIES_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_GROUP_PROPERTIES,
VK_STRUCTURE_TYPE_DEVICE_GROUP_DEVICE_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_DEVICE_GROUP_DEVICE_CREATE_INFO,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_IMAGE_FORMAT_INFO_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_IMAGE_FORMAT_INFO,
VK_STRUCTURE_TYPE_EXTERNAL_IMAGE_FORMAT_PROPERTIES_KHR = VK_STRUCTURE_TYPE_EXTERNAL_IMAGE_FORMAT_PROPERTIES,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_BUFFER_INFO_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_BUFFER_INFO,
VK_STRUCTURE_TYPE_EXTERNAL_BUFFER_PROPERTIES_KHR = VK_STRUCTURE_TYPE_EXTERNAL_BUFFER_PROPERTIES,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES,
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_BUFFER_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_BUFFER_CREATE_INFO,
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO,
VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO_KHR = VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_SEMAPHORE_INFO_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_SEMAPHORE_INFO,
VK_STRUCTURE_TYPE_EXTERNAL_SEMAPHORE_PROPERTIES_KHR = VK_STRUCTURE_TYPE_EXTERNAL_SEMAPHORE_PROPERTIES,
VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_CREATE_INFO,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES,
VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO,
VK_STRUCTURE_TYPE_SURFACE_CAPABILITIES2_EXT = VK_STRUCTURE_TYPE_SURFACE_CAPABILITIES_2_EXT,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_FENCE_INFO_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_FENCE_INFO,
VK_STRUCTURE_TYPE_EXTERNAL_FENCE_PROPERTIES_KHR = VK_STRUCTURE_TYPE_EXTERNAL_FENCE_PROPERTIES,
VK_STRUCTURE_TYPE_EXPORT_FENCE_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_EXPORT_FENCE_CREATE_INFO,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_POINT_CLIPPING_PROPERTIES_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_POINT_CLIPPING_PROPERTIES,
VK_STRUCTURE_TYPE_RENDER_PASS_INPUT_ATTACHMENT_ASPECT_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_RENDER_PASS_INPUT_ATTACHMENT_ASPECT_CREATE_INFO,
VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO,
VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_DOMAIN_ORIGIN_STATE_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_DOMAIN_ORIGIN_STATE_CREATE_INFO,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VARIABLE_POINTER_FEATURES_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VARIABLE_POINTER_FEATURES,
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS_KHR = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS,
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO_KHR = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO,
VK_STRUCTURE_TYPE_BUFFER_MEMORY_REQUIREMENTS_INFO_2_KHR = VK_STRUCTURE_TYPE_BUFFER_MEMORY_REQUIREMENTS_INFO_2,
VK_STRUCTURE_TYPE_IMAGE_MEMORY_REQUIREMENTS_INFO_2_KHR = VK_STRUCTURE_TYPE_IMAGE_MEMORY_REQUIREMENTS_INFO_2,
VK_STRUCTURE_TYPE_IMAGE_SPARSE_MEMORY_REQUIREMENTS_INFO_2_KHR = VK_STRUCTURE_TYPE_IMAGE_SPARSE_MEMORY_REQUIREMENTS_INFO_2,
VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2_KHR = VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2,
VK_STRUCTURE_TYPE_SPARSE_IMAGE_MEMORY_REQUIREMENTS_2_KHR = VK_STRUCTURE_TYPE_SPARSE_IMAGE_MEMORY_REQUIREMENTS_2,
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_CREATE_INFO_KHR = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_CREATE_INFO,
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_INFO_KHR = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_INFO,
VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO_KHR = VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO,
VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO_KHR = VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLER_YCBCR_CONVERSION_FEATURES_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLER_YCBCR_CONVERSION_FEATURES,
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_IMAGE_FORMAT_PROPERTIES_KHR = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_IMAGE_FORMAT_PROPERTIES,
VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_INFO_KHR = VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_INFO,
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO_KHR = VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MAINTENANCE_3_PROPERTIES_KHR = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MAINTENANCE_3_PROPERTIES,
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_SUPPORT_KHR = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_SUPPORT,
} VkStructureType;
Each value corresponds to a particular structure with a sType
member
with a matching name.
As a general rule, the name of each VkStructureType value is obtained
by taking the name of the structure, stripping the leading Vk
,
prefixing each capital letter with _
, converting the entire resulting
string to upper case, and prefixing it with VK_STRUCTURE_TYPE_
.
For example, structures of type VkImageCreateInfo
correspond to a
VkStructureType of VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO
, and thus
its sType
member must equal that when it is passed to the API.
The values VK_STRUCTURE_TYPE_LOADER_INSTANCE_CREATE_INFO
and
VK_STRUCTURE_TYPE_LOADER_DEVICE_CREATE_INFO
are reserved for internal
use by the loader, and do not have corresponding Vulkan structures in this
Specification.
Valid Usage for Structure Pointer Chains
Any parameter that is a structure containing a void*
pNext
member
must have a value of pNext
that is either NULL
, or points to a
valid structure defined by an extension, containing sType
and
pNext
members as described in the Vulkan
Documentation and Extensions document in the section “Extension
Interactions”.
The set of structures connected by pNext
pointers is referred to as a
pNext
chain.
If that extension is supported by the implementation, then it must be
enabled.
Each type of valid structure must not appear more than once in a
pNext
chain.
Any component of the implementation (the loader, any enabled layers, and
drivers) must skip over, without processing (other than reading the
sType
and pNext
members) any structures in the chain with
sType
values not defined by extensions supported by that component.
Extension structures are not described in the base Vulkan Specification, but either in layered Specifications incorporating those extensions, or in separate vendor-provided documents.
As a convenience to implementations and layers needing to iterate through a structure pointer chain, the Vulkan API provides two base structures. These structures allow for some type safety, and can be used by Vulkan API functions that operate on generic inputs and outputs.
The VkBaseInStructure
structure is defined as:
typedef struct VkBaseInStructure {
VkStructureType sType;
const struct VkBaseInStructure* pNext;
} VkBaseInStructure;
-
sType
is the structure type of the structure being iterated through. -
pNext
isNULL
or a pointer to the next structure in a structure chain.
VkBaseInStructure
can be used to facilitate iterating through a
read-only structure pointer chain.
The VkBaseOutStructure
structure is defined as:
typedef struct VkBaseOutStructure {
VkStructureType sType;
struct VkBaseOutStructure* pNext;
} VkBaseOutStructure;
-
sType
is the structure type of the structure being iterated through. -
pNext
isNULL
or a pointer to the next structure in a structure chain.
VkBaseOutStructure
can be used to facilitate iterating through a
structure pointer chain that returns data back to the application.
Valid Usage for Nested Structures
The above conditions also apply recursively to members of structures provided as input to a command, either as a direct argument to the command, or themselves a member of another structure.
Specifics on valid usage of each command are covered in their individual sections.
Valid Usage for Extensions
Instance-level functionality or behavior added by an instance extension to the API must not be used unless that extension is supported by the instance as determined by vkEnumerateInstanceExtensionProperties, and that extension is enabled in VkInstanceCreateInfo.
Physical-device-level functionality or behavior added by an instance extension to the API must not be used unless that extension is supported by the instance as determined by vkEnumerateInstanceExtensionProperties, and that extension is enabled in VkInstanceCreateInfo.
Physical-device-level functionality or behavior added by a device extension to the API must not be used unless the conditions described in Extending Physical Device Core Functionality are met.
Device functionality or behavior added by a device extension to the API must not be used unless that extension is supported by the device as determined by vkEnumerateDeviceExtensionProperties, and that extension is enabled in VkDeviceCreateInfo.
Valid Usage for Newer Core Versions
Instance-level functionality or behavior added by a new core version of the API must not be used unless it is supported by the instance as determined by vkEnumerateInstanceVersion.
Physical-device-level functionality or behavior added by a new core version of the API must not be used unless it is supported by the physical device as determined by vkGetPhysicalDeviceProperties.
Device-level functionality or behavior added by a new core version of the API must not be used unless it is supported by the device as determined by vkGetPhysicalDeviceProperties.
2.7.3. Return Codes
While the core Vulkan API is not designed to capture incorrect usage, some circumstances still require return codes. Commands in Vulkan return their status via return codes that are in one of two categories:
-
Successful completion codes are returned when a command needs to communicate success or status information. All successful completion codes are non-negative values.
-
Run time error codes are returned when a command needs to communicate a failure that could only be detected at run time. All run time error codes are negative values.
All return codes in Vulkan are reported via VkResult return values. The possible codes are:
typedef enum VkResult {
VK_SUCCESS = 0,
VK_NOT_READY = 1,
VK_TIMEOUT = 2,
VK_EVENT_SET = 3,
VK_EVENT_RESET = 4,
VK_INCOMPLETE = 5,
VK_ERROR_OUT_OF_HOST_MEMORY = -1,
VK_ERROR_OUT_OF_DEVICE_MEMORY = -2,
VK_ERROR_INITIALIZATION_FAILED = -3,
VK_ERROR_DEVICE_LOST = -4,
VK_ERROR_MEMORY_MAP_FAILED = -5,
VK_ERROR_LAYER_NOT_PRESENT = -6,
VK_ERROR_EXTENSION_NOT_PRESENT = -7,
VK_ERROR_FEATURE_NOT_PRESENT = -8,
VK_ERROR_INCOMPATIBLE_DRIVER = -9,
VK_ERROR_TOO_MANY_OBJECTS = -10,
VK_ERROR_FORMAT_NOT_SUPPORTED = -11,
VK_ERROR_FRAGMENTED_POOL = -12,
VK_ERROR_OUT_OF_POOL_MEMORY = -1000069000,
VK_ERROR_INVALID_EXTERNAL_HANDLE = -1000072003,
VK_ERROR_SURFACE_LOST_KHR = -1000000000,
VK_ERROR_NATIVE_WINDOW_IN_USE_KHR = -1000000001,
VK_SUBOPTIMAL_KHR = 1000001003,
VK_ERROR_OUT_OF_DATE_KHR = -1000001004,
VK_ERROR_INCOMPATIBLE_DISPLAY_KHR = -1000003001,
VK_ERROR_VALIDATION_FAILED_EXT = -1000011001,
VK_ERROR_INVALID_SHADER_NV = -1000012000,
VK_ERROR_INVALID_DRM_FORMAT_MODIFIER_PLANE_LAYOUT_EXT = -1000158000,
VK_ERROR_FRAGMENTATION_EXT = -1000161000,
VK_ERROR_NOT_PERMITTED_EXT = -1000174001,
VK_ERROR_OUT_OF_POOL_MEMORY_KHR = VK_ERROR_OUT_OF_POOL_MEMORY,
VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR = VK_ERROR_INVALID_EXTERNAL_HANDLE,
} VkResult;
-
VK_SUCCESS
Command successfully completed -
VK_NOT_READY
A fence or query has not yet completed -
VK_TIMEOUT
A wait operation has not completed in the specified time -
VK_EVENT_SET
An event is signaled -
VK_EVENT_RESET
An event is unsignaled -
VK_INCOMPLETE
A return array was too small for the result -
VK_SUBOPTIMAL_KHR
A swapchain no longer matches the surface properties exactly, but can still be used to present to the surface successfully.
-
VK_ERROR_OUT_OF_HOST_MEMORY
A host memory allocation has failed. -
VK_ERROR_OUT_OF_DEVICE_MEMORY
A device memory allocation has failed. -
VK_ERROR_INITIALIZATION_FAILED
Initialization of an object could not be completed for implementation-specific reasons. -
VK_ERROR_DEVICE_LOST
The logical or physical device has been lost. See Lost Device -
VK_ERROR_MEMORY_MAP_FAILED
Mapping of a memory object has failed. -
VK_ERROR_LAYER_NOT_PRESENT
A requested layer is not present or could not be loaded. -
VK_ERROR_EXTENSION_NOT_PRESENT
A requested extension is not supported. -
VK_ERROR_FEATURE_NOT_PRESENT
A requested feature is not supported. -
VK_ERROR_INCOMPATIBLE_DRIVER
The requested version of Vulkan is not supported by the driver or is otherwise incompatible for implementation-specific reasons. -
VK_ERROR_TOO_MANY_OBJECTS
Too many objects of the type have already been created. -
VK_ERROR_FORMAT_NOT_SUPPORTED
A requested format is not supported on this device. -
VK_ERROR_FRAGMENTED_POOL
A pool allocation has failed due to fragmentation of the pool’s memory. This must only be returned if no attempt to allocate host or device memory was made to accommodate the new allocation. This should be returned in preference toVK_ERROR_OUT_OF_POOL_MEMORY
, but only if the implementation is certain that the pool allocation failure was due to fragmentation. -
VK_ERROR_SURFACE_LOST_KHR
A surface is no longer available. -
VK_ERROR_NATIVE_WINDOW_IN_USE_KHR
The requested window is already in use by Vulkan or another API in a manner which prevents it from being used again. -
VK_ERROR_OUT_OF_DATE_KHR
A surface has changed in such a way that it is no longer compatible with the swapchain, and further presentation requests using the swapchain will fail. Applications must query the new surface properties and recreate their swapchain if they wish to continue presenting to the surface. -
VK_ERROR_INCOMPATIBLE_DISPLAY_KHR
The display used by a swapchain does not use the same presentable image layout, or is incompatible in a way that prevents sharing an image. -
VK_ERROR_INVALID_SHADER_NV
One or more shaders failed to compile or link. More details are reported back to the application viaVK_EXT_debug_report
if enabled. -
VK_ERROR_OUT_OF_POOL_MEMORY
A pool memory allocation has failed. This must only be returned if no attempt to allocate host or device memory was made to accommodate the new allocation. If the failure was definitely due to fragmentation of the pool,VK_ERROR_FRAGMENTED_POOL
should be returned instead. -
VK_ERROR_INVALID_EXTERNAL_HANDLE
An external handle is not a valid handle of the specified type. -
VK_ERROR_FRAGMENTATION_EXT
A descriptor pool creation has failed due to fragmentation.
If a command returns a run time error, unless otherwise specified any output
parameters will have undefined contents, except that if the output
parameter is a structure with sType
and pNext
fields, those
fields will be unmodified.
Any structures chained from pNext
will also have undefined contents,
except that sType
and pNext
will be unmodified.
Out of memory errors do not damage any currently existing Vulkan objects. Objects that have already been successfully created can still be used by the application.
Performance-critical commands generally do not have return codes.
If a run time error occurs in such commands, the implementation will defer
reporting the error until a specified point.
For commands that record into command buffers (vkCmd*
) run time errors
are reported by vkEndCommandBuffer
.
2.8. Numeric Representation and Computation
Implementations normally perform computations in floating-point, and must meet the range and precision requirements defined under “Floating-Point Computation” below.
These requirements only apply to computations performed in Vulkan operations outside of shader execution, such as texture image specification and sampling, and per-fragment operations. Range and precision requirements during shader execution differ and are specified by the Precision and Operation of SPIR-V Instructions section.
In some cases, the representation and/or precision of operations is implicitly limited by the specified format of vertex or texel data consumed by Vulkan. Specific floating-point formats are described later in this section.
2.8.1. Floating-Point Computation
Most floating-point computation is performed in SPIR-V shader modules. The properties of computation within shaders are constrained as defined by the Precision and Operation of SPIR-V Instructions section.
Some floating-point computation is performed outside of shaders, such as viewport and depth range calculations. For these computations, we do not specify how floating-point numbers are to be represented, or the details of how operations on them are performed, but only place minimal requirements on representation and precision as described in the remainder of this section.
editing-note
(Jon, Bug 14966) This is a rat’s nest of complexity, both in terms of describing/enumerating places such computation may take place (other than “not shader code”) and in how implementations may do it. We have consciously deferred the resolution of this issue to post-1.0, and in the meantime, the following language inherited from the OpenGL Specification is inserted as a placeholder. Hopefully it can be tightened up considerably. |
We require simply that numbers’ floating-point parts contain enough bits and that their exponent fields are large enough so that individual results of floating-point operations are accurate to about 1 part in 105. The maximum representable magnitude for all floating-point values must be at least 232.
-
x × 0 = 0 × x = 0 for any non-infinite and non-NaN x.
-
1 × x = x × 1 = x.
-
x + 0 = 0 + x = x.
-
00 = 1.
Occasionally, further requirements will be specified. Most single-precision floating-point formats meet these requirements.
The special values Inf and -Inf encode values with magnitudes too large to be represented; the special value NaN encodes “Not A Number” values resulting from undefined arithmetic operations such as 0 / 0. Implementations may support Inf and NaN in their floating-point computations.
2.8.2. Floating-Point Format Conversions
When a value is converted to a defined floating-point representation, finite values falling between two representable finite values are rounded to one or the other. The rounding mode is not defined. Finite values whose magnitude is larger than that of any representable finite value may be rounded either to the closest representable finite value or to the appropriately signed infinity. For unsigned destination formats any negative values are converted to zero. Positive infinity is converted to positive infinity; negative infinity is converted to negative infinity in signed formats and to zero in unsigned formats; and any NaN is converted to a NaN.
2.8.3. 16-Bit Floating-Point Numbers
16-bit floating point numbers are defined in the “16-bit floating point numbers” section of the Khronos Data Format Specification.
2.8.4. Unsigned 11-Bit Floating-Point Numbers
Unsigned 11-bit floating point numbers are defined in the “Unsigned 11-bit floating point numbers” section of the Khronos Data Format Specification.
2.8.5. Unsigned 10-Bit Floating-Point Numbers
Unsigned 10-bit floating point numbers are defined in the “Unsigned 10-bit floating point numbers” section of the Khronos Data Format Specification.
2.8.6. General Requirements
Any representable floating-point value in the appropriate format is legal as input to a Vulkan command that requires floating-point data. The result of providing a value that is not a floating-point number to such a command is unspecified, but must not lead to Vulkan interruption or termination. For example, providing a negative zero (where applicable) or a denormalized number to an Vulkan command must yield deterministic results, while providing a NaN or Inf yields unspecified results.
Some calculations require division. In such cases (including implied divisions performed by vector normalization), division by zero produces an unspecified result but must not lead to Vulkan interruption or termination.
2.9. Fixed-Point Data Conversions
When generic vertex attributes and pixel color or depth components are represented as integers, they are often (but not always) considered to be normalized. Normalized integer values are treated specially when being converted to and from floating-point values, and are usually referred to as normalized fixed-point.
In the remainder of this section, b denotes the bit width of the fixed-point integer representation. When the integer is one of the types defined by the API, b is the bit width of that type. When the integer comes from an image containing color or depth component texels, b is the number of bits allocated to that component in its specified image format.
The signed and unsigned fixed-point representations are assumed to be b-bit binary two’s-complement integers and binary unsigned integers, respectively.
2.9.1. Conversion from Normalized Fixed-Point to Floating-Point
Unsigned normalized fixed-point integers represent numbers in the range [0,1]. The conversion from an unsigned normalized fixed-point value c to the corresponding floating-point value f is defined as
Signed normalized fixed-point integers represent numbers in the range [-1,1]. The conversion from a signed normalized fixed-point value c to the corresponding floating-point value f is performed using
Only the range [-2b-1 + 1, 2b-1 - 1] is used to represent signed fixed-point values in the range [-1,1]. For example, if b = 8, then the integer value -127 corresponds to -1.0 and the value 127 corresponds to 1.0. Note that while zero is exactly expressible in this representation, one value (-128 in the example) is outside the representable range, and must be clamped before use. This equation is used everywhere that signed normalized fixed-point values are converted to floating-point.
2.9.2. Conversion from Floating-Point to Normalized Fixed-Point
The conversion from a floating-point value f to the corresponding unsigned normalized fixed-point value c is defined by first clamping f to the range [0,1], then computing
-
c = convertFloatToUint(f × (2b - 1), b)
where convertFloatToUint}(r,b) returns one of the two unsigned binary integer values with exactly b bits which are closest to the floating-point value r. Implementations should round to nearest. If r is equal to an integer, then that integer value must be returned. In particular, if f is equal to 0.0 or 1.0, then c must be assigned 0 or 2b - 1, respectively.
The conversion from a floating-point value f to the corresponding signed normalized fixed-point value c is performed by clamping f to the range [-1,1], then computing
-
c = convertFloatToInt(f × (2b-1 - 1), b)
where convertFloatToInt(r,b) returns one of the two signed two’s-complement binary integer values with exactly b bits which are closest to the floating-point value r. Implementations should round to nearest. If r is equal to an integer, then that integer value must be returned. In particular, if f is equal to -1.0, 0.0, or 1.0, then c must be assigned -(2b-1 - 1), 0, or 2b-1 - 1, respectively.
This equation is used everywhere that floating-point values are converted to signed normalized fixed-point.
2.10. API Version Numbers and Semantics
The Vulkan version number is used in several places in the API. In each such use, the API major version number, minor version number, and patch version number are packed into a 32-bit integer as follows:
-
The major version number is a 10-bit integer packed into bits 31-22.
-
The minor version number is a 10-bit integer packed into bits 21-12.
-
The patch version number is a 12-bit integer packed into bits 11-0.
Differences in any of the Vulkan version numbers indicates a change to the API in some way, with each part of the version number indicating a different scope of changes.
A difference in patch version numbers indicates that some usually small part of the Specification or header has been modified, typically to fix a bug, and may have an impact on the behavior of existing functionality. Differences in this version number should not affect either full compatibility or backwards compatibility between two versions, or add additional interfaces to the API.
A difference in minor version numbers indicates that some amount of new functionality has been added. This will usually include new interfaces in the header, and may also include behavior changes and bug fixes. Functionality may be deprecated in a minor revision, but will not be removed. The patch version will continue to increment through minor version number changes since all minor versions are generated from the same source files, and changes to the source files may affect all minor versions within a major version. Differences in the patch version should not affect backwards compatibility, but will affect full compatibility. The patch version of the Specification is taken from VK_HEADER_VERSION.
A difference in major version numbers indicates a large set of changes to the API, potentially including new functionality and header interfaces, behavioral changes, removal of deprecated features, modification or outright replacement of any feature, and is thus very likely to break any and all compatibility. Differences in this version will typically require significant modification to an application in order for it to function.
C language macros for manipulating version numbers are defined in the Version Number Macros appendix.
2.11. Common Object Types
Some types of Vulkan objects are used in many different structures and command parameters, and are described here. These types include offsets, extents, and rectangles.
2.11.1. Offsets
Offsets are used to describe a pixel location within an image or framebuffer, as an (x,y) location for two-dimensional images, or an (x,y,z) location for three-dimensional images.
A two-dimensional offsets is defined by the structure:
typedef struct VkOffset2D {
int32_t x;
int32_t y;
} VkOffset2D;
-
x
is the x offset. -
y
is the y offset.
A three-dimensional offset is defined by the structure:
typedef struct VkOffset3D {
int32_t x;
int32_t y;
int32_t z;
} VkOffset3D;
-
x
is the x offset. -
y
is the y offset. -
z
is the z offset.
2.11.2. Extents
Extents are used to describe the size of a rectangular region of pixels within an image or framebuffer, as (width,height) for two-dimensional images, or as (width,height,depth) for three-dimensional images.
A two-dimensional extent is defined by the structure:
typedef struct VkExtent2D {
uint32_t width;
uint32_t height;
} VkExtent2D;
-
width
is the width of the extent. -
height
is the height of the extent.
A three-dimensional extent is defined by the structure:
typedef struct VkExtent3D {
uint32_t width;
uint32_t height;
uint32_t depth;
} VkExtent3D;
-
width
is the width of the extent. -
height
is the height of the extent. -
depth
is the depth of the extent.
2.11.3. Rectangles
Rectangles are used to describe a specified rectangular region of pixels within an image or framebuffer. Rectangles include both an offset and an extent of the same dimensionality, as described above. Two-dimensional rectangles are defined by the structure
typedef struct VkRect2D {
VkOffset2D offset;
VkExtent2D extent;
} VkRect2D;
-
offset
is a VkOffset2D specifying the rectangle offset. -
extent
is a VkExtent2D specifying the rectangle extent.
3. Initialization
Before using Vulkan, an application must initialize it by loading the
Vulkan commands, and creating a VkInstance
object.
3.1. Command Function Pointers
Vulkan commands are not necessarily exposed by static linking on a platform. Commands to query function pointers for Vulkan commands are described below.
Note
When extensions are promoted or otherwise incorporated into another extension or Vulkan core version, commands that have the same definition and behavior are referred to as “aliases”, and are documented as such. Whilst the behavior of each command alias is identical, the behavior of retrieving each alias’s function pointer is not. A function pointer for a given alias can only be retrieved if the extension or version that introduced that alias is supported and enabled, irrespective of whether any other alias is available. |
Function pointers for all Vulkan commands can be obtained with the command:
PFN_vkVoidFunction vkGetInstanceProcAddr(
VkInstance instance,
const char* pName);
-
instance
is the instance that the function pointer will be compatible with, orNULL
for commands not dependent on any instance. -
pName
is the name of the command to obtain.
vkGetInstanceProcAddr
itself is obtained in a platform- and loader-
specific manner.
Typically, the loader library will export this command as a function symbol,
so applications can link against the loader library, or load it dynamically
and look up the symbol using platform-specific APIs.
The table below defines the various use cases for
vkGetInstanceProcAddr
and expected return value (“fp” is “function
pointer”) for each case.
The returned function pointer is of type PFN_vkVoidFunction, and must be cast to the type of the command being queried.
instance |
pName |
return value |
---|---|---|
* |
|
undefined |
invalid instance |
* |
undefined |
|
fp |
|
|
fp |
|
|
fp |
|
|
fp |
|
|
* (any |
|
instance |
core Vulkan command |
fp1 |
instance |
enabled instance extension commands for |
fp1 |
instance |
available device extension2 commands for |
fp1 |
instance |
* (any |
|
- 1
-
The returned function pointer must only be called with a dispatchable object (the first parameter) that is
instance
or a child ofinstance
, e.g. VkInstance, VkPhysicalDevice, VkDevice, VkQueue, or VkCommandBuffer. - 2
-
An “available device extension” is a device extension supported by any physical device enumerated by
instance
.
In order to support systems with multiple Vulkan implementations, the
function pointers returned by vkGetInstanceProcAddr
may point to
dispatch code that calls a different real implementation for different
VkDevice objects or their child objects.
The overhead of the internal dispatch for VkDevice objects can be
avoided by obtaining device-specific function pointers for any commands that
use a device or device-child object as their dispatchable object.
Such function pointers can be obtained with the command:
PFN_vkVoidFunction vkGetDeviceProcAddr(
VkDevice device,
const char* pName);
The table below defines the various use cases for vkGetDeviceProcAddr
and expected return value for each case.
The returned function pointer is of type PFN_vkVoidFunction, and must
be cast to the type of the command being queried.
The function pointer must only be called with a dispatchable object (the
first parameter) that is device
or a child of device
.
device |
pName |
return value |
---|---|---|
|
* |
undefined |
invalid device |
* |
undefined |
device |
|
undefined |
device |
core device-level Vulkan command |
fp |
device |
enabled device extension commands |
fp |
device |
* (any |
|
The definition of PFN_vkVoidFunction is:
typedef void (VKAPI_PTR *PFN_vkVoidFunction)(void);
3.1.1. Extending Physical Device Core Functionality
New core physical-device-level functionality can be used when the physical-device version is greater than or equal to the version of Vulkan that added the new functionality. The Vulkan version supported by a physical device can be obtained by calling vkGetPhysicalDeviceProperties.
3.1.2. Extending Physical Device From Device Extensions
When the VK_KHR_get_physical_device_properties2
extension is enabled,
or when both the instance and the physical-device versions are at least 1.1,
physical-device-level functionality of a device extension can be used with
a physical device if the corresponding extension is enumerated by
vkEnumerateDeviceExtensionProperties for that physical device, even
before a logical device has been created.
To obtain a function pointer for a physical-device-level command from a
device extension, an application can use vkGetInstanceProcAddr.
This function pointer may point to dispatch code, which calls a different
real implementation for different VkPhysicalDevice
objects.
Behavior is undefined if an extension physical-device command is called on
a physical device that does not support the extension.
Device extensions may define structures that can be added to the
pNext
chain of physical-device-level commands.
Behavior is undefined if such an extension structure is passed to a
physical-device-level command for a physical device that does not support
the extension.
3.2. Instances
There is no global state in Vulkan and all per-application state is stored
in a VkInstance
object.
Creating a VkInstance
object initializes the Vulkan library and allows
the application to pass information about itself to the implementation.
Instances are represented by VkInstance
handles:
VK_DEFINE_HANDLE(VkInstance)
The version of Vulkan that is supported by an instance may be different than the version of Vulkan supported by a device or physical device. To query properties that can be used in creating an instance, call:
VkResult vkEnumerateInstanceVersion(
uint32_t* pApiVersion);
-
pApiVersion
points to auint32_t
, which is the version of Vulkan supported by instance-level functionality, encoded as described in the API Version Numbers and Semantics section.
To create an instance object, call:
VkResult vkCreateInstance(
const VkInstanceCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkInstance* pInstance);
-
pCreateInfo
points to an instance of VkInstanceCreateInfo controlling creation of the instance. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pInstance
points a VkInstance handle in which the resulting instance is returned.
vkCreateInstance
verifies that the requested layers exist.
If not, vkCreateInstance
will return VK_ERROR_LAYER_NOT_PRESENT
.
Next vkCreateInstance
verifies that the requested extensions are
supported (e.g. in the implementation or in any enabled instance layer) and
if any requested extension is not supported, vkCreateInstance
must
return VK_ERROR_EXTENSION_NOT_PRESENT
.
After verifying and enabling the instance layers and extensions the
VkInstance
object is created and returned to the application.
If a requested extension is only supported by a layer, both the layer and
the extension need to be specified at vkCreateInstance
time for the
creation to succeed.
The VkInstanceCreateInfo
structure is defined as:
typedef struct VkInstanceCreateInfo {
VkStructureType sType;
const void* pNext;
VkInstanceCreateFlags flags;
const VkApplicationInfo* pApplicationInfo;
uint32_t enabledLayerCount;
const char* const* ppEnabledLayerNames;
uint32_t enabledExtensionCount;
const char* const* ppEnabledExtensionNames;
} VkInstanceCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
pApplicationInfo
isNULL
or a pointer to an instance ofVkApplicationInfo
. If notNULL
, this information helps implementations recognize behavior inherent to classes of applications. VkApplicationInfo is defined in detail below. -
enabledLayerCount
is the number of global layers to enable. -
ppEnabledLayerNames
is a pointer to an array ofenabledLayerCount
null-terminated UTF-8 strings containing the names of layers to enable for the created instance. See the Layers section for further details. -
enabledExtensionCount
is the number of global extensions to enable. -
ppEnabledExtensionNames
is a pointer to an array ofenabledExtensionCount
null-terminated UTF-8 strings containing the names of extensions to enable.
typedef VkFlags VkInstanceCreateFlags;
VkInstanceCreateFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
When creating a Vulkan instance for which you wish to disable validation
checks, add a VkValidationFlagsEXT structure to the pNext
chain
of the VkInstanceCreateInfo structure, specifying the checks to be
disabled.
typedef struct VkValidationFlagsEXT {
VkStructureType sType;
const void* pNext;
uint32_t disabledValidationCheckCount;
const VkValidationCheckEXT* pDisabledValidationChecks;
} VkValidationFlagsEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
disabledValidationCheckCount
is the number of checks to disable. -
pDisabledValidationChecks
is a pointer to an array of VkValidationCheckEXT values specifying the validation checks to be disabled.
Possible values of elements of the
VkValidationFlagsEXT::pDisabledValidationChecks
array,
specifying validation checks to be disabled, are:
typedef enum VkValidationCheckEXT {
VK_VALIDATION_CHECK_ALL_EXT = 0,
VK_VALIDATION_CHECK_SHADERS_EXT = 1,
} VkValidationCheckEXT;
-
VK_VALIDATION_CHECK_ALL_EXT
specifies that all validation checks are disabled. -
VK_VALIDATION_CHECK_SHADERS_EXT
specifies that shader validation is disabled.
The VkApplicationInfo
structure is defined as:
typedef struct VkApplicationInfo {
VkStructureType sType;
const void* pNext;
const char* pApplicationName;
uint32_t applicationVersion;
const char* pEngineName;
uint32_t engineVersion;
uint32_t apiVersion;
} VkApplicationInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pApplicationName
isNULL
or is a pointer to a null-terminated UTF-8 string containing the name of the application. -
applicationVersion
is an unsigned integer variable containing the developer-supplied version number of the application. -
pEngineName
isNULL
or is a pointer to a null-terminated UTF-8 string containing the name of the engine (if any) used to create the application. -
engineVersion
is an unsigned integer variable containing the developer-supplied version number of the engine used to create the application. -
apiVersion
must be the highest version of Vulkan that the application is designed to use, encoded as described in the API Version Numbers and Semantics section. The patch version number specified inapiVersion
is ignored when creating an instance object. Only the major and minor versions of the instance must match those requested inapiVersion
.
Vulkan 1.0 implementations were required to return
VK_ERROR_INCOMPATIBLE_DRIVER
if apiVersion
was larger than 1.0.
Implementations that support Vulkan 1.1 or later must not return
VK_ERROR_INCOMPATIBLE_DRIVER
for any value of apiVersion
.
Note
Because Vulkan 1.0 implementations may fail with
|
Implicit layers must be disabled if they do not support a version at least
as high as apiVersion
.
See the Vulkan Loader Specification and
Architecture Overview document for additional information.
Note
Providing a |
To destroy an instance, call:
void vkDestroyInstance(
VkInstance instance,
const VkAllocationCallbacks* pAllocator);
-
instance
is the handle of the instance to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
4. Devices and Queues
Once Vulkan is initialized, devices and queues are the primary objects used to interact with a Vulkan implementation.
Vulkan separates the concept of physical and logical devices. A physical device usually represents a single complete implementation of Vulkan (excluding instance-level functionality) available to the host, of which there are a finite number. A logical device represents an instance of that implementation with its own state and resources independent of other logical devices.
Physical devices are represented by VkPhysicalDevice
handles:
VK_DEFINE_HANDLE(VkPhysicalDevice)
4.1. Physical Devices
To retrieve a list of physical device objects representing the physical devices installed in the system, call:
VkResult vkEnumeratePhysicalDevices(
VkInstance instance,
uint32_t* pPhysicalDeviceCount,
VkPhysicalDevice* pPhysicalDevices);
-
instance
is a handle to a Vulkan instance previously created with vkCreateInstance. -
pPhysicalDeviceCount
is a pointer to an integer related to the number of physical devices available or queried, as described below. -
pPhysicalDevices
is eitherNULL
or a pointer to an array ofVkPhysicalDevice
handles.
If pPhysicalDevices
is NULL
, then the number of physical devices
available is returned in pPhysicalDeviceCount
.
Otherwise, pPhysicalDeviceCount
must point to a variable set by the
user to the number of elements in the pPhysicalDevices
array, and on
return the variable is overwritten with the number of handles actually
written to pPhysicalDevices
.
If pPhysicalDeviceCount
is less than the number of physical devices
available, at most pPhysicalDeviceCount
structures will be written.
If pPhysicalDeviceCount
is smaller than the number of physical devices
available, VK_INCOMPLETE
will be returned instead of VK_SUCCESS
,
to indicate that not all the available physical devices were returned.
To query general properties of physical devices once enumerated, call:
void vkGetPhysicalDeviceProperties(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceProperties* pProperties);
-
physicalDevice
is the handle to the physical device whose properties will be queried. -
pProperties
points to an instance of the VkPhysicalDeviceProperties structure, that will be filled with returned information.
The VkPhysicalDeviceProperties
structure is defined as:
typedef struct VkPhysicalDeviceProperties {
uint32_t apiVersion;
uint32_t driverVersion;
uint32_t vendorID;
uint32_t deviceID;
VkPhysicalDeviceType deviceType;
char deviceName[VK_MAX_PHYSICAL_DEVICE_NAME_SIZE];
uint8_t pipelineCacheUUID[VK_UUID_SIZE];
VkPhysicalDeviceLimits limits;
VkPhysicalDeviceSparseProperties sparseProperties;
} VkPhysicalDeviceProperties;
-
apiVersion
is the version of Vulkan supported by the device, encoded as described in the API Version Numbers and Semantics section. -
driverVersion
is the vendor-specified version of the driver. -
vendorID
is a unique identifier for the vendor (see below) of the physical device. -
deviceID
is a unique identifier for the physical device among devices available from the vendor. -
deviceType
is a VkPhysicalDeviceType specifying the type of device. -
deviceName
is a null-terminated UTF-8 string containing the name of the device. -
pipelineCacheUUID
is an array of sizeVK_UUID_SIZE
, containing 8-bit values that represent a universally unique identifier for the device. -
limits
is the VkPhysicalDeviceLimits structure which specifies device-specific limits of the physical device. See Limits for details. -
sparseProperties
is the VkPhysicalDeviceSparseProperties structure which specifies various sparse related properties of the physical device. See Sparse Properties for details.
Note
The value of |
The vendorID
and deviceID
fields are provided to allow
applications to adapt to device characteristics that are not adequately
exposed by other Vulkan queries.
Note
These may include performance profiles, hardware errata, or other characteristics. |
The vendor identified by vendorID
is the entity responsible for the
most salient characteristics of the underlying implementation of the
VkPhysicalDevice being queried.
Note
For example, in the case of a discrete GPU implementation, this should be the GPU chipset vendor. In the case of a hardware accelerator integrated into a system-on-chip (SoC), this should be the supplier of the silicon IP used to create the accelerator. |
If the vendor has a PCI
vendor ID, the low 16 bits of vendorID
must contain that PCI vendor
ID, and the remaining bits must be set to zero.
Otherwise, the value returned must be a valid Khronos vendor ID, obtained
as described in the Vulkan Documentation and Extensions:
Procedures and Conventions document in the section “Registering a Vendor
ID with Khronos”.
Khronos vendor IDs are allocated starting at 0x10000, to distinguish them
from the PCI vendor ID namespace.
Khronos vendor IDs are symbolically defined in the VkVendorId type.
The vendor is also responsible for the value returned in deviceID
.
If the implementation is driven primarily by a PCI
device with a PCI device ID, the low 16 bits of
deviceID
must contain that PCI device ID, and the remaining bits
must be set to zero.
Otherwise, the choice of what values to return may be dictated by operating
system or platform policies - but should uniquely identify both the device
version and any major configuration options (for example, core count in the
case of multicore devices).
Note
The same device ID should be used for all physical implementations of that device version and configuration. For example, all uses of a specific silicon IP GPU version and configuration should use the same device ID, even if those uses occur in different SoCs. |
Khronos vendor IDs which may be returned in
VkPhysicalDeviceProperties::vendorID
are:
typedef enum VkVendorId {
VK_VENDOR_ID_VIV = 0x10001,
VK_VENDOR_ID_VSI = 0x10002,
VK_VENDOR_ID_KAZAN = 0x10003,
} VkVendorId;
Note
Khronos vendor IDs may be allocated by vendors at any time.
Only the latest canonical versions of this Specification, of the
corresponding Only Khronos vendor IDs are given symbolic names at present. PCI vendor IDs returned by the implementation can be looked up in the PCI-SIG database. |
The physical device types which may be returned in
VkPhysicalDeviceProperties::deviceType
are:
typedef enum VkPhysicalDeviceType {
VK_PHYSICAL_DEVICE_TYPE_OTHER = 0,
VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU = 1,
VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU = 2,
VK_PHYSICAL_DEVICE_TYPE_VIRTUAL_GPU = 3,
VK_PHYSICAL_DEVICE_TYPE_CPU = 4,
} VkPhysicalDeviceType;
-
VK_PHYSICAL_DEVICE_TYPE_OTHER
- the device does not match any other available types. -
VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
- the device is typically one embedded in or tightly coupled with the host. -
VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
- the device is typically a separate processor connected to the host via an interlink. -
VK_PHYSICAL_DEVICE_TYPE_VIRTUAL_GPU
- the device is typically a virtual node in a virtualization environment. -
VK_PHYSICAL_DEVICE_TYPE_CPU
- the device is typically running on the same processors as the host.
The physical device type is advertised for informational purposes only, and does not directly affect the operation of the system. However, the device type may correlate with other advertised properties or capabilities of the system, such as how many memory heaps there are.
To query general properties of physical devices once enumerated, call:
void vkGetPhysicalDeviceProperties2(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceProperties2* pProperties);
or the equivalent command
void vkGetPhysicalDeviceProperties2KHR(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceProperties2* pProperties);
-
physicalDevice
is the handle to the physical device whose properties will be queried. -
pProperties
points to an instance of the VkPhysicalDeviceProperties2 structure, that will be filled with returned information.
Each structure in pProperties
and its pNext
chain contain
members corresponding to properties or implementation-dependent limits.
vkGetPhysicalDeviceProperties2
writes each member to a value
indicating the value of that property or limit.
The VkPhysicalDeviceProperties2
structure is defined as:
typedef struct VkPhysicalDeviceProperties2 {
VkStructureType sType;
void* pNext;
VkPhysicalDeviceProperties properties;
} VkPhysicalDeviceProperties2;
or the equivalent
typedef VkPhysicalDeviceProperties2 VkPhysicalDeviceProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
properties
is a structure of type VkPhysicalDeviceProperties describing the properties of the physical device. This structure is written with the same values as if it were written by vkGetPhysicalDeviceProperties.
The pNext
chain of this structure is used to extend the structure with
properties defined by extensions.
To query the UUID and LUID of a device, add
VkPhysicalDeviceIDProperties to the pNext
chain of the
VkPhysicalDeviceProperties2 structure.
The VkPhysicalDeviceIDProperties
structure is defined as:
typedef struct VkPhysicalDeviceIDProperties {
VkStructureType sType;
void* pNext;
uint8_t deviceUUID[VK_UUID_SIZE];
uint8_t driverUUID[VK_UUID_SIZE];
uint8_t deviceLUID[VK_LUID_SIZE];
uint32_t deviceNodeMask;
VkBool32 deviceLUIDValid;
} VkPhysicalDeviceIDProperties;
or the equivalent
typedef VkPhysicalDeviceIDProperties VkPhysicalDeviceIDPropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
deviceUUID
is an array of sizeVK_UUID_SIZE
, containing 8-bit values that represent a universally unique identifier for the device. -
driverUUID
is an array of sizeVK_UUID_SIZE
, containing 8-bit values that represent a universally unique identifier for the driver build in use by the device. -
deviceLUID
is an array of sizeVK_LUID_SIZE
, containing 8-bit values that represent a locally unique identifier for the device. -
deviceNodeMask
is a bitfield identifying the node within a linked device adapter corresponding to the device. -
deviceLUIDValid
is a boolean value that will beVK_TRUE
ifdeviceLUID
contains a valid LUID anddeviceNodeMask
contains a valid node mask, andVK_FALSE
if they do not.
deviceUUID
must be immutable for a given device across instances,
processes, driver APIs, driver versions, and system reboots.
Applications can compare the driverUUID
value across instance and
process boundaries, and can make similar queries in external APIs to
determine whether they are capable of sharing memory objects and resources
using them with the device.
deviceUUID
and/or driverUUID
must be used to determine whether
a particular external object can be shared between driver components, where
such a restriction exists as defined in the compatibility table for the
particular object type:
If deviceLUIDValid
is VK_FALSE
, the values of deviceLUID
and deviceNodeMask
are undefined.
If deviceLUIDValid
is VK_TRUE
and Vulkan is running on the
Windows operating system, the contents of deviceLUID
can be cast to
an LUID
object and must be equal to the locally unique identifier of a
IDXGIAdapter1
object that corresponds to physicalDevice
.
If deviceLUIDValid
is VK_TRUE
, deviceNodeMask
must
contain exactly one bit.
If Vulkan is running on an operating system that supports the Direct3D 12
API and physicalDevice
corresponds to an individual device in a linked
device adapter, deviceNodeMask
identifies the Direct3D 12 node
corresponding to physicalDevice
.
Otherwise, deviceNodeMask
must be 1
.
Note
Although they have identical descriptions,
VkPhysicalDeviceIDProperties:: |
Note
While VkPhysicalDeviceIDProperties:: |
To query the properties of the driver corresponding to a physical device,
add VkPhysicalDeviceDriverPropertiesKHR to the pNext
chain of
the VkPhysicalDeviceProperties2 structure.
The VkPhysicalDeviceDriverPropertiesKHR
structure is defined as:
typedef struct VkPhysicalDeviceDriverPropertiesKHR {
VkStructureType sType;
void* pNext;
VkDriverIdKHR driverID;
char driverName[VK_MAX_DRIVER_NAME_SIZE_KHR];
char driverInfo[VK_MAX_DRIVER_INFO_SIZE_KHR];
VkConformanceVersionKHR conformanceVersion;
} VkPhysicalDeviceDriverPropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension specific structure. -
driverID
is a unique identifier for the driver of the physical device. -
driverName
is a null-terminated UTF-8 string containing the name of the driver. -
driverInfo
is a null-terminated UTF-8 string containing additional information about the driver. -
conformanceVersion
is the version of the Vulkan conformance test this driver is conformant against (see VkConformanceVersionKHR).
driverID
must be immutable for a given driver across instances,
processes, driver versions, and system reboots.
Khronos driver IDs which may be returned in
VkPhysicalDeviceDriverPropertiesKHR::driverID
are:
typedef enum VkDriverIdKHR {
VK_DRIVER_ID_AMD_PROPRIETARY_KHR = 1,
VK_DRIVER_ID_AMD_OPEN_SOURCE_KHR = 2,
VK_DRIVER_ID_MESA_RADV_KHR = 3,
VK_DRIVER_ID_NVIDIA_PROPRIETARY_KHR = 4,
VK_DRIVER_ID_INTEL_PROPRIETARY_WINDOWS_KHR = 5,
VK_DRIVER_ID_INTEL_OPEN_SOURCE_MESA_KHR = 6,
VK_DRIVER_ID_IMAGINATION_PROPRIETARY_KHR = 7,
VK_DRIVER_ID_QUALCOMM_PROPRIETARY_KHR = 8,
VK_DRIVER_ID_ARM_PROPRIETARY_KHR = 9,
VK_DRIVER_ID_GOOGLE_PASTEL_KHR = 10,
} VkDriverIdKHR;
Note
Khronos driver IDs may be allocated by vendors at any time.
There may be multiple driver IDs for the same vendor, representing different
drivers (for e.g. different platforms, proprietary or open source, etc.).
Only the latest canonical versions of this Specification, of the
corresponding Only driver IDs registered with Khronos are given symbolic names. There may be unregistered driver IDs returned. |
The conformance test suite version an implementation is compliant with is
described with an instance of the VkConformanceVersionKHR
structure.
The VkConformanceVersionKHR
structure is defined as:
typedef struct VkConformanceVersionKHR {
uint8_t major;
uint8_t minor;
uint8_t subminor;
uint8_t patch;
} VkConformanceVersionKHR;
-
major
is the major version number of the conformance test suite. -
minor
is the minor version number of the conformance test suite. -
subminor
is the subminor version number of the conformance test suite. -
patch
is the patch version number of the conformance test suite.
To query the PCI bus information of a physical device, add
VkPhysicalDevicePCIBusInfoPropertiesEXT to the pNext
chain of
the VkPhysicalDeviceProperties2 structure.
The VkPhysicalDevicePCIBusInfoPropertiesEXT
structure is defined as:
typedef struct VkPhysicalDevicePCIBusInfoPropertiesEXT {
VkStructureType sType;
void* pNext;
uint32_t pciDomain;
uint32_t pciBus;
uint32_t pciDevice;
uint32_t pciFunction;
} VkPhysicalDevicePCIBusInfoPropertiesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pciDomain
is the PCI bus domain. -
pciBus
is the PCI bus identifier. -
pciDevice
is the PCI device identifier. -
pciFunction
is the PCI device function identifier.
To query properties of queues available on a physical device, call:
void vkGetPhysicalDeviceQueueFamilyProperties(
VkPhysicalDevice physicalDevice,
uint32_t* pQueueFamilyPropertyCount,
VkQueueFamilyProperties* pQueueFamilyProperties);
-
physicalDevice
is the handle to the physical device whose properties will be queried. -
pQueueFamilyPropertyCount
is a pointer to an integer related to the number of queue families available or queried, as described below. -
pQueueFamilyProperties
is eitherNULL
or a pointer to an array of VkQueueFamilyProperties structures.
If pQueueFamilyProperties
is NULL
, then the number of queue families
available is returned in pQueueFamilyPropertyCount
.
Implementations must support at least one queue family.
Otherwise, pQueueFamilyPropertyCount
must point to a variable set by
the user to the number of elements in the pQueueFamilyProperties
array, and on return the variable is overwritten with the number of
structures actually written to pQueueFamilyProperties
.
If pQueueFamilyPropertyCount
is less than the number of queue families
available, at most pQueueFamilyPropertyCount
structures will be
written.
The VkQueueFamilyProperties
structure is defined as:
typedef struct VkQueueFamilyProperties {
VkQueueFlags queueFlags;
uint32_t queueCount;
uint32_t timestampValidBits;
VkExtent3D minImageTransferGranularity;
} VkQueueFamilyProperties;
-
queueFlags
is a bitmask of VkQueueFlagBits indicating capabilities of the queues in this queue family. -
queueCount
is the unsigned integer count of queues in this queue family. Each queue family must support at least one queue. -
timestampValidBits
is the unsigned integer count of meaningful bits in the timestamps written viavkCmdWriteTimestamp
. The valid range for the count is 36..64 bits, or a value of 0, indicating no support for timestamps. Bits outside the valid range are guaranteed to be zeros. -
minImageTransferGranularity
is the minimum granularity supported for image transfer operations on the queues in this queue family.
The value returned in minImageTransferGranularity
has a unit of
compressed texel blocks for images having a block-compressed format, and a
unit of texels otherwise.
Possible values of minImageTransferGranularity
are:
-
(0,0,0) which indicates that only whole mip levels must be transferred using the image transfer operations on the corresponding queues. In this case, the following restrictions apply to all offset and extent parameters of image transfer operations:
-
The
x
,y
, andz
members of a VkOffset3D parameter must always be zero. -
The
width
,height
, anddepth
members of a VkExtent3D parameter must always match the width, height, and depth of the image subresource corresponding to the parameter, respectively.
-
-
(Ax, Ay, Az) where Ax, Ay, and Az are all integer powers of two. In this case the following restrictions apply to all image transfer operations:
-
x
,y
, andz
of a VkOffset3D parameter must be integer multiples of Ax, Ay, and Az, respectively. -
width
of a VkExtent3D parameter must be an integer multiple of Ax, or elsex
+width
must equal the width of the image subresource corresponding to the parameter. -
height
of a VkExtent3D parameter must be an integer multiple of Ay, or elsey
+height
must equal the height of the image subresource corresponding to the parameter. -
depth
of a VkExtent3D parameter must be an integer multiple of Az, or elsez
+depth
must equal the depth of the image subresource corresponding to the parameter. -
If the format of the image corresponding to the parameters is one of the block-compressed formats then for the purposes of the above calculations the granularity must be scaled up by the compressed texel block dimensions.
-
Queues supporting graphics and/or compute operations must report
(1,1,1) in minImageTransferGranularity
, meaning that there are
no additional restrictions on the granularity of image transfer operations
for these queues.
Other queues supporting image transfer operations are only required to
support whole mip level transfers, thus minImageTransferGranularity
for queues belonging to such queue families may be (0,0,0).
The Device Memory section describes memory properties queried from the physical device.
For physical device feature queries see the Features chapter.
Bits which may be set in VkQueueFamilyProperties::queueFlags
indicating capabilities of queues in a queue family are:
typedef enum VkQueueFlagBits {
VK_QUEUE_GRAPHICS_BIT = 0x00000001,
VK_QUEUE_COMPUTE_BIT = 0x00000002,
VK_QUEUE_TRANSFER_BIT = 0x00000004,
VK_QUEUE_SPARSE_BINDING_BIT = 0x00000008,
VK_QUEUE_PROTECTED_BIT = 0x00000010,
} VkQueueFlagBits;
-
VK_QUEUE_GRAPHICS_BIT
specifies that queues in this queue family support graphics operations. -
VK_QUEUE_COMPUTE_BIT
specifies that queues in this queue family support compute operations. -
VK_QUEUE_TRANSFER_BIT
specifies that queues in this queue family support transfer operations. -
VK_QUEUE_SPARSE_BINDING_BIT
specifies that queues in this queue family support sparse memory management operations (see Sparse Resources). If any of the sparse resource features are enabled, then at least one queue family must support this bit. -
if
VK_QUEUE_PROTECTED_BIT
is set, then the queues in this queue family support theVK_DEVICE_QUEUE_CREATE_PROTECTED_BIT
bit. (see Protected Memory). If the protected memory physical device feature is supported, then at least one queue family of at least one physical device exposed by the implementation must support this bit.
If an implementation exposes any queue family that supports graphics operations, at least one queue family of at least one physical device exposed by the implementation must support both graphics and compute operations.
Furthermore, if the protected memory physical device feature is supported, then at least one queue family of at least one physical device exposed by the implementation must support graphics operations, compute operations, and protected memory operations.
Note
All commands that are allowed on a queue that supports transfer operations
are also allowed on a queue that supports either graphics or compute
operations.
Thus, if the capabilities of a queue family include
|
For further details see Queues.
typedef VkFlags VkQueueFlags;
VkQueueFlags
is a bitmask type for setting a mask of zero or more
VkQueueFlagBits.
To query properties of queues available on a physical device, call:
void vkGetPhysicalDeviceQueueFamilyProperties2(
VkPhysicalDevice physicalDevice,
uint32_t* pQueueFamilyPropertyCount,
VkQueueFamilyProperties2* pQueueFamilyProperties);
or the equivalent command
void vkGetPhysicalDeviceQueueFamilyProperties2KHR(
VkPhysicalDevice physicalDevice,
uint32_t* pQueueFamilyPropertyCount,
VkQueueFamilyProperties2* pQueueFamilyProperties);
-
physicalDevice
is the handle to the physical device whose properties will be queried. -
pQueueFamilyPropertyCount
is a pointer to an integer related to the number of queue families available or queried, as described in vkGetPhysicalDeviceQueueFamilyProperties. -
pQueueFamilyProperties
is eitherNULL
or a pointer to an array of VkQueueFamilyProperties2 structures.
vkGetPhysicalDeviceQueueFamilyProperties2
behaves similarly to
vkGetPhysicalDeviceQueueFamilyProperties, with the ability to return
extended information in a pNext
chain of output structures.
The VkQueueFamilyProperties2
structure is defined as:
typedef struct VkQueueFamilyProperties2 {
VkStructureType sType;
void* pNext;
VkQueueFamilyProperties queueFamilyProperties;
} VkQueueFamilyProperties2;
or the equivalent
typedef VkQueueFamilyProperties2 VkQueueFamilyProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
queueFamilyProperties
is a structure of type VkQueueFamilyProperties which is populated with the same values as in vkGetPhysicalDeviceQueueFamilyProperties.
Additional queue family information can be queried by setting
VkQueueFamilyProperties2::pNext
to point to an instance of the
VkQueueFamilyCheckpointPropertiesNV structure.
The VkQueueFamilyCheckpointPropertiesNV structure is defined as:
typedef struct VkQueueFamilyCheckpointPropertiesNV {
VkStructureType sType;
void* pNext;
VkPipelineStageFlags checkpointExecutionStageMask;
} VkQueueFamilyCheckpointPropertiesNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
checkpointExecutionStageMask
is a mask indicating which pipeline stages the implementation can execute checkpoint markers in.
4.2. Devices
Device objects represent logical connections to physical devices. Each device exposes a number of queue families each having one or more queues. All queues in a queue family support the same operations.
As described in Physical Devices, a Vulkan application will first query for all physical devices in a system. Each physical device can then be queried for its capabilities, including its queue and queue family properties. Once an acceptable physical device is identified, an application will create a corresponding logical device. An application must create a separate logical device for each physical device it will use. The created logical device is then the primary interface to the physical device.
How to enumerate the physical devices in a system and query those physical devices for their queue family properties is described in the Physical Device Enumeration section above.
A single logical device can also be created from multiple physical devices, if those physical devices belong to the same device group. A device group is a set of physical devices that support accessing each other’s memory and recording a single command buffer that can be executed on all the physical devices. Device groups are enumerated by calling vkEnumeratePhysicalDeviceGroups, and a logical device is created from a subset of the physical devices in a device group by passing the physical devices through VkDeviceGroupDeviceCreateInfo.
To retrieve a list of the device groups present in the system, call:
VkResult vkEnumeratePhysicalDeviceGroups(
VkInstance instance,
uint32_t* pPhysicalDeviceGroupCount,
VkPhysicalDeviceGroupProperties* pPhysicalDeviceGroupProperties);
or the equivalent command
VkResult vkEnumeratePhysicalDeviceGroupsKHR(
VkInstance instance,
uint32_t* pPhysicalDeviceGroupCount,
VkPhysicalDeviceGroupProperties* pPhysicalDeviceGroupProperties);
-
instance
is a handle to a Vulkan instance previously created with vkCreateInstance. -
pPhysicalDeviceGroupCount
is a pointer to an integer related to the number of device groups available or queried, as described below. -
pPhysicalDeviceGroupProperties
is eitherNULL
or a pointer to an array of VkPhysicalDeviceGroupProperties structures.
If pPhysicalDeviceGroupProperties
is NULL
, then the number of device
groups available is returned in pPhysicalDeviceGroupCount
.
Otherwise, pPhysicalDeviceGroupCount
must point to a variable set by
the user to the number of elements in the
pPhysicalDeviceGroupProperties
array, and on return the variable is
overwritten with the number of structures actually written to
pPhysicalDeviceGroupProperties
.
If pPhysicalDeviceGroupCount
is less than the number of device groups
available, at most pPhysicalDeviceGroupCount
structures will be
written.
If pPhysicalDeviceGroupCount
is smaller than the number of device
groups available, VK_INCOMPLETE
will be returned instead of
VK_SUCCESS
, to indicate that not all the available device groups were
returned.
Every physical device must be in exactly one device group.
The VkPhysicalDeviceGroupProperties
structure is defined as:
typedef struct VkPhysicalDeviceGroupProperties {
VkStructureType sType;
void* pNext;
uint32_t physicalDeviceCount;
VkPhysicalDevice physicalDevices[VK_MAX_DEVICE_GROUP_SIZE];
VkBool32 subsetAllocation;
} VkPhysicalDeviceGroupProperties;
or the equivalent
typedef VkPhysicalDeviceGroupProperties VkPhysicalDeviceGroupPropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
physicalDeviceCount
is the number of physical devices in the group. -
physicalDevices
is an array of physical device handles representing all physical devices in the group. The firstphysicalDeviceCount
elements of the array will be valid. -
subsetAllocation
specifies whether logical devices created from the group support allocating device memory on a subset of devices, via thedeviceMask
member of the VkMemoryAllocateFlagsInfo. If this isVK_FALSE
, then all device memory allocations are made across all physical devices in the group. IfphysicalDeviceCount
is1
, thensubsetAllocation
must beVK_FALSE
.
4.2.1. Device Creation
Logical devices are represented by VkDevice
handles:
VK_DEFINE_HANDLE(VkDevice)
A logical device is created as a connection to a physical device. To create a logical device, call:
VkResult vkCreateDevice(
VkPhysicalDevice physicalDevice,
const VkDeviceCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDevice* pDevice);
-
physicalDevice
must be one of the device handles returned from a call tovkEnumeratePhysicalDevices
(see Physical Device Enumeration). -
pCreateInfo
is a pointer to a VkDeviceCreateInfo structure containing information about how to create the device. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pDevice
points to a handle in which the created VkDevice is returned.
vkCreateDevice
verifies that extensions and features requested in the
ppEnabledExtensionNames
and pEnabledFeatures
members of
pCreateInfo
, respectively, are supported by the implementation.
If any requested extension is not supported, vkCreateDevice
must
return VK_ERROR_EXTENSION_NOT_PRESENT
.
If any requested feature is not supported, vkCreateDevice
must return
VK_ERROR_FEATURE_NOT_PRESENT
.
Support for extensions can be checked before creating a device by querying
vkEnumerateDeviceExtensionProperties.
Support for features can similarly be checked by querying
vkGetPhysicalDeviceFeatures.
After verifying and enabling the extensions the VkDevice
object is
created and returned to the application.
If a requested extension is only supported by a layer, both the layer and
the extension need to be specified at vkCreateInstance
time for the
creation to succeed.
Multiple logical devices can be created from the same physical device.
Logical device creation may fail due to lack of device-specific resources
(in addition to the other errors).
If that occurs, vkCreateDevice
will return
VK_ERROR_TOO_MANY_OBJECTS
.
The VkDeviceCreateInfo
structure is defined as:
typedef struct VkDeviceCreateInfo {
VkStructureType sType;
const void* pNext;
VkDeviceCreateFlags flags;
uint32_t queueCreateInfoCount;
const VkDeviceQueueCreateInfo* pQueueCreateInfos;
uint32_t enabledLayerCount;
const char* const* ppEnabledLayerNames;
uint32_t enabledExtensionCount;
const char* const* ppEnabledExtensionNames;
const VkPhysicalDeviceFeatures* pEnabledFeatures;
} VkDeviceCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
queueCreateInfoCount
is the unsigned integer size of thepQueueCreateInfos
array. Refer to the Queue Creation section below for further details. -
pQueueCreateInfos
is a pointer to an array of VkDeviceQueueCreateInfo structures describing the queues that are requested to be created along with the logical device. Refer to the Queue Creation section below for further details. -
enabledLayerCount
is deprecated and ignored. -
ppEnabledLayerNames
is deprecated and ignored. See Device Layer Deprecation. -
enabledExtensionCount
is the number of device extensions to enable. -
ppEnabledExtensionNames
is a pointer to an array ofenabledExtensionCount
null-terminated UTF-8 strings containing the names of extensions to enable for the created device. See the Extensions section for further details. -
pEnabledFeatures
isNULL
or a pointer to a VkPhysicalDeviceFeatures structure that contains boolean indicators of all the features to be enabled. Refer to the Features section for further details.
typedef VkFlags VkDeviceCreateFlags;
VkDeviceCreateFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
A logical device can be created that connects to one or more physical
devices by including a VkDeviceGroupDeviceCreateInfo
structure in the
pNext
chain of VkDeviceCreateInfo.
The VkDeviceGroupDeviceCreateInfo
structure is defined as:
typedef struct VkDeviceGroupDeviceCreateInfo {
VkStructureType sType;
const void* pNext;
uint32_t physicalDeviceCount;
const VkPhysicalDevice* pPhysicalDevices;
} VkDeviceGroupDeviceCreateInfo;
or the equivalent
typedef VkDeviceGroupDeviceCreateInfo VkDeviceGroupDeviceCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
physicalDeviceCount
is the number of elements in thepPhysicalDevices
array. -
pPhysicalDevices
is an array of physical device handles belonging to the same device group.
The elements of the pPhysicalDevices
array are an ordered list of the
physical devices that the logical device represents.
These must be a subset of a single device group, and need not be in the
same order as they were enumerated.
The order of the physical devices in the pPhysicalDevices
array
determines the device index of each physical device, with element i
being assigned a device index of i.
Certain commands and structures refer to one or more physical devices by
using device indices or device masks formed using device indices.
A logical device created without using VkDeviceGroupDeviceCreateInfo
,
or with physicalDeviceCount
equal to zero, is equivalent to a
physicalDeviceCount
of one and pPhysicalDevices
pointing to the
physicalDevice
parameter to vkCreateDevice.
In particular, the device index of that physical device is zero.
To specify whether device memory allocation is allowed beyond the size
reported by VkPhysicalDeviceMemoryProperties, add a
VkDeviceMemoryOverallocationCreateInfoAMD structure to the pNext
chain of the VkDeviceCreateInfo structure.
If this structure is not specified, it is as if the
VK_MEMORY_OVERALLOCATION_BEHAVIOR_DEFAULT_AMD value is used.
typedef struct VkDeviceMemoryOverallocationCreateInfoAMD {
VkStructureType sType;
const void* pNext;
VkMemoryOverallocationBehaviorAMD overallocationBehavior;
} VkDeviceMemoryOverallocationCreateInfoAMD;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
overallocationBehavior
is the desired overallocation behavior.
Possible values for VkDeviceMemoryOverallocationCreateInfoAMD::overallocationBehavior include:
typedef enum VkMemoryOverallocationBehaviorAMD {
VK_MEMORY_OVERALLOCATION_BEHAVIOR_DEFAULT_AMD = 0,
VK_MEMORY_OVERALLOCATION_BEHAVIOR_ALLOWED_AMD = 1,
VK_MEMORY_OVERALLOCATION_BEHAVIOR_DISALLOWED_AMD = 2,
} VkMemoryOverallocationBehaviorAMD;
-
VK_MEMORY_OVERALLOCATION_BEHAVIOR_DEFAULT_AMD
lets the implementation decide if overallocation should be allowed. -
VK_MEMORY_OVERALLOCATION_BEHAVIOR_ALLOWED_AMD
specifies overallocation is allowed if platform permits. -
VK_MEMORY_OVERALLOCATION_BEHAVIOR_DISALLOWED_AMD
specifies the application is not allowed to allocate device memory beyond the heap sizes reported by VkPhysicalDeviceMemoryProperties. Allocations that are not explicitly made by the application within the scope of the Vulkan instance are not accounted for.
4.2.2. Device Use
The following is a high-level list of VkDevice
uses along with
references on where to find more information:
-
Creation of queues. See the Queues section below for further details.
-
Creation and tracking of various synchronization constructs. See Synchronization and Cache Control for further details.
-
Allocating, freeing, and managing memory. See Memory Allocation and Resource Creation for further details.
-
Creation and destruction of command buffers and command buffer pools. See Command Buffers for further details.
-
Creation, destruction, and management of graphics state. See Pipelines and Resource Descriptors, among others, for further details.
4.2.3. Lost Device
A logical device may become lost for a number of implementation-specific reasons, indicating that pending and future command execution may fail and cause resources and backing memory to become undefined.
Note
Typical reasons for device loss will include things like execution timing out (to prevent denial of service), power management events, platform resource management, or implementation errors. |
When this happens, certain commands will return VK_ERROR_DEVICE_LOST
(see Error Codes for a list of such commands).
After any such event, the logical device is considered lost.
It is not possible to reset the logical device to a non-lost state, however
the lost state is specific to a logical device (VkDevice
), and the
corresponding physical device (VkPhysicalDevice
) may be otherwise
unaffected.
In some cases, the physical device may also be lost, and attempting to
create a new logical device will fail, returning VK_ERROR_DEVICE_LOST
.
This is usually indicative of a problem with the underlying implementation,
or its connection to the host.
If the physical device has not been lost, and a new logical device is
successfully created from that physical device, it must be in the non-lost
state.
Note
Whilst logical device loss may be recoverable, in the case of physical device loss, it is unlikely that an application will be able to recover unless additional, unaffected physical devices exist on the system. The error is largely informational and intended only to inform the user that a platform issue has occurred, and should be investigated further. For example, underlying hardware may have developed a fault or become physically disconnected from the rest of the system. In many cases, physical device loss may cause other more serious issues such as the operating system crashing; in which case it may not be reported via the Vulkan API. |
Note
Undefined behavior caused by an application error may cause a device to
become lost.
However, such undefined behavior may also cause unrecoverable damage to
the process, and it is then not guaranteed that the API objects, including
the |
When a device is lost, its child objects are not implicitly destroyed and their handles are still valid. Those objects must still be destroyed before their parents or the device can be destroyed (see the Object Lifetime section). The host address space corresponding to device memory mapped using vkMapMemory is still valid, and host memory accesses to these mapped regions are still valid, but the contents are undefined. It is still legal to call any API command on the device and child objects.
Once a device is lost, command execution may fail, and commands that return
a VkResult may return VK_ERROR_DEVICE_LOST
.
Commands that do not allow run-time errors must still operate correctly for
valid usage and, if applicable, return valid data.
Commands that wait indefinitely for device execution (namely
vkDeviceWaitIdle, vkQueueWaitIdle, vkWaitForFences
or vkAcquireNextImageKHR
with a maximum timeout
, and vkGetQueryPoolResults with the
VK_QUERY_RESULT_WAIT_BIT
bit set in flags
) must return in
finite time even in the case of a lost device, and return either
VK_SUCCESS
or VK_ERROR_DEVICE_LOST
.
For any command that may return VK_ERROR_DEVICE_LOST
, for the purpose
of determining whether a command buffer is in the
pending state, or whether resources are
considered in-use by the device, a return value of
VK_ERROR_DEVICE_LOST
is equivalent to VK_SUCCESS
.
The content of any external memory objects that have been exported from or
imported to a lost device become undefined.
Objects on other logical devices or in other APIs which are associated with
the same underlying memory resource as the external memory objects on the
lost device are unaffected other than their content becoming undefined.
The layout of subresources of images on other logical devices that are bound
to VkDeviceMemory
objects associated with the same underlying memory
resources as external memory objects on the lost device becomes
VK_IMAGE_LAYOUT_UNDEFINED
.
The state of VkSemaphore
objects on other logical devices created by
importing a semaphore payload with
temporary permanence which was exported from the lost device is undefined.
The state of VkSemaphore
objects on other logical devices that
permanently share a semaphore payload with a VkSemaphore
object on the
lost device is undefined, and remains undefined following any subsequent
signal operations.
Implementations must ensure pending and subsequently submitted wait
operations on such semaphores behave as defined in
Semaphore State Requirements For
Wait Operations for external semaphores not in a valid state for a wait
operation.
editing-note
TODO (piman) - I do not think we are very clear about what “in-use by the device” means. |
4.2.4. Device Destruction
To destroy a device, call:
void vkDestroyDevice(
VkDevice device,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
To ensure that no work is active on the device, vkDeviceWaitIdle can
be used to gate the destruction of the device.
Prior to destroying a device, an application is responsible for
destroying/freeing any Vulkan objects that were created using that device as
the first parameter of the corresponding vkCreate*
or
vkAllocate*
command.
Note
The lifetime of each of these objects is bound by the lifetime of the
|
4.3. Queues
4.3.1. Queue Family Properties
As discussed in the Physical Device Enumeration section above, the vkGetPhysicalDeviceQueueFamilyProperties command is used to retrieve details about the queue families and queues supported by a device.
Each index in the pQueueFamilyProperties
array returned by
vkGetPhysicalDeviceQueueFamilyProperties describes a unique queue
family on that physical device.
These indices are used when creating queues, and they correspond directly
with the queueFamilyIndex
that is passed to the vkCreateDevice
command via the VkDeviceQueueCreateInfo structure as described in the
Queue Creation section below.
Grouping of queue families within a physical device is implementation-dependent.
Note
The general expectation is that a physical device groups all queues of matching capabilities into a single family. However, while implementations should do this, it is possible that a physical device may return two separate queue families with the same capabilities. |
Once an application has identified a physical device with the queue(s) that it desires to use, it will create those queues in conjunction with a logical device. This is described in the following section.
4.3.2. Queue Creation
Creating a logical device also creates the queues associated with that
device.
The queues to create are described by a set of VkDeviceQueueCreateInfo
structures that are passed to vkCreateDevice in
pQueueCreateInfos
.
Queues are represented by VkQueue
handles:
VK_DEFINE_HANDLE(VkQueue)
The VkDeviceQueueCreateInfo
structure is defined as:
typedef struct VkDeviceQueueCreateInfo {
VkStructureType sType;
const void* pNext;
VkDeviceQueueCreateFlags flags;
uint32_t queueFamilyIndex;
uint32_t queueCount;
const float* pQueuePriorities;
} VkDeviceQueueCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask indicating behavior of the queue. -
queueFamilyIndex
is an unsigned integer indicating the index of the queue family to create on this device. This index corresponds to the index of an element of thepQueueFamilyProperties
array that was returned byvkGetPhysicalDeviceQueueFamilyProperties
. -
queueCount
is an unsigned integer specifying the number of queues to create in the queue family indicated byqueueFamilyIndex
. -
pQueuePriorities
is an array ofqueueCount
normalized floating point values, specifying priorities of work that will be submitted to each created queue. See Queue Priority for more information.
Bits which can be set in VkDeviceQueueCreateInfo::flags
to
specify usage behavior of the queue are:
typedef enum VkDeviceQueueCreateFlagBits {
VK_DEVICE_QUEUE_CREATE_PROTECTED_BIT = 0x00000001,
} VkDeviceQueueCreateFlagBits;
-
VK_DEVICE_QUEUE_CREATE_PROTECTED_BIT
specifies that the device queue is a protected-capable queue. If the protected memory feature is not enabled, theVK_DEVICE_QUEUE_CREATE_PROTECTED_BIT
bit offlags
must not be set.
typedef VkFlags VkDeviceQueueCreateFlags;
VkDeviceQueueCreateFlags
is a bitmask type for setting a mask of zero
or more VkDeviceQueueCreateFlagBits.
A queue can be created with a system-wide priority by including a
VkDeviceQueueGlobalPriorityCreateInfoEXT
structure in the pNext
chain of VkDeviceQueueCreateInfo.
The VkDeviceQueueGlobalPriorityCreateInfoEXT
structure is defined as:
typedef struct VkDeviceQueueGlobalPriorityCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkQueueGlobalPriorityEXT globalPriority;
} VkDeviceQueueGlobalPriorityCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
globalPriority
is the system-wide priority associated to this queue as specified by VkQueueGlobalPriorityEXT
A queue created without specifying
VkDeviceQueueGlobalPriorityCreateInfoEXT
will default to
VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT
.
Possible values of
VkDeviceQueueGlobalPriorityCreateInfoEXT::globalPriority
,
specifying a system-wide priority level are:
typedef enum VkQueueGlobalPriorityEXT {
VK_QUEUE_GLOBAL_PRIORITY_LOW_EXT = 128,
VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT = 256,
VK_QUEUE_GLOBAL_PRIORITY_HIGH_EXT = 512,
VK_QUEUE_GLOBAL_PRIORITY_REALTIME_EXT = 1024,
} VkQueueGlobalPriorityEXT;
Priority values are sorted in ascending order. A comparison operation on the enum values can be used to determine the priority order.
-
VK_QUEUE_GLOBAL_PRIORITY_LOW_EXT
is below the system default. Useful for non-interactive tasks. -
VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT
is the system default priority. -
VK_QUEUE_GLOBAL_PRIORITY_HIGH_EXT
is above the system default. -
VK_QUEUE_GLOBAL_PRIORITY_REALTIME_EXT
is the highest priority. Useful for critical tasks.
Queues with higher system priority may be allotted more processing time than queues with lower priority. An implementation may allow a higher-priority queue to starve a lower-priority queue until the higher-priority queue has no further commands to execute.
Priorities imply no ordering or scheduling constraints.
No specific guarantees are made about higher priority queues receiving more processing time or better quality of service than lower priority queues.
The global priority level of a queue takes precedence over the per-process
queue priority (VkDeviceQueueCreateInfo
::pQueuePriorities
).
Abuse of this feature may result in starving the rest of the system of
implementation resources.
Therefore, the driver implementation may deny requests to acquire a
priority above the default priority
(VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT
) if the caller does not have
sufficient privileges.
In this scenario VK_ERROR_NOT_PERMITTED_EXT
is returned.
The driver implementation may fail the queue allocation request if
resources required to complete the operation have been exhausted (either by
the same process or a different process).
In this scenario VK_ERROR_INITIALIZATION_FAILED
is returned.
To retrieve a handle to a VkQueue object, call:
void vkGetDeviceQueue(
VkDevice device,
uint32_t queueFamilyIndex,
uint32_t queueIndex,
VkQueue* pQueue);
-
device
is the logical device that owns the queue. -
queueFamilyIndex
is the index of the queue family to which the queue belongs. -
queueIndex
is the index within this queue family of the queue to retrieve. -
pQueue
is a pointer to a VkQueue object that will be filled with the handle for the requested queue.
vkGetDeviceQueue
must only be used to get queues that were created
with the flags
parameter of VkDeviceQueueCreateInfo
set to zero.
To get queues that were created with a non-zero flags
parameter use
vkGetDeviceQueue2.
To retrieve a handle to a VkQueue object with specific VkDeviceQueueCreateFlags creation flags, call:
void vkGetDeviceQueue2(
VkDevice device,
const VkDeviceQueueInfo2* pQueueInfo,
VkQueue* pQueue);
-
device
is the logical device that owns the queue. -
pQueueInfo
points to an instance of the VkDeviceQueueInfo2 structure, describing the parameters used to create the device queue. -
pQueue
is a pointer to a VkQueue object that will be filled with the handle for the requested queue.
The VkDeviceQueueInfo2
structure is defined as:
typedef struct VkDeviceQueueInfo2 {
VkStructureType sType;
const void* pNext;
VkDeviceQueueCreateFlags flags;
uint32_t queueFamilyIndex;
uint32_t queueIndex;
} VkDeviceQueueInfo2;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. ThepNext
chain ofVkDeviceQueueInfo2
is used to provide additional image parameters tovkGetDeviceQueue2
. -
flags
is a VkDeviceQueueCreateFlags value indicating the flags used to create the device queue. -
queueFamilyIndex
is the index of the queue family to which the queue belongs. -
queueIndex
is the index within this queue family of the queue to retrieve.
The queue returned by vkGetDeviceQueue2
must have the same
flags
value from this structure as that used at device creation time
in a VkDeviceQueueCreateInfo
instance.
If no matching flags
were specified at device creation time then
pQueue
will return VK_NULL_HANDLE.
4.3.3. Queue Family Index
The queue family index is used in multiple places in Vulkan in order to tie operations to a specific family of queues.
When retrieving a handle to the queue via vkGetDeviceQueue
, the queue
family index is used to select which queue family to retrieve the
VkQueue
handle from as described in the previous section.
When creating a VkCommandPool
object (see
Command Pools), a queue family index is specified
in the VkCommandPoolCreateInfo structure.
Command buffers from this pool can only be submitted on queues
corresponding to this queue family.
When creating VkImage
(see Images) and
VkBuffer
(see Buffers) resources, a set of queue
families is included in the VkImageCreateInfo and
VkBufferCreateInfo structures to specify the queue families that can
access the resource.
When inserting a VkBufferMemoryBarrier or VkImageMemoryBarrier (see Events) a source and destination queue family index is specified to allow the ownership of a buffer or image to be transferred from one queue family to another. See the Resource Sharing section for details.
4.3.4. Queue Priority
Each queue is assigned a priority, as set in the VkDeviceQueueCreateInfo structures when creating the device. The priority of each queue is a normalized floating point value between 0.0 and 1.0, which is then translated to a discrete priority level by the implementation. Higher values indicate a higher priority, with 0.0 being the lowest priority and 1.0 being the highest.
Within the same device, queues with higher priority may be allotted more processing time than queues with lower priority. The implementation makes no guarantees with regards to ordering or scheduling among queues with the same priority, other than the constraints defined by any explicit synchronization primitives. The implementation make no guarantees with regards to queues across different devices.
An implementation may allow a higher-priority queue to starve a
lower-priority queue on the same VkDevice
until the higher-priority
queue has no further commands to execute.
The relationship of queue priorities must not cause queues on one
VkDevice
to starve queues on another VkDevice
.
No specific guarantees are made about higher priority queues receiving more processing time or better quality of service than lower priority queues.
4.3.5. Queue Submission
Work is submitted to a queue via queue submission commands such as vkQueueSubmit. Queue submission commands define a set of queue operations to be executed by the underlying physical device, including synchronization with semaphores and fences.
Submission commands take as parameters a target queue, zero or more batches of work, and an optional fence to signal upon completion. Each batch consists of three distinct parts:
-
Zero or more semaphores to wait on before execution of the rest of the batch.
-
If present, these describe a semaphore wait operation.
-
-
Zero or more work items to execute.
-
If present, these describe a queue operation matching the work described.
-
-
Zero or more semaphores to signal upon completion of the work items.
-
If present, these describe a semaphore signal operation.
-
If a fence is present in a queue submission, it describes a fence signal operation.
All work described by a queue submission command must be submitted to the queue before the command returns.
Sparse Memory Binding
In Vulkan it is possible to sparsely bind memory to buffers and images as
described in the Sparse Resource chapter.
Sparse memory binding is a queue operation.
A queue whose flags include the VK_QUEUE_SPARSE_BINDING_BIT
must be
able to support the mapping of a virtual address to a physical address on
the device.
This causes an update to the page table mappings on the device.
This update must be synchronized on a queue to avoid corrupting page table
mappings during execution of graphics commands.
By binding the sparse memory resources on queues, all commands that are
dependent on the updated bindings are synchronized to only execute after the
binding is updated.
See the Synchronization and Cache Control chapter for
how this synchronization is accomplished.
4.3.6. Queue Destruction
Queues are created along with a logical device during vkCreateDevice
.
All queues associated with a logical device are destroyed when
vkDestroyDevice
is called on that device.
5. Command Buffers
Command buffers are objects used to record commands which can be subsequently submitted to a device queue for execution. There are two levels of command buffers - primary command buffers, which can execute secondary command buffers, and which are submitted to queues, and secondary command buffers, which can be executed by primary command buffers, and which are not directly submitted to queues.
Command buffers are represented by VkCommandBuffer
handles:
VK_DEFINE_HANDLE(VkCommandBuffer)
Recorded commands include commands to bind pipelines and descriptor sets to the command buffer, commands to modify dynamic state, commands to draw (for graphics rendering), commands to dispatch (for compute), commands to execute secondary command buffers (for primary command buffers only), commands to copy buffers and images, and other commands.
Each command buffer manages state independently of other command buffers. There is no inheritance of state across primary and secondary command buffers, or between secondary command buffers. When a command buffer begins recording, all state in that command buffer is undefined. When secondary command buffer(s) are recorded to execute on a primary command buffer, the secondary command buffer inherits no state from the primary command buffer, and all state of the primary command buffer is undefined after an execute secondary command buffer command is recorded. There is one exception to this rule - if the primary command buffer is inside a render pass instance, then the render pass and subpass state is not disturbed by executing secondary command buffers. Whenever the state of a command buffer is undefined, the application must set all relevant state on the command buffer before any state dependent commands such as draws and dispatches are recorded, otherwise the behavior of executing that command buffer is undefined.
Unless otherwise specified, and without explicit synchronization, the various commands submitted to a queue via command buffers may execute in arbitrary order relative to each other, and/or concurrently. Also, the memory side-effects of those commands may not be directly visible to other commands without explicit memory dependencies. This is true within a command buffer, and across command buffers submitted to a given queue. See the synchronization chapter for information on implicit and explicit synchronization between commands.
5.1. Command Buffer Lifecycle
Each command buffer is always in one of the following states:
- Initial
-
When a command buffer is allocated, it is in the initial state. Some commands are able to reset a command buffer, or a set of command buffers, back to this state from any of the executable, recording or invalid state. Command buffers in the initial state can only be moved to the recording state, or freed.
- Recording
-
vkBeginCommandBuffer changes the state of a command buffer from the initial state to the recording state. Once a command buffer is in the recording state,
vkCmd*
commands can be used to record to the command buffer. - Executable
-
vkEndCommandBuffer ends the recording of a command buffer, and moves it from the recording state to the executable state. Executable command buffers can be submitted, reset, or recorded to another command buffer.
- Pending
-
Queue submission of a command buffer changes the state of a command buffer from the executable state to the pending state. Whilst in the pending state, applications must not attempt to modify the command buffer in any way - as the device may be processing the commands recorded to it. Once execution of a command buffer completes, the command buffer reverts back to either the executable state, or the invalid state if it was recorded with
VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT
. A synchronization command should be used to detect when this occurs. - Invalid
-
Some operations, such as modifying or deleting a resource that was used in a command recorded to a command buffer, will transition the state of that command buffer into the invalid state. Command buffers in the invalid state can only be reset or freed.
Any given command that operates on a command buffer has its own requirements on what state a command buffer must be in, which are detailed in the valid usage constraints for that command.
Resetting a command buffer is an operation that discards any previously recorded commands and puts a command buffer in the initial state. Resetting occurs as a result of vkResetCommandBuffer or vkResetCommandPool, or as part of vkBeginCommandBuffer (which additionally puts the command buffer in the recording state).
Secondary command buffers can be recorded to a primary command buffer via vkCmdExecuteCommands. This partially ties the lifecycle of the two command buffers together - if the primary is submitted to a queue, both the primary and any secondaries recorded to it move to the pending state. Once execution of the primary completes, so does any secondary recorded within it, and once all executions of each command buffer complete, they move to the executable state. If a secondary moves to any other state whilst it is recorded to another command buffer, the primary moves to the invalid state. A primary moving to any other state does not affect the state of the secondary. Resetting or freeing a primary command buffer removes the linkage to any secondary command buffers that were recorded to it.
5.2. Command Pools
Command pools are opaque objects that command buffer memory is allocated from, and which allow the implementation to amortize the cost of resource creation across multiple command buffers. Command pools are externally synchronized, meaning that a command pool must not be used concurrently in multiple threads. That includes use via recording commands on any command buffers allocated from the pool, as well as operations that allocate, free, and reset command buffers or the pool itself.
Command pools are represented by VkCommandPool
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkCommandPool)
To create a command pool, call:
VkResult vkCreateCommandPool(
VkDevice device,
const VkCommandPoolCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkCommandPool* pCommandPool);
-
device
is the logical device that creates the command pool. -
pCreateInfo
is a pointer to an instance of the VkCommandPoolCreateInfo structure specifying the state of the command pool object. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pCommandPool
points to a VkCommandPool handle in which the created pool is returned.
The VkCommandPoolCreateInfo
structure is defined as:
typedef struct VkCommandPoolCreateInfo {
VkStructureType sType;
const void* pNext;
VkCommandPoolCreateFlags flags;
uint32_t queueFamilyIndex;
} VkCommandPoolCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkCommandPoolCreateFlagBits indicating usage behavior for the pool and command buffers allocated from it. -
queueFamilyIndex
designates a queue family as described in section Queue Family Properties. All command buffers allocated from this command pool must be submitted on queues from the same queue family.
Bits which can be set in VkCommandPoolCreateInfo::flags
to
specify usage behavior for a command pool are:
typedef enum VkCommandPoolCreateFlagBits {
VK_COMMAND_POOL_CREATE_TRANSIENT_BIT = 0x00000001,
VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT = 0x00000002,
VK_COMMAND_POOL_CREATE_PROTECTED_BIT = 0x00000004,
} VkCommandPoolCreateFlagBits;
-
VK_COMMAND_POOL_CREATE_TRANSIENT_BIT
specifies that command buffers allocated from the pool will be short-lived, meaning that they will be reset or freed in a relatively short timeframe. This flag may be used by the implementation to control memory allocation behavior within the pool. -
VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT
allows any command buffer allocated from a pool to be individually reset to the initial state; either by calling vkResetCommandBuffer, or via the implicit reset when calling vkBeginCommandBuffer. If this flag is not set on a pool, thenvkResetCommandBuffer
must not be called for any command buffer allocated from that pool. -
VK_COMMAND_POOL_CREATE_PROTECTED_BIT
specifies that command buffers allocated from the pool are protected command buffers. If the protected memory feature is not enabled, theVK_COMMAND_POOL_CREATE_PROTECTED_BIT
bit offlags
must not be set.
typedef VkFlags VkCommandPoolCreateFlags;
VkCommandPoolCreateFlags
is a bitmask type for setting a mask of zero
or more VkCommandPoolCreateFlagBits.
To trim a command pool, call:
void vkTrimCommandPool(
VkDevice device,
VkCommandPool commandPool,
VkCommandPoolTrimFlags flags);
or the equivalent command
void vkTrimCommandPoolKHR(
VkDevice device,
VkCommandPool commandPool,
VkCommandPoolTrimFlags flags);
-
device
is the logical device that owns the command pool. -
commandPool
is the command pool to trim. -
flags
is reserved for future use.
Trimming a command pool recycles unused memory from the command pool back to the system. Command buffers allocated from the pool are not affected by the command.
Note
This command provides applications with some control over the internal memory allocations used by command pools. Unused memory normally arises from command buffers that have been recorded and later reset, such that they are no longer using the memory. On reset, a command buffer can return memory to its command pool, but the only way to release memory from a command pool to the system requires calling vkResetCommandPool, which cannot be executed while any command buffers from that pool are still in use. Subsequent recording operations into command buffers will re-use this memory but since total memory requirements fluctuate over time, unused memory can accumulate. In this situation, trimming a command pool may be useful to return unused memory back to the system, returning the total outstanding memory allocated by the pool back to a more “average” value. Implementations utilize many internal allocation strategies that make it impossible to guarantee that all unused memory is released back to the system. For instance, an implementation of a command pool may involve allocating memory in bulk from the system and sub-allocating from that memory. In such an implementation any live command buffer that holds a reference to a bulk allocation would prevent that allocation from being freed, even if only a small proportion of the bulk allocation is in use. In most cases trimming will result in a reduction in allocated but unused memory, but it does not guarantee the “ideal” behaviour. Trimming may be an expensive operation, and should not be called frequently. Trimming should be treated as a way to relieve memory pressure after application-known points when there exists enough unused memory that the cost of trimming is “worth” it. |
typedef VkFlags VkCommandPoolTrimFlags;
or the equivalent
typedef VkCommandPoolTrimFlags VkCommandPoolTrimFlagsKHR;
VkCommandPoolTrimFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
To reset a command pool, call:
VkResult vkResetCommandPool(
VkDevice device,
VkCommandPool commandPool,
VkCommandPoolResetFlags flags);
-
device
is the logical device that owns the command pool. -
commandPool
is the command pool to reset. -
flags
is a bitmask of VkCommandPoolResetFlagBits controlling the reset operation.
Resetting a command pool recycles all of the resources from all of the command buffers allocated from the command pool back to the command pool. All command buffers that have been allocated from the command pool are put in the initial state.
Any primary command buffer allocated from another VkCommandPool that
is in the recording or executable state and
has a secondary command buffer allocated from commandPool
recorded
into it, becomes invalid.
Bits which can be set in vkResetCommandPool::flags
to control
the reset operation are:
typedef enum VkCommandPoolResetFlagBits {
VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT = 0x00000001,
} VkCommandPoolResetFlagBits;
-
VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT
specifies that resetting a command pool recycles all of the resources from the command pool back to the system.
typedef VkFlags VkCommandPoolResetFlags;
VkCommandPoolResetFlags
is a bitmask type for setting a mask of zero
or more VkCommandPoolResetFlagBits.
To destroy a command pool, call:
void vkDestroyCommandPool(
VkDevice device,
VkCommandPool commandPool,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the command pool. -
commandPool
is the handle of the command pool to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
When a pool is destroyed, all command buffers allocated from the pool are freed.
Any primary command buffer allocated from another VkCommandPool that
is in the recording or executable state and
has a secondary command buffer allocated from commandPool
recorded
into it, becomes invalid.
5.3. Command Buffer Allocation and Management
To allocate command buffers, call:
VkResult vkAllocateCommandBuffers(
VkDevice device,
const VkCommandBufferAllocateInfo* pAllocateInfo,
VkCommandBuffer* pCommandBuffers);
-
device
is the logical device that owns the command pool. -
pAllocateInfo
is a pointer to an instance of theVkCommandBufferAllocateInfo
structure describing parameters of the allocation. -
pCommandBuffers
is a pointer to an array of VkCommandBuffer handles in which the resulting command buffer objects are returned. The array must be at least the length specified by thecommandBufferCount
member ofpAllocateInfo
. Each allocated command buffer begins in the initial state.
vkAllocateCommandBuffers
can be used to create multiple command
buffers.
If the creation of any of those command buffers fails, the implementation
must destroy all successfully created command buffer objects from this
command, set all entries of the pCommandBuffers
array to NULL
and
return the error.
When command buffers are first allocated, they are in the initial state.
The VkCommandBufferAllocateInfo
structure is defined as:
typedef struct VkCommandBufferAllocateInfo {
VkStructureType sType;
const void* pNext;
VkCommandPool commandPool;
VkCommandBufferLevel level;
uint32_t commandBufferCount;
} VkCommandBufferAllocateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
commandPool
is the command pool from which the command buffers are allocated. -
level
is a VkCommandBufferLevel value specifying the command buffer level. -
commandBufferCount
is the number of command buffers to allocate from the pool.
Possible values of VkCommandBufferAllocateInfo::level
,
specifying the command buffer level, are:
typedef enum VkCommandBufferLevel {
VK_COMMAND_BUFFER_LEVEL_PRIMARY = 0,
VK_COMMAND_BUFFER_LEVEL_SECONDARY = 1,
} VkCommandBufferLevel;
-
VK_COMMAND_BUFFER_LEVEL_PRIMARY
specifies a primary command buffer. -
VK_COMMAND_BUFFER_LEVEL_SECONDARY
specifies a secondary command buffer.
To reset command buffers, call:
VkResult vkResetCommandBuffer(
VkCommandBuffer commandBuffer,
VkCommandBufferResetFlags flags);
-
commandBuffer
is the command buffer to reset. The command buffer can be in any state other than pending, and is moved into the initial state. -
flags
is a bitmask of VkCommandBufferResetFlagBits controlling the reset operation.
Any primary command buffer that is in the recording or executable state and has commandBuffer
recorded into
it, becomes invalid.
Bits which can be set in vkResetCommandBuffer::flags
to control
the reset operation are:
typedef enum VkCommandBufferResetFlagBits {
VK_COMMAND_BUFFER_RESET_RELEASE_RESOURCES_BIT = 0x00000001,
} VkCommandBufferResetFlagBits;
-
VK_COMMAND_BUFFER_RESET_RELEASE_RESOURCES_BIT
specifies that most or all memory resources currently owned by the command buffer should be returned to the parent command pool. If this flag is not set, then the command buffer may hold onto memory resources and reuse them when recording commands.commandBuffer
is moved to the initial state.
typedef VkFlags VkCommandBufferResetFlags;
VkCommandBufferResetFlags
is a bitmask type for setting a mask of zero
or more VkCommandBufferResetFlagBits.
To free command buffers, call:
void vkFreeCommandBuffers(
VkDevice device,
VkCommandPool commandPool,
uint32_t commandBufferCount,
const VkCommandBuffer* pCommandBuffers);
-
device
is the logical device that owns the command pool. -
commandPool
is the command pool from which the command buffers were allocated. -
commandBufferCount
is the length of thepCommandBuffers
array. -
pCommandBuffers
is an array of handles of command buffers to free.
Any primary command buffer that is in the recording or executable state and has any element of pCommandBuffers
recorded into it, becomes invalid.
5.4. Command Buffer Recording
To begin recording a command buffer, call:
VkResult vkBeginCommandBuffer(
VkCommandBuffer commandBuffer,
const VkCommandBufferBeginInfo* pBeginInfo);
-
commandBuffer
is the handle of the command buffer which is to be put in the recording state. -
pBeginInfo
is an instance of the VkCommandBufferBeginInfo structure, which defines additional information about how the command buffer begins recording.
The VkCommandBufferBeginInfo
structure is defined as:
typedef struct VkCommandBufferBeginInfo {
VkStructureType sType;
const void* pNext;
VkCommandBufferUsageFlags flags;
const VkCommandBufferInheritanceInfo* pInheritanceInfo;
} VkCommandBufferBeginInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkCommandBufferUsageFlagBits specifying usage behavior for the command buffer. -
pInheritanceInfo
is a pointer to aVkCommandBufferInheritanceInfo
structure, which is used ifcommandBuffer
is a secondary command buffer. If this is a primary command buffer, then this value is ignored.
Bits which can be set in VkCommandBufferBeginInfo::flags
to
specify usage behavior for a command buffer are:
typedef enum VkCommandBufferUsageFlagBits {
VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT = 0x00000001,
VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT = 0x00000002,
VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT = 0x00000004,
} VkCommandBufferUsageFlagBits;
-
VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT
specifies that each recording of the command buffer will only be submitted once, and the command buffer will be reset and recorded again between each submission. -
VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT
specifies that a secondary command buffer is considered to be entirely inside a render pass. If this is a primary command buffer, then this bit is ignored. -
VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT
specifies that a command buffer can be resubmitted to a queue while it is in the pending state, and recorded into multiple primary command buffers.
typedef VkFlags VkCommandBufferUsageFlags;
VkCommandBufferUsageFlags
is a bitmask type for setting a mask of zero
or more VkCommandBufferUsageFlagBits.
If the command buffer is a secondary command buffer, then the
VkCommandBufferInheritanceInfo
structure defines any state that will
be inherited from the primary command buffer:
typedef struct VkCommandBufferInheritanceInfo {
VkStructureType sType;
const void* pNext;
VkRenderPass renderPass;
uint32_t subpass;
VkFramebuffer framebuffer;
VkBool32 occlusionQueryEnable;
VkQueryControlFlags queryFlags;
VkQueryPipelineStatisticFlags pipelineStatistics;
} VkCommandBufferInheritanceInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
renderPass
is a VkRenderPass object defining which render passes theVkCommandBuffer
will be compatible with and can be executed within. If theVkCommandBuffer
will not be executed within a render pass instance,renderPass
is ignored. -
subpass
is the index of the subpass within the render pass instance that theVkCommandBuffer
will be executed within. If theVkCommandBuffer
will not be executed within a render pass instance,subpass
is ignored. -
framebuffer
optionally refers to the VkFramebuffer object that theVkCommandBuffer
will be rendering to if it is executed within a render pass instance. It can be VK_NULL_HANDLE if the framebuffer is not known, or if theVkCommandBuffer
will not be executed within a render pass instance.NoteSpecifying the exact framebuffer that the secondary command buffer will be executed with may result in better performance at command buffer execution time.
-
occlusionQueryEnable
specifies whether the command buffer can be executed while an occlusion query is active in the primary command buffer. If this isVK_TRUE
, then this command buffer can be executed whether the primary command buffer has an occlusion query active or not. If this isVK_FALSE
, then the primary command buffer must not have an occlusion query active. -
queryFlags
specifies the query flags that can be used by an active occlusion query in the primary command buffer when this secondary command buffer is executed. If this value includes theVK_QUERY_CONTROL_PRECISE_BIT
bit, then the active query can return boolean results or actual sample counts. If this bit is not set, then the active query must not use theVK_QUERY_CONTROL_PRECISE_BIT
bit. -
pipelineStatistics
is a bitmask of VkQueryPipelineStatisticFlagBits specifying the set of pipeline statistics that can be counted by an active query in the primary command buffer when this secondary command buffer is executed. If this value includes a given bit, then this command buffer can be executed whether the primary command buffer has a pipeline statistics query active that includes this bit or not. If this value excludes a given bit, then the active pipeline statistics query must not be from a query pool that counts that statistic.
If VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT
was not set when
creating a command buffer, that command buffer must not be submitted to a
queue whilst it is already in the pending
state.
If VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT
is not set on a
secondary command buffer, that command buffer must not be used more than
once in a given primary command buffer.
Note
On some implementations, not using the
|
If a command buffer is in the invalid, or
executable state, and the command buffer was allocated from a command pool
with the VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT
flag set,
then vkBeginCommandBuffer
implicitly resets the command buffer,
behaving as if vkResetCommandBuffer
had been called with
VK_COMMAND_BUFFER_RESET_RELEASE_RESOURCES_BIT
not set.
After the implicit reset, commandBuffer
is moved to the
recording state.
If the pNext
chain of VkCommandBufferInheritanceInfo includes a
VkCommandBufferInheritanceConditionalRenderingInfoEXT
structure, then
that structure controls whether a command buffer can be executed while
conditional rendering is active in the
primary command buffer.
The VkCommandBufferInheritanceConditionalRenderingInfoEXT
structure is
defined as:
typedef struct VkCommandBufferInheritanceConditionalRenderingInfoEXT {
VkStructureType sType;
const void* pNext;
VkBool32 conditionalRenderingEnable;
} VkCommandBufferInheritanceConditionalRenderingInfoEXT;
-
sType
is the type of this structure -
pNext
isNULL
or a pointer to an extension-specific structure -
conditionalRenderingEnable
specifies whether the command buffer can be executed while conditional rendering is active in the primary command buffer. If this isVK_TRUE
, then this command buffer can be executed whether the primary command buffer has active conditional rendering or not. If this isVK_FALSE
, then the primary command buffer must not have conditional rendering active.
If this structure is not present, the behavior is as if
conditionalRenderingEnable
is VK_FALSE
.
Once recording starts, an application records a sequence of commands
(vkCmd*
) to set state in the command buffer, draw, dispatch, and other
commands.
Several commands can also be recorded indirectly from VkBuffer
content, see Device-Generated Commands.
To complete recording of a command buffer, call:
VkResult vkEndCommandBuffer(
VkCommandBuffer commandBuffer);
-
commandBuffer
is the command buffer to complete recording.
If there was an error during recording, the application will be notified by
an unsuccessful return code returned by vkEndCommandBuffer
.
If the application wishes to further use the command buffer, the command
buffer must be reset.
The command buffer must have been in the recording state, and is moved to the executable state.
When a command buffer is in the executable state, it can be submitted to a queue for execution.
5.5. Command Buffer Submission
To submit command buffers to a queue, call:
VkResult vkQueueSubmit(
VkQueue queue,
uint32_t submitCount,
const VkSubmitInfo* pSubmits,
VkFence fence);
-
queue
is the queue that the command buffers will be submitted to. -
submitCount
is the number of elements in thepSubmits
array. -
pSubmits
is a pointer to an array of VkSubmitInfo structures, each specifying a command buffer submission batch. -
fence
is an optional handle to a fence to be signaled once all submitted command buffers have completed execution. Iffence
is not VK_NULL_HANDLE, it defines a fence signal operation.
Note
Submission can be a high overhead operation, and applications should
attempt to batch work together into as few calls to |
vkQueueSubmit
is a queue submission
command, with each batch defined by an element of pSubmits
as an
instance of the VkSubmitInfo structure.
Batches begin execution in the order they appear in pSubmits
, but may
complete out of order.
Fence and semaphore operations submitted with vkQueueSubmit have additional ordering constraints compared to other submission commands, with dependencies involving previous and subsequent queue operations. Information about these additional constraints can be found in the semaphore and fence sections of the synchronization chapter.
Details on the interaction of pWaitDstStageMask
with synchronization
are described in the semaphore wait
operation section of the synchronization chapter.
The order that batches appear in pSubmits
is used to determine
submission order, and thus all the
implicit ordering guarantees that respect it.
Other than these implicit ordering guarantees and any explicit synchronization primitives, these batches may overlap or
otherwise execute out of order.
If any command buffer submitted to this queue is in the
executable state, it is moved to the
pending state.
Once execution of all submissions of a command buffer complete, it moves
from the pending state, back to the
executable state.
If a command buffer was recorded with the
VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT
flag, it instead moves
back to the invalid state.
If vkQueueSubmit
fails, it may return
VK_ERROR_OUT_OF_HOST_MEMORY
or VK_ERROR_OUT_OF_DEVICE_MEMORY
.
If it does, the implementation must ensure that the state and contents of
any resources or synchronization primitives referenced by the submitted
command buffers and any semaphores referenced by pSubmits
is
unaffected by the call or its failure.
If vkQueueSubmit
fails in such a way that the implementation is unable
to make that guarantee, the implementation must return
VK_ERROR_DEVICE_LOST
.
See Lost Device.
The VkSubmitInfo
structure is defined as:
typedef struct VkSubmitInfo {
VkStructureType sType;
const void* pNext;
uint32_t waitSemaphoreCount;
const VkSemaphore* pWaitSemaphores;
const VkPipelineStageFlags* pWaitDstStageMask;
uint32_t commandBufferCount;
const VkCommandBuffer* pCommandBuffers;
uint32_t signalSemaphoreCount;
const VkSemaphore* pSignalSemaphores;
} VkSubmitInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
waitSemaphoreCount
is the number of semaphores upon which to wait before executing the command buffers for the batch. -
pWaitSemaphores
is a pointer to an array of semaphores upon which to wait before the command buffers for this batch begin execution. If semaphores to wait on are provided, they define a semaphore wait operation. -
pWaitDstStageMask
is a pointer to an array of pipeline stages at which each corresponding semaphore wait will occur. -
commandBufferCount
is the number of command buffers to execute in the batch. -
pCommandBuffers
is a pointer to an array of command buffers to execute in the batch. -
signalSemaphoreCount
is the number of semaphores to be signaled once the commands specified inpCommandBuffers
have completed execution. -
pSignalSemaphores
is a pointer to an array of semaphores which will be signaled when the command buffers for this batch have completed execution. If semaphores to be signaled are provided, they define a semaphore signal operation.
The order that command buffers appear in pCommandBuffers
is used to
determine submission order, and thus
all the implicit ordering guarantees that
respect it.
Other than these implicit ordering guarantees and any explicit synchronization primitives, these command buffers may overlap or
otherwise execute out of order.
To specify the values to use when waiting for and signaling semaphores whose
current payload refers to a
Direct3D 12 fence, add the VkD3D12FenceSubmitInfoKHR structure to the
pNext
chain of the VkSubmitInfo structure.
The VkD3D12FenceSubmitInfoKHR
structure is defined as:
typedef struct VkD3D12FenceSubmitInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t waitSemaphoreValuesCount;
const uint64_t* pWaitSemaphoreValues;
uint32_t signalSemaphoreValuesCount;
const uint64_t* pSignalSemaphoreValues;
} VkD3D12FenceSubmitInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
waitSemaphoreValuesCount
is the number of semaphore wait values specified inpWaitSemaphoreValues
. -
pWaitSemaphoreValues
is an array of lengthwaitSemaphoreValuesCount
containing values for the corresponding semaphores in VkSubmitInfo::pWaitSemaphores
to wait for. -
signalSemaphoreValuesCount
is the number of semaphore signal values specified inpSignalSemaphoreValues
. -
pSignalSemaphoreValues
is an array of lengthsignalSemaphoreValuesCount
containing values for the corresponding semaphores in VkSubmitInfo::pSignalSemaphores
to set when signaled.
If the semaphore in VkSubmitInfo::pWaitSemaphores
or
VkSubmitInfo::pSignalSemaphores
corresponding to an entry in
pWaitSemaphoreValues
or pSignalSemaphoreValues
respectively does
not currently have a payload
referring to a Direct3D 12 fence, the implementation must ignore the value
in the pWaitSemaphoreValues
or pSignalSemaphoreValues
entry.
When submitting work that operates on memory imported from a Direct3D 11
resource to a queue, the keyed mutex mechanism may be used in addition to
Vulkan semaphores to synchronize the work.
Keyed mutexes are a property of a properly created shareable Direct3D 11
resource.
They can only be used if the imported resource was created with the
D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX
flag.
To acquire keyed mutexes before submitted work and/or release them after,
add a VkWin32KeyedMutexAcquireReleaseInfoKHR structure to the
pNext
chain of the VkSubmitInfo structure.
The VkWin32KeyedMutexAcquireReleaseInfoKHR
structure is defined as:
typedef struct VkWin32KeyedMutexAcquireReleaseInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t acquireCount;
const VkDeviceMemory* pAcquireSyncs;
const uint64_t* pAcquireKeys;
const uint32_t* pAcquireTimeouts;
uint32_t releaseCount;
const VkDeviceMemory* pReleaseSyncs;
const uint64_t* pReleaseKeys;
} VkWin32KeyedMutexAcquireReleaseInfoKHR;
-
acquireCount
is the number of entries in thepAcquireSyncs
,pAcquireKeys
, andpAcquireTimeoutMilliseconds
arrays. -
pAcquireSyncs
is a pointer to an array of VkDeviceMemory objects which were imported from Direct3D 11 resources. -
pAcquireKeys
is a pointer to an array of mutex key values to wait for prior to beginning the submitted work. Entries refer to the keyed mutex associated with the corresponding entries inpAcquireSyncs
. -
pAcquireTimeoutMilliseconds
is an array of timeout values, in millisecond units, for each acquire specified inpAcquireKeys
. -
releaseCount
is the number of entries in thepReleaseSyncs
andpReleaseKeys
arrays. -
pReleaseSyncs
is a pointer to an array of VkDeviceMemory objects which were imported from Direct3D 11 resources. -
pReleaseKeys
is a pointer to an array of mutex key values to set when the submitted work has completed. Entries refer to the keyed mutex associated with the corresponding entries inpReleaseSyncs
.
When submitting work that operates on memory imported from a Direct3D 11
resource to a queue, the keyed mutex mechanism may be used in addition to
Vulkan semaphores to synchronize the work.
Keyed mutexes are a property of a properly created shareable Direct3D 11
resource.
They can only be used if the imported resource was created with the
D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX
flag.
To acquire keyed mutexes before submitted work and/or release them after,
add a VkWin32KeyedMutexAcquireReleaseInfoNV structure to the
pNext
chain of the VkSubmitInfo structure.
The VkWin32KeyedMutexAcquireReleaseInfoNV
structure is defined as:
typedef struct VkWin32KeyedMutexAcquireReleaseInfoNV {
VkStructureType sType;
const void* pNext;
uint32_t acquireCount;
const VkDeviceMemory* pAcquireSyncs;
const uint64_t* pAcquireKeys;
const uint32_t* pAcquireTimeoutMilliseconds;
uint32_t releaseCount;
const VkDeviceMemory* pReleaseSyncs;
const uint64_t* pReleaseKeys;
} VkWin32KeyedMutexAcquireReleaseInfoNV;
-
acquireCount
is the number of entries in thepAcquireSyncs
,pAcquireKeys
, andpAcquireTimeoutMilliseconds
arrays. -
pAcquireSyncs
is a pointer to an array of VkDeviceMemory objects which were imported from Direct3D 11 resources. -
pAcquireKeys
is a pointer to an array of mutex key values to wait for prior to beginning the submitted work. Entries refer to the keyed mutex associated with the corresponding entries inpAcquireSyncs
. -
pAcquireTimeoutMilliseconds
is an array of timeout values, in millisecond units, for each acquire specified inpAcquireKeys
. -
releaseCount
is the number of entries in thepReleaseSyncs
andpReleaseKeys
arrays. -
pReleaseSyncs
is a pointer to an array of VkDeviceMemory objects which were imported from Direct3D 11 resources. -
pReleaseKeys
is a pointer to an array of mutex key values to set when the submitted work has completed. Entries refer to the keyed mutex associated with the corresponding entries inpReleaseSyncs
.
If the pNext
chain of VkSubmitInfo includes a
VkProtectedSubmitInfo
structure, then the structure indicates whether
the batch is protected.
The VkProtectedSubmitInfo
structure is defined as:
typedef struct VkProtectedSubmitInfo {
VkStructureType sType;
const void* pNext;
VkBool32 protectedSubmit;
} VkProtectedSubmitInfo;
-
protectedSubmit
specifies whether the batch is protected. IfprotectedSubmit
isVK_TRUE
, the batch is protected. IfprotectedSubmit
isVK_FALSE
, the batch is unprotected. If theVkSubmitInfo
::pNext
chain does not contain this structure, the batch is unprotected.
If the pNext
chain of VkSubmitInfo includes a
VkDeviceGroupSubmitInfo
structure, then that structure includes device
indices and masks specifying which physical devices execute semaphore
operations and command buffers.
The VkDeviceGroupSubmitInfo
structure is defined as:
typedef struct VkDeviceGroupSubmitInfo {
VkStructureType sType;
const void* pNext;
uint32_t waitSemaphoreCount;
const uint32_t* pWaitSemaphoreDeviceIndices;
uint32_t commandBufferCount;
const uint32_t* pCommandBufferDeviceMasks;
uint32_t signalSemaphoreCount;
const uint32_t* pSignalSemaphoreDeviceIndices;
} VkDeviceGroupSubmitInfo;
or the equivalent
typedef VkDeviceGroupSubmitInfo VkDeviceGroupSubmitInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
waitSemaphoreCount
is the number of elements in thepWaitSemaphoreDeviceIndices
array. -
pWaitSemaphoreDeviceIndices
is an array of device indices indicating which physical device executes the semaphore wait operation in the corresponding element of VkSubmitInfo::pWaitSemaphores
. -
commandBufferCount
is the number of elements in thepCommandBufferDeviceMasks
array. -
pCommandBufferDeviceMasks
is an array of device masks indicating which physical devices execute the command buffer in the corresponding element of VkSubmitInfo::pCommandBuffers
. A physical device executes the command buffer if the corresponding bit is set in the mask. -
signalSemaphoreCount
is the number of elements in thepSignalSemaphoreDeviceIndices
array. -
pSignalSemaphoreDeviceIndices
is an array of device indices indicating which physical device executes the semaphore signal operation in the corresponding element of VkSubmitInfo::pSignalSemaphores
.
If this structure is not present, semaphore operations and command buffers execute on device index zero.
5.6. Queue Forward Progress
The application must ensure that command buffer submissions will be able to
complete without any subsequent operations by the application on any queue.
After any call to vkQueueSubmit
, for every queued wait on a semaphore
there must be a prior signal of that semaphore that will not be consumed by
a different wait on the semaphore.
Command buffers in the submission can include vkCmdWaitEvents
commands that wait on events that will not be signaled by earlier commands
in the queue.
Such events must be signaled by the application using vkSetEvent, and
the vkCmdWaitEvents
commands that wait upon them must not be inside a
render pass instance.
Implementations may have limits on how long the command buffer will wait,
in order to avoid interfering with progress of other clients of the device.
If the event is not signaled within these limits, results are undefined and
may include device loss.
5.7. Secondary Command Buffer Execution
A secondary command buffer must not be directly submitted to a queue. Instead, secondary command buffers are recorded to execute as part of a primary command buffer with the command:
void vkCmdExecuteCommands(
VkCommandBuffer commandBuffer,
uint32_t commandBufferCount,
const VkCommandBuffer* pCommandBuffers);
-
commandBuffer
is a handle to a primary command buffer that the secondary command buffers are executed in. -
commandBufferCount
is the length of thepCommandBuffers
array. -
pCommandBuffers
is an array of secondary command buffer handles, which are recorded to execute in the primary command buffer in the order they are listed in the array.
If any element of pCommandBuffers
was not recorded with the
VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT
flag, and it was recorded
into any other primary command buffer which is currently in the
executable or recording state, that primary
command buffer becomes invalid.
5.8. Command Buffer Device Mask
Each command buffer has a piece of state storing the current device mask of the command buffer. This mask controls which physical devices within the logical device all subsequent commands will execute on, including state-setting commands, action commands, and synchronization commands.
Scissor, exclusive scissor, and viewport state can be set to different values on each physical device (only when set as dynamic state), and each physical device will render using its local copy of the state. Other state is shared between physical devices, such that all physical devices use the most recently set values for the state. However, when recording an action command that uses a piece of state, the most recent command that set that state must have included all physical devices that execute the action command in its current device mask.
The command buffer’s device mask is orthogonal to the
pCommandBufferDeviceMasks
member of VkDeviceGroupSubmitInfo.
Commands only execute on a physical device if the device index is set in
both device masks.
If the pNext
chain of VkCommandBufferBeginInfo includes a
VkDeviceGroupCommandBufferBeginInfo
structure, then that structure
includes an initial device mask for the command buffer.
The VkDeviceGroupCommandBufferBeginInfo
structure is defined as:
typedef struct VkDeviceGroupCommandBufferBeginInfo {
VkStructureType sType;
const void* pNext;
uint32_t deviceMask;
} VkDeviceGroupCommandBufferBeginInfo;
or the equivalent
typedef VkDeviceGroupCommandBufferBeginInfo VkDeviceGroupCommandBufferBeginInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
deviceMask
is the initial value of the command buffer’s device mask.
The initial device mask also acts as an upper bound on the set of devices that can ever be in the device mask in the command buffer.
If this structure is not present, the initial value of a command buffer’s device mask is set to include all physical devices in the logical device when the command buffer begins recording.
To update the current device mask of a command buffer, call:
void vkCmdSetDeviceMask(
VkCommandBuffer commandBuffer,
uint32_t deviceMask);
or the equivalent command
void vkCmdSetDeviceMaskKHR(
VkCommandBuffer commandBuffer,
uint32_t deviceMask);
-
commandBuffer
is command buffer whose current device mask is modified. -
deviceMask
is the new value of the current device mask.
deviceMask
is used to filter out subsequent commands from executing on
all physical devices whose bit indices are not set in the mask, except
commands beginning a render pass instance, commands transitioning to the
next subpass in the render pass instance, and commands ending a render pass
instance, which always execute on the set of physical devices whose bit
indices are included in the deviceMask
member of the instance of the
VkDeviceGroupRenderPassBeginInfoKHR structure passed to the command
beginning the corresponding render pass instance.
6. Synchronization and Cache Control
Synchronization of access to resources is primarily the responsibility of the application in Vulkan. The order of execution of commands with respect to the host and other commands on the device has few implicit guarantees, and needs to be explicitly specified. Memory caches and other optimizations are also explicitly managed, requiring that the flow of data through the system is largely under application control.
Whilst some implicit guarantees exist between commands, five explicit synchronization mechanisms are exposed by Vulkan:
- Fences
-
Fences can be used to communicate to the host that execution of some task on the device has completed.
- Semaphores
-
Semaphores can be used to control resource access across multiple queues.
- Events
-
Events provide a fine-grained synchronization primitive which can be signaled either within a command buffer or by the host, and can be waited upon within a command buffer or queried on the host.
- Pipeline Barriers
-
Pipeline barriers also provide synchronization control within a command buffer, but at a single point, rather than with separate signal and wait operations.
- Render Passes
-
Render passes provide a useful synchronization framework for most rendering tasks, built upon the concepts in this chapter. Many cases that would otherwise need an application to use other synchronization primitives can be expressed more efficiently as part of a render pass.
6.1. Execution and Memory Dependencies
An operation is an arbitrary amount of work to be executed on the host, a device, or an external entity such as a presentation engine. Synchronization commands introduce explicit execution dependencies, and memory dependencies between two sets of operations defined by the command’s two synchronization scopes.
The synchronization scopes define which other operations a synchronization command is able to create execution dependencies with. Any type of operation that is not in a synchronization command’s synchronization scopes will not be included in the resulting dependency. For example, for many synchronization commands, the synchronization scopes can be limited to just operations executing in specific pipeline stages, which allows other pipeline stages to be excluded from a dependency. Other scoping options are possible, depending on the particular command.
An execution dependency is a guarantee that for two sets of operations, the first set must happen-before the second set. If an operation happens-before another operation, then the first operation must complete before the second operation is initiated. More precisely:
-
Let A and B be separate sets of operations.
-
Let S be a synchronization command.
-
Let AS and BS be the synchronization scopes of S.
-
Let A' be the intersection of sets A and AS.
-
Let B' be the intersection of sets B and BS.
-
Submitting A, S and B for execution, in that order, will result in execution dependency E between A' and B'.
-
Execution dependency E guarantees that A' happens-before B'.
An execution dependency chain is a sequence of execution dependencies that form a happens-before relation between the first dependency’s A' and the final dependency’s B'. For each consecutive pair of execution dependencies, a chain exists if the intersection of BS in the first dependency and AS in the second dependency is not an empty set. The formation of a single execution dependency from an execution dependency chain can be described by substituting the following in the description of execution dependencies:
-
Let S be a set of synchronization commands that generate an execution dependency chain.
-
Let AS be the first synchronization scope of the first command in S.
-
Let BS be the second synchronization scope of the last command in S.
Note
An execution dependency is inherently also multiple execution dependencies - a dependency exists between each subset of A' and each subset of B', and the same is true for execution dependency chains. For example, a synchronization command with multiple pipeline stages in its stage masks effectively generates one dependency between each source stage and each destination stage. This can be useful to think about when considering how execution chains are formed if they do not involve all parts of a synchronization command’s dependency. Similarly, any set of adjacent dependencies in an execution dependency chain can be considered an execution dependency chain in its own right. |
Execution dependencies alone are not sufficient to guarantee that values resulting from writes in one set of operations can be read from another set of operations.
Three additional types of operation are used to control memory access. Availability operations cause the values generated by specified memory write accesses to become available to a memory domain for future access. Any available value remains available until a subsequent write to the same memory location occurs (whether it is made available or not) or the memory is freed. Memory domain operations cause writes that are available to a source memory domain to become available to a destination memory domain (an example of this is making writes available to the host domain available to the device domain). Visibility operations cause values available to a memory domain to become visible to specified memory accesses.
Availability, visibility, memory domains, and memory domain operations are formally defined in the Availability and Visibility section of the Memory Model chapter. Which API operations perform each of these operations is defined in Availability, Visibility, and Domain Operations.
A memory dependency is an execution dependency which includes availability and visibility operations such that:
-
The first set of operations happens-before the availability operation.
-
The availability operation happens-before the visibility operation.
-
The visibility operation happens-before the second set of operations.
Once written values are made visible to a particular type of memory access, they can be read or written by that type of memory access. Most synchronization commands in Vulkan define a memory dependency.
The specific memory accesses that are made available and visible are defined by the access scopes of a memory dependency. Any type of access that is in a memory dependency’s first access scope and occurs in A' is made available. Any type of access that is in a memory dependency’s second access scope and occurs in B' has any available writes made visible to it. Any type of operation that is not in a synchronization command’s access scopes will not be included in the resulting dependency.
A memory dependency enforces availability and visibility of memory accesses and execution order between two sets of operations. Adding to the description of execution dependency chains:
-
Let a be the set of memory accesses performed by A'.
-
Let b be the set of memory accesses performed by B'.
-
Let aS be the first access scope of the first command in S.
-
Let bS be the second access scope of the last command in S.
-
Let a' be the intersection of sets a and aS.
-
Let b' be the intersection of sets b and bS.
-
Submitting A, S and B for execution, in that order, will result in a memory dependency m between A' and B'.
-
Memory dependency m guarantees that:
-
Memory writes in a' are made available.
-
Available memory writes, including those from a', are made visible to b'.
-
Note
Execution and memory dependencies are used to solve data hazards, i.e. to ensure that read and write operations occur in a well-defined order. Write-after-read hazards can be solved with just an execution dependency, but read-after-write and write-after-write hazards need appropriate memory dependencies to be included between them. If an application does not include dependencies to solve these hazards, the results and execution orders of memory accesses are undefined. |
6.1.1. Image Layout Transitions
Image subresources can be transitioned from one layout to another as part of a memory dependency (e.g. by using an image memory barrier). When a layout transition is specified in a memory dependency, it happens-after the availability operations in the memory dependency, and happens-before the visibility operations. Image layout transitions may perform read and write accesses on all memory bound to the image subresource range, so applications must ensure that all memory writes have been made available before a layout transition is executed. Available memory is automatically made visible to a layout transition, and writes performed by a layout transition are automatically made available.
Layout transitions always apply to a particular image subresource range, and
specify both an old layout and new layout.
If the old layout does not match the new layout, a transition occurs.
The old layout must match the current layout of the image subresource
range, with one exception.
The old layout can always be specified as VK_IMAGE_LAYOUT_UNDEFINED
,
though doing so invalidates the contents of the image subresource range.
As image layout transitions may perform read and write accesses on the
memory bound to the image, if the image subresource affected by the layout
transition is bound to peer memory for any device in the current device mask
then the memory heap the bound memory comes from must support the
VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT
and
VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT
capabilities as returned by
vkGetDeviceGroupPeerMemoryFeatures.
Note
Setting the old layout to |
Note
Applications must ensure that layout transitions happen-after all operations accessing the image with the old layout, and happen-before any operations that will access the image with the new layout. Layout transitions are potentially read/write operations, so not defining appropriate memory dependencies to guarantee this will result in a data race. |
Image layout transitions interact with memory aliasing.
6.1.2. Pipeline Stages
The work performed by an action
or synchronization command consists of multiple operations, which are
performed as a sequence of logically independent steps known as pipeline
stages.
The exact pipeline stages executed depend on the particular command that is
used, and current command buffer state when the command was recorded.
Drawing commands, dispatching commands,
copy commands, clear commands, and synchronization commands all execute in different sets of
pipeline stages.
Synchronization commands do not execute in a defined
pipeline, but do execute VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
and
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
.
Note
Operations performed by synchronization commands (e.g.
availability and
visibility operations) are not executed by a defined pipeline stage.
However other commands can still synchronize with them via the
|
Execution of operations across pipeline stages must adhere to implicit ordering guarantees, particularly including pipeline stage order. Otherwise, execution across pipeline stages may overlap or execute out of order with regards to other stages, unless otherwise enforced by an execution dependency.
Several of the synchronization commands include pipeline stage parameters, restricting the synchronization scopes for that command to just those stages. This allows fine grained control over the exact execution dependencies and accesses performed by action commands. Implementations should use these pipeline stages to avoid unnecessary stalls or cache flushing.
Bits which can be set, specifying pipeline stages, are:
typedef enum VkPipelineStageFlagBits {
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT = 0x00000001,
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT = 0x00000002,
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT = 0x00000004,
VK_PIPELINE_STAGE_VERTEX_SHADER_BIT = 0x00000008,
VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT = 0x00000010,
VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT = 0x00000020,
VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT = 0x00000040,
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT = 0x00000080,
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT = 0x00000100,
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT = 0x00000200,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT = 0x00000400,
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT = 0x00000800,
VK_PIPELINE_STAGE_TRANSFER_BIT = 0x00001000,
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT = 0x00002000,
VK_PIPELINE_STAGE_HOST_BIT = 0x00004000,
VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT = 0x00008000,
VK_PIPELINE_STAGE_ALL_COMMANDS_BIT = 0x00010000,
VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT = 0x01000000,
VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT = 0x00040000,
VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX = 0x00020000,
VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV = 0x00400000,
VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_NV = 0x00200000,
VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV = 0x02000000,
VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV = 0x00080000,
VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV = 0x00100000,
VK_PIPELINE_STAGE_FRAGMENT_DENSITY_PROCESS_BIT_EXT = 0x00800000,
} VkPipelineStageFlagBits;
-
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
specifies the stage of the pipeline where any commands are initially received by the queue. -
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
specifies the stage of the pipeline where Draw/DispatchIndirect data structures are consumed. This stage also includes reading commands written by vkCmdProcessCommandsNVX. -
VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV
specifies the task shader stage. -
VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV
specifies the mesh shader stage. -
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT
specifies the stage of the pipeline where vertex and index buffers are consumed. -
VK_PIPELINE_STAGE_VERTEX_SHADER_BIT
specifies the vertex shader stage. -
VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT
specifies the tessellation control shader stage. -
VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT
specifies the tessellation evaluation shader stage. -
VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT
specifies the geometry shader stage. -
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT
specifies the fragment shader stage. -
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
specifies the stage of the pipeline where early fragment tests (depth and stencil tests before fragment shading) are performed. This stage also includes subpass load operations for framebuffer attachments with a depth/stencil format. -
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT
specifies the stage of the pipeline where late fragment tests (depth and stencil tests after fragment shading) are performed. This stage also includes subpass store operations for framebuffer attachments with a depth/stencil format. -
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
specifies the stage of the pipeline after blending where the final color values are output from the pipeline. This stage also includes subpass load and store operations and multisample resolve operations for framebuffer attachments with a color format. -
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT
specifies the execution of a compute shader. -
VK_PIPELINE_STAGE_TRANSFER_BIT
specifies the execution of copy commands. This includes the operations resulting from all copy commands, clear commands (with the exception of vkCmdClearAttachments), and vkCmdCopyQueryPoolResults. -
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
specifies the final stage in the pipeline where operations generated by all commands complete execution. -
VK_PIPELINE_STAGE_HOST_BIT
specifies a pseudo-stage indicating execution on the host of reads/writes of device memory. This stage is not invoked by any commands recorded in a command buffer. -
VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_NV
specifies the execution of the ray tracing shader stages. -
VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV
specifies the execution of vkCmdBuildAccelerationStructureNV, vkCmdCopyAccelerationStructureNV, and vkCmdWriteAccelerationStructuresPropertiesNV. -
VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT
specifies the execution of all graphics pipeline stages, and is equivalent to the logical OR of:-
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
-
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
-
VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV
-
VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV
-
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT
-
VK_PIPELINE_STAGE_VERTEX_SHADER_BIT
-
VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT
-
VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT
-
VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT
-
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT
-
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
-
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT
-
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
-
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
-
VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT
-
VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT
-
VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV
-
VK_PIPELINE_STAGE_FRAGMENT_DENSITY_PROCESS_BIT_EXT
-
-
VK_PIPELINE_STAGE_ALL_COMMANDS_BIT
is equivalent to the logical OR of every other pipeline stage flag that is supported on the queue it is used with. -
VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT
specifies the stage of the pipeline where the predicate of conditional rendering is consumed. -
VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT
specifies the stage of the pipeline where vertex attribute output values are written to the transform feedback buffers. -
VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX
specifies the stage of the pipeline where device-side generation of commands via vkCmdProcessCommandsNVX is handled. -
VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV
specifies the stage of the pipeline where the shading rate image is read to determine the shading rate for portions of a rasterized primitive. -
VK_PIPELINE_STAGE_FRAGMENT_DENSITY_PROCESS_BIT_EXT
specifies the stage of the pipeline where the fragment density map is read to generate the fragment areas.
Note
An execution dependency with only When defining a memory dependency, using only
|
typedef VkFlags VkPipelineStageFlags;
VkPipelineStageFlags
is a bitmask type for setting a mask of zero or
more VkPipelineStageFlagBits.
If a synchronization command includes a source stage mask, its first synchronization scope only includes execution of the pipeline stages specified in that mask, and its first access scope only includes memory access performed by pipeline stages specified in that mask. If a synchronization command includes a destination stage mask, its second synchronization scope only includes execution of the pipeline stages specified in that mask, and its second access scope only includes memory access performed by pipeline stages specified in that mask.
Note
Including a particular pipeline stage in the first synchronization scope of a command implicitly includes logically earlier pipeline stages in the synchronization scope. Similarly, the second synchronization scope includes logically later pipeline stages. However, note that access scopes are not affected in this way - only the precise stages specified are considered part of each access scope. |
Certain pipeline stages are only available on queues that support a particular set of operations. The following table lists, for each pipeline stage flag, which queue capability flag must be supported by the queue. When multiple flags are enumerated in the second column of the table, it means that the pipeline stage is supported on the queue if it supports any of the listed capability flags. For further details on queue capabilities see Physical Device Enumeration and Queues.
Pipeline stage flag | Required queue capability flag |
---|---|
|
None required |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
None required |
|
None required |
|
|
|
None required |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pipeline stages that execute as a result of a command logically complete execution in a specific order, such that completion of a logically later pipeline stage must not happen-before completion of a logically earlier stage. This means that including any stage in the source stage mask for a particular synchronization command also implies that any logically earlier stages are included in AS for that command.
Similarly, initiation of a logically earlier pipeline stage must not happen-after initiation of a logically later pipeline stage. Including any given stage in the destination stage mask for a particular synchronization command also implies that any logically later stages are included in BS for that command.
Note
Implementations may not support synchronization at every pipeline stage for every synchronization operation. If a pipeline stage that an implementation does not support synchronization for appears in a source stage mask, it may substitute any logically later stage in its place for the first synchronization scope. If a pipeline stage that an implementation does not support synchronization for appears in a destination stage mask, it may substitute any logically earlier stage in its place for the second synchronization scope. For example, if an implementation is unable to signal an event immediately after vertex shader execution is complete, it may instead signal the event after color attachment output has completed. If an implementation makes such a substitution, it must not affect the semantics of execution or memory dependencies or image and buffer memory barriers. |
The order and set of pipeline stages executed by a given command is determined by the command’s pipeline type, as described below:
For the graphics primitive shading pipeline, the following stages occur in this order:
-
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
-
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
-
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT
-
VK_PIPELINE_STAGE_VERTEX_SHADER_BIT
-
VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT
-
VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT
-
VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT
-
VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT
-
VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV
-
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
-
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT
-
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT
-
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
-
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
For the graphics mesh shading pipeline, the following stages occur in this order:
-
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
-
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
-
VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV
-
VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV
-
VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV
-
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
-
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT
-
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT
-
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
-
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
For graphics pipeline commands executing in a render pass with a fragment
density map attachment, the pipeline stage where the fragment density map
read happens has no particular order relative to the other stages except
that it happens before VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
.
-
VK_PIPELINE_STAGE_FRAGMENT_DENSITY_PROCESS_BIT_EXT
For the compute pipeline, the following stages occur in this order:
-
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
-
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
-
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT
-
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
The conditional rendering stage is formally part of both the graphics, and the compute pipeline. The pipeline stage where the predicate read happens has unspecified order relative to other stages of these pipelines:
-
VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT
For the transfer pipeline, the following stages occur in this order:
-
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
-
VK_PIPELINE_STAGE_TRANSFER_BIT
-
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
For host operations, only one pipeline stage occurs, so no order is guaranteed:
-
VK_PIPELINE_STAGE_HOST_BIT
For the command processing pipeline, the following stages occur in this order:
-
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
-
VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX
-
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
For the ray tracing shader pipeline, only one pipeline stage occurs, so no order is guaranteed:
-
VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_NV
For ray tracing acceleration structure operations, only one pipeline stage occurs, so no order is guaranteed:
-
VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV
6.1.3. Access Types
Memory in Vulkan can be accessed from within shader invocations and via some fixed-function stages of the pipeline. The access type is a function of the descriptor type used, or how a fixed-function stage accesses memory. Each access type corresponds to a bit flag in VkAccessFlagBits.
Some synchronization commands take sets of access types as parameters to define the access scopes of a memory dependency. If a synchronization command includes a source access mask, its first access scope only includes accesses via the access types specified in that mask. Similarly, if a synchronization command includes a destination access mask, its second access scope only includes accesses via the access types specified in that mask.
Access types that can be set in an access mask include:
typedef enum VkAccessFlagBits {
VK_ACCESS_INDIRECT_COMMAND_READ_BIT = 0x00000001,
VK_ACCESS_INDEX_READ_BIT = 0x00000002,
VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT = 0x00000004,
VK_ACCESS_UNIFORM_READ_BIT = 0x00000008,
VK_ACCESS_INPUT_ATTACHMENT_READ_BIT = 0x00000010,
VK_ACCESS_SHADER_READ_BIT = 0x00000020,
VK_ACCESS_SHADER_WRITE_BIT = 0x00000040,
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT = 0x00000080,
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT = 0x00000100,
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT = 0x00000200,
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT = 0x00000400,
VK_ACCESS_TRANSFER_READ_BIT = 0x00000800,
VK_ACCESS_TRANSFER_WRITE_BIT = 0x00001000,
VK_ACCESS_HOST_READ_BIT = 0x00002000,
VK_ACCESS_HOST_WRITE_BIT = 0x00004000,
VK_ACCESS_MEMORY_READ_BIT = 0x00008000,
VK_ACCESS_MEMORY_WRITE_BIT = 0x00010000,
VK_ACCESS_TRANSFORM_FEEDBACK_WRITE_BIT_EXT = 0x02000000,
VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_READ_BIT_EXT = 0x04000000,
VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT = 0x08000000,
VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT = 0x00100000,
VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX = 0x00020000,
VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX = 0x00040000,
VK_ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT_BIT_EXT = 0x00080000,
VK_ACCESS_SHADING_RATE_IMAGE_READ_BIT_NV = 0x00800000,
VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_NV = 0x00200000,
VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_NV = 0x00400000,
VK_ACCESS_FRAGMENT_DENSITY_MAP_READ_BIT_EXT = 0x01000000,
} VkAccessFlagBits;
-
VK_ACCESS_INDIRECT_COMMAND_READ_BIT
specifies read access to indirect command data read as part of an indirect drawing or dispatch command. -
VK_ACCESS_INDEX_READ_BIT
specifies read access to an index buffer as part of an indexed drawing command, bound by vkCmdBindIndexBuffer. -
VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT
specifies read access to a vertex buffer as part of a drawing command, bound by vkCmdBindVertexBuffers. -
VK_ACCESS_UNIFORM_READ_BIT
specifies read access to a uniform buffer. -
VK_ACCESS_INPUT_ATTACHMENT_READ_BIT
specifies read access to an input attachment within a render pass during fragment shading. -
VK_ACCESS_SHADER_READ_BIT
specifies read access to a storage buffer, uniform texel buffer, storage texel buffer, sampled image, or storage image. -
VK_ACCESS_SHADER_WRITE_BIT
specifies write access to a storage buffer, storage texel buffer, or storage image. -
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT
specifies read access to a color attachment, such as via blending, logic operations, or via certain subpass load operations. It does not include advanced blend operations. -
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT
specifies write access to a color or resolve attachment during a render pass or via certain subpass load and store operations. -
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT
specifies read access to a depth/stencil attachment, via depth or stencil operations or via certain subpass load operations. -
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT
specifies write access to a depth/stencil attachment, via depth or stencil operations or via certain subpass load and store operations. -
VK_ACCESS_TRANSFER_READ_BIT
specifies read access to an image or buffer in a copy operation. -
VK_ACCESS_TRANSFER_WRITE_BIT
specifies write access to an image or buffer in a clear or copy operation. -
VK_ACCESS_HOST_READ_BIT
specifies read access by a host operation. Accesses of this type are not performed through a resource, but directly on memory. -
VK_ACCESS_HOST_WRITE_BIT
specifies write access by a host operation. Accesses of this type are not performed through a resource, but directly on memory. -
VK_ACCESS_MEMORY_READ_BIT
specifies read access via non-specific entities. These entities include the Vulkan device and host, but may also include entities external to the Vulkan device or otherwise not part of the core Vulkan pipeline. When included in a destination access mask, makes all available writes visible to all future read accesses on entities known to the Vulkan device. -
VK_ACCESS_MEMORY_WRITE_BIT
specifies write access via non-specific entities. These entities include the Vulkan device and host, but may also include entities external to the Vulkan device or otherwise not part of the core Vulkan pipeline. When included in a source access mask, all writes that are performed by entities known to the Vulkan device are made available. When included in a destination access mask, makes all available writes visible to all future write accesses on entities known to the Vulkan device. -
VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT
specifies read access to a predicate as part of conditional rendering. -
VK_ACCESS_TRANSFORM_FEEDBACK_WRITE_BIT_EXT
specifies write access to a transform feedback buffer made when transform feedback is active. -
VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_READ_BIT_EXT
specifies read access to a transform feedback counter buffer which is read whenvkCmdBeginTransformFeedbackEXT
executes. -
VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT
specifies write access to a transform feedback counter buffer which is written whenvkCmdEndTransformFeedbackEXT
executes. -
VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX
specifies reads fromVkBuffer
inputs to vkCmdProcessCommandsNVX. -
VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX
specifies writes to the target command buffer in vkCmdProcessCommandsNVX. -
VK_ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT_BIT_EXT
is similar toVK_ACCESS_COLOR_ATTACHMENT_READ_BIT
, but also includes advanced blend operations. -
VK_ACCESS_SHADING_RATE_IMAGE_READ_BIT_NV
specifies read access to a shading rate image as part of a drawing command, as bound by vkCmdBindShadingRateImageNV. -
VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_NV
specifies read access to an acceleration structure as part of a trace or build command. -
VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_NV
specifies write access to an acceleration structure as part of a build command. -
VK_ACCESS_FRAGMENT_DENSITY_MAP_READ_BIT_EXT
specifies read access to a fragment density map attachment during dynamic fragment density map operations
Certain access types are only performed by a subset of pipeline stages. Any synchronization command that takes both stage masks and access masks uses both to define the access scopes - only the specified access types performed by the specified stages are included in the access scope. An application must not specify an access flag in a synchronization command if it does not include a pipeline stage in the corresponding stage mask that is able to perform accesses of that type. The following table lists, for each access flag, which pipeline stages can perform that type of access.
Access flag | Supported pipeline stages |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
N/A |
|
N/A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If a memory object does not have the
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
property, then
vkFlushMappedMemoryRanges must be called in order to guarantee that
writes to the memory object from the host are made available to the host
domain, where they can be further made available to the device domain via a
domain operation.
Similarly, vkInvalidateMappedMemoryRanges must be called to guarantee
that writes which are available to the host domain are made visible to host
operations.
If the memory object does have the
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
property flag, writes to the
memory object from the host are automatically made available to the host
domain.
Similarly, writes made available to the host domain are automatically made
visible to the host.
Note
The vkQueueSubmit command automatically performs a domain operation from host to device for all writes performed before the command executes, so in most cases an explicit memory barrier is not needed for this case. In the few circumstances where a submit does not occur between the host write and the device read access, writes can be made available by using an explicit memory barrier. |
typedef VkFlags VkAccessFlags;
VkAccessFlags
is a bitmask type for setting a mask of zero or more
VkAccessFlagBits.
6.1.4. Framebuffer Region Dependencies
Pipeline stages that operate on, or with respect to, the framebuffer are collectively the framebuffer-space pipeline stages. These stages are:
-
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT
-
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
-
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT
-
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
For these pipeline stages, an execution or memory dependency from the first set of operations to the second set can either be a single framebuffer-global dependency, or split into multiple framebuffer-local dependencies. A dependency with non-framebuffer-space pipeline stages is neither framebuffer-global nor framebuffer-local.
A framebuffer region is a set of sample (x, y, layer, sample) coordinates that is a subset of the entire framebuffer.
Both synchronization scopes of a framebuffer-local dependency include only the operations performed within corresponding framebuffer regions (as defined below). No ordering guarantees are made between different framebuffer regions for a framebuffer-local dependency.
Both synchronization scopes of a framebuffer-global dependency include operations on all framebuffer-regions.
If the first synchronization scope includes operations on pixels/fragments with N samples and the second synchronization scope includes operations on pixels/fragments with M samples, where N does not equal M, then a framebuffer region containing all samples at a given (x, y, layer) coordinate in the first synchronization scope corresponds to a region containing all samples at the same coordinate in the second synchronization scope. In other words, it is a pixel granularity dependency. If N equals M, then a framebuffer region containing a single (x, y, layer, sample) coordinate in the first synchronization scope corresponds to a region containing the same sample at the same coordinate in the second synchronization scope. In other words, it is a sample granularity dependency.
Note
Since fragment invocations are not specified to run in any particular groupings, the size of a framebuffer region is implementation-dependent, not known to the application, and must be assumed to be no larger than specified above. |
Note
Practically, the pixel vs sample granularity dependency means that if an
input attachment has a different number of samples than the pipeline’s
|
If a synchronization command includes a dependencyFlags
parameter, and
specifies the VK_DEPENDENCY_BY_REGION_BIT
flag, then it defines
framebuffer-local dependencies for the framebuffer-space pipeline stages in
that synchronization command, for all framebuffer regions.
If no dependencyFlags
parameter is included, or the
VK_DEPENDENCY_BY_REGION_BIT
flag is not specified, then a
framebuffer-global dependency is specified for those stages.
The VK_DEPENDENCY_BY_REGION_BIT
flag does not affect the dependencies
between non-framebuffer-space pipeline stages, nor does it affect the
dependencies between framebuffer-space and non-framebuffer-space pipeline
stages.
Note
Framebuffer-local dependencies are more optimal for most architectures; particularly tile-based architectures - which can keep framebuffer-regions entirely in on-chip registers and thus avoid external bandwidth across such a dependency. Including a framebuffer-global dependency in your rendering will usually force all implementations to flush data to memory, or to a higher level cache, breaking any potential locality optimizations. |
6.1.5. View-Local Dependencies
In a render pass instance that has multiview enabled, dependencies can be either view-local or view-global.
A view-local dependency only includes operations from a single source view from the source subpass in the first synchronization scope, and only includes operations from a single destination view from the destination subpass in the second synchronization scope. A view-global dependency includes all views in the view mask of the source and destination subpasses in the corresponding synchronization scopes.
If a synchronization command includes a dependencyFlags
parameter and
specifies the VK_DEPENDENCY_VIEW_LOCAL_BIT
flag, then it defines
view-local dependencies for that synchronization command, for all views.
If no dependencyFlags
parameter is included or the
VK_DEPENDENCY_VIEW_LOCAL_BIT
flag is not specified, then a view-global
dependency is specified.
6.1.6. Device-Local Dependencies
Dependencies can be either device-local or non-device-local.
A device-local dependency acts as multiple separate dependencies, one for
each physical device that executes the synchronization command, where each
dependency only includes operations from that physical device in both
synchronization scopes.
A non-device-local dependency is a single dependency where both
synchronization scopes include operations from all physical devices that
participate in the synchronization command.
For subpass dependencies, all physical devices in the
VkDeviceGroupRenderPassBeginInfo::deviceMask
participate in the
dependency, and for pipeline barriers all physical devices that are set in
the command buffer’s current device mask participate in the dependency.
If a synchronization command includes a dependencyFlags
parameter and
specifies the VK_DEPENDENCY_DEVICE_GROUP_BIT
flag, then it defines a
non-device-local dependency for that synchronization command.
If no dependencyFlags
parameter is included or the
VK_DEPENDENCY_DEVICE_GROUP_BIT
flag is not specified, then it defines
device-local dependencies for that synchronization command, for all
participating physical devices.
Semaphore and event dependencies are device-local and only execute on the one physical device that performs the dependency.
6.2. Implicit Synchronization Guarantees
A small number of implicit ordering guarantees are provided by Vulkan, ensuring that the order in which commands are submitted is meaningful, and avoiding unnecessary complexity in common operations.
Submission order is a fundamental ordering in Vulkan, giving meaning to the order in which action and synchronization commands are recorded and submitted to a single queue. Explicit and implicit ordering guarantees between commands in Vulkan all work on the premise that this ordering is meaningful. This order does not itself define any execution or memory dependencies; synchronization commands and other orderings within the API use this ordering to define their scopes.
Submission order for any given set of commands is based on the order in which they were recorded to command buffers and then submitted. This order is determined as follows:
-
The initial order is determined by the order in which vkQueueSubmit commands are executed on the host, for a single queue, from first to last.
-
The order in which VkSubmitInfo structures are specified in the
pSubmits
parameter of vkQueueSubmit, from lowest index to highest. -
The order in which command buffers are specified in the
pCommandBuffers
member of VkSubmitInfo, from lowest index to highest. -
The order in which commands were recorded to a command buffer on the host, from first to last:
-
For commands recorded outside a render pass, this includes all other commands recorded outside a render pass, including vkCmdBeginRenderPass and vkCmdEndRenderPass commands; it does not directly include commands inside a render pass.
-
For commands recorded inside a render pass, this includes all other commands recorded inside the same subpass, including the vkCmdBeginRenderPass and vkCmdEndRenderPass commands that delimit the same render pass instance; it does not include commands recorded to other subpasses.
-
Action and synchronization
commands recorded to a command buffer execute the
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
pipeline stage in
submission order - forming an implicit
execution dependency between this stage in each command.
State commands do not execute any operations on the device, instead they set the state of the command buffer when they execute on the host, in the order that they are recorded. Action commands consume the current state of the command buffer when they are recorded, and will execute state changes on the device as required to match the recorded state.
Query commands, the order of primitives passing through the graphics pipeline and image layout transitions as part of an image memory barrier provide additional guarantees based on submission order.
Execution of pipeline stages within a given command also has a loose ordering, dependent only on a single command.
6.3. Fences
Fences are a synchronization primitive that can be used to insert a dependency from a queue to the host. Fences have two states - signaled and unsignaled. A fence can be signaled as part of the execution of a queue submission command. Fences can be unsignaled on the host with vkResetFences. Fences can be waited on by the host with the vkWaitForFences command, and the current state can be queried with vkGetFenceStatus.
As with most objects in Vulkan, fences are an interface to internal data which is typically opaque to applications. This internal data is referred to as a fence’s payload.
However, in order to enable communication with agents outside of the current device, it is necessary to be able to export that payload to a commonly understood format, and subsequently import from that format as well.
The internal data of a fence may include a reference to any resources and pending work associated with signal or unsignal operations performed on that fence object. Mechanisms to import and export that internal data to and from fences are provided below. These mechanisms indirectly enable applications to share fence state between two or more fences and other synchronization primitives across process and API boundaries.
Fences are represented by VkFence
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkFence)
To create a fence, call:
VkResult vkCreateFence(
VkDevice device,
const VkFenceCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkFence* pFence);
-
device
is the logical device that creates the fence. -
pCreateInfo
is a pointer to an instance of theVkFenceCreateInfo
structure which contains information about how the fence is to be created. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pFence
points to a handle in which the resulting fence object is returned.
The VkFenceCreateInfo
structure is defined as:
typedef struct VkFenceCreateInfo {
VkStructureType sType;
const void* pNext;
VkFenceCreateFlags flags;
} VkFenceCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkFenceCreateFlagBits specifying the initial state and behavior of the fence.
typedef enum VkFenceCreateFlagBits {
VK_FENCE_CREATE_SIGNALED_BIT = 0x00000001,
} VkFenceCreateFlagBits;
-
VK_FENCE_CREATE_SIGNALED_BIT
specifies that the fence object is created in the signaled state. Otherwise, it is created in the unsignaled state.
typedef VkFlags VkFenceCreateFlags;
VkFenceCreateFlags
is a bitmask type for setting a mask of zero or
more VkFenceCreateFlagBits.
To create a fence whose payload can be exported to external handles, add
the VkExportFenceCreateInfo structure to the pNext
chain of the
VkFenceCreateInfo structure.
The VkExportFenceCreateInfo
structure is defined as:
typedef struct VkExportFenceCreateInfo {
VkStructureType sType;
const void* pNext;
VkExternalFenceHandleTypeFlags handleTypes;
} VkExportFenceCreateInfo;
or the equivalent
typedef VkExportFenceCreateInfo VkExportFenceCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleTypes
is a bitmask of VkExternalFenceHandleTypeFlagBits specifying one or more fence handle types the application can export from the resulting fence. The application can request multiple handle types for the same fence.
To specify additional attributes of NT handles exported from a fence, add
the VkExportFenceWin32HandleInfoKHR structure to the pNext
chain
of the VkFenceCreateInfo structure.
The VkExportFenceWin32HandleInfoKHR
structure is defined as:
typedef struct VkExportFenceWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
LPCWSTR name;
} VkExportFenceWin32HandleInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pAttributes
is a pointer to a WindowsSECURITY_ATTRIBUTES
structure specifying security attributes of the handle. -
dwAccess
is aDWORD
specifying access rights of the handle. -
name
is a NULL-terminated UTF-16 string to associate with the underlying synchronization primitive referenced by NT handles exported from the created fence.
If this structure is not present, or if pAttributes
is set to NULL
,
default security descriptor values will be used, and child processes created
by the application will not inherit the handle, as described in the MSDN
documentation for “Synchronization Object Security and Access Rights”1.
Further, if the structure is not present, the access rights will be
DXGI_SHARED_RESOURCE_READ
| DXGI_SHARED_RESOURCE_WRITE
for handles of the following types:
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_BIT
To export a Windows handle representing the state of a fence, call:
VkResult vkGetFenceWin32HandleKHR(
VkDevice device,
const VkFenceGetWin32HandleInfoKHR* pGetWin32HandleInfo,
HANDLE* pHandle);
-
device
is the logical device that created the fence being exported. -
pGetWin32HandleInfo
is a pointer to an instance of the VkFenceGetWin32HandleInfoKHR structure containing parameters of the export operation. -
pHandle
will return the Windows handle representing the fence state.
For handle types defined as NT handles, the handles returned by
vkGetFenceWin32HandleKHR
are owned by the application.
To avoid leaking resources, the application must release ownership of them
using the CloseHandle
system call when they are no longer needed.
Exporting a Windows handle from a fence may have side effects depending on the transference of the specified handle type, as described in Importing Fence Payloads.
The VkFenceGetWin32HandleInfoKHR
structure is defined as:
typedef struct VkFenceGetWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkFence fence;
VkExternalFenceHandleTypeFlagBits handleType;
} VkFenceGetWin32HandleInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
fence
is the fence from which state will be exported. -
handleType
is the type of handle requested.
The properties of the handle returned depend on the value of
handleType
.
See VkExternalFenceHandleTypeFlagBits for a description of the
properties of the defined external fence handle types.
To export a POSIX file descriptor representing the payload of a fence, call:
VkResult vkGetFenceFdKHR(
VkDevice device,
const VkFenceGetFdInfoKHR* pGetFdInfo,
int* pFd);
-
device
is the logical device that created the fence being exported. -
pGetFdInfo
is a pointer to an instance of the VkFenceGetFdInfoKHR structure containing parameters of the export operation. -
pFd
will return the file descriptor representing the fence payload.
Each call to vkGetFenceFdKHR
must create a new file descriptor and
transfer ownership of it to the application.
To avoid leaking resources, the application must release ownership of the
file descriptor when it is no longer needed.
Note
Ownership can be released in many ways.
For example, the application can call |
If pGetFdInfo
->handleType
is
VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT
and the fence is signaled at
the time vkGetFenceFdKHR
is called, pFd
may return the value
-1
instead of a valid file descriptor.
Where supported by the operating system, the implementation must set the
file descriptor to be closed automatically when an execve
system call
is made.
Exporting a file descriptor from a fence may have side effects depending on the transference of the specified handle type, as described in Importing Fence State.
The VkFenceGetFdInfoKHR
structure is defined as:
typedef struct VkFenceGetFdInfoKHR {
VkStructureType sType;
const void* pNext;
VkFence fence;
VkExternalFenceHandleTypeFlagBits handleType;
} VkFenceGetFdInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
fence
is the fence from which state will be exported. -
handleType
is the type of handle requested.
The properties of the file descriptor returned depend on the value of
handleType
.
See VkExternalFenceHandleTypeFlagBits for a description of the
properties of the defined external fence handle types.
To destroy a fence, call:
void vkDestroyFence(
VkDevice device,
VkFence fence,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the fence. -
fence
is the handle of the fence to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
To query the status of a fence from the host, call:
VkResult vkGetFenceStatus(
VkDevice device,
VkFence fence);
-
device
is the logical device that owns the fence. -
fence
is the handle of the fence to query.
Upon success, vkGetFenceStatus
returns the status of the fence object,
with the following return codes:
Status | Meaning |
---|---|
|
The fence specified by |
|
The fence specified by |
|
The device has been lost. See Lost Device. |
If a queue submission command is pending execution, then the value returned by this command may immediately be out of date.
If the device has been lost (see Lost Device),
vkGetFenceStatus
may return any of the above status codes.
If the device has been lost and vkGetFenceStatus
is called repeatedly,
it will eventually return either VK_SUCCESS
or
VK_ERROR_DEVICE_LOST
.
To set the state of fences to unsignaled from the host, call:
VkResult vkResetFences(
VkDevice device,
uint32_t fenceCount,
const VkFence* pFences);
-
device
is the logical device that owns the fences. -
fenceCount
is the number of fences to reset. -
pFences
is a pointer to an array of fence handles to reset.
If any member of pFences
currently has its
payload imported with temporary
permanence, that fence’s prior permanent payload is first restored.
The remaining operations described therefore operate on the restored
payload.
When vkResetFences is executed on the host, it defines a fence unsignal operation for each fence, which resets the fence to the unsignaled state.
If any member of pFences
is already in the unsignaled state when
vkResetFences is executed, then vkResetFences has no effect on
that fence.
When a fence is submitted to a queue as part of a queue submission command, it defines a memory dependency on the batches that were submitted as part of that command, and defines a fence signal operation which sets the fence to the signaled state.
The first synchronization scope includes every batch submitted in the same queue submission command. Fence signal operations that are defined by vkQueueSubmit additionally include in the first synchronization scope all commands that occur earlier in submission order.
The second synchronization scope only includes the fence signal operation.
The first access scope includes all memory access performed by the device.
The second access scope is empty.
To wait for one or more fences to enter the signaled state on the host, call:
VkResult vkWaitForFences(
VkDevice device,
uint32_t fenceCount,
const VkFence* pFences,
VkBool32 waitAll,
uint64_t timeout);
-
device
is the logical device that owns the fences. -
fenceCount
is the number of fences to wait on. -
pFences
is a pointer to an array offenceCount
fence handles. -
waitAll
is the condition that must be satisfied to successfully unblock the wait. IfwaitAll
isVK_TRUE
, then the condition is that all fences inpFences
are signaled. Otherwise, the condition is that at least one fence inpFences
is signaled. -
timeout
is the timeout period in units of nanoseconds.timeout
is adjusted to the closest value allowed by the implementation-dependent timeout accuracy, which may be substantially longer than one nanosecond, and may be longer than the requested period.
If the condition is satisfied when vkWaitForFences
is called, then
vkWaitForFences
returns immediately.
If the condition is not satisfied at the time vkWaitForFences
is
called, then vkWaitForFences
will block and wait up to timeout
nanoseconds for the condition to become satisfied.
If timeout
is zero, then vkWaitForFences
does not wait, but
simply returns the current state of the fences.
VK_TIMEOUT
will be returned in this case if the condition is not
satisfied, even though no actual wait was performed.
If the specified timeout period expires before the condition is satisfied,
vkWaitForFences
returns VK_TIMEOUT
.
If the condition is satisfied before timeout
nanoseconds has expired,
vkWaitForFences
returns VK_SUCCESS
.
If device loss occurs (see Lost Device) before
the timeout has expired, vkWaitForFences
must return in finite time
with either VK_SUCCESS
or VK_ERROR_DEVICE_LOST
.
Note
While we guarantee that |
An execution dependency is defined by waiting for a fence to become signaled, either via vkWaitForFences or by polling on vkGetFenceStatus.
The first synchronization scope includes only the fence signal operation.
The second synchronization scope includes the host operations of vkWaitForFences or vkGetFenceStatus indicating that the fence has become signaled.
Note
Signaling a fence and waiting on the host does not guarantee that the results of memory accesses will be visible to the host, as the access scope of a memory dependency defined by a fence only includes device access. A memory barrier or other memory dependency must be used to guarantee this. See the description of host access types for more information. |
6.3.1. Alternate Methods to Signal Fences
Besides submitting a fence to a queue as part of a queue submission command, a fence may also be signaled when a particular event occurs on a device or display.
To create a fence that will be signaled when an event occurs on a device, call:
VkResult vkRegisterDeviceEventEXT(
VkDevice device,
const VkDeviceEventInfoEXT* pDeviceEventInfo,
const VkAllocationCallbacks* pAllocator,
VkFence* pFence);
-
device
is a logical device on which the event may occur. -
pDeviceEventInfo
is a pointer to an instance of the VkDeviceEventInfoEXT structure describing the event of interest to the application. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pFence
points to a handle in which the resulting fence object is returned.
The VkDeviceEventInfoEXT
structure is defined as:
typedef struct VkDeviceEventInfoEXT {
VkStructureType sType;
const void* pNext;
VkDeviceEventTypeEXT deviceEvent;
} VkDeviceEventInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
device
is a VkDeviceEventTypeEXT value specifying when the fence will be signaled.
Possible values of VkDeviceEventInfoEXT::device
, specifying when
a fence will be signaled, are:
typedef enum VkDeviceEventTypeEXT {
VK_DEVICE_EVENT_TYPE_DISPLAY_HOTPLUG_EXT = 0,
} VkDeviceEventTypeEXT;
-
VK_DEVICE_EVENT_TYPE_DISPLAY_HOTPLUG_EXT
specifies that the fence is signaled when a display is plugged into or unplugged from the specified device. Applications can use this notification to determine when they need to re-enumerate the available displays on a device.
To create a fence that will be signaled when an event occurs on a VkDisplayKHR object, call:
VkResult vkRegisterDisplayEventEXT(
VkDevice device,
VkDisplayKHR display,
const VkDisplayEventInfoEXT* pDisplayEventInfo,
const VkAllocationCallbacks* pAllocator,
VkFence* pFence);
-
device
is a logical device associated withdisplay
-
display
is the display on which the event may occur. -
pDisplayEventInfo
is a pointer to an instance of the VkDisplayEventInfoEXT structure describing the event of interest to the application. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pFence
points to a handle in which the resulting fence object is returned.
The VkDisplayEventInfoEXT
structure is defined as:
typedef struct VkDisplayEventInfoEXT {
VkStructureType sType;
const void* pNext;
VkDisplayEventTypeEXT displayEvent;
} VkDisplayEventInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
displayEvent
is a VkDisplayEventTypeEXT specifying when the fence will be signaled.
Possible values of VkDisplayEventInfoEXT::displayEvent
,
specifying when a fence will be signaled, are:
typedef enum VkDisplayEventTypeEXT {
VK_DISPLAY_EVENT_TYPE_FIRST_PIXEL_OUT_EXT = 0,
} VkDisplayEventTypeEXT;
-
VK_DISPLAY_EVENT_TYPE_FIRST_PIXEL_OUT_EXT
specifies that the fence is signaled when the first pixel of the next display refresh cycle leaves the display engine for the display.
6.3.2. Importing Fence Payloads
Applications can import a fence payload into an existing fence using an external fence handle. The effects of the import operation will be either temporary or permanent, as specified by the application. If the import is temporary, the fence will be restored to its permanent state the next time that fence is passed to vkResetFences.
Note
Restoring a fence to its prior permanent payload is a distinct operation from resetting a fence payload. See vkResetFences for more detail. |
Performing a subsequent temporary import on a fence before resetting it has
no effect on this requirement; the next unsignal of the fence must still
restore its last permanent state.
A permanent payload import behaves as if the target fence was destroyed, and
a new fence was created with the same handle but the imported payload.
Because importing a fence payload temporarily or permanently detaches the
existing payload from a fence, similar usage restrictions to those applied
to vkDestroyFence
are applied to any command that imports a fence
payload.
Which of these import types is used is referred to as the import operation’s
permanence.
Each handle type supports either one or both types of permanence.
The implementation must perform the import operation by either referencing or copying the payload referred to by the specified external fence handle, depending on the handle’s type. The import method used is referred to as the handle type’s transference. When using handle types with reference transference, importing a payload to a fence adds the fence to the set of all fences sharing that payload. This set includes the fence from which the payload was exported. Fence signaling, waiting, and resetting operations performed on any fence in the set must behave as if the set were a single fence. Importing a payload using handle types with copy transference creates a duplicate copy of the payload at the time of import, but makes no further reference to it. Fence signaling, waiting, and resetting operations performed on the target of copy imports must not affect any other fence or payload.
Export operations have the same transference as the specified handle type’s import operations. Additionally, exporting a fence payload to a handle with copy transference has the same side effects on the source fence’s payload as executing a fence reset operation. If the fence was using a temporarily imported payload, the fence’s prior permanent payload will be restored.
Note
The tables Handle Types Supported by VkImportFenceWin32HandleInfoKHR and Handle Types Supported by VkImportFenceFdInfoKHR define the permanence and transference of each handle type. |
External synchronization allows
implementations to modify an object’s internal state, i.e. payload, without
internal synchronization.
However, for fences sharing a payload across processes, satisfying the
external synchronization requirements of VkFence
parameters as if all
fences in the set were the same object is sometimes infeasible.
Satisfying valid usage constraints on the state of a fence would similarly
require impractical coordination or levels of trust between processes.
Therefore, these constraints only apply to a specific fence handle, not to
its payload.
For distinct fence objects which share a payload:
-
If multiple commands which queue a signal operation, or which unsignal a fence, are called concurrently, behavior will be as if the commands were called in an arbitrary sequential order.
-
If a queue submission command is called with a fence that is sharing a payload, and the payload is already associated with another queue command that has not yet completed execution, either one or both of the commands will cause the fence to become signaled when they complete execution.
-
If a fence payload is reset while it is associated with a queue command that has not yet completed execution, the payload will become unsignaled, but may become signaled again when the command completes execution.
-
In the preceding cases, any of the devices associated with the fences sharing the payload may be lost, or any of the queue submission or fence reset commands may return
VK_ERROR_INITIALIZATION_FAILED
.
Other than these non-deterministic results, behavior is well defined. In particular:
-
The implementation must not crash or enter an internally inconsistent state where future valid Vulkan commands might cause undefined results,
-
Timeouts on future wait commands on fences sharing the payload must be effective.
Note
These rules allow processes to synchronize access to shared memory without trusting each other. However, such processes must still be cautious not to use the shared fence for more than synchronizing access to the shared memory. For example, a process should not use a fence with shared payload to tell when commands it submitted to a queue have completed and objects used by those commands may be destroyed, since the other process can accidentally or maliciously cause the fence to signal before the commands actually complete. |
When a fence is using an imported payload, its
VkExportFenceCreateInfo::handleTypes
value is that specified
when creating the fence from which the payload was exported, rather than
that specified when creating the fence.
Additionally, VkExternalFenceProperties::exportFromImportedHandleTypes
restricts which handle types can be exported from such a fence based on the
specific handle type used to import the current payload.
Passing a fence to vkAcquireNextImageKHR is equivalent to temporarily
importing a fence payload to that fence.
Note
Because the exportable handle types of an imported fence correspond to its current imported payload, and vkAcquireNextImageKHR behaves the same as a temporary import operation for which the source fence is opaque to the application, applications have no way of determining whether any external handle types can be exported from a fence in this state. Therefore, applications must not attempt to export handles from fences using a temporarily imported payload from vkAcquireNextImageKHR. |
When importing a fence payload, it is the responsibility of the application
to ensure the external handles meet all valid usage requirements.
However, implementations must perform sufficient validation of external
handles to ensure that the operation results in a valid fence which will not
cause program termination, device loss, queue stalls, host thread stalls, or
corruption of other resources when used as allowed according to its import
parameters.
If the external handle provided does not meet these requirements, the
implementation must fail the fence payload import operation with the error
code VK_ERROR_INVALID_EXTERNAL_HANDLE
.
To import a fence payload from a Windows handle, call:
VkResult vkImportFenceWin32HandleKHR(
VkDevice device,
const VkImportFenceWin32HandleInfoKHR* pImportFenceWin32HandleInfo);
-
device
is the logical device that created the fence. -
pImportFenceWin32HandleInfo
points to a VkImportFenceWin32HandleInfoKHR structure specifying the fence and import parameters.
Importing a fence payload from Windows handles does not transfer ownership
of the handle to the Vulkan implementation.
For handle types defined as NT handles, the application must release
ownership using the CloseHandle
system call when the handle is no
longer needed.
Applications can import the same fence payload into multiple instances of Vulkan, into the same instance from which it was exported, and multiple times into a given Vulkan instance.
The VkImportFenceWin32HandleInfoKHR
structure is defined as:
typedef struct VkImportFenceWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkFence fence;
VkFenceImportFlags flags;
VkExternalFenceHandleTypeFlagBits handleType;
HANDLE handle;
LPCWSTR name;
} VkImportFenceWin32HandleInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
fence
is the fence into which the state will be imported. -
flags
is a bitmask of VkFenceImportFlagBits specifying additional parameters for the fence payload import operation. -
handleType
specifies the type ofhandle
. -
handle
is the external handle to import, orNULL
. -
name
is the NULL-terminated UTF-16 string naming the underlying synchronization primitive to import, orNULL
.
The handle types supported by handleType
are:
Handle Type | Transference | Permanence Supported |
---|---|---|
|
Reference |
Temporary,Permanent |
|
Reference |
Temporary,Permanent |
To import a fence payload from a POSIX file descriptor, call:
VkResult vkImportFenceFdKHR(
VkDevice device,
const VkImportFenceFdInfoKHR* pImportFenceFdInfo);
-
device
is the logical device that created the fence. -
pImportFenceFdInfo
points to a VkImportFenceFdInfoKHR structure specifying the fence and import parameters.
Importing a fence payload from a file descriptor transfers ownership of the file descriptor from the application to the Vulkan implementation. The application must not perform any operations on the file descriptor after a successful import.
Applications can import the same fence payload into multiple instances of Vulkan, into the same instance from which it was exported, and multiple times into a given Vulkan instance.
The VkImportFenceFdInfoKHR
structure is defined as:
typedef struct VkImportFenceFdInfoKHR {
VkStructureType sType;
const void* pNext;
VkFence fence;
VkFenceImportFlags flags;
VkExternalFenceHandleTypeFlagBits handleType;
int fd;
} VkImportFenceFdInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
fence
is the fence into which the payload will be imported. -
flags
is a bitmask of VkFenceImportFlagBits specifying additional parameters for the fence payload import operation. -
handleType
specifies the type offd
. -
fd
is the external handle to import.
The handle types supported by handleType
are:
Handle Type | Transference | Permanence Supported |
---|---|---|
|
Reference |
Temporary,Permanent |
|
Copy |
Temporary |
If handleType
is VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT
, the
special value -1
for fd
is treated like a valid sync file descriptor
referring to an object that has already signaled.
The import operation will succeed and the VkFence
will have a
temporarily imported payload as if a valid file descriptor had been
provided.
Note
This special behavior for importing an invalid sync file descriptor allows
easier interoperability with other system APIs which use the convention that
an invalid sync file descriptor represents work that has already completed
and does not need to be waited for.
It is consistent with the option for implementations to return a |
Bits which can be set in
VkImportFenceWin32HandleInfoKHR::flags
and
VkImportFenceFdInfoKHR::flags
specifying additional parameters of a fence import operation are:
typedef enum VkFenceImportFlagBits {
VK_FENCE_IMPORT_TEMPORARY_BIT = 0x00000001,
VK_FENCE_IMPORT_TEMPORARY_BIT_KHR = VK_FENCE_IMPORT_TEMPORARY_BIT,
} VkFenceImportFlagBits;
or the equivalent
typedef VkFenceImportFlagBits VkFenceImportFlagBitsKHR;
-
VK_FENCE_IMPORT_TEMPORARY_BIT
specifies that the fence payload will be imported only temporarily, as described in Importing Fence Payloads, regardless of the permanence ofhandleType
.
typedef VkFlags VkFenceImportFlags;
or the equivalent
typedef VkFenceImportFlags VkFenceImportFlagsKHR;
VkFenceImportFlags
is a bitmask type for setting a mask of zero or
more VkFenceImportFlagBits.
6.4. Semaphores
Semaphores are a synchronization primitive that can be used to insert a dependency between batches submitted to queues. Semaphores have two states - signaled and unsignaled. The state of a semaphore can be signaled after execution of a batch of commands is completed. A batch can wait for a semaphore to become signaled before it begins execution, and the semaphore is also unsignaled before the batch begins execution.
As with most objects in Vulkan, semaphores are an interface to internal data which is typically opaque to applications. This internal data is referred to as a semaphore’s payload.
However, in order to enable communication with agents outside of the current device, it is necessary to be able to export that payload to a commonly understood format, and subsequently import from that format as well.
The internal data of a semaphore may include a reference to any resources and pending work associated with signal or unsignal operations performed on that semaphore object. Mechanisms to import and export that internal data to and from semaphores are provided below. These mechanisms indirectly enable applications to share semaphore state between two or more semaphores and other synchronization primitives across process and API boundaries.
Semaphores are represented by VkSemaphore
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSemaphore)
To create a semaphore, call:
VkResult vkCreateSemaphore(
VkDevice device,
const VkSemaphoreCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSemaphore* pSemaphore);
-
device
is the logical device that creates the semaphore. -
pCreateInfo
is a pointer to an instance of theVkSemaphoreCreateInfo
structure which contains information about how the semaphore is to be created. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pSemaphore
points to a handle in which the resulting semaphore object is returned.
When created, the semaphore is in the unsignaled state.
The VkSemaphoreCreateInfo
structure is defined as:
typedef struct VkSemaphoreCreateInfo {
VkStructureType sType;
const void* pNext;
VkSemaphoreCreateFlags flags;
} VkSemaphoreCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use.
typedef VkFlags VkSemaphoreCreateFlags;
VkSemaphoreCreateFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
To create a semaphore whose payload can be exported to external handles,
add the VkExportSemaphoreCreateInfo structure to the pNext
chain
of the VkSemaphoreCreateInfo structure.
The VkExportSemaphoreCreateInfo
structure is defined as:
typedef struct VkExportSemaphoreCreateInfo {
VkStructureType sType;
const void* pNext;
VkExternalSemaphoreHandleTypeFlags handleTypes;
} VkExportSemaphoreCreateInfo;
or the equivalent
typedef VkExportSemaphoreCreateInfo VkExportSemaphoreCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleTypes
is a bitmask of VkExternalSemaphoreHandleTypeFlagBits specifying one or more semaphore handle types the application can export from the resulting semaphore. The application can request multiple handle types for the same semaphore.
To specify additional attributes of NT handles exported from a semaphore,
add the VkExportSemaphoreWin32HandleInfoKHR
structure to the
pNext
chain of the VkSemaphoreCreateInfo structure.
The VkExportSemaphoreWin32HandleInfoKHR
structure is defined as:
typedef struct VkExportSemaphoreWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
LPCWSTR name;
} VkExportSemaphoreWin32HandleInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pAttributes
is a pointer to a WindowsSECURITY_ATTRIBUTES
structure specifying security attributes of the handle. -
dwAccess
is aDWORD
specifying access rights of the handle. -
name
is a NULL-terminated UTF-16 string to associate with the underlying synchronization primitive referenced by NT handles exported from the created semaphore.
If this structure is not present, or if pAttributes
is set to NULL
,
default security descriptor values will be used, and child processes created
by the application will not inherit the handle, as described in the MSDN
documentation for “Synchronization Object Security and Access Rights”1.
Further, if the structure is not present, the access rights will be
DXGI_SHARED_RESOURCE_READ
| DXGI_SHARED_RESOURCE_WRITE
for handles of the following types:
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT
And
GENERIC_ALL
for handles of the following types:
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT
To export a Windows handle representing the payload of a semaphore, call:
VkResult vkGetSemaphoreWin32HandleKHR(
VkDevice device,
const VkSemaphoreGetWin32HandleInfoKHR* pGetWin32HandleInfo,
HANDLE* pHandle);
-
device
is the logical device that created the semaphore being exported. -
pGetWin32HandleInfo
is a pointer to an instance of the VkSemaphoreGetWin32HandleInfoKHR structure containing parameters of the export operation. -
pHandle
will return the Windows handle representing the semaphore state.
For handle types defined as NT handles, the handles returned by
vkGetSemaphoreWin32HandleKHR
are owned by the application.
To avoid leaking resources, the application must release ownership of them
using the CloseHandle
system call when they are no longer needed.
Exporting a Windows handle from a semaphore may have side effects depending on the transference of the specified handle type, as described in Importing Semaphore Payloads.
The VkSemaphoreGetWin32HandleInfoKHR
structure is defined as:
typedef struct VkSemaphoreGetWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkSemaphore semaphore;
VkExternalSemaphoreHandleTypeFlagBits handleType;
} VkSemaphoreGetWin32HandleInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
semaphore
is the semaphore from which state will be exported. -
handleType
is the type of handle requested.
The properties of the handle returned depend on the value of
handleType
.
See VkExternalSemaphoreHandleTypeFlagBits for a description of the
properties of the defined external semaphore handle types.
To export a POSIX file descriptor representing the payload of a semaphore, call:
VkResult vkGetSemaphoreFdKHR(
VkDevice device,
const VkSemaphoreGetFdInfoKHR* pGetFdInfo,
int* pFd);
-
device
is the logical device that created the semaphore being exported. -
pGetFdInfo
is a pointer to an instance of the VkSemaphoreGetFdInfoKHR structure containing parameters of the export operation. -
pFd
will return the file descriptor representing the semaphore payload.
Each call to vkGetSemaphoreFdKHR
must create a new file descriptor
and transfer ownership of it to the application.
To avoid leaking resources, the application must release ownership of the
file descriptor when it is no longer needed.
Note
Ownership can be released in many ways.
For example, the application can call |
Where supported by the operating system, the implementation must set the
file descriptor to be closed automatically when an execve
system call
is made.
Exporting a file descriptor from a semaphore may have side effects depending on the transference of the specified handle type, as described in Importing Semaphore State.
The VkSemaphoreGetFdInfoKHR
structure is defined as:
typedef struct VkSemaphoreGetFdInfoKHR {
VkStructureType sType;
const void* pNext;
VkSemaphore semaphore;
VkExternalSemaphoreHandleTypeFlagBits handleType;
} VkSemaphoreGetFdInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
semaphore
is the semaphore from which state will be exported. -
handleType
is the type of handle requested.
The properties of the file descriptor returned depend on the value of
handleType
.
See VkExternalSemaphoreHandleTypeFlagBits for a description of the
properties of the defined external semaphore handle types.
To destroy a semaphore, call:
void vkDestroySemaphore(
VkDevice device,
VkSemaphore semaphore,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the semaphore. -
semaphore
is the handle of the semaphore to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
6.4.1. Semaphore Signaling
When a batch is submitted to a queue via a queue submission, and it includes semaphores to be signaled, it defines a memory dependency on the batch, and defines semaphore signal operations which set the semaphores to the signaled state.
The first synchronization scope includes every command submitted in the same batch. Semaphore signal operations that are defined by vkQueueSubmit additionally include all commands that occur earlier in submission order.
The second synchronization scope includes only the semaphore signal operation.
The first access scope includes all memory access performed by the device.
The second access scope is empty.
6.4.2. Semaphore Waiting & Unsignaling
When a batch is submitted to a queue via a queue submission, and it includes semaphores to be waited on, it defines a memory dependency between prior semaphore signal operations and the batch, and defines semaphore unsignal operations which set the semaphores to the unsignaled state.
The first synchronization scope includes all semaphore signal operations that operate on semaphores waited on in the same batch, and that happen-before the wait completes.
The second synchronization scope
includes every command submitted in the same batch.
In the case of vkQueueSubmit, the second synchronization scope is
limited to operations on the pipeline stages determined by the
destination stage mask specified
by the corresponding element of pWaitDstStageMask
.
Also, in the case of vkQueueSubmit, the second synchronization scope
additionally includes all commands that occur later in
submission order.
The first access scope is empty.
The second access scope includes all memory access performed by the device.
The semaphore unsignal operation happens-after the first set of operations in the execution dependency, and happens-before the second set of operations in the execution dependency.
Note
Unlike fences or events, the act of waiting for a semaphore also unsignals that semaphore. If two operations are separately specified to wait for the same semaphore, and there are no other execution dependencies between those operations, behaviour is undefined. An execution dependency must be present that guarantees that the semaphore unsignal operation for the first of those waits, happens-before the semaphore is signalled again, and before the second unsignal operation. Semaphore waits and signals should thus occur in discrete 1:1 pairs. |
Note
A common scenario for using If an image layout transition needs to be performed on a presentable image
before it is used in a framebuffer, that can be performed as the first
operation submitted to the queue after acquiring the image, and should not
prevent other work from overlapping with the presentation operation.
For example, a
Alternatively, This barrier accomplishes a dependency chain between previous presentation
operations and subsequent color attachment output operations, with the
layout transition performed in between, and does not introduce a dependency
between previous work and any vertex processing stages.
More precisely, the semaphore signals after the presentation operation
completes, the semaphore wait stalls the
|
6.4.3. Semaphore State Requirements For Wait Operations
Before waiting on a semaphore, the application must ensure the semaphore is in a valid state for a wait operation. Specifically, when a semaphore wait and unsignal operation is submitted to a queue:
-
The semaphore must be signaled, or have an associated semaphore signal operation that is pending execution.
-
There must be no other queue waiting on the same semaphore when the operation executes.
6.4.4. Importing Semaphore Payloads
Applications can import a semaphore payload into an existing semaphore
using an external semaphore handle.
The effects of the import operation will be either temporary or permanent,
as specified by the application.
If the import is temporary, the implementation must restore the semaphore
to its prior permanent state after submitting the next semaphore wait
operation.
Performing a subsequent temporary import on a semaphore before performing a
semaphore wait has no effect on this requirement; the next wait submitted on
the semaphore must still restore its last permanent state.
A permanent payload import behaves as if the target semaphore was destroyed,
and a new semaphore was created with the same handle but the imported
payload.
Because importing a semaphore payload temporarily or permanently detaches
the existing payload from a semaphore, similar usage restrictions to those
applied to vkDestroySemaphore
are applied to any command that imports
a semaphore payload.
Which of these import types is used is referred to as the import operation’s
permanence.
Each handle type supports either one or both types of permanence.
The implementation must perform the import operation by either referencing or copying the payload referred to by the specified external semaphore handle, depending on the handle’s type. The import method used is referred to as the handle type’s transference. When using handle types with reference transference, importing a payload to a semaphore adds the semaphore to the set of all semaphores sharing that payload. This set includes the semaphore from which the payload was exported. Semaphore signaling and waiting operations performed on any semaphore in the set must behave as if the set were a single semaphore. Importing a payload using handle types with copy transference creates a duplicate copy of the payload at the time of import, but makes no further reference to it. Semaphore signaling and waiting operations performed on the target of copy imports must not affect any other semaphore or payload.
Export operations have the same transference as the specified handle type’s import operations. Additionally, exporting a semaphore payload to a handle with copy transference has the same side effects on the source semaphore’s payload as executing a semaphore wait operation. If the semaphore was using a temporarily imported payload, the semaphore’s prior permanent payload will be restored.
Note
The tables Handle Types Supported by VkImportSemaphoreWin32HandleInfoKHR and Handle Types Supported by VkImportSemaphoreFdInfoKHR define the permanence and transference of each handle type. |
External synchronization allows
implementations to modify an object’s internal state, i.e. payload, without
internal synchronization.
However, for semaphores sharing a payload across processes, satisfying the
external synchronization requirements of VkSemaphore
parameters as if
all semaphores in the set were the same object is sometimes infeasible.
Satisfying the wait operation
state requirements would similarly require impractical coordination or
levels of trust between processes.
Therefore, these constraints only apply to a specific semaphore handle, not
to its payload.
For distinct semaphore objects which share a payload, if the semaphores are
passed to separate queue submission commands concurrently, behavior will be
as if the commands were called in an arbitrary sequential order.
If the wait operation state
requirements are violated for the shared payload by a queue submission
command, or if a signal operation is queued for a shared payload that is
already signaled or has a pending signal operation, effects must be limited
to one or more of the following:
-
Returning
VK_ERROR_INITIALIZATION_FAILED
from the command which resulted in the violation. -
Losing the logical device on which the violation occurred immediately or at a future time, resulting in a
VK_ERROR_DEVICE_LOST
error from subsequent commands, including the one causing the violation. -
Continuing execution of the violating command or operation as if the semaphore wait completed successfully after an implementation-dependent timeout. In this case, the state of the payload becomes undefined, and future operations on semaphores sharing the payload will be subject to these same rules. The semaphore must be destroyed or have its payload replaced by an import operation to again have a well-defined state.
Note
These rules allow processes to synchronize access to shared memory without trusting each other. However, such processes must still be cautious not to use the shared semaphore for more than synchronizing access to the shared memory. For example, a process should not use a shared semaphore as part of an execution dependency chain that, when complete, leads to objects being destroyed, if it does not trust other processes sharing the semaphore payload. |
When a semaphore is using an imported payload, its
VkExportSemaphoreCreateInfo::handleTypes
value is that specified
when creating the semaphore from which the payload was exported, rather than
that specified when creating the semaphore.
Additionally,
VkExternalSemaphoreProperties::exportFromImportedHandleTypes restricts
which handle types can be exported from such a semaphore based on the
specific handle type used to import the current payload.
Passing a semaphore to vkAcquireNextImageKHR is equivalent to
temporarily importing a semaphore payload to that semaphore.
Note
Because the exportable handle types of an imported semaphore correspond to its current imported payload, and vkAcquireNextImageKHR behaves the same as a temporary import operation for which the source semaphore is opaque to the application, applications have no way of determining whether any external handle types can be exported from a semaphore in this state. Therefore, applications must not attempt to export external handles from semaphores using a temporarily imported payload from vkAcquireNextImageKHR. |
When importing a semaphore payload, it is the responsibility of the
application to ensure the external handles meet all valid usage
requirements.
However, implementations must perform sufficient validation of external
handles to ensure that the operation results in a valid semaphore which will
not cause program termination, device loss, queue stalls, or corruption of
other resources when used as allowed according to its import parameters, and
excepting those side effects allowed for violations of the
valid semaphore state for wait
operations rules.
If the external handle provided does not meet these requirements, the
implementation must fail the semaphore payload import operation with the
error code VK_ERROR_INVALID_EXTERNAL_HANDLE
.
To import a semaphore payload from a Windows handle, call:
VkResult vkImportSemaphoreWin32HandleKHR(
VkDevice device,
const VkImportSemaphoreWin32HandleInfoKHR* pImportSemaphoreWin32HandleInfo);
-
device
is the logical device that created the semaphore. -
pImportSemaphoreWin32HandleInfo
points to a VkImportSemaphoreWin32HandleInfoKHR structure specifying the semaphore and import parameters.
Importing a semaphore payload from Windows handles does not transfer
ownership of the handle to the Vulkan implementation.
For handle types defined as NT handles, the application must release
ownership using the CloseHandle
system call when the handle is no
longer needed.
Applications can import the same semaphore payload into multiple instances of Vulkan, into the same instance from which it was exported, and multiple times into a given Vulkan instance.
The VkImportSemaphoreWin32HandleInfoKHR
structure is defined as:
typedef struct VkImportSemaphoreWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkSemaphore semaphore;
VkSemaphoreImportFlags flags;
VkExternalSemaphoreHandleTypeFlagBits handleType;
HANDLE handle;
LPCWSTR name;
} VkImportSemaphoreWin32HandleInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
semaphore
is the semaphore into which the payload will be imported. -
flags
is a bitmask of VkSemaphoreImportFlagBits specifying additional parameters for the semaphore payload import operation. -
handleType
specifies the type ofhandle
. -
handle
is the external handle to import, orNULL
. -
name
is a NULL-terminated UTF-16 string naming the underlying synchronization primitive to import, orNULL
.
The handle types supported by handleType
are:
Handle Type | Transference | Permanence Supported |
---|---|---|
|
Reference |
Temporary,Permanent |
|
Reference |
Temporary,Permanent |
|
Reference |
Temporary,Permanent |
To import a semaphore payload from a POSIX file descriptor, call:
VkResult vkImportSemaphoreFdKHR(
VkDevice device,
const VkImportSemaphoreFdInfoKHR* pImportSemaphoreFdInfo);
-
device
is the logical device that created the semaphore. -
pImportSemaphoreFdInfo
points to a VkImportSemaphoreFdInfoKHR structure specifying the semaphore and import parameters.
Importing a semaphore payload from a file descriptor transfers ownership of the file descriptor from the application to the Vulkan implementation. The application must not perform any operations on the file descriptor after a successful import.
Applications can import the same semaphore payload into multiple instances of Vulkan, into the same instance from which it was exported, and multiple times into a given Vulkan instance.
The VkImportSemaphoreFdInfoKHR
structure is defined as:
typedef struct VkImportSemaphoreFdInfoKHR {
VkStructureType sType;
const void* pNext;
VkSemaphore semaphore;
VkSemaphoreImportFlags flags;
VkExternalSemaphoreHandleTypeFlagBits handleType;
int fd;
} VkImportSemaphoreFdInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
semaphore
is the semaphore into which the payload will be imported. -
flags
is a bitmask of VkSemaphoreImportFlagBits specifying additional parameters for the semaphore payload import operation. -
handleType
specifies the type offd
. -
fd
is the external handle to import.
The handle types supported by handleType
are:
Handle Type | Transference | Permanence Supported |
---|---|---|
|
Reference |
Temporary,Permanent |
|
Copy |
Temporary |
Additional parameters of a semaphore import operation are specified by
VkImportSemaphoreWin32HandleInfoKHR::flags
or
VkImportSemaphoreFdInfoKHR::flags
.
Bits which can be set include:
typedef enum VkSemaphoreImportFlagBits {
VK_SEMAPHORE_IMPORT_TEMPORARY_BIT = 0x00000001,
VK_SEMAPHORE_IMPORT_TEMPORARY_BIT_KHR = VK_SEMAPHORE_IMPORT_TEMPORARY_BIT,
} VkSemaphoreImportFlagBits;
or the equivalent
typedef VkSemaphoreImportFlagBits VkSemaphoreImportFlagBitsKHR;
These bits have the following meanings:
-
VK_SEMAPHORE_IMPORT_TEMPORARY_BIT
specifies that the semaphore payload will be imported only temporarily, as described in Importing Semaphore Payloads, regardless of the permanence ofhandleType
.
typedef VkFlags VkSemaphoreImportFlags;
or the equivalent
typedef VkSemaphoreImportFlags VkSemaphoreImportFlagsKHR;
VkSemaphoreImportFlags
is a bitmask type for setting a mask of zero or
more VkSemaphoreImportFlagBits.
6.5. Events
Events are a synchronization primitive that can be used to insert a fine-grained dependency between commands submitted to the same queue, or between the host and a queue. Events must not be used to insert a dependency between commands submitted to different queues. Events have two states - signaled and unsignaled. An application can signal an event, or unsignal it, on either the host or the device. A device can wait for an event to become signaled before executing further operations. No command exists to wait for an event to become signaled on the host, but the current state of an event can be queried.
Events are represented by VkEvent
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkEvent)
To create an event, call:
VkResult vkCreateEvent(
VkDevice device,
const VkEventCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkEvent* pEvent);
-
device
is the logical device that creates the event. -
pCreateInfo
is a pointer to an instance of theVkEventCreateInfo
structure which contains information about how the event is to be created. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pEvent
points to a handle in which the resulting event object is returned.
When created, the event object is in the unsignaled state.
The VkEventCreateInfo
structure is defined as:
typedef struct VkEventCreateInfo {
VkStructureType sType;
const void* pNext;
VkEventCreateFlags flags;
} VkEventCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use.
typedef VkFlags VkEventCreateFlags;
VkEventCreateFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
To destroy an event, call:
void vkDestroyEvent(
VkDevice device,
VkEvent event,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the event. -
event
is the handle of the event to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
To query the state of an event from the host, call:
VkResult vkGetEventStatus(
VkDevice device,
VkEvent event);
-
device
is the logical device that owns the event. -
event
is the handle of the event to query.
Upon success, vkGetEventStatus
returns the state of the event object
with the following return codes:
Status | Meaning |
---|---|
|
The event specified by |
|
The event specified by |
If a vkCmdSetEvent
or vkCmdResetEvent
command is in a command
buffer that is in the pending state, then the
value returned by this command may immediately be out of date.
The state of an event can be updated by the host.
The state of the event is immediately changed, and subsequent calls to
vkGetEventStatus
will return the new state.
If an event is already in the requested state, then updating it to the same
state has no effect.
To set the state of an event to signaled from the host, call:
VkResult vkSetEvent(
VkDevice device,
VkEvent event);
-
device
is the logical device that owns the event. -
event
is the event to set.
When vkSetEvent is executed on the host, it defines an event signal operation which sets the event to the signaled state.
If event
is already in the signaled state when vkSetEvent is
executed, then vkSetEvent has no effect, and no event signal operation
occurs.
To set the state of an event to unsignaled from the host, call:
VkResult vkResetEvent(
VkDevice device,
VkEvent event);
-
device
is the logical device that owns the event. -
event
is the event to reset.
When vkResetEvent is executed on the host, it defines an event unsignal operation which resets the event to the unsignaled state.
If event
is already in the unsignaled state when vkResetEvent is
executed, then vkResetEvent has no effect, and no event unsignal
operation occurs.
The state of an event can also be updated on the device by commands inserted in command buffers.
To set the state of an event to signaled from a device, call:
void vkCmdSetEvent(
VkCommandBuffer commandBuffer,
VkEvent event,
VkPipelineStageFlags stageMask);
-
commandBuffer
is the command buffer into which the command is recorded. -
event
is the event that will be signaled. -
stageMask
specifies the source stage mask used to determine when theevent
is signaled.
When vkCmdSetEvent is submitted to a queue, it defines an execution dependency on commands that were submitted before it, and defines an event signal operation which sets the event to the signaled state.
The first synchronization scope
includes all commands that occur earlier in
submission order.
The synchronization scope is limited to operations on the pipeline stages
determined by the source stage
mask specified by stageMask
.
The second synchronization scope includes only the event signal operation.
If event
is already in the signaled state when vkCmdSetEvent is
executed on the device, then vkCmdSetEvent has no effect, no event
signal operation occurs, and no execution dependency is generated.
To set the state of an event to unsignaled from a device, call:
void vkCmdResetEvent(
VkCommandBuffer commandBuffer,
VkEvent event,
VkPipelineStageFlags stageMask);
-
commandBuffer
is the command buffer into which the command is recorded. -
event
is the event that will be unsignaled. -
stageMask
is a bitmask of VkPipelineStageFlagBits specifying the source stage mask used to determine when theevent
is unsignaled.
When vkCmdResetEvent is submitted to a queue, it defines an execution dependency on commands that were submitted before it, and defines an event unsignal operation which resets the event to the unsignaled state.
The first synchronization scope
includes all commands that occur earlier in
submission order.
The synchronization scope is limited to operations on the pipeline stages
determined by the source stage
mask specified by stageMask
.
The second synchronization scope includes only the event unsignal operation.
If event
is already in the unsignaled state when vkCmdResetEvent
is executed on the device, then vkCmdResetEvent has no effect, no
event unsignal operation occurs, and no execution dependency is generated.
To wait for one or more events to enter the signaled state on a device, call:
void vkCmdWaitEvents(
VkCommandBuffer commandBuffer,
uint32_t eventCount,
const VkEvent* pEvents,
VkPipelineStageFlags srcStageMask,
VkPipelineStageFlags dstStageMask,
uint32_t memoryBarrierCount,
const VkMemoryBarrier* pMemoryBarriers,
uint32_t bufferMemoryBarrierCount,
const VkBufferMemoryBarrier* pBufferMemoryBarriers,
uint32_t imageMemoryBarrierCount,
const VkImageMemoryBarrier* pImageMemoryBarriers);
-
commandBuffer
is the command buffer into which the command is recorded. -
eventCount
is the length of thepEvents
array. -
pEvents
is an array of event object handles to wait on. -
srcStageMask
is a bitmask of VkPipelineStageFlagBits specifying the source stage mask. -
dstStageMask
is a bitmask of VkPipelineStageFlagBits specifying the destination stage mask. -
memoryBarrierCount
is the length of thepMemoryBarriers
array. -
pMemoryBarriers
is a pointer to an array of VkMemoryBarrier structures. -
bufferMemoryBarrierCount
is the length of thepBufferMemoryBarriers
array. -
pBufferMemoryBarriers
is a pointer to an array of VkBufferMemoryBarrier structures. -
imageMemoryBarrierCount
is the length of thepImageMemoryBarriers
array. -
pImageMemoryBarriers
is a pointer to an array of VkImageMemoryBarrier structures.
When vkCmdWaitEvents
is submitted to a queue, it defines a memory
dependency between prior event signal operations on the same queue or the
host, and subsequent commands.
vkCmdWaitEvents
must not be used to wait on event signal operations
occurring on other queues.
The first synchronization scope only includes event signal operations that
operate on members of pEvents
, and the operations that happened-before
the event signal operations.
Event signal operations performed by vkCmdSetEvent that occur earlier
in submission order are included in the
first synchronization scope, if the logically latest pipeline stage in their stageMask
parameter is
logically earlier than or equal
to the logically latest pipeline
stage in srcStageMask
.
Event signal operations performed by vkSetEvent are only included in
the first synchronization scope if VK_PIPELINE_STAGE_HOST_BIT
is
included in srcStageMask
.
The second synchronization scope
includes all commands that occur later in
submission order.
The second synchronization scope is limited to operations on the pipeline
stages determined by the destination stage mask specified by dstStageMask
.
The first access scope is
limited to access in the pipeline stages determined by the
source stage mask specified by
srcStageMask
.
Within that, the first access scope only includes the first access scopes
defined by elements of the pMemoryBarriers
,
pBufferMemoryBarriers
and pImageMemoryBarriers
arrays, which
each define a set of memory barriers.
If no memory barriers are specified, then the first access scope includes no
accesses.
The second access scope is
limited to access in the pipeline stages determined by the
destination stage mask specified
by dstStageMask
.
Within that, the second access scope only includes the second access scopes
defined by elements of the pMemoryBarriers
,
pBufferMemoryBarriers
and pImageMemoryBarriers
arrays, which
each define a set of memory barriers.
If no memory barriers are specified, then the second access scope includes
no accesses.
Note
vkCmdWaitEvents is used with vkCmdSetEvent to define a memory dependency between two sets of action commands, roughly in the same way as pipeline barriers, but split into two commands such that work between the two may execute unhindered. |
Note
Applications should be careful to avoid race conditions when using events. There is no direct ordering guarantee between a vkCmdResetEvent command and a vkCmdWaitEvents command submitted after it, so some other execution dependency must be included between these commands (e.g. a semaphore). |
6.6. Pipeline Barriers
vkCmdPipelineBarrier is a synchronization command that inserts a dependency between commands submitted to the same queue, or between commands in the same subpass.
To record a pipeline barrier, call:
void vkCmdPipelineBarrier(
VkCommandBuffer commandBuffer,
VkPipelineStageFlags srcStageMask,
VkPipelineStageFlags dstStageMask,
VkDependencyFlags dependencyFlags,
uint32_t memoryBarrierCount,
const VkMemoryBarrier* pMemoryBarriers,
uint32_t bufferMemoryBarrierCount,
const VkBufferMemoryBarrier* pBufferMemoryBarriers,
uint32_t imageMemoryBarrierCount,
const VkImageMemoryBarrier* pImageMemoryBarriers);
-
commandBuffer
is the command buffer into which the command is recorded. -
srcStageMask
is a bitmask of VkPipelineStageFlagBits specifying the source stage mask. -
dstStageMask
is a bitmask of VkPipelineStageFlagBits specifying the destination stage mask. -
dependencyFlags
is a bitmask of VkDependencyFlagBits specifying how execution and memory dependencies are formed. -
memoryBarrierCount
is the length of thepMemoryBarriers
array. -
pMemoryBarriers
is a pointer to an array of VkMemoryBarrier structures. -
bufferMemoryBarrierCount
is the length of thepBufferMemoryBarriers
array. -
pBufferMemoryBarriers
is a pointer to an array of VkBufferMemoryBarrier structures. -
imageMemoryBarrierCount
is the length of thepImageMemoryBarriers
array. -
pImageMemoryBarriers
is a pointer to an array of VkImageMemoryBarrier structures.
When vkCmdPipelineBarrier is submitted to a queue, it defines a memory dependency between commands that were submitted before it, and those submitted after it.
If vkCmdPipelineBarrier was recorded outside a render pass instance,
the first synchronization scope
includes all commands that occur earlier in
submission order.
If vkCmdPipelineBarrier was recorded inside a render pass instance,
the first synchronization scope includes only commands that occur earlier in
submission order within the same
subpass.
In either case, the first synchronization scope is limited to operations on
the pipeline stages determined by the
source stage mask specified by
srcStageMask
.
If vkCmdPipelineBarrier was recorded outside a render pass instance,
the second synchronization scope
includes all commands that occur later in
submission order.
If vkCmdPipelineBarrier was recorded inside a render pass instance,
the second synchronization scope includes only commands that occur later in
submission order within the same
subpass.
In either case, the second synchronization scope is limited to operations on
the pipeline stages determined by the
destination stage mask specified
by dstStageMask
.
The first access scope is
limited to access in the pipeline stages determined by the
source stage mask specified by
srcStageMask
.
Within that, the first access scope only includes the first access scopes
defined by elements of the pMemoryBarriers
,
pBufferMemoryBarriers
and pImageMemoryBarriers
arrays, which
each define a set of memory barriers.
If no memory barriers are specified, then the first access scope includes no
accesses.
The second access scope is
limited to access in the pipeline stages determined by the
destination stage mask specified
by dstStageMask
.
Within that, the second access scope only includes the second access scopes
defined by elements of the pMemoryBarriers
,
pBufferMemoryBarriers
and pImageMemoryBarriers
arrays, which
each define a set of memory barriers.
If no memory barriers are specified, then the second access scope includes
no accesses.
If dependencyFlags
includes VK_DEPENDENCY_BY_REGION_BIT
, then
any dependency between framebuffer-space pipeline stages is
framebuffer-local - otherwise it is
framebuffer-global.
Bits which can be set in vkCmdPipelineBarrier
::dependencyFlags
,
specifying how execution and memory dependencies are formed, are:
typedef enum VkDependencyFlagBits {
VK_DEPENDENCY_BY_REGION_BIT = 0x00000001,
VK_DEPENDENCY_DEVICE_GROUP_BIT = 0x00000004,
VK_DEPENDENCY_VIEW_LOCAL_BIT = 0x00000002,
VK_DEPENDENCY_VIEW_LOCAL_BIT_KHR = VK_DEPENDENCY_VIEW_LOCAL_BIT,
VK_DEPENDENCY_DEVICE_GROUP_BIT_KHR = VK_DEPENDENCY_DEVICE_GROUP_BIT,
} VkDependencyFlagBits;
-
VK_DEPENDENCY_BY_REGION_BIT
specifies that dependencies will be framebuffer-local. -
VK_DEPENDENCY_VIEW_LOCAL_BIT
specifies that a subpass has more than one view. -
VK_DEPENDENCY_DEVICE_GROUP_BIT
specifies that dependencies are non-device-local dependency.
typedef VkFlags VkDependencyFlags;
VkDependencyFlags
is a bitmask type for setting a mask of zero or more
VkDependencyFlagBits.
6.6.1. Subpass Self-dependency
If vkCmdPipelineBarrier
is called inside a render pass instance, the
following restrictions apply.
For a given subpass to allow a pipeline barrier, the render pass must
declare a self-dependency from that subpass to itself.
That is, there must exist a VkSubpassDependency
in the subpass
dependency list for the render pass with srcSubpass
and
dstSubpass
equal to that subpass index.
More than one self-dependency can be declared for each subpass.
Self-dependencies must only include pipeline stage bits that are graphics
stages.
Self-dependencies must not have any earlier pipeline stages depend on any
later pipeline stages (according to the order of
graphics pipeline stages), unless
all of the stages are
framebuffer-space stages.
If the source and destination stage masks both include framebuffer-space
stages, then dependencyFlags
must include
VK_DEPENDENCY_BY_REGION_BIT
.
If the subpass has more than one view, then dependencyFlags
must
include VK_DEPENDENCY_VIEW_LOCAL_BIT
.
A vkCmdPipelineBarrier
command inside a render pass instance must be
a subset of one of the self-dependencies of the subpass it is used in,
meaning that the stage masks and access masks must each include only a
subset of the bits of the corresponding mask in that self-dependency.
If the self-dependency has VK_DEPENDENCY_BY_REGION_BIT
or VK_DEPENDENCY_VIEW_LOCAL_BIT
set, then so must the pipeline barrier.
Pipeline barriers within a render pass instance can only be types
VkMemoryBarrier
or VkImageMemoryBarrier
.
If a VkImageMemoryBarrier
is used, the image and image subresource
range specified in the barrier must be a subset of one of the image views
used by the framebuffer in the current subpass.
Additionally, oldLayout
must be equal to newLayout
, and both
the srcQueueFamilyIndex
and dstQueueFamilyIndex
must be
VK_QUEUE_FAMILY_IGNORED
.
6.7. Memory Barriers
Memory barriers are used to explicitly control access to buffer and image subresource ranges. Memory barriers are used to transfer ownership between queue families, change image layouts, and define availability and visibility operations. They explicitly define the access types and buffer and image subresource ranges that are included in the access scopes of a memory dependency that is created by a synchronization command that includes them.
6.7.1. Global Memory Barriers
Global memory barriers apply to memory accesses involving all memory objects that exist at the time of its execution.
The VkMemoryBarrier
structure is defined as:
typedef struct VkMemoryBarrier {
VkStructureType sType;
const void* pNext;
VkAccessFlags srcAccessMask;
VkAccessFlags dstAccessMask;
} VkMemoryBarrier;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
srcAccessMask
is a bitmask of VkAccessFlagBits specifying a source access mask. -
dstAccessMask
is a bitmask of VkAccessFlagBits specifying a destination access mask.
The first access scope is
limited to access types in the source access
mask specified by srcAccessMask
.
The second access scope is
limited to access types in the destination
access mask specified by dstAccessMask
.
6.7.2. Buffer Memory Barriers
Buffer memory barriers only apply to memory accesses involving a specific buffer range. That is, a memory dependency formed from a buffer memory barrier is scoped to access via the specified buffer range. Buffer memory barriers can also be used to define a queue family ownership transfer for the specified buffer range.
The VkBufferMemoryBarrier
structure is defined as:
typedef struct VkBufferMemoryBarrier {
VkStructureType sType;
const void* pNext;
VkAccessFlags srcAccessMask;
VkAccessFlags dstAccessMask;
uint32_t srcQueueFamilyIndex;
uint32_t dstQueueFamilyIndex;
VkBuffer buffer;
VkDeviceSize offset;
VkDeviceSize size;
} VkBufferMemoryBarrier;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
srcAccessMask
is a bitmask of VkAccessFlagBits specifying a source access mask. -
dstAccessMask
is a bitmask of VkAccessFlagBits specifying a destination access mask. -
srcQueueFamilyIndex
is the source queue family for a queue family ownership transfer. -
dstQueueFamilyIndex
is the destination queue family for a queue family ownership transfer. -
buffer
is a handle to the buffer whose backing memory is affected by the barrier. -
offset
is an offset in bytes into the backing memory forbuffer
; this is relative to the base offset as bound to the buffer (see vkBindBufferMemory). -
size
is a size in bytes of the affected area of backing memory forbuffer
, orVK_WHOLE_SIZE
to use the range fromoffset
to the end of the buffer.
The first access scope is
limited to access to memory through the specified buffer range, via access
types in the source access mask specified
by srcAccessMask
.
If srcAccessMask
includes VK_ACCESS_HOST_WRITE_BIT
, memory
writes performed by that access type are also made visible, as that access
type is not performed through a resource.
The second access scope is
limited to access to memory through the specified buffer range, via access
types in the destination access mask.
specified by dstAccessMask
.
If dstAccessMask
includes VK_ACCESS_HOST_WRITE_BIT
or
VK_ACCESS_HOST_READ_BIT
, available memory writes are also made visible
to accesses of those types, as those access types are not performed through
a resource.
If srcQueueFamilyIndex
is not equal to dstQueueFamilyIndex
, and
srcQueueFamilyIndex
is equal to the current queue family, then the
memory barrier defines a queue
family release operation for the specified buffer range, and the second
access scope includes no access, as if dstAccessMask
was 0
.
If dstQueueFamilyIndex
is not equal to srcQueueFamilyIndex
, and
dstQueueFamilyIndex
is equal to the current queue family, then the
memory barrier defines a queue
family acquire operation for the specified buffer range, and the first
access scope includes no access, as if srcAccessMask
was 0
.
6.7.3. Image Memory Barriers
Image memory barriers only apply to memory accesses involving a specific image subresource range. That is, a memory dependency formed from an image memory barrier is scoped to access via the specified image subresource range. Image memory barriers can also be used to define image layout transitions or a queue family ownership transfer for the specified image subresource range.
The VkImageMemoryBarrier
structure is defined as:
typedef struct VkImageMemoryBarrier {
VkStructureType sType;
const void* pNext;
VkAccessFlags srcAccessMask;
VkAccessFlags dstAccessMask;
VkImageLayout oldLayout;
VkImageLayout newLayout;
uint32_t srcQueueFamilyIndex;
uint32_t dstQueueFamilyIndex;
VkImage image;
VkImageSubresourceRange subresourceRange;
} VkImageMemoryBarrier;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
srcAccessMask
is a bitmask of VkAccessFlagBits specifying a source access mask. -
dstAccessMask
is a bitmask of VkAccessFlagBits specifying a destination access mask. -
oldLayout
is the old layout in an image layout transition. -
newLayout
is the new layout in an image layout transition. -
srcQueueFamilyIndex
is the source queue family for a queue family ownership transfer. -
dstQueueFamilyIndex
is the destination queue family for a queue family ownership transfer. -
image
is a handle to the image affected by this barrier. -
subresourceRange
describes the image subresource range withinimage
that is affected by this barrier.
The first access scope is
limited to access to memory through the specified image subresource range,
via access types in the source access mask
specified by srcAccessMask
.
If srcAccessMask
includes VK_ACCESS_HOST_WRITE_BIT
, memory
writes performed by that access type are also made visible, as that access
type is not performed through a resource.
The second access scope is
limited to access to memory through the specified image subresource range,
via access types in the destination access
mask specified by dstAccessMask
.
If dstAccessMask
includes VK_ACCESS_HOST_WRITE_BIT
or
VK_ACCESS_HOST_READ_BIT
, available memory writes are also made visible
to accesses of those types, as those access types are not performed through
a resource.
If srcQueueFamilyIndex
is not equal to dstQueueFamilyIndex
, and
srcQueueFamilyIndex
is equal to the current queue family, then the
memory barrier defines a queue
family release operation for the specified image subresource range, and
the second access scope includes no access, as if dstAccessMask
was
0
.
If dstQueueFamilyIndex
is not equal to srcQueueFamilyIndex
, and
dstQueueFamilyIndex
is equal to the current queue family, then the
memory barrier defines a queue
family acquire operation for the specified image subresource range, and
the first access scope includes no access, as if srcAccessMask
was
0
.
If oldLayout
is not equal to newLayout
, then the memory barrier
defines an image layout
transition for the specified image subresource range.
Layout transitions that are performed via image memory barriers execute in their entirety in submission order, relative to other image layout transitions submitted to the same queue, including those performed by render passes. In effect there is an implicit execution dependency from each such layout transition to all layout transitions previously submitted to the same queue.
The image layout of each image subresource of a depth/stencil image created
with VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
is
dependent on the last sample locations used to render to the image
subresource as a depth/stencil attachment, thus when the image
member
of an VkImageMemoryBarrier
is an image created with this flag the
application can chain a VkSampleLocationsInfoEXT structure to the
pNext
chain of VkImageMemoryBarrier
to specify the sample
locations to use during the image layout transition.
If the VkSampleLocationsInfoEXT
structure in the pNext
chain of
VkImageMemoryBarrier
does not match the sample location state last
used to render to the image subresource range specified by
subresourceRange
or if no VkSampleLocationsInfoEXT
structure is
in the pNext
chain of VkImageMemoryBarrier
then the contents of
the given image subresource range becomes undefined as if oldLayout
would equal VK_IMAGE_LAYOUT_UNDEFINED
.
If image
has a multi-planar format and the image is disjoint, then
including VK_IMAGE_ASPECT_COLOR_BIT
in the aspectMask
member of
subresourceRange
is equivalent to including
VK_IMAGE_ASPECT_PLANE_0_BIT
, VK_IMAGE_ASPECT_PLANE_1_BIT
, and
(for three-plane formats only) VK_IMAGE_ASPECT_PLANE_2_BIT
.
6.7.4. Queue Family Ownership Transfer
Resources created with a VkSharingMode of
VK_SHARING_MODE_EXCLUSIVE
must have their ownership explicitly
transferred from one queue family to another in order to access their
content in a well-defined manner on a queue in a different queue family.
Resources shared with external APIs or instances using external memory must
also explicitly manage ownership transfers between local and external queues
(or equivalent constructs in external APIs) regardless of the
VkSharingMode specified when creating them.
The special queue family index VK_QUEUE_FAMILY_EXTERNAL
represents any
queue external to the resource’s current Vulkan instance, as long as the
queue uses the same underlying physical device
or device group
and uses the same driver version as the resource’s VkDevice, as
indicated by VkPhysicalDeviceIDProperties::deviceUUID
and
VkPhysicalDeviceIDProperties::driverUUID
.
The special queue family index VK_QUEUE_FAMILY_FOREIGN_EXT
represents
any queue external to the resource’s current Vulkan instance, regardless of
the queue’s underlying physical device or driver version.
This includes, for example, queues for fixed-function image processing
devices, media codec devices, and display devices, as well as all queues
that use the same underlying physical device
(or device group)
and driver version as the resource’s VkDevice.
If memory dependencies are correctly expressed between uses of such a
resource between two queues in different families, but no ownership transfer
is defined, the contents of that resource are undefined for any read
accesses performed by the second queue family.
Note
If an application does not need the contents of a resource to remain valid when transferring from one queue family to another, then the ownership transfer should be skipped. |
Note
Applications should expect transfers to/from
|
A queue family ownership transfer consists of two distinct parts:
-
Release exclusive ownership from the source queue family
-
Acquire exclusive ownership for the destination queue family
An application must ensure that these operations occur in the correct order by defining an execution dependency between them, e.g. using a semaphore.
A release operation is used to
release exclusive ownership of a range of a buffer or image subresource
range.
A release operation is defined by executing a
buffer memory barrier (for a
buffer range) or an image memory
barrier (for an image subresource range), on a queue from the source queue
family.
The srcQueueFamilyIndex
parameter of the barrier must be set to the
source queue family index, and the dstQueueFamilyIndex
parameter to
the destination queue family index.
dstStageMask
is ignored for such a barrier, such that no visibility
operation is executed - the value of this mask does not affect the validity
of the barrier.
The release operation happens-after the availability operation.
An acquire operation is used
to acquire exclusive ownership of a range of a buffer or image subresource
range.
An acquire operation is defined by executing a
buffer memory barrier (for a
buffer range) or an image memory
barrier (for an image subresource range), on a queue from the destination
queue family.
The buffer range or image subresource range specified in an acquire
operation must match exactly that of a previous release operation.
The srcQueueFamilyIndex
parameter of the barrier must be set to the
source queue family index, and the dstQueueFamilyIndex
parameter to
the destination queue family index.
srcStageMask
is ignored for such a barrier, such that no availability
operation is executed - the value of this mask does not affect the validity
of the barrier.
The acquire operation happens-before the visibility operation.
Note
Whilst it is not invalid to provide destination or source access masks for memory barriers used for release or acquire operations, respectively, they have no practical effect. Access after a release operation has undefined results, and so visibility for those accesses has no practical effect. Similarly, write access before an acquire operation will produce undefined results for future access, so availability of those writes has no practical use. In an earlier version of the specification, these were required to match on both sides - but this was subsequently relaxed. These masks should be set to 0. |
If the transfer is via an image memory barrier, and an
image layout transition is
desired, then the values of oldLayout
and newLayout
in the
release memory barrier must be equal to values of oldLayout
and
newLayout
in the acquire memory barrier.
Although the image layout transition is submitted twice, it will only be
executed once.
A layout transition specified in this way happens-after the release
operation and happens-before the acquire operation.
If the values of srcQueueFamilyIndex
and dstQueueFamilyIndex
are
equal, no ownership transfer is performed, and the barrier operates as if
they were both set to VK_QUEUE_FAMILY_IGNORED
.
Queue family ownership transfers may perform read and write accesses on all memory bound to the image subresource or buffer range, so applications must ensure that all memory writes have been made available before a queue family ownership transfer is executed. Available memory is automatically made visible to queue family release and acquire operations, and writes performed by those operations are automatically made available.
Once a queue family has acquired ownership of a buffer range or image
subresource range of an VK_SHARING_MODE_EXCLUSIVE
resource, its
contents are undefined to other queue families unless ownership is
transferred.
The contents of any portion of another resource which aliases memory that is
bound to the transferred buffer or image subresource range are undefined
after a release or acquire operation.
6.8. Wait Idle Operations
To wait on the host for the completion of outstanding queue operations for a given queue, call:
VkResult vkQueueWaitIdle(
VkQueue queue);
-
queue
is the queue on which to wait.
vkQueueWaitIdle
is equivalent to submitting a fence to a queue and
waiting with an infinite timeout for that fence to signal.
To wait on the host for the completion of outstanding queue operations for all queues on a given logical device, call:
VkResult vkDeviceWaitIdle(
VkDevice device);
-
device
is the logical device to idle.
vkDeviceWaitIdle
is equivalent to calling vkQueueWaitIdle
for
all queues owned by device
.
6.9. Host Write Ordering Guarantees
When batches of command buffers are submitted to a queue via vkQueueSubmit, it defines a memory dependency with prior host operations, and execution of command buffers submitted to the queue.
The first synchronization scope is defined by the host execution model, but includes execution of vkQueueSubmit on the host and anything that happened-before it.
The second synchronization scope includes all commands submitted in the same queue submission, and all commands that occur later in submission order.
The first access scope includes all host writes to mappable device memory that are available to the host memory domain.
The second access scope includes all memory access performed by the device.
6.10. Synchronization and Multiple Physical Devices
If a logical device includes more than one physical device, then fences, semaphores, and events all still have a single instance of the signaled state.
A fence becomes signaled when all physical devices complete the necessary queue operations.
Semaphore wait and signal operations all include a device index that is the sole physical device that performs the operation. These indices are provided in the VkDeviceGroupSubmitInfo and VkDeviceGroupBindSparseInfo structures. Semaphores are not exclusively owned by any physical device. For example, a semaphore can be signaled by one physical device and then waited on by a different physical device.
An event can only be waited on by the same physical device that signaled it (or the host).
6.11. Calibrated timestamps
In order to be able to correlate the time a particular operation took place at on timelines of different time domains (e.g. a device operation vs a host operation), Vulkan allows querying calibrated timestamps from multiple time domains.
To query calibrated timestamps from a set of time domains, call:
VkResult vkGetCalibratedTimestampsEXT(
VkDevice device,
uint32_t timestampCount,
const VkCalibratedTimestampInfoEXT* pTimestampInfos,
uint64_t* pTimestamps,
uint64_t* pMaxDeviation);
-
device
is the logical device used to perform the query. -
timestampCount
is the number of timestamps to query. -
pTimestampInfos
is a pointer to an array oftimestampCount
number of structures of type VkCalibratedTimestampInfoEXT, describing the time domains the calibrated timestamps should be captured from. -
pTimestamps
is a pointer to an array oftimestampCount
number of 64-bit unsigned integer values in which the requested calibrated timestamp values are returned. -
pMaxDeviation
is a pointer to a 64-bit unsigned integer value in which the strictly positive maximum deviation, in nanoseconds, of the calibrated timestamp values is returned.
Note
The maximum deviation may vary between calls to
|
Calibrated timestamp values can be extrapolated to estimate future coinciding timestamp values, however, depending on the nature of the time domains and other properties of the platform extrapolating values over a sufficiently long period of time may no longer be accurate enough to fit any particular purpose so applications are expected to re-calibrate the timestamps on a regular basis.
The VkCalibratedTimestampInfoEXT
structure is defined as:
typedef struct VkCalibratedTimestampInfoEXT {
VkStructureType sType;
const void* pNext;
VkTimeDomainEXT timeDomain;
} VkCalibratedTimestampInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
timeDomain
is a VkTimeDomainEXT value specifying the time domain from which the calibrated timestamp value should be returned.
The set of supported time domains consists of:
typedef enum VkTimeDomainEXT {
VK_TIME_DOMAIN_DEVICE_EXT = 0,
VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT = 1,
VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT = 2,
VK_TIME_DOMAIN_QUERY_PERFORMANCE_COUNTER_EXT = 3,
} VkTimeDomainEXT;
-
VK_TIME_DOMAIN_DEVICE_EXT
specifies the device time domain. Timestamp values in this time domain are comparable with device timestamp values captured using vkCmdWriteTimestamp and are defined to be incrementing according to the timestampPeriod of the device. -
VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT
specifies the CLOCK_MONOTONIC time domain available on POSIX platforms. -
VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT
specifies the CLOCK_MONOTONIC_RAW time domain available on POSIX platforms. -
VK_TIME_DOMAIN_QUERY_PERFORMANCE_COUNTER_EXT
specifies the performance counter (QPC) time domain available on Windows.
7. Render Pass
A render pass represents a collection of attachments, subpasses, and dependencies between the subpasses, and describes how the attachments are used over the course of the subpasses. The use of a render pass in a command buffer is a render pass instance.
Render passes are represented by VkRenderPass
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkRenderPass)
An attachment description describes the properties of an attachment including its format, sample count, and how its contents are treated at the beginning and end of each render pass instance.
A subpass represents a phase of rendering that reads and writes a subset of the attachments in a render pass. Rendering commands are recorded into a particular subpass of a render pass instance.
A subpass description describes the subset of attachments that is involved in the execution of a subpass. Each subpass can read from some attachments as input attachments, write to some as color attachments or depth/stencil attachments, and perform multisample resolve operations to resolve attachments. A subpass description can also include a set of preserve attachments, which are attachments that are not read or written by the subpass but whose contents must be preserved throughout the subpass.
A subpass uses an attachment if the attachment is a color, depth/stencil,
resolve, or input attachment for that subpass (as determined by the
pColorAttachments
, pDepthStencilAttachment
,
pResolveAttachments
, and pInputAttachments
members of
VkSubpassDescription, respectively).
A subpass does not use an attachment if that attachment is preserved by the
subpass.
The first use of an attachment is in the lowest numbered subpass that uses
that attachment.
Similarly, the last use of an attachment is in the highest numbered
subpass that uses that attachment.
The subpasses in a render pass all render to the same dimensions, and fragments for pixel (x,y,layer) in one subpass can only read attachment contents written by previous subpasses at that same (x,y,layer) location.
Note
By describing a complete set of subpasses in advance, render passes provide the implementation an opportunity to optimize the storage and transfer of attachment data between subpasses. In practice, this means that subpasses with a simple framebuffer-space dependency may be merged into a single tiled rendering pass, keeping the attachment data on-chip for the duration of a render pass instance. However, it is also quite common for a render pass to only contain a single subpass. |
Subpass dependencies describe execution and memory dependencies between subpasses.
A subpass dependency chain is a sequence of subpass dependencies in a render pass, where the source subpass of each subpass dependency (after the first) equals the destination subpass of the previous dependency.
Execution of subpasses may overlap or execute out of order with regards to other subpasses, unless otherwise enforced by an execution dependency. Each subpass only respects submission order for commands recorded in the same subpass, and the vkCmdBeginRenderPass and vkCmdEndRenderPass commands that delimit the render pass - commands within other subpasses are not included. This affects most other implicit ordering guarantees.
A render pass describes the structure of subpasses and attachments
independent of any specific image views for the attachments.
The specific image views that will be used for the attachments, and their
dimensions, are specified in VkFramebuffer
objects.
Framebuffers are created with respect to a specific render pass that the
framebuffer is compatible with (see Render Pass
Compatibility).
Collectively, a render pass and a framebuffer define the complete render
target state for one or more subpasses as well as the algorithmic
dependencies between the subpasses.
The various pipeline stages of the drawing commands for a given subpass may execute concurrently and/or out of order, both within and across drawing commands, whilst still respecting pipeline order. However for a given (x,y,layer,sample) sample location, certain per-sample operations are performed in rasterization order.
7.1. Render Pass Creation
To create a render pass, call:
VkResult vkCreateRenderPass(
VkDevice device,
const VkRenderPassCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkRenderPass* pRenderPass);
-
device
is the logical device that creates the render pass. -
pCreateInfo
is a pointer to an instance of the VkRenderPassCreateInfo structure that describes the parameters of the render pass. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pRenderPass
points to a VkRenderPass handle in which the resulting render pass object is returned.
The VkRenderPassCreateInfo
structure is defined as:
typedef struct VkRenderPassCreateInfo {
VkStructureType sType;
const void* pNext;
VkRenderPassCreateFlags flags;
uint32_t attachmentCount;
const VkAttachmentDescription* pAttachments;
uint32_t subpassCount;
const VkSubpassDescription* pSubpasses;
uint32_t dependencyCount;
const VkSubpassDependency* pDependencies;
} VkRenderPassCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
attachmentCount
is the number of attachments used by this render pass, or zero indicating no attachments. Attachments are referred to by zero-based indices in the range [0,attachmentCount
). -
pAttachments
points to an array ofattachmentCount
number of VkAttachmentDescription structures describing properties of the attachments, orNULL
ifattachmentCount
is zero. -
subpassCount
is the number of subpasses to create for this render pass. Subpasses are referred to by zero-based indices in the range [0,subpassCount
). A render pass must have at least one subpass. -
pSubpasses
points to an array ofsubpassCount
number of VkSubpassDescription structures describing properties of the subpasses. -
dependencyCount
is the number of dependencies between pairs of subpasses, or zero indicating no dependencies. -
pDependencies
points to an array ofdependencyCount
number of VkSubpassDependency structures describing dependencies between pairs of subpasses, orNULL
ifdependencyCount
is zero.
Note
Care should be taken to avoid a data race here; if any subpasses access attachments with overlapping memory locations, and one of those accesses is a write, a subpass dependency needs to be included between them. |
typedef VkFlags VkRenderPassCreateFlags;
VkRenderPassCreateFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
If the VkRenderPassCreateInfo
::pNext
chain includes a
VkRenderPassMultiviewCreateInfo
structure, then that structure
includes an array of view masks, view offsets, and correlation masks for the
render pass.
The VkRenderPassMultiviewCreateInfo
structure is defined as:
typedef struct VkRenderPassMultiviewCreateInfo {
VkStructureType sType;
const void* pNext;
uint32_t subpassCount;
const uint32_t* pViewMasks;
uint32_t dependencyCount;
const int32_t* pViewOffsets;
uint32_t correlationMaskCount;
const uint32_t* pCorrelationMasks;
} VkRenderPassMultiviewCreateInfo;
or the equivalent
typedef VkRenderPassMultiviewCreateInfo VkRenderPassMultiviewCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
subpassCount
is zero or is the number of subpasses in the render pass. -
pViewMasks
points to an array ofsubpassCount
number of view masks, where each mask is a bitfield of view indices describing which views rendering is broadcast to in each subpass, when multiview is enabled. IfsubpassCount
is zero, each view mask is treated as zero. -
dependencyCount
is zero or the number of dependencies in the render pass. -
pViewOffsets
points to an array ofdependencyCount
view offsets, one for each dependency. IfdependencyCount
is zero, each dependency’s view offset is treated as zero. Each view offset controls which views in the source subpass the views in the destination subpass depend on. -
correlationMaskCount
is zero or a number of correlation masks. -
pCorrelationMasks
is an array of view masks indicating sets of views that may be more efficient to render concurrently.
When a subpass uses a non-zero view mask, multiview functionality is
considered to be enabled.
Multiview is all-or-nothing for a render pass - that is, either all
subpasses must have a non-zero view mask (though some subpasses may have
only one view) or all must be zero.
Multiview causes all drawing and clear commands in the subpass to behave as
if they were broadcast to each view, where a view is represented by one
layer of the framebuffer attachments.
All draws and clears are broadcast to each view index whose bit is set in
the view mask.
The view index is provided in the ViewIndex
shader input variable, and
color, depth/stencil, and input attachments all read/write the layer of the
framebuffer corresponding to the view index.
If the view mask is zero for all subpasses, multiview is considered to be disabled and all drawing commands execute normally, without this additional broadcasting.
Some implementations may not support multiview in conjunction with geometry shaders or tessellation shaders.
When multiview is enabled, the VK_DEPENDENCY_VIEW_LOCAL_BIT
bit in a
dependency can be used to express a view-local dependency, meaning that
each view in the destination subpass depends on a single view in the source
subpass.
Unlike pipeline barriers, a subpass dependency can potentially have a
different view mask in the source subpass and the destination subpass.
If the dependency is view-local, then each view (dstView) in the
destination subpass depends on the view dstView +
pViewOffsets[dependency] in the source subpass.
If there is not such a view in the source subpass, then this dependency does
not affect that view in the destination subpass.
If the dependency is not view-local, then all views in the destination
subpass depend on all views in the source subpass, and the view offset is
ignored.
A non-zero view offset is not allowed in a self-dependency.
The elements of pCorrelationMasks
are a set of masks of views
indicating that views in the same mask may exhibit spatial coherency
between the views, making it more efficient to render them concurrently.
Correlation masks must not have a functional effect on the results of the
multiview rendering.
When multiview is enabled, at the beginning of each subpass all non-render pass state is undefined. In particular, each time vkCmdBeginRenderPass or vkCmdNextSubpass is called the graphics pipeline must be bound, any relevant descriptor sets or vertex/index buffers must be bound, and any relevant dynamic state or push constants must be set before they are used.
A multiview subpass can declare that its shaders will write per-view
attributes for all views in a single invocation, by setting the
VK_SUBPASS_DESCRIPTION_PER_VIEW_ATTRIBUTES_BIT_NVX
bit in the subpass
description.
The only supported per-view attributes are position and viewport mask, and
per-view position and viewport masks are written to output array variables
decorated with PositionPerViewNV
and ViewportMaskPerViewNV
,
respectively.
If VK_NV_viewport_array2
is not supported and enabled,
ViewportMaskPerViewNV
must not be used.
Values written to elements of PositionPerViewNV
and
ViewportMaskPerViewNV
must not depend on the ViewIndex
.
The shader must also write to an output variable decorated with
Position
, and the value written to Position
must equal the value
written to PositionPerViewNV
[ViewIndex
].
Similarly, if ViewportMaskPerViewNV
is written to then the shader must
also write to an output variable decorated with ViewportMaskNV
, and the
value written to ViewportMaskNV
must equal the value written to
ViewportMaskPerViewNV
[ViewIndex
].
Implementations will either use values taken from Position
and
ViewportMaskNV
and invoke the shader once for each view, or will use
values taken from PositionPerViewNV
and ViewportMaskPerViewNV
and
invoke the shader fewer times.
The values written to Position
and ViewportMaskNV
must not depend
on the values written to PositionPerViewNV
and
ViewportMaskPerViewNV
, or vice versa (to allow compilers to eliminate
the unused outputs).
All attributes that do not have *PerViewNV
counterparts must not depend
on ViewIndex
.
Per-view attributes are all-or-nothing for a subpass.
That is, all pipelines compiled against a subpass that includes the
VK_SUBPASS_DESCRIPTION_PER_VIEW_ATTRIBUTES_BIT_NVX
bit must write
per-view attributes to the *PerViewNV[]
shader outputs, in addition to the
non-per-view (e.g. Position
) outputs.
Pipelines compiled against a subpass that does not include this bit must
not include the *PerViewNV[]
outputs in their interfaces.
If the VkRenderPassCreateInfo
::pNext
chain includes a
VkRenderPassFragmentDensityMapCreateInfoEXT
structure, then that
structure includes a fragment density map attachment for the render pass.
The VkRenderPassFragmentDensityMapCreateInfoEXT
structure is defined
as:
typedef struct VkRenderPassFragmentDensityMapCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkAttachmentReference fragmentDensityMapAttachment;
} VkRenderPassFragmentDensityMapCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
fragmentDensityMapAttachment
is the fragment density map to use for the render pass.
The fragment density map attachment is read at an implementation-dependent
time either by the host during vkCmdBeginRenderPass if the
attachment’s image view was not created with flags
containing
VK_IMAGE_VIEW_CREATE_FRAGMENT_DENSITY_MAP_DYNAMIC_BIT_EXT
, or by the
device when drawing commands in the renderpass execute
VK_PIPELINE_STAGE_FRAGMENT_DENSITY_PROCESS_BIT_EXT
.
If this structure is not present, it is as if
fragmentDensityMapAttachment
was given as VK_ATTACHMENT_UNUSED
.
The VkAttachmentDescription
structure is defined as:
typedef struct VkAttachmentDescription {
VkAttachmentDescriptionFlags flags;
VkFormat format;
VkSampleCountFlagBits samples;
VkAttachmentLoadOp loadOp;
VkAttachmentStoreOp storeOp;
VkAttachmentLoadOp stencilLoadOp;
VkAttachmentStoreOp stencilStoreOp;
VkImageLayout initialLayout;
VkImageLayout finalLayout;
} VkAttachmentDescription;
-
flags
is a bitmask of VkAttachmentDescriptionFlagBits specifying additional properties of the attachment. -
format
is a VkFormat value specifying the format of the image view that will be used for the attachment. -
samples
is the number of samples of the image as defined in VkSampleCountFlagBits. -
loadOp
is a VkAttachmentLoadOp value specifying how the contents of color and depth components of the attachment are treated at the beginning of the subpass where it is first used. -
storeOp
is a VkAttachmentStoreOp value specifying how the contents of color and depth components of the attachment are treated at the end of the subpass where it is last used. -
stencilLoadOp
is a VkAttachmentLoadOp value specifying how the contents of stencil components of the attachment are treated at the beginning of the subpass where it is first used. -
stencilStoreOp
is a VkAttachmentStoreOp value specifying how the contents of stencil components of the attachment are treated at the end of the last subpass where it is used. -
initialLayout
is the layout the attachment image subresource will be in when a render pass instance begins. -
finalLayout
is the layout the attachment image subresource will be transitioned to when a render pass instance ends. During a render pass instance, an attachment can use a different layout in each subpass, if desired.
If the attachment uses a color format, then loadOp
and storeOp
are used, and stencilLoadOp
and stencilStoreOp
are ignored.
If the format has depth and/or stencil components, loadOp
and
storeOp
apply only to the depth data, while stencilLoadOp
and
stencilStoreOp
define how the stencil data is handled.
loadOp
and stencilLoadOp
define the load operations that
execute as part of the first subpass that uses the attachment.
storeOp
and stencilStoreOp
define the store operations that
execute as part of the last subpass that uses the attachment.
The load operation for each sample in an attachment happens-before any
recorded command which accesses the sample in the first subpass where the
attachment is used.
Load operations for attachments with a depth/stencil format execute in the
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
pipeline stage.
Load operations for attachments with a color format execute in the
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
pipeline stage.
The store operation for each sample in an attachment happens-after any
recorded command which accesses the sample in the last subpass where the
attachment is used.
Store operations for attachments with a depth/stencil format execute in the
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT
pipeline stage.
Store operations for attachments with a color format execute in the
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
pipeline stage.
If an attachment is not used by any subpass, then loadOp
,
storeOp
, stencilStoreOp
, and stencilLoadOp
are ignored,
and the attachment’s memory contents will not be modified by execution of a
render pass instance.
The load and store operations apply on the first and last use of each view in the render pass, respectively. If a view index of an attachment is not included in the view mask in any subpass that uses it, then the load and store operations are ignored, and the attachment’s memory contents will not be modified by execution of a render pass instance.
During a render pass instance, input/color attachments with color formats
that have a component size of 8, 16, or 32 bits must be represented in the
attachment’s format throughout the instance.
Attachments with other floating- or fixed-point color formats, or with depth
components may be represented in a format with a precision higher than the
attachment format, but must be represented with the same range.
When such a component is loaded via the loadOp
, it will be converted
into an implementation-dependent format used by the render pass.
Such components must be converted from the render pass format, to the
format of the attachment, before they are resolved or stored at the end of a
render pass instance via storeOp
.
Conversions occur as described in Numeric
Representation and Computation and Fixed-Point
Data Conversions.
If flags
includes VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT
, then
the attachment is treated as if it shares physical memory with another
attachment in the same render pass.
This information limits the ability of the implementation to reorder certain
operations (like layout transitions and the loadOp
) such that it is
not improperly reordered against other uses of the same physical memory via
a different attachment.
This is described in more detail below.
To specify which aspects of an input attachment can be read add a
VkRenderPassInputAttachmentAspectCreateInfo structure to the
pNext
chain of the VkRenderPassCreateInfo structure:
The VkRenderPassInputAttachmentAspectCreateInfo
structure is defined
as:
typedef struct VkRenderPassInputAttachmentAspectCreateInfo {
VkStructureType sType;
const void* pNext;
uint32_t aspectReferenceCount;
const VkInputAttachmentAspectReference* pAspectReferences;
} VkRenderPassInputAttachmentAspectCreateInfo;
or the equivalent
typedef VkRenderPassInputAttachmentAspectCreateInfo VkRenderPassInputAttachmentAspectCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
aspectReferenceCount
is the number of elements in the pAspectReferences array. -
pAspectReferences
points to an array ofaspectReferenceCount
number of VkInputAttachmentAspectReference structures describing which aspect(s) can be accessed for a given input attachment within a given subpass.
The VkInputAttachmentAspectReference
structure specifies an aspect
mask for a specific input attachment of a specific subpass in the render
pass.
subpass
and inputAttachmentIndex
index into the render pass as:
pCreateInfo
::pSubpasses
[subpass
].pInputAttachments
[inputAttachmentIndex
]
typedef struct VkInputAttachmentAspectReference {
uint32_t subpass;
uint32_t inputAttachmentIndex;
VkImageAspectFlags aspectMask;
} VkInputAttachmentAspectReference;
or the equivalent
typedef VkInputAttachmentAspectReference VkInputAttachmentAspectReferenceKHR;
-
subpass
is an index into thepSubpasses
array of the parentVkRenderPassCreateInfo
structure. -
inputAttachmentIndex
is an index into thepInputAttachments
of the specified subpass. -
aspectMask
is a mask of which aspect(s) can be accessed within the specified subpass.
editing-note
TODO (Jon) - it’s unclear whether the following two paragraphs are intended to apply to VkAttachmentDescription, one of the extension structures described immediately above, or something else. The following description of VkAttachmentDescriptionFlagBits should probably be moved up to near VkAttachmentDescription. |
An application must only access the specified aspect(s).
An application can access any aspect of an input attachment that does not have a specified aspect mask.
Bits which can be set in VkAttachmentDescription::flags
describing additional properties of the attachment are:
typedef enum VkAttachmentDescriptionFlagBits {
VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT = 0x00000001,
} VkAttachmentDescriptionFlagBits;
-
VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT
specifies that the attachment aliases the same device memory as other attachments.
typedef VkFlags VkAttachmentDescriptionFlags;
VkAttachmentDescriptionFlags
is a bitmask type for setting a mask of
zero or more VkAttachmentDescriptionFlagBits.
Possible values of VkAttachmentDescription::loadOp
and
stencilLoadOp
, specifying how the contents of the attachment are
treated, are:
typedef enum VkAttachmentLoadOp {
VK_ATTACHMENT_LOAD_OP_LOAD = 0,
VK_ATTACHMENT_LOAD_OP_CLEAR = 1,
VK_ATTACHMENT_LOAD_OP_DONT_CARE = 2,
} VkAttachmentLoadOp;
-
VK_ATTACHMENT_LOAD_OP_LOAD
specifies that the previous contents of the image within the render area will be preserved. For attachments with a depth/stencil format, this uses the access typeVK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT
. For attachments with a color format, this uses the access typeVK_ACCESS_COLOR_ATTACHMENT_READ_BIT
. -
VK_ATTACHMENT_LOAD_OP_CLEAR
specifies that the contents within the render area will be cleared to a uniform value, which is specified when a render pass instance is begun. For attachments with a depth/stencil format, this uses the access typeVK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT
. For attachments with a color format, this uses the access typeVK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT
. -
VK_ATTACHMENT_LOAD_OP_DONT_CARE
specifies that the previous contents within the area need not be preserved; the contents of the attachment will be undefined inside the render area. For attachments with a depth/stencil format, this uses the access typeVK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT
. For attachments with a color format, this uses the access typeVK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT
.
Possible values of VkAttachmentDescription::storeOp
and
stencilStoreOp
, specifying how the contents of the attachment are
treated, are:
typedef enum VkAttachmentStoreOp {
VK_ATTACHMENT_STORE_OP_STORE = 0,
VK_ATTACHMENT_STORE_OP_DONT_CARE = 1,
} VkAttachmentStoreOp;
-
VK_ATTACHMENT_STORE_OP_STORE
specifies the contents generated during the render pass and within the render area are written to memory. For attachments with a depth/stencil format, this uses the access typeVK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT
. For attachments with a color format, this uses the access typeVK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT
. -
VK_ATTACHMENT_STORE_OP_DONT_CARE
specifies the contents within the render area are not needed after rendering, and may be discarded; the contents of the attachment will be undefined inside the render area. For attachments with a depth/stencil format, this uses the access typeVK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT
. For attachments with a color format, this uses the access typeVK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT
.
editing-note
TODO (Jon) - the following text may need to be moved back to combine with vkCreateRenderPass above for automatic ref page generation. |
If a render pass uses multiple attachments that alias the same device
memory, those attachments must each include the
VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT
bit in their attachment
description flags.
Attachments aliasing the same memory occurs in multiple ways:
-
Multiple attachments being assigned the same image view as part of framebuffer creation.
-
Attachments using distinct image views that correspond to the same image subresource of an image.
-
Attachments using views of distinct image subresources which are bound to overlapping memory ranges.
Note
Render passes must include subpass dependencies (either directly or via a
subpass dependency chain) between any two subpasses that operate on the same
attachment or aliasing attachments and those subpass dependencies must
include execution and memory dependencies separating uses of the aliases, if
at least one of those subpasses writes to one of the aliases.
These dependencies must not include the |
Multiple attachments that alias the same memory must not be used in a single subpass. A given attachment index must not be used multiple times in a single subpass, with one exception: two subpass attachments can use the same attachment index if at least one use is as an input attachment and neither use is as a resolve or preserve attachment. In other words, the same view can be used simultaneously as an input and color or depth/stencil attachment, but must not be used as multiple color or depth/stencil attachments nor as resolve or preserve attachments. The precise set of valid scenarios is described in more detail below.
If a set of attachments alias each other, then all except the first to be
used in the render pass must use an initialLayout
of
VK_IMAGE_LAYOUT_UNDEFINED
, since the earlier uses of the other aliases
make their contents undefined.
Once an alias has been used and a different alias has been used after it,
the first alias must not be used in any later subpasses.
However, an application can assign the same image view to multiple aliasing
attachment indices, which allows that image view to be used multiple times
even if other aliases are used in between.
Note
Once an attachment needs the |
The VkSubpassDescription
structure is defined as:
typedef struct VkSubpassDescription {
VkSubpassDescriptionFlags flags;
VkPipelineBindPoint pipelineBindPoint;
uint32_t inputAttachmentCount;
const VkAttachmentReference* pInputAttachments;
uint32_t colorAttachmentCount;
const VkAttachmentReference* pColorAttachments;
const VkAttachmentReference* pResolveAttachments;
const VkAttachmentReference* pDepthStencilAttachment;
uint32_t preserveAttachmentCount;
const uint32_t* pPreserveAttachments;
} VkSubpassDescription;
-
flags
is a bitmask of VkSubpassDescriptionFlagBits specifying usage of the subpass. -
pipelineBindPoint
is a VkPipelineBindPoint value specifying whether this is a compute or graphics subpass. Currently, only graphics subpasses are supported. -
inputAttachmentCount
is the number of input attachments. -
pInputAttachments
is an array of VkAttachmentReference structures (defined below) that lists which of the render pass’s attachments can be read in the fragment shader stage during the subpass, and what layout each attachment will be in during the subpass. Each element of the array corresponds to an input attachment unit number in the shader, i.e. if the shader declares an input variablelayout(input_attachment_index=X, set=Y, binding=Z)
then it uses the attachment provided inpInputAttachments
[X]. Input attachments must also be bound to the pipeline with a descriptor set, with the input attachment descriptor written in the location (set=Y, binding=Z). Fragment shaders can use subpass input variables to access the contents of an input attachment at the fragment’s (x, y, layer) framebuffer coordinates. -
colorAttachmentCount
is the number of color attachments. -
pColorAttachments
is an array ofcolorAttachmentCount
VkAttachmentReference structures that lists which of the render pass’s attachments will be used as color attachments in the subpass, and what layout each attachment will be in during the subpass. Each element of the array corresponds to a fragment shader output location, i.e. if the shader declared an output variablelayout(location=X)
then it uses the attachment provided inpColorAttachments
[X]. -
pResolveAttachments
isNULL
or an array ofcolorAttachmentCount
VkAttachmentReference structures that lists which of the render pass’s attachments are resolved to at the end of the subpass, and what layout each attachment will be in during the multisample resolve operation. IfpResolveAttachments
is notNULL
, each of its elements corresponds to a color attachment (the element inpColorAttachments
at the same index), and a multisample resolve operation is defined for each attachment. At the end of each subpass, multisample resolve operations read the subpass’s color attachments, and resolve the samples for each pixel to the same pixel location in the corresponding resolve attachments, unless the resolve attachment index isVK_ATTACHMENT_UNUSED
. If the first use of an attachment in a render pass is as a resolve attachment, then theloadOp
is effectively ignored as the resolve is guaranteed to overwrite all pixels in the render area. -
pDepthStencilAttachment
is a pointer to a VkAttachmentReference specifying which attachment will be used for depth/stencil data and the layout it will be in during the subpass. Setting the attachment index toVK_ATTACHMENT_UNUSED
or leaving this pointer asNULL
indicates that no depth/stencil attachment will be used in the subpass. -
preserveAttachmentCount
is the number of preserved attachments. -
pPreserveAttachments
is an array ofpreserveAttachmentCount
render pass attachment indices describing the attachments that are not used by a subpass, but whose contents must be preserved throughout the subpass.
The contents of an attachment within the render area become undefined at the start of a subpass S if all of the following conditions are true:
-
The attachment is used as a color, depth/stencil, or resolve attachment in any subpass in the render pass.
-
There is a subpass S1 that uses or preserves the attachment, and a subpass dependency from S1 to S.
-
The attachment is not used or preserved in subpass S.
Once the contents of an attachment become undefined in subpass S, they remain undefined for subpasses in subpass dependency chains starting with subpass S until they are written again. However, they remain valid for subpasses in other subpass dependency chains starting with subpass S1 if those subpasses use or preserve the attachment.
Bits which can be set in VkSubpassDescription::flags
,
specifying usage of the subpass, are:
typedef enum VkSubpassDescriptionFlagBits {
VK_SUBPASS_DESCRIPTION_PER_VIEW_ATTRIBUTES_BIT_NVX = 0x00000001,
VK_SUBPASS_DESCRIPTION_PER_VIEW_POSITION_X_ONLY_BIT_NVX = 0x00000002,
} VkSubpassDescriptionFlagBits;
-
VK_SUBPASS_DESCRIPTION_PER_VIEW_ATTRIBUTES_BIT_NVX
specifies that shaders compiled for this subpass write the attributes for all views in a single invocation of each vertex processing stage. All pipelines compiled against a subpass that includes this bit must write per-view attributes to the*PerViewNV[]
shader outputs, in addition to the non-per-view (e.g.Position
) outputs. -
VK_SUBPASS_DESCRIPTION_PER_VIEW_POSITION_X_ONLY_BIT_NVX
specifies that shaders compiled for this subpass use per-view positions which only differ in value in the x component. Per-view viewport mask can also be used.
typedef VkFlags VkSubpassDescriptionFlags;
VkSubpassDescriptionFlags
is a bitmask type for setting a mask of zero
or more VkSubpassDescriptionFlagBits.
The VkAttachmentReference
structure is defined as:
typedef struct VkAttachmentReference {
uint32_t attachment;
VkImageLayout layout;
} VkAttachmentReference;
-
attachment
is the index of the attachment of the render pass, and corresponds to the index of the corresponding element in thepAttachments
array of theVkRenderPassCreateInfo
structure. If any color or depth/stencil attachments areVK_ATTACHMENT_UNUSED
, then no writes occur for those attachments. -
layout
is a VkImageLayout value specifying the layout the attachment uses during the subpass.
The VkSubpassDependency
structure is defined as:
typedef struct VkSubpassDependency {
uint32_t srcSubpass;
uint32_t dstSubpass;
VkPipelineStageFlags srcStageMask;
VkPipelineStageFlags dstStageMask;
VkAccessFlags srcAccessMask;
VkAccessFlags dstAccessMask;
VkDependencyFlags dependencyFlags;
} VkSubpassDependency;
-
srcSubpass
is the subpass index of the first subpass in the dependency, orVK_SUBPASS_EXTERNAL
. -
dstSubpass
is the subpass index of the second subpass in the dependency, orVK_SUBPASS_EXTERNAL
. -
srcStageMask
is a bitmask of VkPipelineStageFlagBits specifying the source stage mask. -
dstStageMask
is a bitmask of VkPipelineStageFlagBits specifying the destination stage mask -
srcAccessMask
is a bitmask of VkAccessFlagBits specifying a source access mask. -
dstAccessMask
is a bitmask of VkAccessFlagBits specifying a destination access mask. -
dependencyFlags
is a bitmask of VkDependencyFlagBits.
If srcSubpass
is equal to dstSubpass
then the
VkSubpassDependency describes a
subpass
self-dependency, and only constrains the pipeline barriers allowed within
a subpass instance.
Otherwise, when a render pass instance which includes a subpass dependency
is submitted to a queue, it defines a memory dependency between the
subpasses identified by srcSubpass
and dstSubpass
.
If srcSubpass
is equal to VK_SUBPASS_EXTERNAL
, the first
synchronization scope includes
commands that occur earlier in submission
order than the vkCmdBeginRenderPass used to begin the render pass
instance.
Otherwise, the first set of commands includes all commands submitted as part
of the subpass instance identified by srcSubpass
and any load, store
or multisample resolve operations on attachments used in srcSubpass
.
In either case, the first synchronization scope is limited to operations on
the pipeline stages determined by the
source stage mask specified by
srcStageMask
.
If dstSubpass
is equal to VK_SUBPASS_EXTERNAL
, the second
synchronization scope includes
commands that occur later in submission
order than the vkCmdEndRenderPass used to end the render pass
instance.
Otherwise, the second set of commands includes all commands submitted as
part of the subpass instance identified by dstSubpass
and any load,
store or multisample resolve operations on attachments used in
dstSubpass
.
In either case, the second synchronization scope is limited to operations on
the pipeline stages determined by the
destination stage mask specified
by dstStageMask
.
The first access scope is
limited to access in the pipeline stages determined by the
source stage mask specified by
srcStageMask
.
It is also limited to access types in the source access mask specified by srcAccessMask
.
The second access scope is
limited to access in the pipeline stages determined by the
destination stage mask specified
by dstStageMask
.
It is also limited to access types in the destination access mask specified by dstAccessMask
.
The availability and visibility operations defined by a subpass dependency affect the execution of image layout transitions within the render pass.
Note
For non-attachment resources, the memory dependency expressed by subpass
dependency is nearly identical to that of a VkMemoryBarrier (with
matching For attachments however, subpass dependencies work more like a
VkImageMemoryBarrier defined similarly to the VkMemoryBarrier
above, the queue family indices set to
|
When multiview is enabled, the execution of the multiple views of one
subpass may not occur simultaneously or even back-to-back, and rather may
be interleaved with the execution of other subpasses.
The load and store operations apply to attachments on a per-view basis.
For example, an attachment using VK_ATTACHMENT_LOAD_OP_CLEAR
will have
each view cleared on first use, but the first use of one view may be
temporally distant from the first use of another view.
Note
A good mental model for multiview is to think of a multiview subpass as if it were a collection of individual (per-view) subpasses that are logically grouped together and described as a single multiview subpass in the API. Similarly, a multiview attachment can be thought of like several individual attachments that happen to be layers in a single image. A view-local dependency between two multiview subpasses acts like a set of one-to-one dependencies between corresponding pairs of per-view subpasses. A view-global dependency between two multiview subpasses acts like a set of N × M dependencies between all pairs of per-view subpasses in the source and destination. Thus, it is a more compact representation which also makes clear the commonality and reuse that is present between views in a subpass. This interpretation motivates the answers to questions like “when does the load op apply” - it is on the first use of each view of an attachment, as if each view were a separate attachment. |
If any two subpasses of a render pass activate transform feedback to the same bound transform feedback buffers, a subpass dependency must be included (either directly or via some intermediate subpasses) between them.
editing-note
The following two alleged implicit dependencies are practically no-ops, as the operations they describe are already guaranteed by semaphores and submission order (so they’re almost entirely no-ops on their own). The only reason they exist is because it simplifies reasoning about where automatic layout transitions happen. Further rewrites of this chapter could potentially remove the need for these. |
If there is no subpass dependency from VK_SUBPASS_EXTERNAL
to the
first subpass that uses an attachment, then an implicit subpass dependency
exists from VK_SUBPASS_EXTERNAL
to the first subpass it is used in.
The subpass dependency operates as if defined with the following parameters:
VkSubpassDependency implicitDependency = {
.srcSubpass = VK_SUBPASS_EXTERNAL;
.dstSubpass = firstSubpass; // First subpass attachment is used in
.srcStageMask = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT;
.dstStageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
.srcAccessMask = 0;
.dstAccessMask = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT |
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
.dependencyFlags = 0;
};
Similarly, if there is no subpass dependency from the last subpass that uses
an attachment to VK_SUBPASS_EXTERNAL
, then an implicit subpass
dependency exists from the last subpass it is used in to
VK_SUBPASS_EXTERNAL
.
The subpass dependency operates as if defined with the following parameters:
VkSubpassDependency implicitDependency = {
.srcSubpass = lastSubpass; // Last subpass attachment is used in
.dstSubpass = VK_SUBPASS_EXTERNAL;
.srcStageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
.dstStageMask = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT;
.srcAccessMask = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT |
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
.dstAccessMask = 0;
.dependencyFlags = 0;
};
As subpasses may overlap or execute out of order with regards to other subpasses unless a subpass dependency chain describes otherwise, the layout transitions required between subpasses cannot be known to an application. Instead, an application provides the layout that each attachment must be in at the start and end of a render pass, and the layout it must be in during each subpass it is used in. The implementation then must execute layout transitions between subpasses in order to guarantee that the images are in the layouts required by each subpass, and in the final layout at the end of the render pass.
Automatic layout transitions apply to the entire image subresource attached to the framebuffer. If the attachment view is a 2D or 2D array view of a 3D image, even if the attachment view only refers to a subset of the slices of the selected mip level of the 3D image, automatic layout transitions apply to the entire subresource referenced which is the entire mip level in this case.
Automatic layout transitions away from the layout used in a subpass
happen-after the availability operations for all dependencies with that
subpass as the srcSubpass
.
Automatic layout transitions into the layout used in a subpass happen-before
the visibility operations for all dependencies with that subpass as the
dstSubpass
.
Automatic layout transitions away from initialLayout
happens-after the
availability operations for all dependencies with a srcSubpass
equal
to VK_SUBPASS_EXTERNAL
, where dstSubpass
uses the attachment
that will be transitioned.
For attachments created with VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT
,
automatic layout transitions away from initialLayout
happen-after the
availability operations for all dependencies with a srcSubpass
equal
to VK_SUBPASS_EXTERNAL
, where dstSubpass
uses any aliased
attachment.
Automatic layout transitions into finalLayout
happens-before the
visibility operations for all dependencies with a dstSubpass
equal to
VK_SUBPASS_EXTERNAL
, where srcSubpass
uses the attachment that
will be transitioned.
For attachments created with VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT
,
automatic layout transitions into finalLayout
happen-before the
visibility operations for all dependencies with a dstSubpass
equal to
VK_SUBPASS_EXTERNAL
, where srcSubpass
uses any aliased
attachment.
The image layout of the depth aspect of a depth/stencil attachment referring
to an image created with
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
is dependent
on the last sample locations used to render to the attachment, thus
automatic layout transitions use the sample locations state specified in
VkRenderPassSampleLocationsBeginInfoEXT.
Automatic layout transitions of an attachment referring to a depth/stencil
image created with
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
use the
sample locations the image subresource range referenced by the attachment
was last rendered with.
If the current render pass does not use the attachment as a depth/stencil
attachment in any subpass that happens-before, the automatic layout
transition uses the sample locations state specified in the
sampleLocationsInfo
member of the element of the
VkRenderPassSampleLocationsBeginInfoEXT
::pAttachmentInitialSampleLocations
array for which the attachmentIndex
member equals the attachment index
of the attachment, if one is specified.
Otherwise, the automatic layout transition uses the sample locations state
specified in the sampleLocationsInfo
member of the element of the
VkRenderPassSampleLocationsBeginInfoEXT
::pPostSubpassSampleLocations
array for which the subpassIndex
member equals the index of the
subpass that last used the attachment as a depth/stencil attachment, if one
is specified.
If no sample locations state has been specified for an automatic layout
transition performed on an attachment referring to a depth/stencil image
created with VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
the contents of the depth aspect of the depth/stencil attachment become
undefined as if the layout of the attachment was transitioned from the
VK_IMAGE_LAYOUT_UNDEFINED
layout.
If two subpasses use the same attachment in different layouts, and both layouts are read-only, no subpass dependency needs to be specified between those subpasses. If an implementation treats those layouts separately, it must insert an implicit subpass dependency between those subpasses to separate the uses in each layout. The subpass dependency operates as if defined with the following parameters:
// Used for input attachments
VkPipelineStageFlags inputAttachmentStages = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT;
VkAccessFlags inputAttachmentAccess = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT;
// Used for depth/stencil attachments
VkPipelineStageFlags depthStencilAttachmentStages = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT | VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT;
VkAccessFlags depthStencilAttachmentAccess = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT;
VkSubpassDependency implicitDependency = {
.srcSubpass = firstSubpass;
.dstSubpass = secondSubpass;
.srcStageMask = inputAttachmentStages | depthStencilAttachmentStages;
.dstStageMask = inputAttachmentStages | depthStencilAttachmentStages;
.srcAccessMask = inputAttachmentAccess | depthStencilAttachmentAccess;
.dstAccessMask = inputAttachmentAccess | depthStencilAttachmentAccess;
.dependencyFlags = 0;
};
If a subpass uses the same attachment as both an input attachment and either a color attachment or a depth/stencil attachment, writes via the color or depth/stencil attachment are not automatically made visible to reads via the input attachment, causing a feedback loop, except in any of the following conditions:
-
If the color components or depth/stencil components read by the input attachment are mutually exclusive with the components written by the color or depth/stencil attachments, then there is no feedback loop. This requires the graphics pipelines used by the subpass to disable writes to color components that are read as inputs via the
colorWriteMask
, and to disable writes to depth/stencil components that are read as inputs viadepthWriteEnable
orstencilTestEnable
. -
If the attachment is used as an input attachment and depth/stencil attachment only, and the depth/stencil attachment is not written to.
-
If a memory dependency is inserted between when the attachment is written and when it is subsequently read by later fragments. Pipeline barriers expressing a subpass self-dependency are the only way to achieve this, and one must be inserted every time a fragment will read values at a particular sample (x, y, layer, sample) coordinate, if those values have been written since the most recent pipeline barrier; or the since start of the subpass if there have been no pipeline barriers since the start of the subpass.
An attachment used as both an input attachment and a color attachment must
be in the
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
or
VK_IMAGE_LAYOUT_GENERAL
layout.
An attachment used as an input attachment and depth/stencil attachment must
be in the
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
,
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL
,
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL
,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL
, or
VK_IMAGE_LAYOUT_GENERAL
layout.
An attachment must not be used as both a depth/stencil attachment and a
color attachment.
A more extensible version of render pass creation is also defined below.
To create a render pass, call:
VkResult vkCreateRenderPass2KHR(
VkDevice device,
const VkRenderPassCreateInfo2KHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkRenderPass* pRenderPass);
-
device
is the logical device that creates the render pass. -
pCreateInfo
is a pointer to an instance of the VkRenderPassCreateInfo2KHR structure that describes the parameters of the render pass. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pRenderPass
points to aVkRenderPass
handle in which the resulting render pass object is returned.
This command is functionally identical to vkCreateRenderPass, but
includes extensible sub-structures that include sType
and pNext
parameters, allowing them to be more easily extended.
The VkRenderPassCreateInfo2KHR
structure is defined as:
typedef struct VkRenderPassCreateInfo2KHR {
VkStructureType sType;
const void* pNext;
VkRenderPassCreateFlags flags;
uint32_t attachmentCount;
const VkAttachmentDescription2KHR* pAttachments;
uint32_t subpassCount;
const VkSubpassDescription2KHR* pSubpasses;
uint32_t dependencyCount;
const VkSubpassDependency2KHR* pDependencies;
uint32_t correlatedViewMaskCount;
const uint32_t* pCorrelatedViewMasks;
} VkRenderPassCreateInfo2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
attachmentCount
is the number of attachments used by this render pass. -
pAttachments
points to an array ofattachmentCount
VkAttachmentDescription2KHR structures describing the attachments used by the render pass. -
subpassCount
is the number of subpasses to create. -
pSubpasses
points to an array ofsubpassCount
VkSubpassDescription2KHR structures describing each subpass. -
dependencyCount
is the number of dependencies between pairs of subpasses. -
pDependencies
points to an array ofdependencyCount
VkSubpassDependency2KHR structures describing dependencies between pairs of subpasses. -
correlatedViewMaskCount
is the number of correlation masks. -
pCorrelatedViewMasks
is an array of view masks indicating sets of views that may be more efficient to render concurrently.
Parameters defined by this structure with the same name as those in
VkRenderPassCreateInfo have the identical effect to those parameters;
the child structures are variants of those used in
VkRenderPassCreateInfo which include sType
and pNext
parameters, allowing them to be extended.
If the VkSubpassDescription2KHR::viewMask
member of any element
of pSubpasses
is not zero, multiview functionality is considered to
be enabled for this render pass.
correlatedViewMaskCount
and pCorrelatedViewMasks
have the same
effect as VkRenderPassMultiviewCreateInfo::correlationMaskCount
and VkRenderPassMultiviewCreateInfo::pCorrelationMasks
,
respectively.
The VkAttachmentDescription2KHR
structure is defined as:
typedef struct VkAttachmentDescription2KHR {
VkStructureType sType;
const void* pNext;
VkAttachmentDescriptionFlags flags;
VkFormat format;
VkSampleCountFlagBits samples;
VkAttachmentLoadOp loadOp;
VkAttachmentStoreOp storeOp;
VkAttachmentLoadOp stencilLoadOp;
VkAttachmentStoreOp stencilStoreOp;
VkImageLayout initialLayout;
VkImageLayout finalLayout;
} VkAttachmentDescription2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkAttachmentDescriptionFlagBits specifying additional properties of the attachment. -
format
is a VkFormat value specifying the format of the image that will be used for the attachment. -
samples
is the number of samples of the image as defined in VkSampleCountFlagBits. -
loadOp
is a VkAttachmentLoadOp value specifying how the contents of color and depth components of the attachment are treated at the beginning of the subpass where it is first used. -
storeOp
is a VkAttachmentStoreOp value specifying how the contents of color and depth components of the attachment are treated at the end of the subpass where it is last used. -
stencilLoadOp
is a VkAttachmentLoadOp value specifying how the contents of stencil components of the attachment are treated at the beginning of the subpass where it is first used. -
stencilStoreOp
is a VkAttachmentStoreOp value specifying how the contents of stencil components of the attachment are treated at the end of the last subpass where it is used. -
initialLayout
is the layout the attachment image subresource will be in when a render pass instance begins. -
finalLayout
is the layout the attachment image subresource will be transitioned to when a render pass instance ends.
Parameters defined by this structure with the same name as those in VkAttachmentDescription have the identical effect to those parameters.
The VkSubpassDescription2KHR
structure is defined as:
typedef struct VkSubpassDescription2KHR {
VkStructureType sType;
const void* pNext;
VkSubpassDescriptionFlags flags;
VkPipelineBindPoint pipelineBindPoint;
uint32_t viewMask;
uint32_t inputAttachmentCount;
const VkAttachmentReference2KHR* pInputAttachments;
uint32_t colorAttachmentCount;
const VkAttachmentReference2KHR* pColorAttachments;
const VkAttachmentReference2KHR* pResolveAttachments;
const VkAttachmentReference2KHR* pDepthStencilAttachment;
uint32_t preserveAttachmentCount;
const uint32_t* pPreserveAttachments;
} VkSubpassDescription2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkSubpassDescriptionFlagBits specifying usage of the subpass. -
pipelineBindPoint
is a VkPipelineBindPoint value specifying the pipeline type supported for this subpass. -
viewMask
is a bitfield of view indices describing which views rendering is broadcast to in this subpass, when multiview is enabled. -
inputAttachmentCount
is the number of input attachments. -
pInputAttachments
is an array of VkAttachmentReference2KHR structures defining the input attachments for this subpass and their layouts. -
colorAttachmentCount
is the number of color attachments. -
pColorAttachments
is an array of VkAttachmentReference2KHR structures defining the color attachments for this subpass and their layouts. -
pResolveAttachments
is an optional array ofcolorAttachmentCount
VkAttachmentReference2KHR structures defining the resolve attachments for this subpass and their layouts. -
pDepthStencilAttachment
is a pointer to a VkAttachmentReference2KHR specifying the depth/stencil attachment for this subpass and its layout. -
preserveAttachmentCount
is the number of preserved attachments. -
pPreserveAttachments
is an array ofpreserveAttachmentCount
render pass attachment indices identifying attachments that are not used by this subpass, but whose contents must be preserved throughout the subpass.
Parameters defined by this structure with the same name as those in VkSubpassDescription have the identical effect to those parameters.
viewMask
has the same effect for the described subpass as
VkRenderPassMultiviewCreateInfo::pViewMasks
has on each
corresponding subpass.
The VkAttachmentReference2KHR
structure is defined as:
typedef struct VkAttachmentReference2KHR {
VkStructureType sType;
const void* pNext;
uint32_t attachment;
VkImageLayout layout;
VkImageAspectFlags aspectMask;
} VkAttachmentReference2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
attachment
is either an integer value identifying an attachment at the corresponding index in VkRenderPassCreateInfo::pAttachments
, orVK_ATTACHMENT_UNUSED
to signify that this attachment is not used. -
layout
is a VkImageLayout value specifying the layout the attachment uses during the subpass. -
aspectMask
is a mask of which aspect(s) can be accessed within the specified subpass as an input attachment.
Parameters defined by this structure with the same name as those in VkAttachmentReference have the identical effect to those parameters.
aspectMask
has the same effect for the described attachment as
VkInputAttachmentAspectReference::aspectMask
has on each
corresponding attachment.
It is ignored when this structure is used to describe anything other than an
input attachment reference.
The VkSubpassDependency2KHR
structure is defined as:
typedef struct VkSubpassDependency2KHR {
VkStructureType sType;
const void* pNext;
uint32_t srcSubpass;
uint32_t dstSubpass;
VkPipelineStageFlags srcStageMask;
VkPipelineStageFlags dstStageMask;
VkAccessFlags srcAccessMask;
VkAccessFlags dstAccessMask;
VkDependencyFlags dependencyFlags;
int32_t viewOffset;
} VkSubpassDependency2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
srcSubpass
is the subpass index of the first subpass in the dependency, orVK_SUBPASS_EXTERNAL
. -
dstSubpass
is the subpass index of the second subpass in the dependency, orVK_SUBPASS_EXTERNAL
. -
srcStageMask
is a bitmask of VkPipelineStageFlagBits specifying the source stage mask. -
dstStageMask
is a bitmask of VkPipelineStageFlagBits specifying the destination stage mask -
srcAccessMask
is a bitmask of VkAccessFlagBits specifying a source access mask. -
dstAccessMask
is a bitmask of VkAccessFlagBits specifying a destination access mask. -
dependencyFlags
is a bitmask of VkDependencyFlagBits. -
viewOffset
controls which views in the source subpass the views in the destination subpass depend on.
Parameters defined by this structure with the same name as those in VkSubpassDependency have the identical effect to those parameters.
viewOffset
has the same effect for the described subpass dependency as
VkRenderPassMultiviewCreateInfo::pViewOffsets
has on each
corresponding subpass dependency.
To destroy a render pass, call:
void vkDestroyRenderPass(
VkDevice device,
VkRenderPass renderPass,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the render pass. -
renderPass
is the handle of the render pass to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
7.2. Render Pass Compatibility
Framebuffers and graphics pipelines are created based on a specific render pass object. They must only be used with that render pass object, or one compatible with it.
Two attachment references are compatible if they have matching format and
sample count, or are both VK_ATTACHMENT_UNUSED
or the pointer that
would contain the reference is NULL
.
Two arrays of attachment references are compatible if all corresponding
pairs of attachments are compatible.
If the arrays are of different lengths, attachment references not present in
the smaller array are treated as VK_ATTACHMENT_UNUSED
.
Two render passes are compatible if their corresponding color, input, resolve, and depth/stencil attachment references are compatible and if they are otherwise identical except for:
-
Initial and final image layout in attachment descriptions
-
Load and store operations in attachment descriptions
-
Image layout in attachment references
A framebuffer is compatible with a render pass if it was created using the same render pass or a compatible render pass.
7.3. Framebuffers
Render passes operate in conjunction with framebuffers. Framebuffers represent a collection of specific memory attachments that a render pass instance uses.
Framebuffers are represented by VkFramebuffer
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkFramebuffer)
To create a framebuffer, call:
VkResult vkCreateFramebuffer(
VkDevice device,
const VkFramebufferCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkFramebuffer* pFramebuffer);
-
device
is the logical device that creates the framebuffer. -
pCreateInfo
points to a VkFramebufferCreateInfo structure which describes additional information about framebuffer creation. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pFramebuffer
points to a VkFramebuffer handle in which the resulting framebuffer object is returned.
The VkFramebufferCreateInfo
structure is defined as:
typedef struct VkFramebufferCreateInfo {
VkStructureType sType;
const void* pNext;
VkFramebufferCreateFlags flags;
VkRenderPass renderPass;
uint32_t attachmentCount;
const VkImageView* pAttachments;
uint32_t width;
uint32_t height;
uint32_t layers;
} VkFramebufferCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
renderPass
is a render pass that defines what render passes the framebuffer will be compatible with. See Render Pass Compatibility for details. -
attachmentCount
is the number of attachments. -
pAttachments
is an array of VkImageView handles, each of which will be used as the corresponding attachment in a render pass instance. -
width
,height
andlayers
define the dimensions of the framebuffer. If the render pass uses multiview, thenlayers
must be one and each attachment requires a number of layers that is greater than the maximum bit index set in the view mask in the subpasses in which it is used.
Applications must ensure that all accesses to memory that backs image subresources used as attachments in a given renderpass instance either happen-before the load operations for those attachments, or happen-after the store operations for those attachments.
For depth/stencil attachments, each aspect can be used separately as
attachments and non-attachments as long as the non-attachment accesses are
also via an image subresource in either the
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL
layout or
the VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL
layout,
and the attachment resource uses whichever of those two layouts the image
accesses do not.
Use of non-attachment aspects in this case is only well defined if the
attachment is used in the subpass where the non-attachment access is being
made, or the layout of the image subresource is constant throughout the
entire render pass instance, including the initialLayout
and
finalLayout
.
Note
These restrictions mean that the render pass has full knowledge of all uses of all of the attachments, so that the implementation is able to make correct decisions about when and how to perform layout transitions, when to overlap execution of subpasses, etc. |
It is legal for a subpass to use no color or depth/stencil attachments, and
rather use shader side effects such as image stores and atomics to produce
an output.
In this case, the subpass continues to use the width
, height
,
and layers
of the framebuffer to define the dimensions of the
rendering area, and the rasterizationSamples
from each pipeline’s
VkPipelineMultisampleStateCreateInfo to define the number of samples
used in rasterization; however, if
VkPhysicalDeviceFeatures::variableMultisampleRate
is
VK_FALSE
, then all pipelines to be bound with a given zero-attachment
subpass must have the same value for
VkPipelineMultisampleStateCreateInfo::rasterizationSamples
.
typedef VkFlags VkFramebufferCreateFlags;
VkFramebufferCreateFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
To destroy a framebuffer, call:
void vkDestroyFramebuffer(
VkDevice device,
VkFramebuffer framebuffer,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the framebuffer. -
framebuffer
is the handle of the framebuffer to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
7.4. Render Pass Commands
An application records the commands for a render pass instance one subpass at a time, by beginning a render pass instance, iterating over the subpasses to record commands for that subpass, and then ending the render pass instance.
To begin a render pass instance, call:
void vkCmdBeginRenderPass(
VkCommandBuffer commandBuffer,
const VkRenderPassBeginInfo* pRenderPassBegin,
VkSubpassContents contents);
-
commandBuffer
is the command buffer in which to record the command. -
pRenderPassBegin
is a pointer to a VkRenderPassBeginInfo structure (defined below) which specifies the render pass to begin an instance of, and the framebuffer the instance uses. -
contents
is a VkSubpassContents value specifying how the commands in the first subpass will be provided.
After beginning a render pass instance, the command buffer is ready to record the commands for the first subpass of that render pass.
Alternatively to begin a render pass, call:
void vkCmdBeginRenderPass2KHR(
VkCommandBuffer commandBuffer,
const VkRenderPassBeginInfo* pRenderPassBegin,
const VkSubpassBeginInfoKHR* pSubpassBeginInfo);
-
commandBuffer
is the command buffer in which to record the command. -
pRenderPassBegin
is a pointer to a VkRenderPassBeginInfo structure (defined below) which indicates the render pass to begin an instance of, and the framebuffer the instance uses. -
pSubpassBeginInfo
is a pointer to a VkSubpassBeginInfoKHR structure which contains information about the subpass which is about to begin rendering.
After beginning a render pass instance, the command buffer is ready to record the commands for the first subpass of that render pass.
The VkRenderPassBeginInfo
structure is defined as:
typedef struct VkRenderPassBeginInfo {
VkStructureType sType;
const void* pNext;
VkRenderPass renderPass;
VkFramebuffer framebuffer;
VkRect2D renderArea;
uint32_t clearValueCount;
const VkClearValue* pClearValues;
} VkRenderPassBeginInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
renderPass
is the render pass to begin an instance of. -
framebuffer
is the framebuffer containing the attachments that are used with the render pass. -
renderArea
is the render area that is affected by the render pass instance, and is described in more detail below. -
clearValueCount
is the number of elements inpClearValues
. -
pClearValues
is an array of VkClearValue structures that contains clear values for each attachment, if the attachment uses aloadOp
value ofVK_ATTACHMENT_LOAD_OP_CLEAR
or if the attachment has a depth/stencil format and uses astencilLoadOp
value ofVK_ATTACHMENT_LOAD_OP_CLEAR
. The array is indexed by attachment number. Only elements corresponding to cleared attachments are used. Other elements ofpClearValues
are ignored.
renderArea
is the render area that is affected by the render pass
instance.
The effects of attachment load, store and multisample resolve operations are
restricted to the pixels whose x and y coordinates fall within the render
area on all attachments.
The render area extends to all layers of framebuffer
.
The application must ensure (using scissor if necessary) that all rendering
is contained within the render area.
The render area must be contained within the framebuffer dimensions.
When multiview is enabled, the resolve operation at the end of a subpass applies to all views in the view mask.
Note
There may be a performance cost for using a render area smaller than the framebuffer, unless it matches the render area granularity for the render pass. |
The image layout of the depth aspect of a depth/stencil attachment referring
to an image created with
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
is dependent
on the last sample locations used to render to the image subresource, thus
preserving the contents of such depth/stencil attachments across subpass
boundaries requires the application to specify these sample locations
whenever a layout transition of the attachment may occur.
This information can be provided by chaining an instance of the
VkRenderPassSampleLocationsBeginInfoEXT
structure to the pNext
chain of VkRenderPassBeginInfo
.
The VkRenderPassSampleLocationsBeginInfoEXT
structure is defined as:
typedef struct VkRenderPassSampleLocationsBeginInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t attachmentInitialSampleLocationsCount;
const VkAttachmentSampleLocationsEXT* pAttachmentInitialSampleLocations;
uint32_t postSubpassSampleLocationsCount;
const VkSubpassSampleLocationsEXT* pPostSubpassSampleLocations;
} VkRenderPassSampleLocationsBeginInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
attachmentInitialSampleLocationsCount
is the number of elements in thepAttachmentInitialSampleLocations
array. -
pAttachmentInitialSampleLocations
is an array ofattachmentInitialSampleLocationsCount
VkAttachmentSampleLocationsEXT structures specifying the attachment indices and their corresponding sample location state. Each element ofpAttachmentInitialSampleLocations
can specify the sample location state to use in the automatic layout transition performed to transition a depth/stencil attachment from the initial layout of the attachment to the image layout specified for the attachment in the first subpass using it. -
postSubpassSampleLocationsCount
is the number of elements in thepPostSubpassSampleLocations
array. -
pPostSubpassSampleLocations
is an array ofpostSubpassSampleLocationsCount
VkSubpassSampleLocationsEXT structures specifying the subpass indices and their corresponding sample location state. Each element ofpPostSubpassSampleLocations
can specify the sample location state to use in the automatic layout transition performed to transition the depth/stencil attachment used by the specified subpass to the image layout specified in a dependent subpass or to the final layout of the attachment in case the specified subpass is the last subpass using that attachment. In addition, if VkPhysicalDeviceSampleLocationsPropertiesEXT::variableSampleLocations
isVK_FALSE
, each element ofpPostSubpassSampleLocations
must specify the sample location state that matches the sample locations used by all pipelines that will be bound to a command buffer during the specified subpass. IfvariableSampleLocations
isVK_TRUE
, the sample locations used for rasterization do not depend onpPostSubpassSampleLocations
.
The VkAttachmentSampleLocationsEXT
structure is defined as:
typedef struct VkAttachmentSampleLocationsEXT {
uint32_t attachmentIndex;
VkSampleLocationsInfoEXT sampleLocationsInfo;
} VkAttachmentSampleLocationsEXT;
-
attachmentIndex
is the index of the attachment for which the sample locations state is provided. -
sampleLocationsInfo
is the sample locations state to use for the layout transition of the given attachment from the initial layout of the attachment to the image layout specified for the attachment in the first subpass using it.
If the image referenced by the framebuffer attachment at index
attachmentIndex
was not created with
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
then the
values specified in sampleLocationsInfo
are ignored.
The VkSubpassSampleLocationsEXT
structure is defined as:
typedef struct VkSubpassSampleLocationsEXT {
uint32_t subpassIndex;
VkSampleLocationsInfoEXT sampleLocationsInfo;
} VkSubpassSampleLocationsEXT;
-
subpassIndex
is the index of the subpass for which the sample locations state is provided. -
sampleLocationsInfo
is the sample locations state to use for the layout transition of the depth/stencil attachment away from the image layout the attachment is used with in the subpass specified insubpassIndex
.
If the image referenced by the depth/stencil attachment used in the subpass
identified by subpassIndex
was not created with
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
or if the
subpass does not use a depth/stencil attachment, and
VkPhysicalDeviceSampleLocationsPropertiesEXT::variableSampleLocations
is VK_TRUE
then the values specified in sampleLocationsInfo
are
ignored.
The VkSubpassBeginInfoKHR
structure is defined as:
typedef struct VkSubpassBeginInfoKHR {
VkStructureType sType;
const void* pNext;
VkSubpassContents contents;
} VkSubpassBeginInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
contents
is a VkSubpassContents value specifying how the commands in the next subpass will be provided.
Possible values of vkCmdBeginRenderPass::contents
, specifying
how the commands in the first subpass will be provided, are:
typedef enum VkSubpassContents {
VK_SUBPASS_CONTENTS_INLINE = 0,
VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS = 1,
} VkSubpassContents;
-
VK_SUBPASS_CONTENTS_INLINE
specifies that the contents of the subpass will be recorded inline in the primary command buffer, and secondary command buffers must not be executed within the subpass. -
VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS
specifies that the contents are recorded in secondary command buffers that will be called from the primary command buffer, and vkCmdExecuteCommands is the only valid command on the command buffer until vkCmdNextSubpass or vkCmdEndRenderPass.
If the pNext
chain of VkRenderPassBeginInfo includes a
VkDeviceGroupRenderPassBeginInfo
structure, then that structure
includes a device mask and set of render areas for the render pass instance.
The VkDeviceGroupRenderPassBeginInfo
structure is defined as:
typedef struct VkDeviceGroupRenderPassBeginInfo {
VkStructureType sType;
const void* pNext;
uint32_t deviceMask;
uint32_t deviceRenderAreaCount;
const VkRect2D* pDeviceRenderAreas;
} VkDeviceGroupRenderPassBeginInfo;
or the equivalent
typedef VkDeviceGroupRenderPassBeginInfo VkDeviceGroupRenderPassBeginInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
deviceMask
is the device mask for the render pass instance. -
deviceRenderAreaCount
is the number of elements in thepDeviceRenderAreas
array. -
pDeviceRenderAreas
is an array of structures of type VkRect2D defining the render area for each physical device.
The deviceMask
serves several purposes.
It is an upper bound on the set of physical devices that can be used during
the render pass instance, and the initial device mask when the render pass
instance begins.
In addition, commands transitioning to the next subpass in the render pass
instance and commands ending the render pass instance, and, accordingly
render pass attachment load, store, and resolve operations and subpass
dependencies corresponding to the render pass instance, are executed on the
physical devices included in the device mask provided here.
If deviceRenderAreaCount
is not zero, then the elements of
pDeviceRenderAreas
override the value of
VkRenderPassBeginInfo::renderArea
, and provide a render area
specific to each physical device.
These render areas serve the same purpose as
VkRenderPassBeginInfo::renderArea
, including controlling the
region of attachments that are cleared by VK_ATTACHMENT_LOAD_OP_CLEAR
and that are resolved into resolve attachments.
If this structure is not present, the render pass instance’s device mask is
the value of VkDeviceGroupCommandBufferBeginInfo::deviceMask
.
If this structure is not present or if deviceRenderAreaCount
is zero,
VkRenderPassBeginInfo::renderArea
is used for all physical
devices.
To query the render area granularity, call:
void vkGetRenderAreaGranularity(
VkDevice device,
VkRenderPass renderPass,
VkExtent2D* pGranularity);
-
device
is the logical device that owns the render pass. -
renderPass
is a handle to a render pass. -
pGranularity
points to a VkExtent2D structure in which the granularity is returned.
The conditions leading to an optimal renderArea
are:
-
the
offset.x
member inrenderArea
is a multiple of thewidth
member of the returned VkExtent2D (the horizontal granularity). -
the
offset.y
member inrenderArea
is a multiple of theheight
of the returned VkExtent2D (the vertical granularity). -
either the
offset.width
member inrenderArea
is a multiple of the horizontal granularity oroffset.x
+offset.width
is equal to thewidth
of theframebuffer
in the VkRenderPassBeginInfo. -
either the
offset.height
member inrenderArea
is a multiple of the vertical granularity oroffset.y
+offset.height
is equal to theheight
of theframebuffer
in the VkRenderPassBeginInfo.
Subpass dependencies are not affected by the render area, and apply to the entire image subresources attached to the framebuffer as specified in the description of automatic layout transitions. Similarly, pipeline barriers are valid even if their effect extends outside the render area.
To transition to the next subpass in the render pass instance after recording the commands for a subpass, call:
void vkCmdNextSubpass(
VkCommandBuffer commandBuffer,
VkSubpassContents contents);
-
commandBuffer
is the command buffer in which to record the command. -
contents
specifies how the commands in the next subpass will be provided, in the same fashion as the corresponding parameter of vkCmdBeginRenderPass.
The subpass index for a render pass begins at zero when
vkCmdBeginRenderPass
is recorded, and increments each time
vkCmdNextSubpass
is recorded.
Moving to the next subpass automatically performs any multisample resolve
operations in the subpass being ended.
End-of-subpass multisample resolves are treated as color attachment writes
for the purposes of synchronization.
That is, they are considered to execute in the
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
pipeline stage and their
writes are synchronized with VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT
.
Synchronization between rendering within a subpass and any resolve
operations at the end of the subpass occurs automatically, without need for
explicit dependencies or pipeline barriers.
However, if the resolve attachment is also used in a different subpass, an
explicit dependency is needed.
After transitioning to the next subpass, the application can record the commands for that subpass.
To transition to the next subpass in the render pass instance after recording the commands for a subpass, call:
void vkCmdNextSubpass2KHR(
VkCommandBuffer commandBuffer,
const VkSubpassBeginInfoKHR* pSubpassBeginInfo,
const VkSubpassEndInfoKHR* pSubpassEndInfo);
-
commandBuffer
is the command buffer in which to record the command. -
pSubpassBeginInfo
is a pointer to a VkSubpassBeginInfoKHR structure which contains information about the subpass which is about to begin rendering. -
pSubpassEndInfo
is a pointer to a VkSubpassEndInfoKHR structure which contains information about how the previous subpass will be ended.
vkCmdNextSubpass2KHR
is semantically identical to
vkCmdNextSubpass, except that it is extensible, and that
contents
is provided as part of an extensible structure instead of as
a flat parameter.
To record a command to end a render pass instance after recording the commands for the last subpass, call:
void vkCmdEndRenderPass(
VkCommandBuffer commandBuffer);
-
commandBuffer
is the command buffer in which to end the current render pass instance.
Ending a render pass instance performs any multisample resolve operations on the final subpass.
To record a command to end a render pass instance after recording the commands for the last subpass, call:
void vkCmdEndRenderPass2KHR(
VkCommandBuffer commandBuffer,
const VkSubpassEndInfoKHR* pSubpassEndInfo);
-
commandBuffer
is the command buffer in which to end the current render pass instance. -
pSubpassEndInfo
is a pointer to a VkSubpassEndInfoKHR structure which contains information about how the previous subpass will be ended.
vkCmdEndRenderPass2KHR
is semantically identical to
vkCmdEndRenderPass, except that it is extensible.
The VkSubpassEndInfoKHR
structure is defined as:
typedef struct VkSubpassEndInfoKHR {
VkStructureType sType;
const void* pNext;
} VkSubpassEndInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure.
8. Shaders
A shader specifies programmable operations that execute for each vertex, control point, tessellated vertex, primitive, fragment, or workgroup in the corresponding stage(s) of the graphics and compute pipelines.
Graphics pipelines include vertex shader execution as a result of primitive assembly, followed, if enabled, by tessellation control and evaluation shaders operating on patches, geometry shaders, if enabled, operating on primitives, and fragment shaders, if present, operating on fragments generated by Rasterization. In this specification, vertex, tessellation control, tessellation evaluation and geometry shaders are collectively referred to as vertex processing stages and occur in the logical pipeline before rasterization. The fragment shader occurs logically after rasterization.
Only the compute shader stage is included in a compute pipeline. Compute shaders operate on compute invocations in a workgroup.
Shaders can read from input variables, and read from and write to output variables. Input and output variables can be used to transfer data between shader stages, or to allow the shader to interact with values that exist in the execution environment. Similarly, the execution environment provides constants that describe capabilities.
Shader variables are associated with execution environment-provided inputs and outputs using built-in decorations in the shader. The available decorations for each stage are documented in the following subsections.
8.1. Shader Modules
Shader modules contain shader code and one or more entry points. Shaders are selected from a shader module by specifying an entry point as part of pipeline creation. The stages of a pipeline can use shaders that come from different modules. The shader code defining a shader module must be in the SPIR-V format, as described by the Vulkan Environment for SPIR-V appendix.
Shader modules are represented by VkShaderModule
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkShaderModule)
To create a shader module, call:
VkResult vkCreateShaderModule(
VkDevice device,
const VkShaderModuleCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkShaderModule* pShaderModule);
-
device
is the logical device that creates the shader module. -
pCreateInfo
is a pointer to an instance of theVkShaderModuleCreateInfo
structure. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pShaderModule
points to a VkShaderModule handle in which the resulting shader module object is returned.
Once a shader module has been created, any entry points it contains can be used in pipeline shader stages as described in Compute Pipelines and Graphics Pipelines.
If the shader stage fails to compile VK_ERROR_INVALID_SHADER_NV
will
be generated and the compile log will be reported back to the application by
VK_EXT_debug_report
if enabled.
The VkShaderModuleCreateInfo
structure is defined as:
typedef struct VkShaderModuleCreateInfo {
VkStructureType sType;
const void* pNext;
VkShaderModuleCreateFlags flags;
size_t codeSize;
const uint32_t* pCode;
} VkShaderModuleCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
codeSize
is the size, in bytes, of the code pointed to bypCode
. -
pCode
points to code that is used to create the shader module. The type and format of the code is determined from the content of the memory addressed bypCode
.
typedef VkFlags VkShaderModuleCreateFlags;
VkShaderModuleCreateFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
To use a VkValidationCacheEXT to cache shader validation results, add
a VkShaderModuleValidationCacheCreateInfoEXT to the pNext
chain
of the VkShaderModuleCreateInfo structure, specifying the cache object
to use.
The VkShaderModuleValidationCacheCreateInfoEXT
struct is defined as:
typedef struct VkShaderModuleValidationCacheCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkValidationCacheEXT validationCache;
} VkShaderModuleValidationCacheCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
validationCache
is the validation cache object from which the results of prior validation attempts will be written, and to which new validation results for this VkShaderModule will be written (if not already present).
To destroy a shader module, call:
void vkDestroyShaderModule(
VkDevice device,
VkShaderModule shaderModule,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the shader module. -
shaderModule
is the handle of the shader module to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
A shader module can be destroyed while pipelines created using its shaders are still in use.
8.2. Shader Execution
At each stage of the pipeline, multiple invocations of a shader may execute simultaneously. Further, invocations of a single shader produced as the result of different commands may execute simultaneously. The relative execution order of invocations of the same shader type is undefined. Shader invocations may complete in a different order than that in which the primitives they originated from were drawn or dispatched by the application. However, fragment shader outputs are written to attachments in rasterization order.
The relative execution order of invocations of different shader types is largely undefined. However, when invoking a shader whose inputs are generated from a previous pipeline stage, the shader invocations from the previous stage are guaranteed to have executed far enough to generate input values for all required inputs.
8.3. Shader Memory Access Ordering
The order in which image or buffer memory is read or written by shaders is largely undefined. For some shader types (vertex, tessellation evaluation, and in some cases, fragment), even the number of shader invocations that may perform loads and stores is undefined.
In particular, the following rules apply:
-
Vertex and tessellation evaluation shaders will be invoked at least once for each unique vertex, as defined in those sections.
-
Fragment shaders will be invoked zero or more times, as defined in that section.
-
The relative execution order of invocations of the same shader type is undefined. A store issued by a shader when working on primitive B might complete prior to a store for primitive A, even if primitive A is specified prior to primitive B. This applies even to fragment shaders; while fragment shader outputs are always written to the framebuffer in rasterization order, stores executed by fragment shader invocations are not.
-
The relative execution order of invocations of different shader types is largely undefined.
Note
The above limitations on shader invocation order make some forms of synchronization between shader invocations within a single set of primitives unimplementable. For example, having one invocation poll memory written by another invocation assumes that the other invocation has been launched and will complete its writes in finite time. |
The Memory Model appendix defines the terminology and rules for how to correctly communicate between shader invocations, such as when a write is Visible-To a read, and what constitutes a Data Race.
Applications must not cause a data race.
8.4. Shader Inputs and Outputs
Data is passed into and out of shaders using variables with input or output
storage class, respectively.
User-defined inputs and outputs are connected between stages by matching
their Location
decorations.
Additionally, data can be provided by or communicated to special functions
provided by the execution environment using BuiltIn
decorations.
In many cases, the same BuiltIn
decoration can be used in multiple
shader stages with similar meaning.
The specific behavior of variables decorated as BuiltIn
is documented
in the following sections.
8.5. Task Shaders
Task shaders operate in conjunction with the mesh shaders to produce a collection of primitives that will be processed by subsequent stages of the graphics pipeline. Its primary purpose is to create a variable amount of subsequent mesh shader invocations.
Task shaders are invoked via the execution of the programmable mesh shading pipeline.
The task shader has no fixed-function inputs other than variables identifying the specific workgroup and invocation. The only fixed output of the task shader is a task count, identifying the number of mesh shader workgroups to create. The task shader can write additional outputs to task memory, which can be read by all of the mesh shader workgroups it created.
8.5.1. Task Shader Execution
Task workloads are formed from groups of work items called workgroups and
processed by the task shader in the current graphics pipeline.
A workgroup is a collection of shader invocations that execute the same
shader, potentially in parallel.
Task shaders execute in global workgroups which are divided into a number
of local workgroups with a size that can be set by assigning a value to
the LocalSize
execution mode or via an object decorated by the
WorkgroupSize
decoration.
An invocation within a local workgroup can share data with other members of
the local workgroup through shared variables and issue memory and control
flow barriers to synchronize with other members of the local workgroup.
8.6. Mesh Shaders
Mesh shaders operate in workgroups to produce a collection of primitives that will be processed by subsequent stages of the graphics pipeline. Each workgroup emits zero or more output primitives and the group of vertices and their associated data required for each output primitive.
Mesh shaders are invoked via the execution of the programmable mesh shading pipeline.
The only inputs available to the mesh shader are variables identifying the specific workgroup and invocation and, if applicable, any outputs written to task memory by the task shader that spawned the mesh shader’s workgroup. The mesh shader can operate without a task shader as well.
The invocations of the mesh shader workgroup write an output mesh, comprising a set of primitives with per-primitive attributes, a set of vertices with per-vertex attributes, and an array of indices identifying the mesh vertices that belong to each primitive. The primitives of this mesh are then processed by subsequent graphics pipeline stages, where the outputs of the mesh shader form an interface with the fragment shader.
8.6.1. Mesh Shader Execution
Mesh workloads are formed from groups of work items called workgroups and
processed by the mesh shader in the current graphics pipeline.
A workgroup is a collection of shader invocations that execute the same
shader, potentially in parallel.
Mesh shaders execute in global workgroups which are divided into a number
of local workgroups with a size that can be set by assigning a value to
the LocalSize
execution mode or via an object decorated by the
WorkgroupSize
decoration.
An invocation within a local workgroup can share data with other members of
the local workgroup through shared variables and issue memory and control
flow barriers to synchronize with other members of the local workgroup.
The global workgroups may be generated explcitly via the API, or implicitly through the task shader’s work creation mechanism.
8.7. Vertex Shaders
Each vertex shader invocation operates on one vertex and its associated vertex attribute data, and outputs one vertex and associated data. Graphics pipelines using primitive shading must include a vertex shader, and the vertex shader stage is always the first shader stage in the graphics pipeline.
8.7.1. Vertex Shader Execution
A vertex shader must be executed at least once for each vertex specified by a draw command. If the subpass includes multiple views in its view mask, the shader may be invoked separately for each view. During execution, the shader is presented with the index of the vertex and instance for which it has been invoked. Input variables declared in the vertex shader are filled by the implementation with the values of vertex attributes associated with the invocation being executed.
If the same vertex is specified multiple times in a draw command (e.g. by including the same index value multiple times in an index buffer) the implementation may reuse the results of vertex shading if it can statically determine that the vertex shader invocations will produce identical results.
Note
It is implementation-dependent when and if results of vertex shading are
reused, and thus how many times the vertex shader will be executed.
This is true also if the vertex shader contains stores or atomic operations
(see |
8.8. Tessellation Control Shaders
The tessellation control shader is used to read an input patch provided by
the application and to produce an output patch.
Each tessellation control shader invocation operates on an input patch
(after all control points in the patch are processed by a vertex shader) and
its associated data, and outputs a single control point of the output patch
and its associated data, and can also output additional per-patch data.
The input patch is sized according to the patchControlPoints
member of
VkPipelineTessellationStateCreateInfo, as part of input assembly.
The size of the output patch is controlled by the OpExecutionMode
OutputVertices
specified in the tessellation control or tessellation
evaluation shaders, which must be specified in at least one of the shaders.
The size of the input and output patches must each be greater than zero and
less than or equal to
VkPhysicalDeviceLimits
::maxTessellationPatchSize
.
8.8.1. Tessellation Control Shader Execution
A tessellation control shader is invoked at least once for each output vertex in a patch. If the subpass includes multiple views in its view mask, the shader may be invoked separately for each view.
Inputs to the tessellation control shader are generated by the vertex
shader.
Each invocation of the tessellation control shader can read the attributes
of any incoming vertices and their associated data.
The invocations corresponding to a given patch execute logically in
parallel, with undefined relative execution order.
However, the OpControlBarrier
instruction can be used to provide
limited control of the execution order by synchronizing invocations within a
patch, effectively dividing tessellation control shader execution into a set
of phases.
Tessellation control shaders will read undefined values if one invocation
reads a per-vertex or per-patch attribute written by another invocation at
any point during the same phase, or if two invocations attempt to write
different values to the same per-patch output in a single phase.
8.9. Tessellation Evaluation Shaders
The Tessellation Evaluation Shader operates on an input patch of control points and their associated data, and a single input barycentric coordinate indicating the invocation’s relative position within the subdivided patch, and outputs a single vertex and its associated data.
8.9.1. Tessellation Evaluation Shader Execution
A tessellation evaluation shader is invoked at least once for each unique vertex generated by the tessellator. If the subpass includes multiple views in its view mask, the shader may be invoked separately for each view.
8.10. Geometry Shaders
The geometry shader operates on a group of vertices and their associated data assembled from a single input primitive, and emits zero or more output primitives and the group of vertices and their associated data required for each output primitive.
8.10.1. Geometry Shader Execution
A geometry shader is invoked at least once for each primitive produced by the tessellation stages, or at least once for each primitive generated by primitive assembly when tessellation is not in use. A shader can request that the geometry shader runs multiple instances. A geometry shader is invoked at least once for each instance. If the subpass includes multiple views in its view mask, the shader may be invoked separately for each view.
8.11. Fragment Shaders
Fragment shaders are invoked as the result of rasterization in a graphics pipeline. Each fragment shader invocation operates on a single fragment and its associated data. With few exceptions, fragment shaders do not have access to any data associated with other fragments and are considered to execute in isolation of fragment shader invocations associated with other fragments.
8.11.1. Fragment Shader Execution
For each fragment generated by rasterization, a fragment shader may be invoked. A fragment shader must not be invoked if the Early Per-Fragment Tests cause it to have no coverage. If the subpass includes multiple views in its view mask, the shader may be invoked separately for each view.
Furthermore, if it is determined that a fragment generated as the result of rasterizing a first primitive will have its outputs entirely overwritten by a fragment generated as the result of rasterizing a second primitive in the same subpass, and the fragment shader used for the fragment has no other side effects, then the fragment shader may not be executed for the fragment from the first primitive.
Relative ordering of execution of different fragment shader invocations is not defined.
For each fragment generated by a primitive, the number of times the fragment shader is invoked is implementation-dependent, but must obey the following constraints:
-
Each covered sample is included in a single fragment shader invocation.
-
When sample shading is not enabled, there is at least one fragment shader invocation.
-
When sample shading is enabled, the minimum number of fragment shader invocations is as defined in Shading Rate Image and Sample Shading.
When there is more than one fragment shader invocation per fragment, the association of samples to invocations is implementation-dependent.
In addition to the conditions outlined above for the invocation of a fragment shader, a fragment shader invocation may be produced as a helper invocation. A helper invocation is a fragment shader invocation that is created solely for the purposes of evaluating derivatives for use in non-helper fragment shader invocations. Stores and atomics performed by helper invocations must not have any effect on memory, and values returned by atomic instructions in helper invocations are undefined.
If the render pass has a fragment density map attachment, more than one
fragment shader invocation may be invoked for each covered sample.
Stores and atomics performed by these additional invocations have the normal
effect.
Such additional invocations are only produced if
VkPhysicalDeviceFragmentDensityMapPropertiesEXT
::fragmentDensityInvocations
is VK_TRUE
.
Note
Implementations may generate these additional fragment shader invocations in order to make transitions between fragment areas with different fragment densities more smooth. |
8.11.2. Early Fragment Tests
An explicit control is provided to allow fragment shaders to enable early
fragment tests.
If the fragment shader specifies the EarlyFragmentTests
OpExecutionMode
, the per-fragment tests described in
Early Fragment Test Mode are performed prior to
fragment shader execution.
Otherwise, they are performed after fragment shader execution.
If the fragment shader additionally specifies the PostDepthCoverage
OpExecutionMode
, the value of a variable decorated with the
SampleMask
built-in
reflects the coverage after the early fragment tests.
Otherwise, it reflects the coverage before the early fragment tests.
8.12. Compute Shaders
Compute shaders are invoked via vkCmdDispatch and vkCmdDispatchIndirect commands. In general, they have access to similar resources as shader stages executing as part of a graphics pipeline.
Compute workloads are formed from groups of work items called workgroups and
processed by the compute shader in the current compute pipeline.
A workgroup is a collection of shader invocations that execute the same
shader, potentially in parallel.
Compute shaders execute in global workgroups which are divided into a
number of local workgroups with a size that can be set by assigning a
value to the LocalSize
execution mode or via an object decorated by the
WorkgroupSize
decoration.
An invocation within a local workgroup can share data with other members of
the local workgroup through shared variables and issue memory and control
flow barriers to synchronize with other members of the local workgroup.
8.13. Interpolation Decorations
Interpolation decorations control the behavior of attribute interpolation in
the fragment shader stage.
Interpolation decorations can be applied to Input
storage class
variables in the fragment shader stage’s interface, and control the
interpolation behavior of those variables.
Inputs that could be interpolated can be decorated by at most one of the following decorations:
Fragment input variables decorated with neither Flat
nor
NoPerspective
use perspective-correct interpolation (for
lines and
polygons).
The presence of and type of interpolation is controlled by the above
interpolation decorations as well as the auxiliary decorations Centroid
and Sample
.
A variable decorated with Flat
will not be interpolated.
Instead, it will have the same value for every fragment within a triangle.
This value will come from a single provoking
vertex.
A variable decorated with Flat
can also be decorated with
Centroid
or Sample
, which will mean the same thing as decorating
it only as Flat
.
For fragment shader input variables decorated with neither Centroid
nor
Sample
, the assigned variable may be interpolated anywhere within the
fragment and a single value may be assigned to each sample within the
fragment.
If a fragment shader input is decorated with Centroid
, a single value
may be assigned to that variable for all samples in the fragment, but that
value must be interpolated to a location that lies in both the fragment and
in the primitive being rendered, including any of the fragment’s samples
covered by the primitive.
Because the location at which the variable is interpolated may be different
in neighboring fragments, and derivatives may be computed by computing
differences between neighboring fragments, derivatives of centroid-sampled
inputs may be less accurate than those for non-centroid interpolated
variables.
If
VkPipelineViewportShadingRateImageStateCreateInfoNV::shadingRateImageEnable
is enabled, implementations may estimate derivatives using differencing
without dividing by the distance between adjacent sample locations when the
fragment size is larger than one pixel.
The PostDepthCoverage
execution mode does not affect the determination of the centroid location.
If a fragment shader input is decorated with Sample
, a separate value
must be assigned to that variable for each covered sample in the fragment,
and that value must be sampled at the location of the individual sample.
When rasterizationSamples
is VK_SAMPLE_COUNT_1_BIT
, the fragment
center must be used for Centroid
, Sample
, and undecorated
attribute interpolation.
Fragment shader inputs that are signed or unsigned integers, integer
vectors, or any double-precision floating-point type must be decorated with
Flat
.
When the VK_AMD_shader_explicit_vertex_parameter
device extension is
enabled inputs can be also decorated with the CustomInterpAMD
interpolation decoration, including fragment shader inputs that are signed
or unsigned integers, integer vectors, or any double-precision
floating-point type.
Inputs decorated with CustomInterpAMD
can only be accessed by the
extended instruction InterpolateAtVertexAMD
and allows accessing the
value of the input for individual vertices of the primitive.
When the fragmentShaderBarycentric
feature is enabled, inputs can be
also decorated with the PerVertexNV
interpolation decoration, including
fragment shader inputs that are signed or unsigned integers, integer
vectors, or any double-precision floating-point type.
Inputs decorated with PerVertexNV
can only be accessed using an extra
array dimension, where the extra index identifies one of the vertices of the
primitive that produced the fragment.
8.14. Ray Generation Shaders
A ray generation shader is similar to a compute shader.
Its main purpose is to execute ray tracing queries using OpTraceNV
instructions and process the results.
8.14.1. Ray Generation Shader Execution
One ray generation shader is executed per ray tracing dispatch.
Its location in the shader binding table (see Shader
Binding Table for details) is passed directly into vkCmdTraceRaysNV
using the raygenShaderBindingTableBuffer
and
raygenShaderBindingOffset
parameters.
8.15. Intersection Shaders
Intersection shaders enable the implementation of arbitrary, application defined geometric primitives. An intersection shader for a primitive is executed whenever its axis-aligned bounding box is hit by a ray.
A built-in intersection shader for triangle primitives that is used
automatically whenever geometry of type VK_GEOMETRY_TYPE_TRIANGLES_NV
is specified.
Like other ray tracing shader domains, an intersection shader operates on a
single ray at a time.
It also operates on a single primitive at a time.
It is therefore the purpose of an intersection shader to compute the
ray-primitive intersections and report them.
To report an intersection, the shader calls the OpReportIntersectionNV
instruction.
An intersection shader communicates with any-hit and closest shaders by generating attribute values that they can read. Intersection shaders cannot read or modify the ray payload.
8.15.1. Intersection Shader Execution
The order in which intersections are found along a ray, and therefore the order in which intersection shaders are executed, is unspecified.
The intersection shader of the closest AABB which intersects the ray is guaranteed to be executed at some point during traversal, unless the ray is forcibly terminated.
8.16. Any-Hit Shaders
The any-hit shader is executed after the intersection shader reports an
intersection that lies within the current [tmin,tmax] of the ray.
The main use of any hit shaders is to programmatically decide whether or not
an intersection will be accepted.
The intersection will be accepted unless the shader calls the
OpIgnoreIntersectionNV
instruction.
8.16.1. Any Hit Shader Execution
The order in which intersections are found along a ray, and therefore the order in which any-hit shaders are executed, is unspecified.
The any-hit shader of the closest hit is guaranteed to be executed at some point during traversal, unless the ray is forcibly terminated.
8.17. Closest Hit Shaders
Closest hit shaders have read-only access to the attributes generated by the
corresponding intersection shader, and can read or modify the ray payload.
They also have access to a number of system-generated values.
Closest hit shaders can call OpTraceNV
to recursively trace rays.
8.17.1. Closest Hit Shader Execution
Exactly one closest hit shader is executed when traversal is finished and an intersection has been found and accepted.
8.18. Miss Shaders
Miss shaders can access the ray payload and can trace new rays through the
OpTraceNV
instruction, but cannot access attributes since they are not
associated with an intersection.
8.18.1. Miss Shader Execution
A miss shader is executed instead of a closest hit shader if no intersection was found during traversal.
8.19. Callable Shaders
Callable shaders can access a callable payload that works similarly to ray payloads to do subroutine work.
8.19.1. Callable Shader Execution
A callable shader is executed by calling OpExecuteCallableNV
from an
allowed shader stage.
8.20. Static Use
A SPIR-V module declares a global object in memory using the OpVariable
instruction, which results in a pointer x
to that object.
A specific entry point in a SPIR-V module is said to statically use that
object if that entry point’s call tree contains a function that contains a
memory instruction or image instruction with x
as an id
operand.
See the “Memory Instructions” and “Image Instructions” subsections of
section 3 “Binary Form” of the SPIR-V specification for the complete list
of SPIR-V memory instructions.
Static use is not used to control the behavior of variables with Input
and Output
storage.
The effects of those variables are applied based only on whether they are
present in a shader entry point’s interface.
8.21. Invocation and Derivative Groups
An invocation group (see the subsection “Control Flow” of section 2 of
the SPIR-V specification) for a compute shader is the set of invocations in
a single local workgroup.
For graphics shaders, an invocation group is an implementation-dependent
subset of the set of shader invocations of a given shader stage which are
produced by a single drawing command.
For indirect drawing commands with drawCount
greater than one,
invocations from separate draws are in distinct invocation groups.
Note
Because the partitioning of invocations into invocation groups is implementation-dependent and not observable, applications generally need to assume the worst case of all invocations in a draw belonging to a single invocation group. |
A derivative group (see the subsection “Control Flow” of section 2 of
the SPIR-V 1.00 Revision 4 specification) is a set of invocations which are
used together to compute a derivative.
For a fragment shader, a derivative group is generated by a single primitive
(point, line, or triangle) and includes any helper invocations needed to
compute derivatives.
If the subgroupSize
field of VkPhysicalDeviceSubgroupProperties
is at least 4, a derivative group for a fragment shader corresponds to a
single subgroup quad.
Otherwise, a derivative group is the set of invocations generated by a
single primitive.
A derivative group for a compute shader is a single local workgroup.
Derivatives are undefined for a sampled image instruction if the instruction is in flow control that is not uniform across the derivative group.
8.22. Subgroups
A subgroup (see the subsection “Control Flow” of section 2 of the SPIR-V 1.3 Revision 1 specification) is a set of invocations that can synchronize and share data with each other efficiently. An invocation group is partitioned into one or more subgroups.
Subgroup operations are divided into various categories as described in VkSubgroupFeatureFlagBits.
8.22.1. Basic Subgroup Operations
The basic subgroup operations allow two classes of functionality within
shaders
- elect and barrier.
Invocations within a subgroup can choose a single invocation to perform
some task for the subgroup as a whole using elect.
Invocations within a subgroup can perform a subgroup barrier to ensure the
ordering of execution or memory accesses within a subgroup.
Barriers can be performed on buffer memory accesses, WorkgroupLocal
memory accesses, and image memory accesses to ensure that any results
written are visible by other invocations within the subgroup.
An OpControlBarrier
can also be used to perform a full execution
control barrier.
A full execution control barrier will ensure that each active invocation
within the subgroup reaches a point of execution before any are allowed to
continue.
8.22.2. Vote Subgroup Operations
The vote subgroup operations allow invocations within a subgroup to compare values across a subgroup. The types of votes enabled are:
-
Do all active subgroup invocations agree that an expression is true?
-
Do any active subgroup invocations evaluate an expression to true?
-
Do all active subgroup invocations have the same value of an expression?
Note
These operations are useful in combination with control flow in that they allow for developers to check whether conditions match across the subgroup and choose potentially faster code-paths in these cases. |
8.22.3. Arithmetic Subgroup Operations
The arithmetic subgroup operations allow invocations to perform scan and reduction operations across a subgroup. For reduction operations, each invocation in a subgroup will obtain the same result of these arithmetic operations applied across the subgroup. For scan operations, each invocation in the subgroup will perform an inclusive or exclusive scan, cumulatively applying the operation across the invocations in a subgroup in linear order. The operations supported are add, mul, min, max, and, or, xor.
8.22.4. Ballot Subgroup Operations
The ballot subgroup operations allow invocations to perform more complex votes across the subgroup. The ballot functionality allows all invocations within a subgroup to provide a boolean value and get as a result what each invocation provided as their boolean value. The broadcast functionality allows values to be broadcast from an invocation to all other invocations within the subgroup, given that the invocation to be broadcast from is known at pipeline creation time.
8.22.5. Shuffle Subgroup Operations
The shuffle subgroup operations allow invocations to read values from other invocations within a subgroup.
8.22.6. Shuffle Relative Subgroup Operations
The shuffle relative subgroup operations allow invocations to read values from other invocations within the subgroup relative to the current invocation in the group. The relative operations supported allow data to be shifted up and down through the invocations within a subgroup.
8.22.7. Clustered Subgroup Operations
The clustered subgroup operations allow invocations to perform an operation among partitions of a subgroup, such that the operation is only performed within the subgroup invocations within a partition. The partitions for clustered subgroup operations are consecutive power-of-two size groups of invocations and the cluster size must be known at pipeline creation time. The operations supported are add, mul, min, max, and, or, xor.
8.22.8. Quad Subgroup Operations
The quad subgroup operations allow clusters of 4 invocations (a quad), to
share data efficiently with each other.
For fragment shaders, if the subgroupSize
field of
VkPhysicalDeviceSubgroupProperties is at least 4, each quad
corresponds to one of the groups of four shader invocations used for
derivatives.
For compute shaders using the DerivativeGroupQuadsNV
or
DerivativeGroupLinearNV
execution modes, each quad corresponds to one
of the groups of four shader invocations used for
derivatives.
The invocations in each quad are ordered to have attribute values of
Pi0,j0, Pi1,j0, Pi0,j1, and Pi1,j1, respectively.
8.22.9. Partitioned Subgroup Operations
The partitioned subgroup operations allow invocations to perform an operation among partitions of a subgroup, such that the operation is only performed within the subgroup invocations within a partition. The partitions for partitioned subgroup operations can group the invocations into arbitrary subsets and can be computed at runtime. The operations supported are add, mul, min, max, and, or, xor.
8.23. Validation Cache
Validation cache objects allow the result of internal validation to be reused, both within a single application run and between multiple runs. Reuse within a single run is achieved by passing the same validation cache object when creating supported Vulkan objects. Reuse across runs of an application is achieved by retrieving validation cache contents in one run of an application, saving the contents, and using them to preinitialize a validation cache on a subsequent run. The contents of the validation cache objects are managed by the validation layers. Applications can manage the host memory consumed by a validation cache object and control the amount of data retrieved from a validation cache object.
Validation cache objects are represented by VkValidationCacheEXT
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkValidationCacheEXT)
To create validation cache objects, call:
VkResult vkCreateValidationCacheEXT(
VkDevice device,
const VkValidationCacheCreateInfoEXT* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkValidationCacheEXT* pValidationCache);
-
device
is the logical device that creates the validation cache object. -
pCreateInfo
is a pointer to a VkValidationCacheCreateInfoEXT structure that contains the initial parameters for the validation cache object. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pValidationCache
is a pointer to a VkValidationCacheEXT handle in which the resulting validation cache object is returned.
Note
Applications can track and manage the total host memory size of a
validation cache object using the |
Once created, a validation cache can be passed to the
vkCreateShaderModule
command as part of the
VkShaderModuleCreateInfo
pNext
chain.
If a VkShaderModuleValidationCacheCreateInfoEXT
object is part of the
VkShaderModuleCreateInfo
::pNext
chain, and its
validationCache
field is not VK_NULL_HANDLE, the implementation
will query it for possible reuse opportunities and update it with new
content.
The use of the validation cache object in these commands is internally
synchronized, and the same validation cache object can be used in multiple
threads simultaneously.
Note
Implementations should make every effort to limit any critical sections to
the actual accesses to the cache, which is expected to be significantly
shorter than the duration of the |
The VkValidationCacheCreateInfoEXT
structure is defined as:
typedef struct VkValidationCacheCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkValidationCacheCreateFlagsEXT flags;
size_t initialDataSize;
const void* pInitialData;
} VkValidationCacheCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
initialDataSize
is the number of bytes inpInitialData
. IfinitialDataSize
is zero, the validation cache will initially be empty. -
pInitialData
is a pointer to previously retrieved validation cache data. If the validation cache data is incompatible (as defined below) with the device, the validation cache will be initially empty. IfinitialDataSize
is zero,pInitialData
is ignored.
typedef VkFlags VkValidationCacheCreateFlagsEXT;
VkValidationCacheCreateFlagsEXT
is a bitmask type for setting a mask,
but is currently reserved for future use.
Validation cache objects can be merged using the command:
VkResult vkMergeValidationCachesEXT(
VkDevice device,
VkValidationCacheEXT dstCache,
uint32_t srcCacheCount,
const VkValidationCacheEXT* pSrcCaches);
-
device
is the logical device that owns the validation cache objects. -
dstCache
is the handle of the validation cache to merge results into. -
srcCacheCount
is the length of thepSrcCaches
array. -
pSrcCaches
is an array of validation cache handles, which will be merged intodstCache
. The previous contents ofdstCache
are included after the merge.
Note
The details of the merge operation are implementation dependent, but implementations should merge the contents of the specified validation caches and prune duplicate entries. |
Data can be retrieved from a validation cache object using the command:
VkResult vkGetValidationCacheDataEXT(
VkDevice device,
VkValidationCacheEXT validationCache,
size_t* pDataSize,
void* pData);
-
device
is the logical device that owns the validation cache. -
validationCache
is the validation cache to retrieve data from. -
pDataSize
is a pointer to a value related to the amount of data in the validation cache, as described below. -
pData
is eitherNULL
or a pointer to a buffer.
If pData
is NULL
, then the maximum size of the data that can be
retrieved from the validation cache, in bytes, is returned in
pDataSize
.
Otherwise, pDataSize
must point to a variable set by the user to the
size of the buffer, in bytes, pointed to by pData
, and on return the
variable is overwritten with the amount of data actually written to
pData
.
If pDataSize
is less than the maximum size that can be retrieved by
the validation cache, at most pDataSize
bytes will be written to
pData
, and vkGetValidationCacheDataEXT
will return
VK_INCOMPLETE
.
Any data written to pData
is valid and can be provided as the
pInitialData
member of the VkValidationCacheCreateInfoEXT
structure passed to vkCreateValidationCacheEXT
.
Two calls to vkGetValidationCacheDataEXT
with the same parameters
must retrieve the same data unless a command that modifies the contents of
the cache is called between them.
Applications can store the data retrieved from the validation cache, and
use these data, possibly in a future run of the application, to populate new
validation cache objects.
The results of validation, however, may depend on the vendor ID, device ID,
driver version, and other details of the device.
To enable applications to detect when previously retrieved data is
incompatible with the device, the initial bytes written to pData
must
be a header consisting of the following members:
Offset | Size | Meaning |
---|---|---|
0 |
4 |
length in bytes of the entire validation cache header written as a stream of bytes, with the least significant byte first |
4 |
4 |
a VkValidationCacheHeaderVersionEXT value written as a stream of bytes, with the least significant byte first |
8 |
|
a layer commit ID expressed as a UUID, which uniquely identifies the version of the validation layers used to generate these validation results |
The first four bytes encode the length of the entire validation cache header, in bytes. This value includes all fields in the header including the validation cache version field and the size of the length field.
The next four bytes encode the validation cache version, as described for VkValidationCacheHeaderVersionEXT. A consumer of the validation cache should use the cache version to interpret the remainder of the cache header.
If pDataSize
is less than what is necessary to store this header,
nothing will be written to pData
and zero will be written to
pDataSize
.
Possible values of the second group of four bytes in the header returned by vkGetValidationCacheDataEXT, encoding the validation cache version, are:
typedef enum VkValidationCacheHeaderVersionEXT {
VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT = 1,
} VkValidationCacheHeaderVersionEXT;
-
VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT
specifies version one of the validation cache.
To destroy a validation cache, call:
void vkDestroyValidationCacheEXT(
VkDevice device,
VkValidationCacheEXT validationCache,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the validation cache object. -
validationCache
is the handle of the validation cache to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
9. Pipelines
The following figure shows a block diagram of the Vulkan pipelines. Some Vulkan commands specify geometric objects to be drawn or computational work to be performed, while others specify state controlling how objects are handled by the various pipeline stages, or control data transfer between memory organized as images and buffers. Commands are effectively sent through a processing pipeline, either a graphics pipeline or a compute pipeline.
The graphics pipeline can be operated in two modes, as either primitive shading or mesh shading pipeline.
Primitive Shading
The first stage of the graphics pipeline (Input Assembler) assembles vertices to form geometric primitives such as points, lines, and triangles, based on a requested primitive topology. In the next stage (Vertex Shader) vertices can be transformed, computing positions and attributes for each vertex. If tessellation and/or geometry shaders are supported, they can then generate multiple primitives from a single input primitive, possibly changing the primitive topology or generating additional attribute data in the process.
Mesh Shading
When using the mesh shading pipeline input primitives are not assembled implicitly, but explicitly through the (Mesh Shader). The work on the mesh pipeline is initiated by the application drawing a set of mesh tasks.
If an optional (Task Shader) is active, each task triggers the execution of a task shader workgroup that will generate a new set of tasks upon completion. Each of these spawned tasks, or each of the original dispatched tasks if no task shader is present, triggers the execution of a mesh shader workgroup that produces an output mesh with a variable-sized number of primitives assembled from vertices stored in the output mesh.
Common
The final resulting primitives are clipped to a clip volume in preparation for the next stage, Rasterization. The rasterizer produces a series of framebuffer addresses and values using a two-dimensional description of a point, line segment, or triangle. Each fragment so produced is fed to the next stage (Fragment Shader) that performs operations on individual fragments before they finally alter the framebuffer. These operations include conditional updates into the framebuffer based on incoming and previously stored depth values (to effect depth buffering), blending of incoming fragment colors with stored colors, as well as masking, stenciling, and other logical operations on fragment values.
Framebuffer operations read and write the color and depth/stencil attachments of the framebuffer for a given subpass of a render pass instance. The attachments can be used as input attachments in the fragment shader in a later subpass of the same render pass.
The compute pipeline is a separate pipeline from the graphics pipeline, which operates on one-, two-, or three-dimensional workgroups which can read from and write to buffer and image memory.
This ordering is meant only as a tool for describing Vulkan, not as a strict rule of how Vulkan is implemented, and we present it only as a means to organize the various operations of the pipelines. Actual ordering guarantees between pipeline stages are explained in detail in the synchronization chapter.
Each pipeline is controlled by a monolithic object created from a description of all of the shader stages and any relevant fixed-function stages. Linking the whole pipeline together allows the optimization of shaders based on their input/outputs and eliminates expensive draw time state validation.
A pipeline object is bound to the current state using vkCmdBindPipeline. Any pipeline object state that is specified as dynamic is not applied to the current state when the pipeline object is bound, but is instead set by dynamic state setting commands.
No state, including dynamic state, is inherited from one command buffer to another.
Compute and graphics pipelines are each represented by VkPipeline
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkPipeline)
9.1. Compute Pipelines
Compute pipelines consist of a single static compute shader stage and the pipeline layout.
The compute pipeline represents a compute shader and is created by calling
vkCreateComputePipelines
with module
and pName
selecting
an entry point from a shader module, where that entry point defines a valid
compute shader, in the VkPipelineShaderStageCreateInfo
structure
contained within the VkComputePipelineCreateInfo
structure.
To create compute pipelines, call:
VkResult vkCreateComputePipelines(
VkDevice device,
VkPipelineCache pipelineCache,
uint32_t createInfoCount,
const VkComputePipelineCreateInfo* pCreateInfos,
const VkAllocationCallbacks* pAllocator,
VkPipeline* pPipelines);
-
device
is the logical device that creates the compute pipelines. -
pipelineCache
is either VK_NULL_HANDLE, indicating that pipeline caching is disabled; or the handle of a valid pipeline cache object, in which case use of that cache is enabled for the duration of the command. -
createInfoCount
is the length of thepCreateInfos
andpPipelines
arrays. -
pCreateInfos
is an array of VkComputePipelineCreateInfo structures. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pPipelines
is a pointer to an array in which the resulting compute pipeline objects are returned.editing-noteTODO (Jon) - Should we say something like “the i’th element of the
pPipelines
array is created based on the corresponding element of thepCreateInfos
array”? Also for vkCreateGraphicsPipelines below.
The VkComputePipelineCreateInfo
structure is defined as:
typedef struct VkComputePipelineCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineCreateFlags flags;
VkPipelineShaderStageCreateInfo stage;
VkPipelineLayout layout;
VkPipeline basePipelineHandle;
int32_t basePipelineIndex;
} VkComputePipelineCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkPipelineCreateFlagBits specifying how the pipeline will be generated. -
stage
is a VkPipelineShaderStageCreateInfo describing the compute shader. -
layout
is the description of binding locations used by both the pipeline and descriptor sets used with the pipeline. -
basePipelineHandle
is a pipeline to derive from -
basePipelineIndex
is an index into thepCreateInfos
parameter to use as a pipeline to derive from
The parameters basePipelineHandle
and basePipelineIndex
are
described in more detail in Pipeline
Derivatives.
stage
points to a structure of type
VkPipelineShaderStageCreateInfo
.
The VkPipelineShaderStageCreateInfo
structure is defined as:
typedef struct VkPipelineShaderStageCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineShaderStageCreateFlags flags;
VkShaderStageFlagBits stage;
VkShaderModule module;
const char* pName;
const VkSpecializationInfo* pSpecializationInfo;
} VkPipelineShaderStageCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
stage
is a VkShaderStageFlagBits value specifying a single pipeline stage. -
module
is a VkShaderModule object that contains the shader for this stage. -
pName
is a pointer to a null-terminated UTF-8 string specifying the entry point name of the shader for this stage. -
pSpecializationInfo
is a pointer to VkSpecializationInfo, as described in Specialization Constants, and can beNULL
.
typedef VkFlags VkPipelineShaderStageCreateFlags;
VkPipelineShaderStageCreateFlags
is a bitmask type for setting a mask,
but is currently reserved for future use.
Commands and structures which need to specify one or more shader stages do so using a bitmask whose bits correspond to stages. Bits which can be set to specify shader stages are:
typedef enum VkShaderStageFlagBits {
VK_SHADER_STAGE_VERTEX_BIT = 0x00000001,
VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT = 0x00000002,
VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT = 0x00000004,
VK_SHADER_STAGE_GEOMETRY_BIT = 0x00000008,
VK_SHADER_STAGE_FRAGMENT_BIT = 0x00000010,
VK_SHADER_STAGE_COMPUTE_BIT = 0x00000020,
VK_SHADER_STAGE_ALL_GRAPHICS = 0x0000001F,
VK_SHADER_STAGE_ALL = 0x7FFFFFFF,
VK_SHADER_STAGE_RAYGEN_BIT_NV = 0x00000100,
VK_SHADER_STAGE_ANY_HIT_BIT_NV = 0x00000200,
VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV = 0x00000400,
VK_SHADER_STAGE_MISS_BIT_NV = 0x00000800,
VK_SHADER_STAGE_INTERSECTION_BIT_NV = 0x00001000,
VK_SHADER_STAGE_CALLABLE_BIT_NV = 0x00002000,
VK_SHADER_STAGE_TASK_BIT_NV = 0x00000040,
VK_SHADER_STAGE_MESH_BIT_NV = 0x00000080,
} VkShaderStageFlagBits;
-
VK_SHADER_STAGE_VERTEX_BIT
specifies the vertex stage. -
VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT
specifies the tessellation control stage. -
VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT
specifies the tessellation evaluation stage. -
VK_SHADER_STAGE_GEOMETRY_BIT
specifies the geometry stage. -
VK_SHADER_STAGE_FRAGMENT_BIT
specifies the fragment stage. -
VK_SHADER_STAGE_COMPUTE_BIT
specifies the compute stage. -
VK_SHADER_STAGE_TASK_BIT_NV
specifies the task stage. -
VK_SHADER_STAGE_MESH_BIT_NV
specifies the mesh stage. -
VK_SHADER_STAGE_ALL_GRAPHICS
is a combination of bits used as shorthand to specify all graphics stages defined above (excluding the compute stage). -
VK_SHADER_STAGE_ALL
is a combination of bits used as shorthand to specify all shader stages supported by the device, including all additional stages which are introduced by extensions. -
VK_SHADER_STAGE_RAYGEN_BIT_NV
specifies the ray generation stage. -
VK_SHADER_STAGE_ANY_HIT_BIT_NV
specifies the any hit stage. -
VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV
specifies the closest hit stage. -
VK_SHADER_STAGE_MISS_BIT_NV
specifies the miss stage. -
VK_SHADER_STAGE_INTERSECTION_BIT_NV
specifies the intersection stage. -
VK_SHADER_STAGE_CALLABLE_BIT_NV
specifies the callable stage.
Note
|
typedef VkFlags VkShaderStageFlags;
VkShaderStageFlags
is a bitmask type for setting a mask of zero or
more VkShaderStageFlagBits.
9.2. Graphics Pipelines
Graphics pipelines consist of multiple shader stages, multiple fixed-function pipeline stages, and a pipeline layout.
To create graphics pipelines, call:
VkResult vkCreateGraphicsPipelines(
VkDevice device,
VkPipelineCache pipelineCache,
uint32_t createInfoCount,
const VkGraphicsPipelineCreateInfo* pCreateInfos,
const VkAllocationCallbacks* pAllocator,
VkPipeline* pPipelines);
-
device
is the logical device that creates the graphics pipelines. -
pipelineCache
is either VK_NULL_HANDLE, indicating that pipeline caching is disabled; or the handle of a valid pipeline cache object, in which case use of that cache is enabled for the duration of the command. -
createInfoCount
is the length of thepCreateInfos
andpPipelines
arrays. -
pCreateInfos
is an array of VkGraphicsPipelineCreateInfo structures. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pPipelines
is a pointer to an array in which the resulting graphics pipeline objects are returned.
The VkGraphicsPipelineCreateInfo structure includes an array of shader create info structures containing all the desired active shader stages, as well as creation info to define all relevant fixed-function stages, and a pipeline layout.
The VkGraphicsPipelineCreateInfo
structure is defined as:
typedef struct VkGraphicsPipelineCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineCreateFlags flags;
uint32_t stageCount;
const VkPipelineShaderStageCreateInfo* pStages;
const VkPipelineVertexInputStateCreateInfo* pVertexInputState;
const VkPipelineInputAssemblyStateCreateInfo* pInputAssemblyState;
const VkPipelineTessellationStateCreateInfo* pTessellationState;
const VkPipelineViewportStateCreateInfo* pViewportState;
const VkPipelineRasterizationStateCreateInfo* pRasterizationState;
const VkPipelineMultisampleStateCreateInfo* pMultisampleState;
const VkPipelineDepthStencilStateCreateInfo* pDepthStencilState;
const VkPipelineColorBlendStateCreateInfo* pColorBlendState;
const VkPipelineDynamicStateCreateInfo* pDynamicState;
VkPipelineLayout layout;
VkRenderPass renderPass;
uint32_t subpass;
VkPipeline basePipelineHandle;
int32_t basePipelineIndex;
} VkGraphicsPipelineCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkPipelineCreateFlagBits specifying how the pipeline will be generated. -
stageCount
is the number of entries in thepStages
array. -
pStages
is an array of sizestageCount
structures of type VkPipelineShaderStageCreateInfo describing the set of the shader stages to be included in the graphics pipeline. -
pVertexInputState
is a pointer to an instance of the VkPipelineVertexInputStateCreateInfo structure. It is ignored if the pipeline includes a mesh shader stage. -
pInputAssemblyState
is a pointer to an instance of the VkPipelineInputAssemblyStateCreateInfo structure which determines input assembly behavior, as described in Drawing Commands. It is ignored if the pipeline includes a mesh shader stage. -
pTessellationState
is a pointer to an instance of the VkPipelineTessellationStateCreateInfo structure, and is ignored if the pipeline does not include a tessellation control shader stage and tessellation evaluation shader stage. -
pViewportState
is a pointer to an instance of the VkPipelineViewportStateCreateInfo structure, and is ignored if the pipeline has rasterization disabled. -
pRasterizationState
is a pointer to an instance of the VkPipelineRasterizationStateCreateInfo structure. -
pMultisampleState
is a pointer to an instance of the VkPipelineMultisampleStateCreateInfo, and is ignored if the pipeline has rasterization disabled. -
pDepthStencilState
is a pointer to an instance of the VkPipelineDepthStencilStateCreateInfo structure, and is ignored if the pipeline has rasterization disabled or if the subpass of the render pass the pipeline is created against does not use a depth/stencil attachment. -
pColorBlendState
is a pointer to an instance of the VkPipelineColorBlendStateCreateInfo structure, and is ignored if the pipeline has rasterization disabled or if the subpass of the render pass the pipeline is created against does not use any color attachments. -
pDynamicState
is a pointer to VkPipelineDynamicStateCreateInfo and is used to indicate which properties of the pipeline state object are dynamic and can be changed independently of the pipeline state. This can beNULL
, which means no state in the pipeline is considered dynamic. -
layout
is the description of binding locations used by both the pipeline and descriptor sets used with the pipeline. -
renderPass
is a handle to a render pass object describing the environment in which the pipeline will be used; the pipeline must only be used with an instance of any render pass compatible with the one provided. See Render Pass Compatibility for more information. -
subpass
is the index of the subpass in the render pass where this pipeline will be used. -
basePipelineHandle
is a pipeline to derive from. -
basePipelineIndex
is an index into thepCreateInfos
parameter to use as a pipeline to derive from.
The parameters basePipelineHandle
and basePipelineIndex
are
described in more detail in Pipeline
Derivatives.
pStages
points to an array of VkPipelineShaderStageCreateInfo
structures, which were previously described in Compute
Pipelines.
pDynamicState
points to a structure of type
VkPipelineDynamicStateCreateInfo.
If any shader stage fails to compile,
the compile log will be reported back to the application, and
VK_ERROR_INVALID_SHADER_NV
will be generated.
Possible values of the flags
member of
VkGraphicsPipelineCreateInfo, VkComputePipelineCreateInfo, and
VkRayTracingPipelineCreateInfoNV,
specifying how a pipeline is created, are:
typedef enum VkPipelineCreateFlagBits {
VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT = 0x00000001,
VK_PIPELINE_CREATE_ALLOW_DERIVATIVES_BIT = 0x00000002,
VK_PIPELINE_CREATE_DERIVATIVE_BIT = 0x00000004,
VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT = 0x00000008,
VK_PIPELINE_CREATE_DISPATCH_BASE = 0x00000010,
VK_PIPELINE_CREATE_DEFER_COMPILE_BIT_NV = 0x00000020,
VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT_KHR = VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT,
VK_PIPELINE_CREATE_DISPATCH_BASE_KHR = VK_PIPELINE_CREATE_DISPATCH_BASE,
} VkPipelineCreateFlagBits;
-
VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT
specifies that the created pipeline will not be optimized. Using this flag may reduce the time taken to create the pipeline. -
VK_PIPELINE_CREATE_ALLOW_DERIVATIVES_BIT
specifies that the pipeline to be created is allowed to be the parent of a pipeline that will be created in a subsequent call to vkCreateGraphicsPipelines or vkCreateComputePipelines. -
VK_PIPELINE_CREATE_DERIVATIVE_BIT
specifies that the pipeline to be created will be a child of a previously created parent pipeline. -
VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT
specifies that any shader input variables decorated asViewIndex
will be assigned values as if they were decorated asDeviceIndex
. -
VK_PIPELINE_CREATE_DISPATCH_BASE
specifies that a compute pipeline can be used with vkCmdDispatchBase with a non-zero base workgroup. -
VK_PIPELINE_CREATE_DEFER_COMPILE_BIT_NV
specifies that a pipeline is created with all shaders in the deferred state. Before using the pipeline the application must call vkCompileDeferredNV exactly once on each shader in the pipeline before using the pipeline.
It is valid to set both VK_PIPELINE_CREATE_ALLOW_DERIVATIVES_BIT
and
VK_PIPELINE_CREATE_DERIVATIVE_BIT
.
This allows a pipeline to be both a parent and possibly a child in a
pipeline hierarchy.
See Pipeline Derivatives for more
information.
typedef VkFlags VkPipelineCreateFlags;
VkPipelineCreateFlags
is a bitmask type for setting a mask of zero or
more VkPipelineCreateFlagBits.
The VkPipelineDynamicStateCreateInfo
structure is defined as:
typedef struct VkPipelineDynamicStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineDynamicStateCreateFlags flags;
uint32_t dynamicStateCount;
const VkDynamicState* pDynamicStates;
} VkPipelineDynamicStateCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
dynamicStateCount
is the number of elements in thepDynamicStates
array. -
pDynamicStates
is an array of VkDynamicState values specifying which pieces of pipeline state will use the values from dynamic state commands rather than from pipeline state creation info.
typedef VkFlags VkPipelineDynamicStateCreateFlags;
VkPipelineDynamicStateCreateFlags
is a bitmask type for setting a
mask, but is currently reserved for future use.
The source of different pieces of dynamic state is specified by the
VkPipelineDynamicStateCreateInfo::pDynamicStates
property of the
currently active pipeline, each of whose elements must be one of the
values:
typedef enum VkDynamicState {
VK_DYNAMIC_STATE_VIEWPORT = 0,
VK_DYNAMIC_STATE_SCISSOR = 1,
VK_DYNAMIC_STATE_LINE_WIDTH = 2,
VK_DYNAMIC_STATE_DEPTH_BIAS = 3,
VK_DYNAMIC_STATE_BLEND_CONSTANTS = 4,
VK_DYNAMIC_STATE_DEPTH_BOUNDS = 5,
VK_DYNAMIC_STATE_STENCIL_COMPARE_MASK = 6,
VK_DYNAMIC_STATE_STENCIL_WRITE_MASK = 7,
VK_DYNAMIC_STATE_STENCIL_REFERENCE = 8,
VK_DYNAMIC_STATE_VIEWPORT_W_SCALING_NV = 1000087000,
VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT = 1000099000,
VK_DYNAMIC_STATE_SAMPLE_LOCATIONS_EXT = 1000143000,
VK_DYNAMIC_STATE_VIEWPORT_SHADING_RATE_PALETTE_NV = 1000164004,
VK_DYNAMIC_STATE_VIEWPORT_COARSE_SAMPLE_ORDER_NV = 1000164006,
VK_DYNAMIC_STATE_EXCLUSIVE_SCISSOR_NV = 1000205001,
} VkDynamicState;
-
VK_DYNAMIC_STATE_VIEWPORT
specifies that thepViewports
state inVkPipelineViewportStateCreateInfo
will be ignored and must be set dynamically with vkCmdSetViewport before any draw commands. The number of viewports used by a pipeline is still specified by theviewportCount
member ofVkPipelineViewportStateCreateInfo
. -
VK_DYNAMIC_STATE_SCISSOR
specifies that thepScissors
state inVkPipelineViewportStateCreateInfo
will be ignored and must be set dynamically with vkCmdSetScissor before any draw commands. The number of scissor rectangles used by a pipeline is still specified by thescissorCount
member ofVkPipelineViewportStateCreateInfo
. -
VK_DYNAMIC_STATE_LINE_WIDTH
specifies that thelineWidth
state inVkPipelineRasterizationStateCreateInfo
will be ignored and must be set dynamically with vkCmdSetLineWidth before any draw commands that generate line primitives for the rasterizer. -
VK_DYNAMIC_STATE_DEPTH_BIAS
specifies that thedepthBiasConstantFactor
,depthBiasClamp
anddepthBiasSlopeFactor
states inVkPipelineRasterizationStateCreateInfo
will be ignored and must be set dynamically with vkCmdSetDepthBias before any draws are performed withdepthBiasEnable
inVkPipelineRasterizationStateCreateInfo
set toVK_TRUE
. -
VK_DYNAMIC_STATE_BLEND_CONSTANTS
specifies that theblendConstants
state inVkPipelineColorBlendStateCreateInfo
will be ignored and must be set dynamically with vkCmdSetBlendConstants before any draws are performed with a pipeline state withVkPipelineColorBlendAttachmentState
memberblendEnable
set toVK_TRUE
and any of the blend functions using a constant blend color. -
VK_DYNAMIC_STATE_DEPTH_BOUNDS
specifies that theminDepthBounds
andmaxDepthBounds
states of VkPipelineDepthStencilStateCreateInfo will be ignored and must be set dynamically with vkCmdSetDepthBounds before any draws are performed with a pipeline state withVkPipelineDepthStencilStateCreateInfo
memberdepthBoundsTestEnable
set toVK_TRUE
. -
VK_DYNAMIC_STATE_STENCIL_COMPARE_MASK
specifies that thecompareMask
state inVkPipelineDepthStencilStateCreateInfo
for bothfront
andback
will be ignored and must be set dynamically with vkCmdSetStencilCompareMask before any draws are performed with a pipeline state withVkPipelineDepthStencilStateCreateInfo
memberstencilTestEnable
set toVK_TRUE
-
VK_DYNAMIC_STATE_STENCIL_WRITE_MASK
specifies that thewriteMask
state inVkPipelineDepthStencilStateCreateInfo
for bothfront
andback
will be ignored and must be set dynamically with vkCmdSetStencilWriteMask before any draws are performed with a pipeline state withVkPipelineDepthStencilStateCreateInfo
memberstencilTestEnable
set toVK_TRUE
-
VK_DYNAMIC_STATE_STENCIL_REFERENCE
specifies that thereference
state inVkPipelineDepthStencilStateCreateInfo
for bothfront
andback
will be ignored and must be set dynamically with vkCmdSetStencilReference before any draws are performed with a pipeline state withVkPipelineDepthStencilStateCreateInfo
memberstencilTestEnable
set toVK_TRUE
-
VK_DYNAMIC_STATE_VIEWPORT_W_SCALING_NV
specifies that thepViewportScalings
state in VkPipelineViewportWScalingStateCreateInfoNV will be ignored and must be set dynamically with vkCmdSetViewportWScalingNV before any draws are performed with a pipeline state with VkPipelineViewportWScalingStateCreateInfoNV memberviewportScalingEnable
set toVK_TRUE
-
VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT
specifies that thepDiscardRectangles
state in VkPipelineDiscardRectangleStateCreateInfoEXT will be ignored and must be set dynamically with vkCmdSetDiscardRectangleEXT before any draw or clear commands. The VkDiscardRectangleModeEXT and the number of active discard rectangles is still specified by thediscardRectangleMode
anddiscardRectangleCount
members ofVkPipelineDiscardRectangleStateCreateInfoEXT
. -
VK_DYNAMIC_STATE_SAMPLE_LOCATIONS_EXT
specifies that thesampleLocationsInfo
state in VkPipelineSampleLocationsStateCreateInfoEXT will be ignored and must be set dynamically with vkCmdSetSampleLocationsEXT before any draw or clear commands. Enabling custom sample locations is still indicated by thesampleLocationsEnable
member ofVkPipelineSampleLocationsStateCreateInfoEXT
. -
VK_DYNAMIC_STATE_EXCLUSIVE_SCISSOR_NV
specifies that thepExclusiveScissors
state inVkPipelineViewportExclusiveScissorStateCreateInfoNV
will be ignored and must be set dynamically with vkCmdSetExclusiveScissorNV before any draw commands. The number of exclusive scissor rectangles used by a pipeline is still specified by theexclusiveScissorCount
member ofVkPipelineViewportExclusiveScissorStateCreateInfoNV
. -
VK_DYNAMIC_STATE_VIEWPORT_SHADING_RATE_PALETTE_NV
specifies that thepShadingRatePalettes
state in VkPipelineViewportShadingRateImageStateCreateInfoNV will be ignored and must be set dynamically with vkCmdSetViewportShadingRatePaletteNV before any draw commands. -
VK_DYNAMIC_STATE_VIEWPORT_COARSE_SAMPLE_ORDER_NV
specifies that the coarse sample order state in VkPipelineViewportCoarseSampleOrderStateCreateInfoNV will be ignored and must be set dynamically with vkCmdSetCoarseSampleOrderNV before any draw commands.
9.2.1. Valid Combinations of Stages for Graphics Pipelines
The geometric primitive processing can either be handled on a per primitive basis by the vertex, tessellation, and geometry shader stages, or on a per mesh basis using task and mesh shader stages. If the pipeline includes a mesh shader stage, it uses the mesh pipeline, otherwise it uses the primitive pipeline.
If a task shader is omitted, the task shading stage is skipped.
If tessellation shader stages are omitted, the tessellation shading and fixed-function stages of the pipeline are skipped.
If a geometry shader is omitted, the geometry shading stage is skipped.
If a fragment shader is omitted, fragment color outputs have undefined values, and the fragment depth value is unmodified. This can be useful for depth-only rendering.
Presence of a shader stage in a pipeline is indicated by including a valid
VkPipelineShaderStageCreateInfo
with module
and pName
selecting an entry point from a shader module, where that entry point is
valid for the stage specified by stage
.
Presence of some of the fixed-function stages in the pipeline is implicitly derived from enabled shaders and provided state. For example, the fixed-function tessellator is always present when the pipeline has valid Tessellation Control and Tessellation Evaluation shaders.
-
Depth/stencil-only rendering in a subpass with no color attachments
-
Active Pipeline Shader Stages
-
Vertex Shader
-
-
Required: Fixed-Function Pipeline Stages
-
-
Color-only rendering in a subpass with no depth/stencil attachment
-
Active Pipeline Shader Stages
-
Vertex Shader
-
Fragment Shader
-
-
Required: Fixed-Function Pipeline Stages
-
-
Rendering pipeline with tessellation and geometry shaders
-
Active Pipeline Shader Stages
-
Vertex Shader
-
Tessellation Control Shader
-
Tessellation Evaluation Shader
-
Geometry Shader
-
Fragment Shader
-
-
Required: Fixed-Function Pipeline Stages
-
-
Rendering pipeline with task and mesh shaders
-
Active Pipeline Shader Stages
-
Task Shader
-
Mesh Shader
-
Fragment Shader
-
-
Required: Fixed-Function Pipeline Stages
-
9.3. Pipeline destruction
To destroy a graphics or compute pipeline, call:
void vkDestroyPipeline(
VkDevice device,
VkPipeline pipeline,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the pipeline. -
pipeline
is the handle of the pipeline to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
9.4. Multiple Pipeline Creation
Multiple pipelines can be created simultaneously by passing an array of
VkGraphicsPipelineCreateInfo
or VkComputePipelineCreateInfo
structures into the vkCreateGraphicsPipelines and
vkCreateComputePipelines commands, respectively.
Applications can group together similar pipelines to be created in a single
call, and implementations are encouraged to look for reuse opportunities
within a group-create.
When an application attempts to create many pipelines in a single command,
it is possible that some subset may fail creation.
In that case, the corresponding entries in the pPipelines
output array
will be filled with VK_NULL_HANDLE values.
If any pipeline fails creation (for example, due to out of memory errors),
the vkCreate*Pipelines
commands will return an error code.
The implementation will attempt to create all pipelines, and only return
VK_NULL_HANDLE values for those that actually failed.
9.5. Pipeline Derivatives
A pipeline derivative is a child pipeline created from a parent pipeline, where the child and parent are expected to have much commonality. The goal of derivative pipelines is that they be cheaper to create using the parent as a starting point, and that it be more efficient (on either host or device) to switch/bind between children of the same parent.
A derivative pipeline is created by setting the
VK_PIPELINE_CREATE_DERIVATIVE_BIT
flag in the
Vk*PipelineCreateInfo
structure.
If this is set, then exactly one of basePipelineHandle
or
basePipelineIndex
members of the structure must have a valid
handle/index, and specifies the parent pipeline.
If basePipelineHandle
is used, the parent pipeline must have already
been created.
If basePipelineIndex
is used, then the parent is being created in the
same command.
VK_NULL_HANDLE acts as the invalid handle for
basePipelineHandle
, and -1 is the invalid index for
basePipelineIndex
.
If basePipelineIndex
is used, the base pipeline must appear earlier
in the array.
The base pipeline must have been created with the
VK_PIPELINE_CREATE_ALLOW_DERIVATIVES_BIT
flag set.
9.6. Pipeline Cache
Pipeline cache objects allow the result of pipeline construction to be reused between pipelines and between runs of an application. Reuse between pipelines is achieved by passing the same pipeline cache object when creating multiple related pipelines. Reuse across runs of an application is achieved by retrieving pipeline cache contents in one run of an application, saving the contents, and using them to preinitialize a pipeline cache on a subsequent run. The contents of the pipeline cache objects are managed by the implementation. Applications can manage the host memory consumed by a pipeline cache object and control the amount of data retrieved from a pipeline cache object.
Pipeline cache objects are represented by VkPipelineCache
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkPipelineCache)
To create pipeline cache objects, call:
VkResult vkCreatePipelineCache(
VkDevice device,
const VkPipelineCacheCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkPipelineCache* pPipelineCache);
-
device
is the logical device that creates the pipeline cache object. -
pCreateInfo
is a pointer to a VkPipelineCacheCreateInfo structure that contains the initial parameters for the pipeline cache object. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pPipelineCache
is a pointer to a VkPipelineCache handle in which the resulting pipeline cache object is returned.
Note
Applications can track and manage the total host memory size of a pipeline
cache object using the |
Once created, a pipeline cache can be passed to the
vkCreateGraphicsPipelines
and vkCreateComputePipelines
commands.
If the pipeline cache passed into these commands is not
VK_NULL_HANDLE, the implementation will query it for possible reuse
opportunities and update it with new content.
The use of the pipeline cache object in these commands is internally
synchronized, and the same pipeline cache object can be used in multiple
threads simultaneously.
Note
Implementations should make every effort to limit any critical sections to
the actual accesses to the cache, which is expected to be significantly
shorter than the duration of the |
The VkPipelineCacheCreateInfo
structure is defined as:
typedef struct VkPipelineCacheCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineCacheCreateFlags flags;
size_t initialDataSize;
const void* pInitialData;
} VkPipelineCacheCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
initialDataSize
is the number of bytes inpInitialData
. IfinitialDataSize
is zero, the pipeline cache will initially be empty. -
pInitialData
is a pointer to previously retrieved pipeline cache data. If the pipeline cache data is incompatible (as defined below) with the device, the pipeline cache will be initially empty. IfinitialDataSize
is zero,pInitialData
is ignored.
typedef VkFlags VkPipelineCacheCreateFlags;
VkPipelineCacheCreateFlags
is a bitmask type for setting a mask, but
is currently reserved for future use.
Pipeline cache objects can be merged using the command:
VkResult vkMergePipelineCaches(
VkDevice device,
VkPipelineCache dstCache,
uint32_t srcCacheCount,
const VkPipelineCache* pSrcCaches);
-
device
is the logical device that owns the pipeline cache objects. -
dstCache
is the handle of the pipeline cache to merge results into. -
srcCacheCount
is the length of thepSrcCaches
array. -
pSrcCaches
is an array of pipeline cache handles, which will be merged intodstCache
. The previous contents ofdstCache
are included after the merge.
Note
The details of the merge operation are implementation dependent, but implementations should merge the contents of the specified pipelines and prune duplicate entries. |
Data can be retrieved from a pipeline cache object using the command:
VkResult vkGetPipelineCacheData(
VkDevice device,
VkPipelineCache pipelineCache,
size_t* pDataSize,
void* pData);
-
device
is the logical device that owns the pipeline cache. -
pipelineCache
is the pipeline cache to retrieve data from. -
pDataSize
is a pointer to a value related to the amount of data in the pipeline cache, as described below. -
pData
is eitherNULL
or a pointer to a buffer.
If pData
is NULL
, then the maximum size of the data that can be
retrieved from the pipeline cache, in bytes, is returned in pDataSize
.
Otherwise, pDataSize
must point to a variable set by the user to the
size of the buffer, in bytes, pointed to by pData
, and on return the
variable is overwritten with the amount of data actually written to
pData
.
If pDataSize
is less than the maximum size that can be retrieved by
the pipeline cache, at most pDataSize
bytes will be written to
pData
, and vkGetPipelineCacheData
will return
VK_INCOMPLETE
.
Any data written to pData
is valid and can be provided as the
pInitialData
member of the VkPipelineCacheCreateInfo
structure
passed to vkCreatePipelineCache
.
Two calls to vkGetPipelineCacheData
with the same parameters must
retrieve the same data unless a command that modifies the contents of the
cache is called between them.
Applications can store the data retrieved from the pipeline cache, and use
these data, possibly in a future run of the application, to populate new
pipeline cache objects.
The results of pipeline compiles, however, may depend on the vendor ID,
device ID, driver version, and other details of the device.
To enable applications to detect when previously retrieved data is
incompatible with the device, the initial bytes written to pData
must
be a header consisting of the following members:
Offset | Size | Meaning |
---|---|---|
0 |
4 |
length in bytes of the entire pipeline cache header written as a stream of bytes, with the least significant byte first |
4 |
4 |
a VkPipelineCacheHeaderVersion value written as a stream of bytes, with the least significant byte first |
8 |
4 |
a vendor ID equal to
|
12 |
4 |
a device ID equal to
|
16 |
|
a pipeline cache ID equal to
|
The first four bytes encode the length of the entire pipeline cache header, in bytes. This value includes all fields in the header including the pipeline cache version field and the size of the length field.
The next four bytes encode the pipeline cache version, as described for VkPipelineCacheHeaderVersion. A consumer of the pipeline cache should use the cache version to interpret the remainder of the cache header.
If pDataSize
is less than what is necessary to store this header,
nothing will be written to pData
and zero will be written to
pDataSize
.
Possible values of the second group of four bytes in the header returned by vkGetPipelineCacheData, encoding the pipeline cache version, are:
typedef enum VkPipelineCacheHeaderVersion {
VK_PIPELINE_CACHE_HEADER_VERSION_ONE = 1,
} VkPipelineCacheHeaderVersion;
-
VK_PIPELINE_CACHE_HEADER_VERSION_ONE
specifies version one of the pipeline cache.
To destroy a pipeline cache, call:
void vkDestroyPipelineCache(
VkDevice device,
VkPipelineCache pipelineCache,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the pipeline cache object. -
pipelineCache
is the handle of the pipeline cache to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
9.7. Specialization Constants
Specialization constants are a mechanism whereby constants in a SPIR-V
module can have their constant value specified at the time the
VkPipeline
is created.
This allows a SPIR-V module to have constants that can be modified while
executing an application that uses the Vulkan API.
Note
Specialization constants are useful to allow a compute shader to have its local workgroup size changed at runtime by the user, for example. |
Each instance of the VkPipelineShaderStageCreateInfo
structure
contains a parameter pSpecializationInfo
, which can be NULL
to
indicate no specialization constants, or point to a
VkSpecializationInfo
structure.
The VkSpecializationInfo
structure is defined as:
typedef struct VkSpecializationInfo {
uint32_t mapEntryCount;
const VkSpecializationMapEntry* pMapEntries;
size_t dataSize;
const void* pData;
} VkSpecializationInfo;
-
mapEntryCount
is the number of entries in thepMapEntries
array. -
pMapEntries
is a pointer to an array ofVkSpecializationMapEntry
which maps constant IDs to offsets inpData
. -
dataSize
is the byte size of thepData
buffer. -
pData
contains the actual constant values to specialize with.
pMapEntries
points to a structure of type
VkSpecializationMapEntry.
The VkSpecializationMapEntry
structure is defined as:
typedef struct VkSpecializationMapEntry {
uint32_t constantID;
uint32_t offset;
size_t size;
} VkSpecializationMapEntry;
-
constantID
is the ID of the specialization constant in SPIR-V. -
offset
is the byte offset of the specialization constant value within the supplied data buffer. -
size
is the byte size of the specialization constant value within the supplied data buffer.
If a constantID
value is not a specialization constant ID used in the
shader, that map entry does not affect the behavior of the pipeline.
In human readable SPIR-V:
OpDecorate %x SpecId 13 ; decorate .x component of WorkgroupSize with ID 13
OpDecorate %y SpecId 42 ; decorate .y component of WorkgroupSize with ID 42
OpDecorate %z SpecId 3 ; decorate .z component of WorkgroupSize with ID 3
OpDecorate %wgsize BuiltIn WorkgroupSize ; decorate WorkgroupSize onto constant
%i32 = OpTypeInt 32 0 ; declare an unsigned 32-bit type
%uvec3 = OpTypeVector %i32 3 ; declare a 3 element vector type of unsigned 32-bit
%x = OpSpecConstant %i32 1 ; declare the .x component of WorkgroupSize
%y = OpSpecConstant %i32 1 ; declare the .y component of WorkgroupSize
%z = OpSpecConstant %i32 1 ; declare the .z component of WorkgroupSize
%wgsize = OpSpecConstantComposite %uvec3 %x %y %z ; declare WorkgroupSize
From the above we have three specialization constants, one for each of the x, y & z elements of the WorkgroupSize vector.
Now to specialize the above via the specialization constants mechanism:
const VkSpecializationMapEntry entries[] =
{
{
13, // constantID
0 * sizeof(uint32_t), // offset
sizeof(uint32_t) // size
},
{
42, // constantID
1 * sizeof(uint32_t), // offset
sizeof(uint32_t) // size
},
{
3, // constantID
2 * sizeof(uint32_t), // offset
sizeof(uint32_t) // size
}
};
const uint32_t data[] = { 16, 8, 4 }; // our workgroup size is 16x8x4
const VkSpecializationInfo info =
{
3, // mapEntryCount
entries, // pMapEntries
3 * sizeof(uint32_t), // dataSize
data, // pData
};
Then when calling vkCreateComputePipelines, and passing the
VkSpecializationInfo
we defined as the pSpecializationInfo
parameter of VkPipelineShaderStageCreateInfo, we will create a compute
pipeline with the runtime specified local workgroup size.
Another example would be that an application has a SPIR-V module that has some platform-dependent constants they wish to use.
In human readable SPIR-V:
OpDecorate %1 SpecId 0 ; decorate our signed 32-bit integer constant
OpDecorate %2 SpecId 12 ; decorate our 32-bit floating-point constant
%i32 = OpTypeInt 32 1 ; declare a signed 32-bit type
%float = OpTypeFloat 32 ; declare a 32-bit floating-point type
%1 = OpSpecConstant %i32 -1 ; some signed 32-bit integer constant
%2 = OpSpecConstant %float 0.5 ; some 32-bit floating-point constant
From the above we have two specialization constants, one is a signed 32-bit integer and the second is a 32-bit floating-point.
Now to specialize the above via the specialization constants mechanism:
struct SpecializationData {
int32_t data0;
float data1;
};
const VkSpecializationMapEntry entries[] =
{
{
0, // constantID
offsetof(SpecializationData, data0), // offset
sizeof(SpecializationData::data0) // size
},
{
12, // constantID
offsetof(SpecializationData, data1), // offset
sizeof(SpecializationData::data1) // size
}
};
SpecializationData data;
data.data0 = -42; // set the data for the 32-bit integer
data.data1 = 42.0f; // set the data for the 32-bit floating-point
const VkSpecializationInfo info =
{
2, // mapEntryCount
entries, // pMapEntries
sizeof(data), // dataSize
&data, // pData
};
It is legal for a SPIR-V module with specializations to be compiled into a pipeline where no specialization info was provided. SPIR-V specialization constants contain default values such that if a specialization is not provided, the default value will be used. In the examples above, it would be valid for an application to only specialize some of the specialization constants within the SPIR-V module, and let the other constants use their default values encoded within the OpSpecConstant declarations.
9.8. Pipeline Binding
Once a pipeline has been created, it can be bound to the command buffer using the command:
void vkCmdBindPipeline(
VkCommandBuffer commandBuffer,
VkPipelineBindPoint pipelineBindPoint,
VkPipeline pipeline);
-
commandBuffer
is the command buffer that the pipeline will be bound to. -
pipelineBindPoint
is a VkPipelineBindPoint value specifying whether to bind to the compute or graphics bind point. Binding one does not disturb the other. -
pipeline
is the pipeline to be bound.
Once bound, a pipeline binding affects subsequent graphics or compute
commands in the command buffer until a different pipeline is bound to the
bind point.
The pipeline bound to VK_PIPELINE_BIND_POINT_COMPUTE
controls the
behavior of vkCmdDispatch and vkCmdDispatchIndirect.
The pipeline bound to VK_PIPELINE_BIND_POINT_GRAPHICS
controls the
behavior of all drawing commands.
The pipeline bound to VK_PIPELINE_BIND_POINT_RAY_TRACING_NV
controls
the behavior of vkCmdTraceRaysNV.
No other commands are affected by the pipeline state.
Possible values of vkCmdBindPipeline::pipelineBindPoint
,
specifying the bind point of a pipeline object, are:
typedef enum VkPipelineBindPoint {
VK_PIPELINE_BIND_POINT_GRAPHICS = 0,
VK_PIPELINE_BIND_POINT_COMPUTE = 1,
VK_PIPELINE_BIND_POINT_RAY_TRACING_NV = 1000165000,
} VkPipelineBindPoint;
-
VK_PIPELINE_BIND_POINT_COMPUTE
specifies binding as a compute pipeline. -
VK_PIPELINE_BIND_POINT_GRAPHICS
specifies binding as a graphics pipeline. -
VK_PIPELINE_BIND_POINT_RAY_TRACING_NV
specifies binding as a ray tracing pipeline.
9.9. Dynamic State
When a pipeline object is bound, any pipeline object state that is not specified as dynamic is applied to the command buffer state. Pipeline object state that is specified as dynamic is not applied to the command buffer state at this time. Instead, dynamic state can be modified at any time and persists for the lifetime of the command buffer, or until modified by another dynamic state setting command or another pipeline bind.
When a pipeline object is bound, the following applies to each state parameter:
-
If the state is not specified as dynamic in the new pipeline object, then that command buffer state is overwritten by the state in the new pipeline object.
-
If the state is specified as dynamic in both the new and the previous pipeline object, then that command buffer state is not disturbed.
-
If the state is specified as dynamic in the new pipeline object but is not specified as dynamic in the previous pipeline object, then that command buffer state becomes undefined. If the state is an array, then the entire array becomes undefined.
-
If the state is an array specified as dynamic in both the new and the previous pipeline object, and the array size is not the same in both pipeline objects, then that command buffer state becomes undefined.
Dynamic state setting commands must not be issued for state that is not specified as dynamic in the bound pipeline object.
Dynamic state that does not affect the result of operations can be left undefined.
Note
For example, if blending is disabled by the pipeline object state then the dynamic color blend constants do not need to be specified in the command buffer, even if this state is specified as dynamic in the pipeline object. |
9.10. Pipeline Shader Information
Information about a particular shader that has been compiled as part of a pipeline object can be extracted by calling:
VkResult vkGetShaderInfoAMD(
VkDevice device,
VkPipeline pipeline,
VkShaderStageFlagBits shaderStage,
VkShaderInfoTypeAMD infoType,
size_t* pInfoSize,
void* pInfo);
-
device
is the device that createdpipeline
. -
pipeline
is the target of the query. -
shaderStage
identifies the particular shader within the pipeline about which information is being queried. -
infoType
describes what kind of information is being queried. -
pInfoSize
is a pointer to a value related to the amount of data the query returns, as described below. -
pInfo
is either NULL or a pointer to a buffer.
If pInfo
is NULL
, then the maximum size of the information that can
be retrieved about the shader, in bytes, is returned in pInfoSize
.
Otherwise, pInfoSize
must point to a variable set by the user to the
size of the buffer, in bytes, pointed to by pInfo
, and on return the
variable is overwritten with the amount of data actually written to
pInfo
.
If pInfoSize
is less than the maximum size that can be retrieved by
the pipeline cache, then at most pInfoSize
bytes will be written to
pInfo
, and vkGetShaderInfoAMD
will return VK_INCOMPLETE
.
Not all information is available for every shader and implementations may
not support all kinds of information for any shader.
When a certain type of information is unavailable, the function returns
VK_ERROR_FEATURE_NOT_PRESENT
.
If information is successfully and fully queried, the function will return
VK_SUCCESS
.
For infoType
VK_SHADER_INFO_TYPE_STATISTICS_AMD
, an instance of
VkShaderStatisticsInfoAMD
will be written to the buffer pointed to by
pInfo
.
This structure will be populated with statistics regarding the physical
device resources used by that shader along with other miscellaneous
information and is described in further detail below.
For infoType
VK_SHADER_INFO_TYPE_DISASSEMBLY_AMD
, pInfo
points to a UTF-8 null-terminated string containing human-readable
disassembly.
The exact formatting and contents of the disassembly string are
vendor-specific.
The formatting and contents of all other types of information, including
infoType
VK_SHADER_INFO_TYPE_BINARY_AMD
, are left to the vendor
and are not further specified by this extension.
Possible values of vkGetShaderInfoAMD::infoType
, specifying the
information being queried from a shader, are:
typedef enum VkShaderInfoTypeAMD {
VK_SHADER_INFO_TYPE_STATISTICS_AMD = 0,
VK_SHADER_INFO_TYPE_BINARY_AMD = 1,
VK_SHADER_INFO_TYPE_DISASSEMBLY_AMD = 2,
} VkShaderInfoTypeAMD;
-
VK_SHADER_INFO_TYPE_STATISTICS_AMD
specifies that device resources used by a shader will be queried. -
VK_SHADER_INFO_TYPE_BINARY_AMD
specifies that implementation-specific information will be queried. -
VK_SHADER_INFO_TYPE_DISASSEMBLY_AMD
specifies that human-readable dissassembly of a shader.
The VkShaderStatisticsInfoAMD
structure is defined as:
typedef struct VkShaderStatisticsInfoAMD {
VkShaderStageFlags shaderStageMask;
VkShaderResourceUsageAMD resourceUsage;
uint32_t numPhysicalVgprs;
uint32_t numPhysicalSgprs;
uint32_t numAvailableVgprs;
uint32_t numAvailableSgprs;
uint32_t computeWorkGroupSize[3];
} VkShaderStatisticsInfoAMD;
-
shaderStageMask
are the combination of logical shader stages contained within this shader. -
resourceUsage
is an instance of VkShaderResourceUsageAMD describing internal physical device resources used by this shader. -
numPhysicalVgprs
is the maximum number of vector instruction general-purpose registers (VGPRs) available to the physical device. -
numPhysicalSgprs
is the maximum number of scalar instruction general-purpose registers (SGPRs) available to the physical device. -
numAvailableVgprs
is the maximum limit of VGPRs made available to the shader compiler. -
numAvailableSgprs
is the maximum limit of SGPRs made available to the shader compiler. -
computeWorkGroupSize
is the local workgroup size of this shader in { X, Y, Z } dimensions.
Some implementations may merge multiple logical shader stages together in a
single shader.
In such cases, shaderStageMask
will contain a bitmask of all of the
stages that are active within that shader.
Consequently, if specifying those stages as input to
vkGetShaderInfoAMD, the same output information may be returned for
all such shader stage queries.
The number of available VGPRs and SGPRs (numAvailableVgprs
and
numAvailableSgprs
respectively) are the shader-addressable subset of
physical registers that is given as a limit to the compiler for register
assignment.
These values may further be limited by implementations due to performance
optimizations where register pressure is a bottleneck.
The VkShaderResourceUsageAMD
structure is defined as:
typedef struct VkShaderResourceUsageAMD {
uint32_t numUsedVgprs;
uint32_t numUsedSgprs;
uint32_t ldsSizePerLocalWorkGroup;
size_t ldsUsageSizeInBytes;
size_t scratchMemUsageInBytes;
} VkShaderResourceUsageAMD;
-
numUsedVgprs
is the number of vector instruction general-purpose registers used by this shader. -
numUsedSgprs
is the number of scalar instruction general-purpose registers used by this shader. -
ldsSizePerLocalWorkGroup
is the maximum local data store size per work group in bytes. -
ldsUsageSizeInBytes
is the LDS usage size in bytes per work group by this shader. -
scratchMemUsageInBytes
is the scratch memory usage in bytes by this shader.
9.11. Ray Tracing Pipeline
Ray tracing pipelines consist of multiple shader stages, fixed-function traversal stages, and a pipeline layout.
To create ray tracing pipelines, call:
VkResult vkCreateRayTracingPipelinesNV(
VkDevice device,
VkPipelineCache pipelineCache,
uint32_t createInfoCount,
const VkRayTracingPipelineCreateInfoNV* pCreateInfos,
const VkAllocationCallbacks* pAllocator,
VkPipeline* pPipelines);
-
device
is the logical device that creates the ray tracing pipelines. -
pipelineCache
is either VK_NULL_HANDLE, indicating that pipeline caching is disabled, or the handle of a valid pipeline cache object, in which case use of that cache is enabled for the duration of the command. -
createInfoCount
is the length of thepCreateInfos
andpPipelines
arrays. -
pCreateInfos
is an array ofVkRayTracingPipelineCreateInfoNV
structures. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pPipelines
is a pointer to an array in which the resulting ray tracing pipeline objects are returned.
The VkRayTracingPipelineCreateInfoNV
structure is defined as:
typedef struct VkRayTracingPipelineCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkPipelineCreateFlags flags;
uint32_t stageCount;
const VkPipelineShaderStageCreateInfo* pStages;
uint32_t groupCount;
const VkRayTracingShaderGroupCreateInfoNV* pGroups;
uint32_t maxRecursionDepth;
VkPipelineLayout layout;
VkPipeline basePipelineHandle;
int32_t basePipelineIndex;
} VkRayTracingPipelineCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkPipelineCreateFlagBits specifying how the pipeline will be generated. -
stageCount
is the number of entries in thepStages
array. -
pStages
is an array of sizestageCount
structures of type VkPipelineShaderStageCreateInfo describing the set of the shader stages to be included in the ray tracing pipeline. -
groupCount
is the number of entries in thepGroups
array. -
pGroups
is an array of sizegroupCount
structures of type VkRayTracingShaderGroupCreateInfoNV describing the set of the shader stages to be included in each shader group in the ray tracing pipeline. -
maxRecursionDepth
is the maximum recursion that will be called from this pipeline. -
layout
is the description of binding locations used by both the pipeline and descriptor sets used with the pipeline. -
basePipelineHandle
is a pipeline to derive from. -
basePipelineIndex
is an index into thepCreateInfos
parameter to use as a pipeline to derive from.
The parameters basePipelineHandle
and basePipelineIndex
are
described in more detail in Pipeline
Derivatives.
The VkRayTracingShaderGroupCreateInfoNV
structure is defined as:
typedef struct VkRayTracingShaderGroupCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkRayTracingShaderGroupTypeNV type;
uint32_t generalShader;
uint32_t closestHitShader;
uint32_t anyHitShader;
uint32_t intersectionShader;
} VkRayTracingShaderGroupCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
type
is the type of hit group specified in this structure. -
generalShader
is the index of the ray generation, miss, or callable shader fromVkRayTracingPipelineCreateInfoNV
::pStages
in the group if the shader group hastype
ofVK_RAY_TRACING_SHADER_GROUP_TYPE_GENERAL_NV
andVK_SHADER_UNUSED_NV
otherwise. -
closestHitShader
is the optional index of the closest hit shader fromVkRayTracingPipelineCreateInfoNV
::pStages
in the group if the shader group hastype
ofVK_RAY_TRACING_SHADER_GROUP_TYPE_TRIANGLES_HIT_GROUP_NV
orVK_RAY_TRACING_SHADER_GROUP_TYPE_PROCEDURAL_HIT_GROUP_NV
andVK_SHADER_UNUSED_NV
otherwise. -
anyHitShader
is the optional index of the any hit shader fromVkRayTracingPipelineCreateInfoNV
::pStages
in the group if the shader group hastype
ofVK_RAY_TRACING_SHADER_GROUP_TYPE_TRIANGLES_HIT_GROUP_NV
orVK_RAY_TRACING_SHADER_GROUP_TYPE_PROCEDURAL_HIT_GROUP_NV
andVK_SHADER_UNUSED_NV
otherwise. -
intersectionShader
is the index of the intersection shader fromVkRayTracingPipelineCreateInfoNV
::pStages
in the group if the shader group hastype
ofVK_RAY_TRACING_SHADER_GROUP_TYPE_PROCEDURAL_HIT_GROUP_NV
andVK_SHADER_UNUSED_NV
otherwise.
Possible values of type
in VkRayTracingShaderGroupCreateInfoNV
are:
typedef enum VkRayTracingShaderGroupTypeNV {
VK_RAY_TRACING_SHADER_GROUP_TYPE_GENERAL_NV = 0,
VK_RAY_TRACING_SHADER_GROUP_TYPE_TRIANGLES_HIT_GROUP_NV = 1,
VK_RAY_TRACING_SHADER_GROUP_TYPE_PROCEDURAL_HIT_GROUP_NV = 2,
} VkRayTracingShaderGroupTypeNV;
-
VK_RAY_TRACING_SHADER_GROUP_TYPE_GENERAL_NV
indicates a shader group with a singleVK_SHADER_STAGE_RAYGEN_BIT_NV
,VK_SHADER_STAGE_MISS_BIT_NV
, orVK_SHADER_STAGE_CALLABLE_BIT_NV
shader in it. -
VK_RAY_TRACING_SHADER_GROUP_TYPE_TRIANGLES_HIT_GROUP_NV
specifies a shader group that only hits triangles and must not contain an intersection shader, only closest hit and any-hit. -
VK_RAY_TRACING_SHADER_GROUP_TYPE_PROCEDURAL_HIT_GROUP_NV
specifies a shader group that only intersects with custom geometry and must contain an intersection shader and may contain closest hit and any-hit shaders.
Note
For current group types, the hit group type could be inferred from the presence or absence of the intersection shader, but we provide the type explicitly for future hit groups that do not have that property. |
To query the opaque handles of shaders in the ray tracing pipeline, call:
VkResult vkGetRayTracingShaderGroupHandlesNV(
VkDevice device,
VkPipeline pipeline,
uint32_t firstGroup,
uint32_t groupCount,
size_t dataSize,
void* pData);
-
device
is the logical device that contains the ray tracing pipeline. -
pipeline
is the ray tracing pipeline object that contains the shaders. -
firstGroup
is the index of the first group to retrieve a handle for from theVkPipelineShaderStageCreateInfo
::pGroups
array. -
groupCount
is the number of shader handles to retrieve. -
dataSize
is the size in bytes of the buffer pointed to bypData
. -
pData
is a pointer to a user-allocated buffer where the results will be written.
Ray tracing pipelines can contain more shaders than a graphics or compute pipeline, so to allow parallel compilation of shaders within a pipeline, an application can choose to defer compilation until a later point in time.
To compile a deferred shader in a pipeline call:
VkResult vkCompileDeferredNV(
VkDevice device,
VkPipeline pipeline,
uint32_t shader);
-
device
is the logical device that contains the ray tracing pipeline. -
pipeline
is the ray tracing pipeline object that contains the shaders. -
shader
is the index of the shader to compile.
10. Memory Allocation
Vulkan memory is broken up into two categories, host memory and device memory.
10.1. Host Memory
Host memory is memory needed by the Vulkan implementation for non-device-visible storage.
Note
This memory may be used to store the implementation’s representation and state of Vulkan objects. |
Vulkan provides applications the opportunity to perform host memory allocations on behalf of the Vulkan implementation. If this feature is not used, the implementation will perform its own memory allocations. Since most memory allocations are off the critical path, this is not meant as a performance feature. Rather, this can be useful for certain embedded systems, for debugging purposes (e.g. putting a guard page after all host allocations), or for memory allocation logging.
Allocators are provided by the application as a pointer to a
VkAllocationCallbacks
structure:
typedef struct VkAllocationCallbacks {
void* pUserData;
PFN_vkAllocationFunction pfnAllocation;
PFN_vkReallocationFunction pfnReallocation;
PFN_vkFreeFunction pfnFree;
PFN_vkInternalAllocationNotification pfnInternalAllocation;
PFN_vkInternalFreeNotification pfnInternalFree;
} VkAllocationCallbacks;
-
pUserData
is a value to be interpreted by the implementation of the callbacks. When any of the callbacks inVkAllocationCallbacks
are called, the Vulkan implementation will pass this value as the first parameter to the callback. This value can vary each time an allocator is passed into a command, even when the same object takes an allocator in multiple commands. -
pfnAllocation
is a pointer to an application-defined memory allocation function of type PFN_vkAllocationFunction. -
pfnReallocation
is a pointer to an application-defined memory reallocation function of type PFN_vkReallocationFunction. -
pfnFree
is a pointer to an application-defined memory free function of type PFN_vkFreeFunction. -
pfnInternalAllocation
is a pointer to an application-defined function that is called by the implementation when the implementation makes internal allocations, and it is of type PFN_vkInternalAllocationNotification. -
pfnInternalFree
is a pointer to an application-defined function that is called by the implementation when the implementation frees internal allocations, and it is of type PFN_vkInternalFreeNotification.
The type of pfnAllocation
is:
typedef void* (VKAPI_PTR *PFN_vkAllocationFunction)(
void* pUserData,
size_t size,
size_t alignment,
VkSystemAllocationScope allocationScope);
-
pUserData
is the value specified for VkAllocationCallbacks::pUserData
in the allocator specified by the application. -
size
is the size in bytes of the requested allocation. -
alignment
is the requested alignment of the allocation in bytes and must be a power of two. -
allocationScope
is a VkSystemAllocationScope value specifying the allocation scope of the lifetime of the allocation, as described here.
If pfnAllocation
is unable to allocate the requested memory, it must
return NULL
.
If the allocation was successful, it must return a valid pointer to memory
allocation containing at least size
bytes, and with the pointer value
being a multiple of alignment
.
Note
Correct Vulkan operation cannot be assumed if the application does not follow these rules. For example, |
If pfnAllocation
returns NULL
, and if the implementation is unable
to continue correct processing of the current command without the requested
allocation, it must treat this as a run-time error, and generate
VK_ERROR_OUT_OF_HOST_MEMORY
at the appropriate time for the command in
which the condition was detected, as described in Return Codes.
If the implementation is able to continue correct processing of the current
command without the requested allocation, then it may do so, and must not
generate VK_ERROR_OUT_OF_HOST_MEMORY
as a result of this failed
allocation.
The type of pfnReallocation
is:
typedef void* (VKAPI_PTR *PFN_vkReallocationFunction)(
void* pUserData,
void* pOriginal,
size_t size,
size_t alignment,
VkSystemAllocationScope allocationScope);
-
pUserData
is the value specified for VkAllocationCallbacks::pUserData
in the allocator specified by the application. -
pOriginal
must be eitherNULL
or a pointer previously returned bypfnReallocation
orpfnAllocation
of the same allocator. -
size
is the size in bytes of the requested allocation. -
alignment
is the requested alignment of the allocation in bytes and must be a power of two. -
allocationScope
is a VkSystemAllocationScope value specifying the allocation scope of the lifetime of the allocation, as described here.
pfnReallocation
must return an allocation with enough space for
size
bytes, and the contents of the original allocation from bytes
zero to min(original size, new size) - 1 must be preserved in the
returned allocation.
If size
is larger than the old size, the contents of the additional
space are undefined.
If satisfying these requirements involves creating a new allocation, then
the old allocation should be freed.
If pOriginal
is NULL
, then pfnReallocation
must behave
equivalently to a call to PFN_vkAllocationFunction with the same
parameter values (without pOriginal
).
If size
is zero, then pfnReallocation
must behave equivalently
to a call to PFN_vkFreeFunction with the same pUserData
parameter value, and pMemory
equal to pOriginal
.
If pOriginal
is non-NULL
, the implementation must ensure that
alignment
is equal to the alignment
used to originally allocate
pOriginal
.
If this function fails and pOriginal
is non-NULL
the application
must not free the old allocation.
pfnReallocation
must follow the same
rules for return values as
PFN_vkAllocationFunction
.
The type of pfnFree
is:
typedef void (VKAPI_PTR *PFN_vkFreeFunction)(
void* pUserData,
void* pMemory);
-
pUserData
is the value specified for VkAllocationCallbacks::pUserData
in the allocator specified by the application. -
pMemory
is the allocation to be freed.
pMemory
may be NULL
, which the callback must handle safely.
If pMemory
is non-NULL
, it must be a pointer previously allocated
by pfnAllocation
or pfnReallocation
.
The application should free this memory.
The type of pfnInternalAllocation
is:
typedef void (VKAPI_PTR *PFN_vkInternalAllocationNotification)(
void* pUserData,
size_t size,
VkInternalAllocationType allocationType,
VkSystemAllocationScope allocationScope);
-
pUserData
is the value specified for VkAllocationCallbacks::pUserData
in the allocator specified by the application. -
size
is the requested size of an allocation. -
allocationType
is a VkInternalAllocationType value specifying the requested type of an allocation. -
allocationScope
is a VkSystemAllocationScope value specifying the allocation scope of the lifetime of the allocation, as described here.
This is a purely informational callback.
The type of pfnInternalFree
is:
typedef void (VKAPI_PTR *PFN_vkInternalFreeNotification)(
void* pUserData,
size_t size,
VkInternalAllocationType allocationType,
VkSystemAllocationScope allocationScope);
-
pUserData
is the value specified for VkAllocationCallbacks::pUserData
in the allocator specified by the application. -
size
is the requested size of an allocation. -
allocationType
is a VkInternalAllocationType value specifying the requested type of an allocation. -
allocationScope
is a VkSystemAllocationScope value specifying the allocation scope of the lifetime of the allocation, as described here.
Each allocation has an allocation scope which defines its lifetime and
which object it is associated with.
Possible values passed to the allocationScope
parameter of the
callback functions specified by VkAllocationCallbacks, indicating the
allocation scope, are:
typedef enum VkSystemAllocationScope {
VK_SYSTEM_ALLOCATION_SCOPE_COMMAND = 0,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT = 1,
VK_SYSTEM_ALLOCATION_SCOPE_CACHE = 2,
VK_SYSTEM_ALLOCATION_SCOPE_DEVICE = 3,
VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE = 4,
} VkSystemAllocationScope;
-
VK_SYSTEM_ALLOCATION_SCOPE_COMMAND
specifies that the allocation is scoped to the duration of the Vulkan command. -
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT
specifies that the allocation is scoped to the lifetime of the Vulkan object that is being created or used. -
VK_SYSTEM_ALLOCATION_SCOPE_CACHE
specifies that the allocation is scoped to the lifetime of aVkPipelineCache
orVkValidationCacheEXT
object. -
VK_SYSTEM_ALLOCATION_SCOPE_DEVICE
specifies that the allocation is scoped to the lifetime of the Vulkan device. -
VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE
specifies that the allocation is scoped to the lifetime of the Vulkan instance.
Most Vulkan commands operate on a single object, or there is a sole object
that is being created or manipulated.
When an allocation uses an allocation scope of
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT
or
VK_SYSTEM_ALLOCATION_SCOPE_CACHE
, the allocation is scoped to the
object being created or manipulated.
When an implementation requires host memory, it will make callbacks to the application using the most specific allocator and allocation scope available:
-
If an allocation is scoped to the duration of a command, the allocator will use the
VK_SYSTEM_ALLOCATION_SCOPE_COMMAND
allocation scope. The most specific allocator available is used: if the object being created or manipulated has an allocator, that object’s allocator will be used, else if the parentVkDevice
has an allocator it will be used, else if the parentVkInstance
has an allocator it will be used. Else, -
If an allocation is associated with an object of type
VkValidationCacheEXT
orVkPipelineCache
, the allocator will use theVK_SYSTEM_ALLOCATION_SCOPE_CACHE
allocation scope. The most specific allocator available is used (cache, else device, else instance). Else, -
If an allocation is scoped to the lifetime of an object, that object is being created or manipulated by the command, and that object’s type is not
VkDevice
orVkInstance
, the allocator will use an allocation scope ofVK_SYSTEM_ALLOCATION_SCOPE_OBJECT
. The most specific allocator available is used (object, else device, else instance). Else, -
If an allocation is scoped to the lifetime of a device, the allocator will use an allocation scope of
VK_SYSTEM_ALLOCATION_SCOPE_DEVICE
. The most specific allocator available is used (device, else instance). Else, -
If the allocation is scoped to the lifetime of an instance and the instance has an allocator, its allocator will be used with an allocation scope of
VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE
. -
Otherwise an implementation will allocate memory through an alternative mechanism that is unspecified.
Objects that are allocated from pools do not specify their own allocator. When an implementation requires host memory for such an object, that memory is sourced from the object’s parent pool’s allocator.
The application is not expected to handle allocating memory that is intended
for execution by the host due to the complexities of differing security
implementations across multiple platforms.
The implementation will allocate such memory internally and invoke an
application provided informational callback when these internal
allocations are allocated and freed.
Upon allocation of executable memory, pfnInternalAllocation
will be
called.
Upon freeing executable memory, pfnInternalFree
will be called.
An implementation will only call an informational callback for executable
memory allocations and frees.
The allocationType
parameter to the pfnInternalAllocation
and
pfnInternalFree
functions may be one of the following values:
typedef enum VkInternalAllocationType {
VK_INTERNAL_ALLOCATION_TYPE_EXECUTABLE = 0,
} VkInternalAllocationType;
-
VK_INTERNAL_ALLOCATION_TYPE_EXECUTABLE
specifies that the allocation is intended for execution by the host.
An implementation must only make calls into an application-provided allocator during the execution of an API command. An implementation must only make calls into an application-provided allocator from the same thread that called the provoking API command. The implementation should not synchronize calls to any of the callbacks. If synchronization is needed, the callbacks must provide it themselves. The informational callbacks are subject to the same restrictions as the allocation callbacks.
If an implementation intends to make calls through a
VkAllocationCallbacks
structure between the time a vkCreate*
command returns and the time a corresponding vkDestroy*
command
begins, that implementation must save a copy of the allocator before the
vkCreate*
command returns.
The callback functions and any data structures they rely upon must remain
valid for the lifetime of the object they are associated with.
If an allocator is provided to a vkCreate*
command, a compatible
allocator must be provided to the corresponding vkDestroy*
command.
Two VkAllocationCallbacks
structures are compatible if memory
allocated with pfnAllocation
or pfnReallocation
in each can be
freed with pfnReallocation
or pfnFree
in the other.
An allocator must not be provided to a vkDestroy*
command if an
allocator was not provided to the corresponding vkCreate*
command.
If a non-NULL
allocator is used, the pfnAllocation
,
pfnReallocation
and pfnFree
members must be non-NULL
and
point to valid implementations of the callbacks.
An application can choose to not provide informational callbacks by setting
both pfnInternalAllocation
and pfnInternalFree
to NULL
.
pfnInternalAllocation
and pfnInternalFree
must either both be
NULL
or both be non-NULL
.
If pfnAllocation
or pfnReallocation
fail, the implementation
may fail object creation and/or generate an
VK_ERROR_OUT_OF_HOST_MEMORY
error, as appropriate.
Allocation callbacks must not call any Vulkan commands.
The following sets of rules define when an implementation is permitted to call the allocator callbacks.
pfnAllocation
or pfnReallocation
may be called in the following
situations:
-
Allocations scoped to a
VkDevice
orVkInstance
may be allocated from any API command. -
Allocations scoped to a command may be allocated from any API command.
-
Allocations scoped to a
VkPipelineCache
may only be allocated from:-
vkCreatePipelineCache
-
vkMergePipelineCaches
fordstCache
-
vkCreateGraphicsPipelines
forpipelineCache
-
vkCreateComputePipelines
forpipelineCache
-
-
Allocations scoped to a
VkValidationCacheEXT
may only be allocated from:-
vkCreateValidationCacheEXT
-
vkMergeValidationCachesEXT
fordstCache
-
vkCreateShaderModule
forvalidationCache
inVkShaderModuleValidationCacheCreateInfoEXT
-
-
Allocations scoped to a
VkDescriptorPool
may only be allocated from:-
any command that takes the pool as a direct argument
-
vkAllocateDescriptorSets
for thedescriptorPool
member of itspAllocateInfo
parameter -
vkCreateDescriptorPool
-
-
Allocations scoped to a
VkCommandPool
may only be allocated from:-
any command that takes the pool as a direct argument
-
vkCreateCommandPool
-
vkAllocateCommandBuffers
for thecommandPool
member of itspAllocateInfo
parameter -
any
vkCmd*
command whosecommandBuffer
was allocated from thatVkCommandPool
-
-
Allocations scoped to any other object may only be allocated in that object’s
vkCreate*
command.
pfnFree
may be called in the following situations:
-
Allocations scoped to a
VkDevice
orVkInstance
may be freed from any API command. -
Allocations scoped to a command must be freed by any API command which allocates such memory.
-
Allocations scoped to a
VkPipelineCache
may be freed fromvkDestroyPipelineCache
. -
Allocations scoped to a
VkValidationCacheEXT
may be freed fromvkDestroyValidationCacheEXT
. -
Allocations scoped to a
VkDescriptorPool
may be freed from-
any command that takes the pool as a direct argument
-
-
Allocations scoped to a
VkCommandPool
may be freed from:-
any command that takes the pool as a direct argument
-
vkResetCommandBuffer
whosecommandBuffer
was allocated from thatVkCommandPool
-
-
Allocations scoped to any other object may be freed in that object’s
vkDestroy*
command. -
Any command that allocates host memory may also free host memory of the same scope.
10.2. Device Memory
Device memory is memory that is visible to the device — for example the contents of the image or buffer objects, which can be natively used by the device.
Memory properties of a physical device describe the memory heaps and memory types available.
To query memory properties, call:
void vkGetPhysicalDeviceMemoryProperties(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceMemoryProperties* pMemoryProperties);
-
physicalDevice
is the handle to the device to query. -
pMemoryProperties
points to an instance ofVkPhysicalDeviceMemoryProperties
structure in which the properties are returned.
The VkPhysicalDeviceMemoryProperties
structure is defined as:
typedef struct VkPhysicalDeviceMemoryProperties {
uint32_t memoryTypeCount;
VkMemoryType memoryTypes[VK_MAX_MEMORY_TYPES];
uint32_t memoryHeapCount;
VkMemoryHeap memoryHeaps[VK_MAX_MEMORY_HEAPS];
} VkPhysicalDeviceMemoryProperties;
-
memoryTypeCount
is the number of valid elements in thememoryTypes
array. -
memoryTypes
is an array of VkMemoryType structures describing the memory types that can be used to access memory allocated from the heaps specified bymemoryHeaps
. -
memoryHeapCount
is the number of valid elements in thememoryHeaps
array. -
memoryHeaps
is an array of VkMemoryHeap structures describing the memory heaps from which memory can be allocated.
The VkPhysicalDeviceMemoryProperties
structure describes a number of
memory heaps as well as a number of memory types that can be used to
access memory allocated in those heaps.
Each heap describes a memory resource of a particular size, and each memory
type describes a set of memory properties (e.g. host cached vs uncached)
that can be used with a given memory heap.
Allocations using a particular memory type will consume resources from the
heap indicated by that memory type’s heap index.
More than one memory type may share each heap, and the heaps and memory
types provide a mechanism to advertise an accurate size of the physical
memory resources while allowing the memory to be used with a variety of
different properties.
The number of memory heaps is given by memoryHeapCount
and is less
than or equal to VK_MAX_MEMORY_HEAPS
.
Each heap is described by an element of the memoryHeaps
array as a
VkMemoryHeap structure.
The number of memory types available across all memory heaps is given by
memoryTypeCount
and is less than or equal to
VK_MAX_MEMORY_TYPES
.
Each memory type is described by an element of the memoryTypes
array
as a VkMemoryType structure.
At least one heap must include VK_MEMORY_HEAP_DEVICE_LOCAL_BIT
in
VkMemoryHeap::flags
.
If there are multiple heaps that all have similar performance
characteristics, they may all include
VK_MEMORY_HEAP_DEVICE_LOCAL_BIT
.
In a unified memory architecture (UMA) system there is often only a single
memory heap which is considered to be equally “local” to the host and to
the device, and such an implementation must advertise the heap as
device-local.
Each memory type returned by vkGetPhysicalDeviceMemoryProperties must
have its propertyFlags
set to one of the following values:
-
0
-
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
|
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
-
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
|
VK_MEMORY_PROPERTY_HOST_CACHED_BIT
-
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
|
VK_MEMORY_PROPERTY_HOST_CACHED_BIT
|
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
-
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
-
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
|
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
|
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
-
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
|
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
|
VK_MEMORY_PROPERTY_HOST_CACHED_BIT
-
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
|
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
|
VK_MEMORY_PROPERTY_HOST_CACHED_BIT
|
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
-
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
|
VK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT
-
VK_MEMORY_PROPERTY_PROTECTED_BIT
-
VK_MEMORY_PROPERTY_PROTECTED_BIT
|VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
There must be at least one memory type with both the
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
and
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
bits set in its
propertyFlags
.
There must be at least one memory type with the
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
bit set in its
propertyFlags
.
For each pair of elements X and Y returned in memoryTypes
, X
must be placed at a lower index position than Y if:
-
either the set of bit flags returned in the
propertyFlags
member of X is a strict subset of the set of bit flags returned in thepropertyFlags
member of Y. -
or the
propertyFlags
members of X and Y are equal, and X belongs to a memory heap with greater performance (as determined in an implementation-specific manner).
Note
There is no ordering requirement between X and Y elements for the case
their |
This ordering requirement enables applications to use a simple search loop to select the desired memory type along the lines of:
// Find a memory in `memoryTypeBitsRequirement` that includes all of `requiredProperties`
int32_t findProperties(const VkPhysicalDeviceMemoryProperties* pMemoryProperties,
uint32_t memoryTypeBitsRequirement,
VkMemoryPropertyFlags requiredProperties) {
const uint32_t memoryCount = pMemoryProperties->memoryTypeCount;
for (uint32_t memoryIndex = 0; memoryIndex < memoryCount; ++memoryIndex) {
const uint32_t memoryTypeBits = (1 << memoryIndex);
const bool isRequiredMemoryType = memoryTypeBitsRequirement & memoryTypeBits;
const VkMemoryPropertyFlags properties =
pMemoryProperties->memoryTypes[memoryIndex].propertyFlags;
const bool hasRequiredProperties =
(properties & requiredProperties) == requiredProperties;
if (isRequiredMemoryType && hasRequiredProperties)
return static_cast<int32_t>(memoryIndex);
}
// failed to find memory type
return -1;
}
// Try to find an optimal memory type, or if it does not exist try fallback memory type
// `device` is the VkDevice
// `image` is the VkImage that requires memory to be bound
// `memoryProperties` properties as returned by vkGetPhysicalDeviceMemoryProperties
// `requiredProperties` are the property flags that must be present
// `optimalProperties` are the property flags that are preferred by the application
VkMemoryRequirements memoryRequirements;
vkGetImageMemoryRequirements(device, image, &memoryRequirements);
int32_t memoryType =
findProperties(&memoryProperties, memoryRequirements.memoryTypeBits, optimalProperties);
if (memoryType == -1) // not found; try fallback properties
memoryType =
findProperties(&memoryProperties, memoryRequirements.memoryTypeBits, requiredProperties);
To query memory properties, call:
void vkGetPhysicalDeviceMemoryProperties2(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceMemoryProperties2* pMemoryProperties);
or the equivalent command
void vkGetPhysicalDeviceMemoryProperties2KHR(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceMemoryProperties2* pMemoryProperties);
-
physicalDevice
is the handle to the device to query. -
pMemoryProperties
points to an instance ofVkPhysicalDeviceMemoryProperties2
structure in which the properties are returned.
vkGetPhysicalDeviceMemoryProperties2
behaves similarly to
vkGetPhysicalDeviceMemoryProperties, with the ability to return
extended information in a pNext
chain of output structures.
The VkPhysicalDeviceMemoryProperties2
structure is defined as:
typedef struct VkPhysicalDeviceMemoryProperties2 {
VkStructureType sType;
void* pNext;
VkPhysicalDeviceMemoryProperties memoryProperties;
} VkPhysicalDeviceMemoryProperties2;
or the equivalent
typedef VkPhysicalDeviceMemoryProperties2 VkPhysicalDeviceMemoryProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memoryProperties
is a structure of type VkPhysicalDeviceMemoryProperties which is populated with the same values as in vkGetPhysicalDeviceMemoryProperties.
The VkMemoryHeap
structure is defined as:
typedef struct VkMemoryHeap {
VkDeviceSize size;
VkMemoryHeapFlags flags;
} VkMemoryHeap;
-
size
is the total memory size in bytes in the heap. -
flags
is a bitmask of VkMemoryHeapFlagBits specifying attribute flags for the heap.
Bits which may be set in VkMemoryHeap::flags
, indicating
attribute flags for the heap, are:
typedef enum VkMemoryHeapFlagBits {
VK_MEMORY_HEAP_DEVICE_LOCAL_BIT = 0x00000001,
VK_MEMORY_HEAP_MULTI_INSTANCE_BIT = 0x00000002,
VK_MEMORY_HEAP_MULTI_INSTANCE_BIT_KHR = VK_MEMORY_HEAP_MULTI_INSTANCE_BIT,
} VkMemoryHeapFlagBits;
-
VK_MEMORY_HEAP_DEVICE_LOCAL_BIT
specifies that the heap corresponds to device local memory. Device local memory may have different performance characteristics than host local memory, and may support different memory property flags. -
VK_MEMORY_HEAP_MULTI_INSTANCE_BIT
specifies that in a logical device representing more than one physical device, there is a per-physical device instance of the heap memory. By default, an allocation from such a heap will be replicated to each physical device’s instance of the heap.
typedef VkFlags VkMemoryHeapFlags;
VkMemoryHeapFlags
is a bitmask type for setting a mask of zero or more
VkMemoryHeapFlagBits.
The VkMemoryType
structure is defined as:
typedef struct VkMemoryType {
VkMemoryPropertyFlags propertyFlags;
uint32_t heapIndex;
} VkMemoryType;
-
heapIndex
describes which memory heap this memory type corresponds to, and must be less thanmemoryHeapCount
from the VkPhysicalDeviceMemoryProperties structure. -
propertyFlags
is a bitmask of VkMemoryPropertyFlagBits of properties for this memory type.
Bits which may be set in VkMemoryType::propertyFlags
,
indicating properties of a memory heap, are:
typedef enum VkMemoryPropertyFlagBits {
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT = 0x00000001,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT = 0x00000002,
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT = 0x00000004,
VK_MEMORY_PROPERTY_HOST_CACHED_BIT = 0x00000008,
VK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT = 0x00000010,
VK_MEMORY_PROPERTY_PROTECTED_BIT = 0x00000020,
} VkMemoryPropertyFlagBits;
-
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
bit specifies that memory allocated with this type is the most efficient for device access. This property will be set if and only if the memory type belongs to a heap with theVK_MEMORY_HEAP_DEVICE_LOCAL_BIT
set. -
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
bit specifies that memory allocated with this type can be mapped for host access using vkMapMemory. -
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
bit specifies that the host cache management commands vkFlushMappedMemoryRanges and vkInvalidateMappedMemoryRanges are not needed to flush host writes to the device or make device writes visible to the host, respectively. -
VK_MEMORY_PROPERTY_HOST_CACHED_BIT
bit specifies that memory allocated with this type is cached on the host. Host memory accesses to uncached memory are slower than to cached memory, however uncached memory is always host coherent. -
VK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT
bit specifies that the memory type only allows device access to the memory. Memory types must not have bothVK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT
andVK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
set. Additionally, the object’s backing memory may be provided by the implementation lazily as specified in Lazily Allocated Memory. -
VK_MEMORY_PROPERTY_PROTECTED_BIT
bit specifies that the memory type only allows device access to the memory, and allows protected queue operations to access the memory. Memory types must not haveVK_MEMORY_PROPERTY_PROTECTED_BIT
set and any ofVK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
set, orVK_MEMORY_PROPERTY_HOST_COHERENT_BIT
set, orVK_MEMORY_PROPERTY_HOST_CACHED_BIT
set.
typedef VkFlags VkMemoryPropertyFlags;
VkMemoryPropertyFlags
is a bitmask type for setting a mask of zero or
more VkMemoryPropertyFlagBits.
A Vulkan device operates on data in device memory via memory objects that
are represented in the API by a VkDeviceMemory
handle:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDeviceMemory)
To allocate memory objects, call:
VkResult vkAllocateMemory(
VkDevice device,
const VkMemoryAllocateInfo* pAllocateInfo,
const VkAllocationCallbacks* pAllocator,
VkDeviceMemory* pMemory);
-
device
is the logical device that owns the memory. -
pAllocateInfo
is a pointer to an instance of the VkMemoryAllocateInfo structure describing parameters of the allocation. A successful returned allocation must use the requested parameters — no substitution is permitted by the implementation. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pMemory
is a pointer to a VkDeviceMemory handle in which information about the allocated memory is returned.
Allocations returned by vkAllocateMemory
are guaranteed to meet any
alignment requirement of the implementation.
For example, if an implementation requires 128 byte alignment for images and
64 byte alignment for buffers, the device memory returned through this
mechanism would be 128-byte aligned.
This ensures that applications can correctly suballocate objects of
different types (with potentially different alignment requirements) in the
same memory object.
When memory is allocated, its contents are undefined with the following constraint:
-
The contents of unprotected memory must not be a function of data protected memory objects, even if those memory objects were previously freed.
Note
The contents of memory allocated by one application should not be a function of data from protected memory objects of another application, even if those memory objects were previously freed. |
The maximum number of valid memory allocations that can exist
simultaneously within a VkDevice may be restricted by implementation-
or platform-dependent limits.
If a call to vkAllocateMemory would cause the total number of
allocations to exceed these limits, such a call will fail and must return
VK_ERROR_TOO_MANY_OBJECTS
.
The
maxMemoryAllocationCount
feature describes the number of allocations that can exist simultaneously
before encountering these internal limits.
Some platforms may have a limit on the maximum size of a single allocation.
For example, certain systems may fail to create allocations with a size
greater than or equal to 4GB.
Such a limit is implementation-dependent, and if such a failure occurs then
the error VK_ERROR_OUT_OF_DEVICE_MEMORY
must be returned.
This limit is advertised in
VkPhysicalDeviceMaintenance3Properties::maxMemoryAllocationSize
.
The cumulative memory size allocated to a heap can be limited by the size
of the specified heap.
In such cases, allocated memory is tracked on a per-device and per-heap
basis.
Some platforms allow overallocation into other heaps.
The overallocation behavior can be specified through the
VK_AMD_memory_overallocation_behavior
extension.
The VkMemoryAllocateInfo
structure is defined as:
typedef struct VkMemoryAllocateInfo {
VkStructureType sType;
const void* pNext;
VkDeviceSize allocationSize;
uint32_t memoryTypeIndex;
} VkMemoryAllocateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
allocationSize
is the size of the allocation in bytes -
memoryTypeIndex
is an index identifying a memory type from thememoryTypes
array of the VkPhysicalDeviceMemoryProperties structure
An instance of the VkMemoryAllocateInfo structure defines a memory
import operation if the pNext
chain contains an instance of one of the
following structures:
-
VkImportMemoryWin32HandleInfoKHR with non-zero
handleType
value -
VkImportMemoryFdInfoKHR with a non-zero
handleType
value -
VkImportMemoryHostPointerInfoEXT with a non-zero
handleType
value -
VkImportAndroidHardwareBufferInfoANDROID with a non-
NULL
buffer
value
Importing memory must not modify the content of the memory. Implementations must ensure that importing memory does not enable the importing Vulkan instance to access any memory or resources in other Vulkan instances other than that corresponding to the memory object imported. Implementations must also ensure accessing imported memory which has not been initialized does not allow the importing Vulkan instance to obtain data from the exporting Vulkan instance or vice-versa.
Note
How exported and imported memory is isolated is left to the implementation, but applications should be aware that such isolation may prevent implementations from placing multiple exportable memory objects in the same physical or virtual page. Hence, applications should avoid creating many small external memory objects whenever possible. |
When performing a memory import operation, it is the responsibility of the
application to ensure the external handles meet all valid usage
requirements.
However, implementations must perform sufficient validation of external
handles to ensure that the operation results in a valid memory object which
will not cause program termination, device loss, queue stalls, or corruption
of other resources when used as allowed according to its allocation
parameters.
If the external handle provided does not meet these requirements, the
implementation must fail the memory import operation with the error code
VK_ERROR_INVALID_EXTERNAL_HANDLE
.
If the pNext
chain includes a VkMemoryDedicatedAllocateInfo
structure, then that structure includes a handle of the sole buffer or image
resource that the memory can be bound to.
The VkMemoryDedicatedAllocateInfo
structure is defined as:
typedef struct VkMemoryDedicatedAllocateInfo {
VkStructureType sType;
const void* pNext;
VkImage image;
VkBuffer buffer;
} VkMemoryDedicatedAllocateInfo;
or the equivalent
typedef VkMemoryDedicatedAllocateInfo VkMemoryDedicatedAllocateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
image
is VK_NULL_HANDLE or a handle of an image which this memory will be bound to. -
buffer
is VK_NULL_HANDLE or a handle of a buffer which this memory will be bound to.
If the pNext
chain includes a
VkDedicatedAllocationMemoryAllocateInfoNV
structure, then that
structure includes a handle of the sole buffer or image resource that the
memory can be bound to.
The VkDedicatedAllocationMemoryAllocateInfoNV
structure is defined as:
typedef struct VkDedicatedAllocationMemoryAllocateInfoNV {
VkStructureType sType;
const void* pNext;
VkImage image;
VkBuffer buffer;
} VkDedicatedAllocationMemoryAllocateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
image
is VK_NULL_HANDLE or a handle of an image which this memory will be bound to. -
buffer
is VK_NULL_HANDLE or a handle of a buffer which this memory will be bound to.
When allocating memory that may be exported to another process or Vulkan
instance, add a VkExportMemoryAllocateInfo structure to the
pNext
chain of the VkMemoryAllocateInfo structure, specifying
the handle types that may be exported.
The VkExportMemoryAllocateInfo structure is defined as:
typedef struct VkExportMemoryAllocateInfo {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlags handleTypes;
} VkExportMemoryAllocateInfo;
or the equivalent
typedef VkExportMemoryAllocateInfo VkExportMemoryAllocateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleTypes
is a bitmask of VkExternalMemoryHandleTypeFlagBits specifying one or more memory handle types the application can export from the resulting allocation. The application can request multiple handle types for the same allocation.
To specify additional attributes of NT handles exported from a memory
object, add the VkExportMemoryWin32HandleInfoKHR structure to the
pNext
chain of the VkMemoryAllocateInfo structure.
The VkExportMemoryWin32HandleInfoKHR
structure is defined as:
typedef struct VkExportMemoryWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
LPCWSTR name;
} VkExportMemoryWin32HandleInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pAttributes
is a pointer to a WindowsSECURITY_ATTRIBUTES
structure specifying security attributes of the handle. -
dwAccess
is aDWORD
specifying access rights of the handle. -
name
is a NULL-terminated UTF-16 string to associate with the underlying resource referenced by NT handles exported from the created memory.
If this structure is not present, or if pAttributes
is set to NULL
,
default security descriptor values will be used, and child processes created
by the application will not inherit the handle, as described in the MSDN
documentation for “Synchronization Object Security and Access Rights”1.
Further, if the structure is not present, the access rights will be
DXGI_SHARED_RESOURCE_READ
| DXGI_SHARED_RESOURCE_WRITE
for handles of the following types:
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_BIT
And
GENERIC_ALL
for handles of the following types:
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_HEAP_BIT
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_RESOURCE_BIT
To import memory from a Windows handle, add a
VkImportMemoryWin32HandleInfoKHR structure to the pNext
chain of
the VkMemoryAllocateInfo structure.
The VkImportMemoryWin32HandleInfoKHR
structure is defined as:
typedef struct VkImportMemoryWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagBits handleType;
HANDLE handle;
LPCWSTR name;
} VkImportMemoryWin32HandleInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleType
specifies the type ofhandle
orname
. -
handle
is the external handle to import, orNULL
. -
name
is a NULL-terminated UTF-16 string naming the underlying memory resource to import, orNULL
.
Importing memory objects from Windows handles does not transfer ownership of
the handle to the Vulkan implementation.
For handle types defined as NT handles, the application must release
ownership using the CloseHandle
system call when the handle is no
longer needed.
Applications can import the same underlying memory into multiple instances
of Vulkan, into the same instance from which it was exported, and multiple
times into a given Vulkan instance.
In all cases, each import operation must create a distinct
VkDeviceMemory
object.
To export a Windows handle representing the underlying resources of a Vulkan device memory object, call:
VkResult vkGetMemoryWin32HandleKHR(
VkDevice device,
const VkMemoryGetWin32HandleInfoKHR* pGetWin32HandleInfo,
HANDLE* pHandle);
-
device
is the logical device that created the device memory being exported. -
pGetWin32HandleInfo
is a pointer to an instance of the VkMemoryGetWin32HandleInfoKHR structure containing parameters of the export operation. -
pHandle
will return the Windows handle representing the underlying resources of the device memory object.
For handle types defined as NT handles, the handles returned by
vkGetMemoryWin32HandleKHR
are owned by the application.
To avoid leaking resources, the application must release ownership of them
using the CloseHandle
system call when they are no longer needed.
The VkMemoryGetWin32HandleInfoKHR
structure is defined as:
typedef struct VkMemoryGetWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkDeviceMemory memory;
VkExternalMemoryHandleTypeFlagBits handleType;
} VkMemoryGetWin32HandleInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memory
is the memory object from which the handle will be exported. -
handleType
is the type of handle requested.
The properties of the handle returned depend on the value of
handleType
.
See VkExternalMemoryHandleTypeFlagBits for a description of the
properties of the defined external memory handle types.
Windows memory handles compatible with Vulkan may also be created by non-Vulkan APIs using methods beyond the scope of this specification. To determine the correct parameters to use when importing such handles, call:
VkResult vkGetMemoryWin32HandlePropertiesKHR(
VkDevice device,
VkExternalMemoryHandleTypeFlagBits handleType,
HANDLE handle,
VkMemoryWin32HandlePropertiesKHR* pMemoryWin32HandleProperties);
-
device
is the logical device that will be importinghandle
. -
handleType
is the type of the handlehandle
. -
handle
is the handle which will be imported. -
pMemoryWin32HandleProperties
will return properties ofhandle
.
The VkMemoryWin32HandlePropertiesKHR
structure returned is defined as:
typedef struct VkMemoryWin32HandlePropertiesKHR {
VkStructureType sType;
void* pNext;
uint32_t memoryTypeBits;
} VkMemoryWin32HandlePropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memoryTypeBits
is a bitmask containing one bit set for every memory type which the specified windows handle can be imported as.
To import memory from a POSIX file descriptor handle, add a
VkImportMemoryFdInfoKHR structure to the pNext
chain of the
VkMemoryAllocateInfo structure.
The VkImportMemoryFdInfoKHR
structure is defined as:
typedef struct VkImportMemoryFdInfoKHR {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagBits handleType;
int fd;
} VkImportMemoryFdInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleType
specifies the handle type offd
. -
fd
is the external handle to import.
Importing memory from a file descriptor transfers ownership of the file descriptor from the application to the Vulkan implementation. The application must not perform any operations on the file descriptor after a successful import.
Applications can import the same underlying memory into multiple instances
of Vulkan, into the same instance from which it was exported, and multiple
times into a given Vulkan instance.
In all cases, each import operation must create a distinct
VkDeviceMemory
object.
To export a POSIX file descriptor representing the underlying resources of a Vulkan device memory object, call:
VkResult vkGetMemoryFdKHR(
VkDevice device,
const VkMemoryGetFdInfoKHR* pGetFdInfo,
int* pFd);
-
device
is the logical device that created the device memory being exported. -
pGetFdInfo
is a pointer to an instance of the VkMemoryGetFdInfoKHR structure containing parameters of the export operation. -
pFd
will return a file descriptor representing the underlying resources of the device memory object.
Each call to vkGetMemoryFdKHR
must create a new file descriptor and
transfer ownership of it to the application.
To avoid leaking resources, the application must release ownership of the
file descriptor using the close
system call when it is no longer
needed, or by importing a Vulkan memory object from it.
Where supported by the operating system, the implementation must set the
file descriptor to be closed automatically when an execve
system call
is made.
The VkMemoryGetFdInfoKHR
structure is defined as:
typedef struct VkMemoryGetFdInfoKHR {
VkStructureType sType;
const void* pNext;
VkDeviceMemory memory;
VkExternalMemoryHandleTypeFlagBits handleType;
} VkMemoryGetFdInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memory
is the memory object from which the handle will be exported. -
handleType
is the type of handle requested.
The properties of the file descriptor exported depend on the value of
handleType
.
See VkExternalMemoryHandleTypeFlagBits for a description of the
properties of the defined external memory handle types.
Note
The size of the exported file may be larger than the size requested by
VkMemoryAllocateInfo::allocationSize.
If |
POSIX file descriptor memory handles compatible with Vulkan may also be created by non-Vulkan APIs using methods beyond the scope of this specification. To determine the correct parameters to use when importing such handles, call:
VkResult vkGetMemoryFdPropertiesKHR(
VkDevice device,
VkExternalMemoryHandleTypeFlagBits handleType,
int fd,
VkMemoryFdPropertiesKHR* pMemoryFdProperties);
-
device
is the logical device that will be importingfd
. -
handleType
is the type of the handlefd
. -
fd
is the handle which will be imported. -
pMemoryFdProperties
is a pointer to a VkMemoryFdPropertiesKHR structure in which the properties of the handlefd
are returned.
The VkMemoryFdPropertiesKHR
structure returned is defined as:
typedef struct VkMemoryFdPropertiesKHR {
VkStructureType sType;
void* pNext;
uint32_t memoryTypeBits;
} VkMemoryFdPropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memoryTypeBits
is a bitmask containing one bit set for every memory type which the specified file descriptor can be imported as.
To import memory from a host pointer, add a
VkImportMemoryHostPointerInfoEXT structure to the pNext
chain of
the VkMemoryAllocateInfo structure.
The VkImportMemoryHostPointerInfoEXT
structure is defined as:
typedef struct VkImportMemoryHostPointerInfoEXT {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagBits handleType;
void* pHostPointer;
} VkImportMemoryHostPointerInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleType
specifies the handle type. -
pHostPointer
is the host pointer to import from.
Importing memory from a host pointer shares ownership of the memory between the host and the Vulkan implementation. The application can continue to access the memory through the host pointer but it is the application’s responsibility to synchronize device and non-device access to the underlying memory as defined in Host Access to Device Memory Objects.
Applications can import the same underlying memory into multiple instances of Vulkan and multiple times into a given Vulkan instance. However, implementations may fail to import the same underlying memory multiple times into a given physical device due to platform constraints.
Importing memory from a particular host pointer may not be possible due to
additional platform-specific restrictions beyond the scope of this
specification in which case the implementation must fail the memory import
operation with the error code VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR
.
The application must ensure that the imported memory range remains valid and accessible for the lifetime of the imported memory object.
To determine the correct parameters to use when importing host pointers, call:
VkResult vkGetMemoryHostPointerPropertiesEXT(
VkDevice device,
VkExternalMemoryHandleTypeFlagBits handleType,
const void* pHostPointer,
VkMemoryHostPointerPropertiesEXT* pMemoryHostPointerProperties);
-
device
is the logical device that will be importingpHostPointer
. -
handleType
is the type of the handlepHostPointer
. -
pHostPointer
is the host pointer to import from.
The VkMemoryHostPointerPropertiesEXT
structure is defined as:
typedef struct VkMemoryHostPointerPropertiesEXT {
VkStructureType sType;
void* pNext;
uint32_t memoryTypeBits;
} VkMemoryHostPointerPropertiesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memoryTypeBits
is a bitmask containing one bit set for every memory type which the specified host pointer can be imported as.
To import memory created outside of the current Vulkan instance from an
Android hardware buffer, add a
VkImportAndroidHardwareBufferInfoANDROID
structure to the pNext
chain of the VkMemoryAllocateInfo structure.
The VkImportAndroidHardwareBufferInfoANDROID
structure is defined as:
typedef struct VkImportAndroidHardwareBufferInfoANDROID {
VkStructureType sType;
const void* pNext;
struct AHardwareBuffer* buffer;
} VkImportAndroidHardwareBufferInfoANDROID;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
buffer
is the Android hardware buffer to import.
If the vkAllocateMemory command succeeds, the implementation must acquire a reference to the imported hardware buffer, which it must release when the device memory object is freed. If the command fails, the implementation must not retain a reference.
To export an Android hardware buffer representing the underlying resources of a Vulkan device memory object, call:
VkResult vkGetMemoryAndroidHardwareBufferANDROID(
VkDevice device,
const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo,
struct AHardwareBuffer** pBuffer);
-
device
is the logical device that created the device memory being exported. -
pInfo
is a pointer to an instance of the VkMemoryGetAndroidHardwareBufferInfoANDROID structure containing parameters of the export operation. -
pBuffer
will return an Android hardware buffer representing the underlying resources of the device memory object.
Each call to vkGetMemoryAndroidHardwareBufferANDROID
must return an
Android hardware buffer with a new reference acquired in addition to the
reference held by the VkDeviceMemory.
To avoid leaking resources, the application must release the reference by
calling AHardwareBuffer_release
when it is no longer needed.
When called with the same handle in
VkMemoryGetAndroidHardwareBufferInfoANDROID::memory
,
vkGetMemoryAndroidHardwareBufferANDROID
must return the same Android
hardware buffer object.
If the device memory was created by importing an Android hardware buffer,
vkGetMemoryAndroidHardwareBufferANDROID
must return that same Android
hardware buffer object.
The VkMemoryGetAndroidHardwareBufferInfoANDROID
structure is defined
as:
typedef struct VkMemoryGetAndroidHardwareBufferInfoANDROID {
VkStructureType sType;
const void* pNext;
VkDeviceMemory memory;
} VkMemoryGetAndroidHardwareBufferInfoANDROID;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memory
is the memory object from which the Android hardware buffer will be exported.
To determine the memory parameters to use when importing an Android hardware buffer, call:
VkResult vkGetAndroidHardwareBufferPropertiesANDROID(
VkDevice device,
const struct AHardwareBuffer* buffer,
VkAndroidHardwareBufferPropertiesANDROID* pProperties);
-
device
is the logical device that will be importingbuffer
. -
buffer
is the Android hardware buffer which will be imported. -
pProperties
is a pointer to a VkAndroidHardwareBufferPropertiesANDROID structure in which the properties ofbuffer
are returned.
The VkAndroidHardwareBufferPropertiesANDROID
structure returned is
defined as:
typedef struct VkAndroidHardwareBufferPropertiesANDROID {
VkStructureType sType;
void* pNext;
VkDeviceSize allocationSize;
uint32_t memoryTypeBits;
} VkAndroidHardwareBufferPropertiesANDROID;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
allocationSize
is the size of the external memory -
memoryTypeBits
is a bitmask containing one bit set for every memory type which the specified Android hardware buffer can be imported as.
To obtain format properties of an Android hardware buffer, include an
instance of VkAndroidHardwareBufferFormatPropertiesANDROID
in the
pNext
chain of the VkAndroidHardwareBufferPropertiesANDROID
instance passed to vkGetAndroidHardwareBufferPropertiesANDROID.
This structure is defined as:
typedef struct VkAndroidHardwareBufferFormatPropertiesANDROID {
VkStructureType sType;
void* pNext;
VkFormat format;
uint64_t externalFormat;
VkFormatFeatureFlags formatFeatures;
VkComponentMapping samplerYcbcrConversionComponents;
VkSamplerYcbcrModelConversion suggestedYcbcrModel;
VkSamplerYcbcrRange suggestedYcbcrRange;
VkChromaLocation suggestedXChromaOffset;
VkChromaLocation suggestedYChromaOffset;
} VkAndroidHardwareBufferFormatPropertiesANDROID;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
format
is the Vulkan format corresponding to the Android hardware buffer’s format, orVK_FORMAT_UNDEFINED
if there is not an equivalent Vulkan format. -
externalFormat
is an implementation-defined external format identifier for use with VkExternalFormatANDROID. It must not be zero. -
formatFeatures
describes the capabilities of this external format when used with an image bound to memory imported frombuffer
. -
samplerYcbcrConversionComponents
is the component swizzle that should be used in VkSamplerYcbcrConversionCreateInfo. -
suggestedYcbcrModel
is a suggested color model to use in the VkSamplerYcbcrConversionCreateInfo. -
suggestedYcbcrRange
is a suggested numerical value range to use in VkSamplerYcbcrConversionCreateInfo. -
suggestedXChromaOffset
is a suggested X chroma offset to use in VkSamplerYcbcrConversionCreateInfo. -
suggestedYChromaOffset
is a suggested Y chroma offset to use in VkSamplerYcbcrConversionCreateInfo.
If the Android hardware buffer has one of the formats listed in the
Format Equivalence
table, then format
must have the equivalent Vulkan format listed in
the table.
Otherwise, format
may be VK_FORMAT_UNDEFINED
, indicating the
Android hardware buffer can only be used with an external format.
The formatFeatures
member must include
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
and at least one of
VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT
or
VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT
, and should include
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT
and
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER_BIT
.
Note
The |
Android hardware buffers with the same external format must have the same
support for VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT
,
VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT
,
VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT
,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER_BIT
,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_SEPARATE_RECONSTRUCTION_FILTER_BIT
,
and
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_FORCEABLE_BIT
.
in formatFeatures
.
Other format features may differ between Android hardware buffers that have
the same external format.
This allows applications to use the same VkSamplerYcbcrConversion
object (and samplers and pipelines created from them) for any Android
hardware buffers that have the same external format.
If format
is not VK_FORMAT_UNDEFINED
, then the value of
samplerYcbcrConversionComponents
must be valid when used as the
components
member of VkSamplerYcbcrConversionCreateInfo with
that format.
If format
is VK_FORMAT_UNDEFINED
, all members of
samplerYcbcrConversionComponents
must be
VK_COMPONENT_SWIZZLE_IDENTITY
.
Implementations may not always be able to determine the color model,
numerical range, or chroma offsets of the image contents, so the values in
VkAndroidHardwareBufferFormatPropertiesANDROID
are only suggestions.
Applications should treat these values as sensible defaults to use in the
absence of more reliable information obtained through some other means.
If the underlying physical device is also usable via OpenGL ES with the
GL_OES_EGL_image_external
extension, the implementation should suggest values that will produce
similar sampled values as would be obtained by sampling the same external
image via samplerExternalOES
in OpenGL ES using equivalent sampler
parameters.
Note
Since GL_OES_EGL_image_external does not require the same sampling and conversion calculations as Vulkan does, achieving identical results between APIs may not be possible on some implementations. |
When allocating memory that may be exported to another process or Vulkan
instance, add a VkExportMemoryAllocateInfoNV structure to the
pNext
chain of the VkMemoryAllocateInfo structure, specifying
the handle types that may be exported.
The VkExportMemoryAllocateInfoNV structure is defined as:
typedef struct VkExportMemoryAllocateInfoNV {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagsNV handleTypes;
} VkExportMemoryAllocateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleTypes
is a bitmask of VkExternalMemoryHandleTypeFlagBitsNV specifying one or more memory handle types that may be exported. Multiple handle types may be requested for the same allocation as long as they are compatible, as reported by vkGetPhysicalDeviceExternalImageFormatPropertiesNV.
When VkExportMemoryAllocateInfoNV::handleTypes
includes
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_NV
, add a
VkExportMemoryWin32HandleInfoNV
to the pNext
chain of the
VkExportMemoryAllocateInfoNV structure to specify security attributes
and access rights for the memory object’s external handle.
The VkExportMemoryWin32HandleInfoNV
structure is defined as:
typedef struct VkExportMemoryWin32HandleInfoNV {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
} VkExportMemoryWin32HandleInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pAttributes
is a pointer to a WindowsSECURITY_ATTRIBUTES
structure specifying security attributes of the handle. -
dwAccess
is aDWORD
specifying access rights of the handle.
If this structure is not present, or if pAttributes
is set to NULL
,
default security descriptor values will be used, and child processes created
by the application will not inherit the handle, as described in the MSDN
documentation for “Synchronization Object Security and Access Rights”[1].
Further, if the structure is not present, the access rights will be
DXGI_SHARED_RESOURCE_READ | DXGI_SHARED_RESOURCE_WRITE
To import memory created on the same physical device but outside of the
current Vulkan instance, add a VkImportMemoryWin32HandleInfoNV
structure to the pNext
chain of the VkMemoryAllocateInfo
structure, specifying a handle to and the type of the memory.
The VkImportMemoryWin32HandleInfoNV
structure is defined as:
typedef struct VkImportMemoryWin32HandleInfoNV {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagsNV handleType;
HANDLE handle;
} VkImportMemoryWin32HandleInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleType
is0
or a VkExternalMemoryHandleTypeFlagBitsNV value specifying the type of memory handle inhandle
. -
handle
is a WindowsHANDLE
referring to the memory.
If handleType
is 0
, this structure is ignored by consumers of the
VkMemoryAllocateInfo structure it is chained from.
Bits which can be set in handleType
are:
Possible values of VkImportMemoryWin32HandleInfoNV::handleType
,
specifying the type of an external memory handle, are:
typedef enum VkExternalMemoryHandleTypeFlagBitsNV {
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_NV = 0x00000001,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT_NV = 0x00000002,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_IMAGE_BIT_NV = 0x00000004,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_IMAGE_KMT_BIT_NV = 0x00000008,
} VkExternalMemoryHandleTypeFlagBitsNV;
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT_NV
specifies a handle to memory returned by vkGetMemoryWin32HandleNV. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_NV
specifies a handle to memory returned by vkGetMemoryWin32HandleNV, or one duplicated from such a handle usingDuplicateHandle()
. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_IMAGE_BIT_NV
specifies a valid NT handle to memory returned byIDXGIResource1::
, or a handle duplicated from such a handle usingCreateSharedHandle
()DuplicateHandle()
. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_IMAGE_KMT_BIT_NV
specifies a handle to memory returned byIDXGIResource::GetSharedHandle()
.
editing-note
(Jon) If additional (non-Win32) bits are added to the possible memory types,
this type should move to the |
typedef VkFlags VkExternalMemoryHandleTypeFlagsNV;
VkExternalMemoryHandleTypeFlagsNV
is a bitmask type for setting a mask
of zero or more VkExternalMemoryHandleTypeFlagBitsNV.
To retrieve the handle corresponding to a device memory object created with
VkExportMemoryAllocateInfoNV::handleTypes
set to include
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_NV
or
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT_NV
, call:
VkResult vkGetMemoryWin32HandleNV(
VkDevice device,
VkDeviceMemory memory,
VkExternalMemoryHandleTypeFlagsNV handleType,
HANDLE* pHandle);
-
device
is the logical device that owns the memory. -
memory
is the VkDeviceMemory object. -
handleType
is a bitmask of VkExternalMemoryHandleTypeFlagBitsNV containing a single bit specifying the type of handle requested. -
handle
points to a WindowsHANDLE
in which the handle is returned.
If the pNext
chain of VkMemoryAllocateInfo includes a
VkMemoryAllocateFlagsInfo
structure, then that structure includes
flags and a device mask controlling how many instances of the memory will be
allocated.
The VkMemoryAllocateFlagsInfo
structure is defined as:
typedef struct VkMemoryAllocateFlagsInfo {
VkStructureType sType;
const void* pNext;
VkMemoryAllocateFlags flags;
uint32_t deviceMask;
} VkMemoryAllocateFlagsInfo;
or the equivalent
typedef VkMemoryAllocateFlagsInfo VkMemoryAllocateFlagsInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkMemoryAllocateFlagBits controlling the allocation. -
deviceMask
is a mask of physical devices in the logical device, indicating that memory must be allocated on each device in the mask, ifVK_MEMORY_ALLOCATE_DEVICE_MASK_BIT
is set inflags
.
If VK_MEMORY_ALLOCATE_DEVICE_MASK_BIT
is not set, the number of
instances allocated depends on whether
VK_MEMORY_HEAP_MULTI_INSTANCE_BIT
is set in the memory heap.
If VK_MEMORY_HEAP_MULTI_INSTANCE_BIT
is set, then memory is allocated
for every physical device in the logical device (as if deviceMask
has
bits set for all device indices).
If VK_MEMORY_HEAP_MULTI_INSTANCE_BIT
is not set, then a single
instance of memory is allocated (as if deviceMask
is set to one).
On some implementations, allocations from a multi-instance heap may consume
memory on all physical devices even if the deviceMask
excludes some
devices.
If VkPhysicalDeviceGroupProperties::subsetAllocation
is
VK_TRUE
, then memory is only consumed for the devices in the device
mask.
Note
In practice, most allocations on a multi-instance heap will be allocated across all physical devices. Unicast allocation support is an optional optimization for a minority of allocations. |
Bits which can be set in VkMemoryAllocateFlagsInfo::flags
,
controlling device memory allocation, are:
typedef enum VkMemoryAllocateFlagBits {
VK_MEMORY_ALLOCATE_DEVICE_MASK_BIT = 0x00000001,
VK_MEMORY_ALLOCATE_DEVICE_MASK_BIT_KHR = VK_MEMORY_ALLOCATE_DEVICE_MASK_BIT,
} VkMemoryAllocateFlagBits;
or the equivalent
typedef VkMemoryAllocateFlagBits VkMemoryAllocateFlagBitsKHR;
-
VK_MEMORY_ALLOCATE_DEVICE_MASK_BIT
specifies that memory will be allocated for the devices in VkMemoryAllocateFlagsInfo::deviceMask
.
typedef VkFlags VkMemoryAllocateFlags;
or the equivalent
typedef VkMemoryAllocateFlags VkMemoryAllocateFlagsKHR;
VkMemoryAllocateFlags
is a bitmask type for setting a mask of zero or
more VkMemoryAllocateFlagBits.
To free a memory object, call:
void vkFreeMemory(
VkDevice device,
VkDeviceMemory memory,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that owns the memory. -
memory
is the VkDeviceMemory object to be freed. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
Before freeing a memory object, an application must ensure the memory object is no longer in use by the device—for example by command buffers in the pending state. The memory can remain bound to images or buffers at the time the memory object is freed, but any further use of them (on host or device) for anything other than destroying those objects will result in undefined behavior. If there are still any bound images or buffers, the memory may not be immediately released by the implementation, but must be released by the time all bound images and buffers have been destroyed. Once memory is released, it is returned to the heap from which it was allocated.
How memory objects are bound to Images and Buffers is described in detail in the Resource Memory Association section.
If a memory object is mapped at the time it is freed, it is implicitly unmapped.
Note
As described below, host writes are not implicitly flushed when the memory object is unmapped, but the implementation must guarantee that writes that have not been flushed do not affect any other memory. |
10.2.1. Host Access to Device Memory Objects
Memory objects created with vkAllocateMemory are not directly host accessible.
Memory objects created with the memory property
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
are considered mappable.
Memory objects must be mappable in order to be successfully mapped on the
host.
To retrieve a host virtual address pointer to a region of a mappable memory object, call:
VkResult vkMapMemory(
VkDevice device,
VkDeviceMemory memory,
VkDeviceSize offset,
VkDeviceSize size,
VkMemoryMapFlags flags,
void** ppData);
-
device
is the logical device that owns the memory. -
memory
is the VkDeviceMemory object to be mapped. -
offset
is a zero-based byte offset from the beginning of the memory object. -
size
is the size of the memory range to map, orVK_WHOLE_SIZE
to map fromoffset
to the end of the allocation. -
flags
is reserved for future use. -
ppData
points to a pointer in which is returned a host-accessible pointer to the beginning of the mapped range. This pointer minusoffset
must be aligned to at least VkPhysicalDeviceLimits::minMemoryMapAlignment
.
After a successful call to vkMapMemory
the memory object memory
is considered to be currently host mapped.
It is an application error to call vkMapMemory
on a memory object that
is already host mapped.
Note
|
vkMapMemory
does not check whether the device memory is currently in
use before returning the host-accessible pointer.
The application must guarantee that any previously submitted command that
writes to this range has completed before the host reads from or writes to
that range, and that any previously submitted command that reads from that
range has completed before the host writes to that region (see
here for details on fulfilling
such a guarantee).
If the device memory was allocated without the
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
set, these guarantees must be
made for an extended range: the application must round down the start of
the range to the nearest multiple of
VkPhysicalDeviceLimits::nonCoherentAtomSize
, and round the end
of the range up to the nearest multiple of
VkPhysicalDeviceLimits::nonCoherentAtomSize
.
While a range of device memory is host mapped, the application is responsible for synchronizing both device and host access to that memory range.
Note
It is important for the application developer to become meticulously familiar with all of the mechanisms described in the chapter on Synchronization and Cache Control as they are crucial to maintaining memory access ordering. |
typedef VkFlags VkMemoryMapFlags;
VkMemoryMapFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
Two commands are provided to enable applications to work with non-coherent
memory allocations: vkFlushMappedMemoryRanges
and
vkInvalidateMappedMemoryRanges
.
Note
If the memory object was created with the
|
Note
While memory objects imported from a handle type of
|
To flush ranges of non-coherent memory from the host caches, call:
VkResult vkFlushMappedMemoryRanges(
VkDevice device,
uint32_t memoryRangeCount,
const VkMappedMemoryRange* pMemoryRanges);
-
device
is the logical device that owns the memory ranges. -
memoryRangeCount
is the length of thepMemoryRanges
array. -
pMemoryRanges
is a pointer to an array of VkMappedMemoryRange structures describing the memory ranges to flush.
vkFlushMappedMemoryRanges
guarantees that host writes to the memory
ranges described by pMemoryRanges
are made available to the host
memory domain, such that they can be made available to the device memory
domain via memory
domain operations using the VK_ACCESS_HOST_WRITE_BIT
access type.
Within each range described by pMemoryRanges
, each set of
nonCoherentAtomSize
bytes in that range is flushed if any byte in that
set has been written by the host since it was first host mapped, or the last
time it was flushed.
If pMemoryRanges
includes sets of nonCoherentAtomSize
bytes
where no bytes have been written by the host, those bytes must not be
flushed.
Unmapping non-coherent memory does not implicitly flush the host mapped memory, and host writes that have not been flushed may not ever be visible to the device. However, implementations must ensure that writes that have not been flushed do not become visible to any other memory.
Note
The above guarantee avoids a potential memory corruption in scenarios where host writes to a mapped memory object have not been flushed before the memory is unmapped (or freed), and the virtual address range is subsequently reused for a different mapping (or memory allocation). |
To invalidate ranges of non-coherent memory from the host caches, call:
VkResult vkInvalidateMappedMemoryRanges(
VkDevice device,
uint32_t memoryRangeCount,
const VkMappedMemoryRange* pMemoryRanges);
-
device
is the logical device that owns the memory ranges. -
memoryRangeCount
is the length of thepMemoryRanges
array. -
pMemoryRanges
is a pointer to an array of VkMappedMemoryRange structures describing the memory ranges to invalidate.
vkInvalidateMappedMemoryRanges
guarantees that device writes to the
memory ranges described by pMemoryRanges
, which have been made
available to the host memory domain using the VK_ACCESS_HOST_WRITE_BIT
and VK_ACCESS_HOST_READ_BIT
access
types, are made visible to the host.
If a range of non-coherent memory is written by the host and then
invalidated without first being flushed, its contents are undefined.
Within each range described by pMemoryRanges
, each set of
nonCoherentAtomSize
bytes in that range is invalidated if any byte in
that set has been written by the device since it was first host mapped, or
the last time it was invalidated.
Note
Mapping non-coherent memory does not implicitly invalidate the mapped memory, and device writes that have not been invalidated must be made visible before the host reads or overwrites them. |
The VkMappedMemoryRange
structure is defined as:
typedef struct VkMappedMemoryRange {
VkStructureType sType;
const void* pNext;
VkDeviceMemory memory;
VkDeviceSize offset;
VkDeviceSize size;
} VkMappedMemoryRange;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memory
is the memory object to which this range belongs. -
offset
is the zero-based byte offset from the beginning of the memory object. -
size
is either the size of range, orVK_WHOLE_SIZE
to affect the range fromoffset
to the end of the current mapping of the allocation.
To unmap a memory object once host access to it is no longer needed by the application, call:
void vkUnmapMemory(
VkDevice device,
VkDeviceMemory memory);
-
device
is the logical device that owns the memory. -
memory
is the memory object to be unmapped.
10.2.2. Lazily Allocated Memory
If the memory object is allocated from a heap with the
VK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT
bit set, that object’s backing
memory may be provided by the implementation lazily.
The actual committed size of the memory may initially be as small as zero
(or as large as the requested size), and monotonically increases as
additional memory is needed.
A memory type with this flag set is only allowed to be bound to a
VkImage
whose usage flags include
VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT
.
Note
Using lazily allocated memory objects for framebuffer attachments that are not needed once a render pass instance has completed may allow some implementations to never allocate memory for such attachments. |
To determine the amount of lazily-allocated memory that is currently committed for a memory object, call:
void vkGetDeviceMemoryCommitment(
VkDevice device,
VkDeviceMemory memory,
VkDeviceSize* pCommittedMemoryInBytes);
-
device
is the logical device that owns the memory. -
memory
is the memory object being queried. -
pCommittedMemoryInBytes
is a pointer to aVkDeviceSize
value in which the number of bytes currently committed is returned, on success.
The implementation may update the commitment at any time, and the value returned by this query may be out of date.
The implementation guarantees to allocate any committed memory from the heapIndex indicated by the memory type that the memory object was created with.
10.2.3. Protected Memory
Protected memory divides device memory into protected device memory and unprotected device memory.
Protected memory adds the following concepts:
-
Memory:
-
Unprotected device memory, which can be visible to the device and can be visible to the host
-
Protected device memory, which can be visible to the device but must not be visible to the host
-
-
Resources:
-
Unprotected images and unprotected buffers, to which unprotected memory can be bound
-
Protected images and protected buffers, to which protected memory can be bound
-
-
Command buffers:
-
Unprotected command buffers, which can be submitted to a device queue to execute unprotected queue operations
-
Protected command buffers, which can be submitted to a protected-capable device queue to execute protected queue operations
-
-
Device queues:
-
Unprotected device queues, to which unprotected command buffers can be submitted
-
Protected-capable device queues, to which unprotected command buffers or protected command buffers can be submitted
-
-
Queue submissions
-
Unprotected queue submissions, through which unprotected command buffers can be submitted
-
Protected queue submissions, through which protected command buffers can be submitted
-
-
Queue operations
-
Unprotected queue operations
-
Any read from or write to protected memory during unprotected queue operations results in undefined behavior but is subject to the inviolable rules below.
-
-
Protected queue operations
-
Any write to unprotected memory during protected queue operations results in undefined behavior but is subject to the inviolable rules below.
-
Except for framebuffer-space pipeline stages, compute shader stage, and transfer stage, any read from or write to protected memory during protected queue operations results in undefined behavior but is subject to the inviolable rules below.
-
Any query during protected queue operations results in undefined behavior, but is subject to the inviolable rules below.
-
-
Protected memory inviolable rules
Implementations must ensure that correct usage or incorrect usage by an application does not affect the integrity of the memory protection system.
The implementation must guarantee that:
-
Protected device memory must not be visible to the host.
-
Values written to unprotected device memory must not be a function of data from protected memory.
Incorrect usage by an application of the memory protection system results in undefined behavior which may include process termination or device loss.
10.2.4. External Memory Handle Types
Android Hardware Buffer
Android’s NDK defines AHardwareBuffer
objects, which represent device
memory that is shareable across processes and that can be accessed by a
variety of media APIs and the hardware used to implement them.
These Android hardware buffer objects may be imported into
VkDeviceMemory objects for access via Vulkan, or exported from Vulkan.
To remove an unnecessary compile-time dependency, an incomplete type
definition of AHardwareBuffer
is provided in the Vulkan headers:
struct AHardwareBuffer;
The actual AHardwareBuffer
type is defined in Android NDK headers.
Note
The NDK format, usage, and size/dimensions of an |
Android hardware buffer objects are reference-counted using Android NDK
functions outside of the scope of this specification.
A VkDeviceMemory imported from an Android hardware buffer or that can
be exported to an Android hardware buffer must acquire a reference to its
AHardwareBuffer
object, and must release this reference when the
device memory is freed.
During the host execution of a Vulkan command that has an Android hardware
buffer as a parameter (including indirect parameters via pNext
chains), the application must not decrement the Android hardware buffer’s
reference count to zero.
Android hardware buffers can be mapped and unmapped for CPU access using the NDK functions. These lock and unlock APIs are considered to acquire and release ownership of the Android hardware buffer, and applications must follow the rules described in External Resource Sharing to transfer ownership between the Vulkan instance and these native APIs.
Android hardware buffers can be shared with external APIs and Vulkan instances on the same device, and also with foreign devices. When transferring ownership of the Android hardware buffer, the external and foreign special queue families described in Queue Family Ownership Transfer are not identical. All APIs which produce or consume Android hardware buffers are considered to use foreign devices, except OpenGL ES contexts and Vulkan logical devices that have matching device and driver UUIDs. Implementations may treat a transfer to or from the foreign queue family as if it were a transfer to or from the external queue family when the Android hardware buffer’s usage only permits it to be used on the same physical device.
Android Hardware Buffer Optimal Usages
Vulkan buffer and image usage flags do not correspond exactly to Android
hardware buffer usage flags.
When allocating Android hardware buffers with non-Vulkan APIs, if any
AHARDWAREBUFFER_USAGE_GPU_
* usage bits are included, by default the
allocator must allocate the memory in such a way that it supports Vulkan
usages and creation flags in the
usage equivalence table
which do not have Android hardware buffer equivalents.
The VkAndroidHardwareBufferUsageANDROID structure can be attached to
the pNext
chain of a VkImageFormatProperties2 instance passed to
vkGetPhysicalDeviceImageFormatProperties2 to obtain optimal Android
hardware buffer usage flags for specific Vulkan resource creation
parameters.
Some usage flags returned by these commands are required based on the input
parameters, but additional vendor-specific usage flags
(AHARDWAREBUFFER_USAGE_VENDOR_
*) may also be returned.
Any Android hardware buffer allocated with these vendor-specific usage flags
and imported to Vulkan must only be bound to resources created with
parameters that are a subset of the parameters used to obtain the Android
hardware buffer usage, since the memory may have been allocated in a way
incompatible with other parameters.
If an Android hardware buffer is successfully allocated with additional
non-vendor-specific usage flags in addition to the recommended usage, it
must support being used in the same ways as an Android hardware buffer
allocated with only the recommended usage, and also in ways indicated by the
additional usage.
Android Hardware Buffer External Formats
Android hardware buffers may represent images using implementation-specific formats, layouts, color models, etc., which do not have Vulkan equivalents. Such external formats are commonly used by external image sources such as video decoders or cameras. Vulkan can import Android hardware buffers that have external formats, but since the image contents are in an undiscoverable and possibly proprietary representation, images with external formats must only be used as sampled images, must only be sampled with a sampler that has Y’CBCR conversion enabled, and must have optimal tiling.
Images that will be backed by an Android hardware buffer can use an
external format by setting VkImageCreateInfo::format
to
VK_FORMAT_UNDEFINED
and including an instance of
VkExternalFormatANDROID in the pNext
chain.
Images can be created with an external format even if the Android hardware
buffer has a format which has an
equivalent Vulkan format
to enable consistent handling of images from sources that might use either
category of format.
However, all images created with an external format are subject to the valid
usage requirements associated with external formats, even if the Android
hardware buffer’s format has a Vulkan equivalent.
The external format of an Android hardware buffer can be obtained by
passing an instance of VkAndroidHardwareBufferFormatPropertiesANDROID
to vkGetAndroidHardwareBufferPropertiesANDROID.
Android Hardware Buffer Image Resources
Android hardware buffers have intrinsic width, height, format, and usage
properties, so Vulkan images bound to memory imported from an Android
hardware buffer must use dedicated allocations:
VkMemoryDedicatedRequirements
::requiresDedicatedAllocation
must
be VK_TRUE
for images created with
VkExternalMemoryImageCreateInfo
::handleTypes
that includes
VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID
.
When creating an image that will be bound to an imported Android hardware
buffer, the image creation parameters must be equivalent to the
AHardwareBuffer
properties as described by the valid usage of
VkMemoryAllocateInfo.
Similarly, device memory allocated for a dedicated image must not be
exported to an Android hardware buffer until it has been bound to that
image, and the implementation must return an Android hardware buffer with
properties derived from the image:
-
The
width
andheight
members ofAHardwareBuffer_Desc
must be the same as thewidth
andheight
members of VkImageCreateInfo::extent
, respectively. -
The
layers
member ofAHardwareBuffer_Desc
must be the same as thearrayLayers
member of VkImageCreateInfo. -
The
format
member ofAHardwareBuffer_Desc
must be equivalent to VkImageCreateInfo::format
as defined by AHardwareBuffer Format Equivalence. -
The
usage
member ofAHardwareBuffer_Desc
must include bits corresponding to bits included in VkImageCreateInfo::usage
and VkImageCreateInfo::flags
where such a correspondence exists according to AHardwareBuffer Usage Equivalence. It may also include additional usage bits, including vendor-specific usages. Presence of vendor usage bits may make the Android hardware buffer only usable in ways indicated by the image creation parameters, even when used outside Vulkan, in a similar way that allocating the Android hardware buffer with usage returned in VkAndroidHardwareBufferUsageANDROID does.
Implementations may support fewer combinations of image creation parameters
for images with Android hardware buffer external handle type than for
non-external images.
Support for a given set of parameters can be determined by passing
VkExternalImageFormatProperties to
vkGetPhysicalDeviceImageFormatProperties2 with handleType
set to
VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID
.
Any Android hardware buffer successfully allocated outside Vulkan with usage
that includes AHARDWAREBUFFER_USAGE_GPU_
* must be supported when using
equivalent Vulkan image parameters.
If a given choice of image parameters are supported for import, they can
also be used to create an image and memory that will be exported to an
Android hardware buffer.
AHardwareBuffer Format | Vulkan Format |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
AHardwareBuffer Usage | Vulkan Usage or Creation Flag |
---|---|
None |
|
None |
|
|
|
|
|
|
|
|
|
|
None 2 |
|
|
None |
|
None |
|
- 2
-
The
AHARDWAREBUFFER_USAGE_GPU_MIPMAP_COMPLETE
flag does not correspond to a Vulkan image usage or creation flag. Instead, its presence indicates that the Android hardware buffer contains a complete mipmap chain, and its absence indicates that the Android hardware buffer contains only a single mip level.
Note
When using |
Android Hardware Buffer Buffer Resources
Android hardware buffers with a format of AHARDWAREBUFFER_FORMAT_BLOB
and usage that includes AHARDWAREBUFFER_USAGE_GPU_DATA_BUFFER
can be
used as the backing store for VkBuffer objects.
Such Android hardware buffers have a size in bytes specified by their
width
; height
and layers
are both 1
.
Unlike images, buffer resources backed by Android hardware buffers do not require dedicated allocations.
Exported AHardwareBuffer
objects that do not have dedicated images
must have a format of AHARDWAREBUFFER_FORMAT_BLOB
, usage must include
AHARDWAREBUFFER_USAGE_GPU_DATA_BUFFER
, width
must equal the
device memory allocation size, and height
and layers
must be 1
.
10.2.5. Peer Memory Features
Peer memory is memory that is allocated for a given physical device and then bound to a resource and accessed by a different physical device, in a logical device that represents multiple physical devices. Some ways of reading and writing peer memory may not be supported by a device.
To determine how peer memory can be accessed, call:
void vkGetDeviceGroupPeerMemoryFeatures(
VkDevice device,
uint32_t heapIndex,
uint32_t localDeviceIndex,
uint32_t remoteDeviceIndex,
VkPeerMemoryFeatureFlags* pPeerMemoryFeatures);
or the equivalent command
void vkGetDeviceGroupPeerMemoryFeaturesKHR(
VkDevice device,
uint32_t heapIndex,
uint32_t localDeviceIndex,
uint32_t remoteDeviceIndex,
VkPeerMemoryFeatureFlags* pPeerMemoryFeatures);
-
device
is the logical device that owns the memory. -
heapIndex
is the index of the memory heap from which the memory is allocated. -
localDeviceIndex
is the device index of the physical device that performs the memory access. -
remoteDeviceIndex
is the device index of the physical device that the memory is allocated for. -
pPeerMemoryFeatures
is a pointer to a bitmask of VkPeerMemoryFeatureFlagBits indicating which types of memory accesses are supported for the combination of heap, local, and remote devices.
Bits which may be set in the value returned for
vkGetDeviceGroupPeerMemoryFeatures::pPeerMemoryFeatures
,
indicating the supported peer memory features, are:
typedef enum VkPeerMemoryFeatureFlagBits {
VK_PEER_MEMORY_FEATURE_COPY_SRC_BIT = 0x00000001,
VK_PEER_MEMORY_FEATURE_COPY_DST_BIT = 0x00000002,
VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT = 0x00000004,
VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT = 0x00000008,
VK_PEER_MEMORY_FEATURE_COPY_SRC_BIT_KHR = VK_PEER_MEMORY_FEATURE_COPY_SRC_BIT,
VK_PEER_MEMORY_FEATURE_COPY_DST_BIT_KHR = VK_PEER_MEMORY_FEATURE_COPY_DST_BIT,
VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT_KHR = VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT,
VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT_KHR = VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT,
} VkPeerMemoryFeatureFlagBits;
or the equivalent
typedef VkPeerMemoryFeatureFlagBits VkPeerMemoryFeatureFlagBitsKHR;
-
VK_PEER_MEMORY_FEATURE_COPY_SRC_BIT
specifies that the memory can be accessed as the source of a vkCmdCopyBuffer, vkCmdCopyImage, vkCmdCopyBufferToImage, or vkCmdCopyImageToBuffer command. -
VK_PEER_MEMORY_FEATURE_COPY_DST_BIT
specifies that the memory can be accessed as the destination of a vkCmdCopyBuffer, vkCmdCopyImage, vkCmdCopyBufferToImage, or vkCmdCopyImageToBuffer command. -
VK_PEER_MEMORY_FEATURE_GENERIC_SRC_BIT
specifies that the memory can be read as any memory access type. -
VK_PEER_MEMORY_FEATURE_GENERIC_DST_BIT
specifies that the memory can be written as any memory access type. Shader atomics are considered to be writes.
Note
The peer memory features of a memory heap also apply to any accesses that may be performed during image layout transitions. |
VK_PEER_MEMORY_FEATURE_COPY_DST_BIT
must be supported for all host
local heaps and for at least one device local heap.
If a device does not support a peer memory feature, it is still valid to use a resource that includes both local and peer memory bindings with the corresponding access type as long as only the local bindings are actually accessed. For example, an application doing split-frame rendering would use framebuffer attachments that include both local and peer memory bindings, but would scissor the rendering to only update local memory.
typedef VkFlags VkPeerMemoryFeatureFlags;
or the equivalent
typedef VkPeerMemoryFeatureFlags VkPeerMemoryFeatureFlagsKHR;
VkPeerMemoryFeatureFlags
is a bitmask type for setting a mask of zero
or more VkPeerMemoryFeatureFlagBits.
11. Resource Creation
Vulkan supports two primary resource types: buffers and images. Resources are views of memory with associated formatting and dimensionality. Buffers are essentially unformatted arrays of bytes whereas images contain format information, can be multidimensional and may have associated metadata.
11.1. Buffers
Buffers represent linear arrays of data which are used for various purposes by binding them to a graphics or compute pipeline via descriptor sets or via certain commands, or by directly specifying them as parameters to certain commands.
Buffers are represented by VkBuffer
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkBuffer)
To create buffers, call:
VkResult vkCreateBuffer(
VkDevice device,
const VkBufferCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkBuffer* pBuffer);
-
device
is the logical device that creates the buffer object. -
pCreateInfo
is a pointer to an instance of theVkBufferCreateInfo
structure containing parameters affecting creation of the buffer. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pBuffer
points to a VkBuffer handle in which the resulting buffer object is returned.
The VkBufferCreateInfo
structure is defined as:
typedef struct VkBufferCreateInfo {
VkStructureType sType;
const void* pNext;
VkBufferCreateFlags flags;
VkDeviceSize size;
VkBufferUsageFlags usage;
VkSharingMode sharingMode;
uint32_t queueFamilyIndexCount;
const uint32_t* pQueueFamilyIndices;
} VkBufferCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkBufferCreateFlagBits specifying additional parameters of the buffer. -
size
is the size in bytes of the buffer to be created. -
usage
is a bitmask of VkBufferUsageFlagBits specifying allowed usages of the buffer. -
sharingMode
is a VkSharingMode value specifying the sharing mode of the buffer when it will be accessed by multiple queue families. -
queueFamilyIndexCount
is the number of entries in thepQueueFamilyIndices
array. -
pQueueFamilyIndices
is a list of queue families that will access this buffer (ignored ifsharingMode
is notVK_SHARING_MODE_CONCURRENT
).
editing-note
(Jon) Should the constraint on usage != 0 be converted to a Valid Usage statement? See gitlab #854. |
Bits which can be set in VkBufferCreateInfo::usage
, specifying
usage behavior of a buffer, are:
typedef enum VkBufferUsageFlagBits {
VK_BUFFER_USAGE_TRANSFER_SRC_BIT = 0x00000001,
VK_BUFFER_USAGE_TRANSFER_DST_BIT = 0x00000002,
VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT = 0x00000004,
VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT = 0x00000008,
VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT = 0x00000010,
VK_BUFFER_USAGE_STORAGE_BUFFER_BIT = 0x00000020,
VK_BUFFER_USAGE_INDEX_BUFFER_BIT = 0x00000040,
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT = 0x00000080,
VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT = 0x00000100,
VK_BUFFER_USAGE_TRANSFORM_FEEDBACK_BUFFER_BIT_EXT = 0x00000800,
VK_BUFFER_USAGE_TRANSFORM_FEEDBACK_COUNTER_BUFFER_BIT_EXT = 0x00001000,
VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT = 0x00000200,
VK_BUFFER_USAGE_RAY_TRACING_BIT_NV = 0x00000400,
} VkBufferUsageFlagBits;
-
VK_BUFFER_USAGE_TRANSFER_SRC_BIT
specifies that the buffer can be used as the source of a transfer command (see the definition ofVK_PIPELINE_STAGE_TRANSFER_BIT
). -
VK_BUFFER_USAGE_TRANSFER_DST_BIT
specifies that the buffer can be used as the destination of a transfer command. -
VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT
specifies that the buffer can be used to create aVkBufferView
suitable for occupying aVkDescriptorSet
slot of typeVK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
. -
VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT
specifies that the buffer can be used to create aVkBufferView
suitable for occupying aVkDescriptorSet
slot of typeVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
. -
VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT
specifies that the buffer can be used in aVkDescriptorBufferInfo
suitable for occupying aVkDescriptorSet
slot either of typeVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
orVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
. -
VK_BUFFER_USAGE_STORAGE_BUFFER_BIT
specifies that the buffer can be used in aVkDescriptorBufferInfo
suitable for occupying aVkDescriptorSet
slot either of typeVK_DESCRIPTOR_TYPE_STORAGE_BUFFER
orVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
. -
VK_BUFFER_USAGE_INDEX_BUFFER_BIT
specifies that the buffer is suitable for passing as thebuffer
parameter tovkCmdBindIndexBuffer
. -
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT
specifies that the buffer is suitable for passing as an element of thepBuffers
array tovkCmdBindVertexBuffers
. -
VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT
specifies that the buffer is suitable for passing as thebuffer
parameter tovkCmdDrawIndirect
,vkCmdDrawIndexedIndirect
,vkCmdDrawMeshTasksIndirectNV
,vkCmdDrawMeshTasksIndirectCountNV
, orvkCmdDispatchIndirect
. It is also suitable for passing as thebuffer
member ofVkIndirectCommandsTokenNVX
, orsequencesCountBuffer
orsequencesIndexBuffer
member ofVkCmdProcessCommandsInfoNVX
-
VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT
specifies that the buffer is suitable for passing as thebuffer
parameter to vkCmdBeginConditionalRenderingEXT. -
VK_BUFFER_USAGE_TRANSFORM_FEEDBACK_BUFFER_BIT_EXT
specifies that the buffer is suitable for using for binding as a transform feedback buffer with vkCmdBindTransformFeedbackBuffersEXT. -
VK_BUFFER_USAGE_TRANSFORM_FEEDBACK_COUNTER_BUFFER_BIT_EXT
specifies that the buffer is suitable for using as a counter buffer with vkCmdBeginTransformFeedbackEXT and vkCmdEndTransformFeedbackEXT. -
VK_BUFFER_USAGE_RAY_TRACING_BIT_NV
specifies that the buffer is suitable for use in vkCmdTraceRaysNV and vkCmdBuildAccelerationStructureNV.
typedef VkFlags VkBufferUsageFlags;
VkBufferUsageFlags
is a bitmask type for setting a mask of zero or
more VkBufferUsageFlagBits.
Bits which can be set in VkBufferCreateInfo::flags
, specifying
additional parameters of a buffer, are:
typedef enum VkBufferCreateFlagBits {
VK_BUFFER_CREATE_SPARSE_BINDING_BIT = 0x00000001,
VK_BUFFER_CREATE_SPARSE_RESIDENCY_BIT = 0x00000002,
VK_BUFFER_CREATE_SPARSE_ALIASED_BIT = 0x00000004,
VK_BUFFER_CREATE_PROTECTED_BIT = 0x00000008,
} VkBufferCreateFlagBits;
-
VK_BUFFER_CREATE_SPARSE_BINDING_BIT
specifies that the buffer will be backed using sparse memory binding. -
VK_BUFFER_CREATE_SPARSE_RESIDENCY_BIT
specifies that the buffer can be partially backed using sparse memory binding. Buffers created with this flag must also be created with theVK_BUFFER_CREATE_SPARSE_BINDING_BIT
flag. -
VK_BUFFER_CREATE_SPARSE_ALIASED_BIT
specifies that the buffer will be backed using sparse memory binding with memory ranges that might also simultaneously be backing another buffer (or another portion of the same buffer). Buffers created with this flag must also be created with theVK_BUFFER_CREATE_SPARSE_BINDING_BIT
flag. -
VK_BUFFER_CREATE_PROTECTED_BIT
specifies that the buffer is a protected buffer.
See Sparse Resource Features and Physical Device Features for details of the sparse memory features supported on a device.
typedef VkFlags VkBufferCreateFlags;
VkBufferCreateFlags
is a bitmask type for setting a mask of zero or
more VkBufferCreateFlagBits.
If the pNext
chain includes a
VkDedicatedAllocationBufferCreateInfoNV
structure, then that structure
includes an enable controlling whether the buffer will have a dedicated
memory allocation bound to it.
The VkDedicatedAllocationBufferCreateInfoNV
structure is defined as:
typedef struct VkDedicatedAllocationBufferCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkBool32 dedicatedAllocation;
} VkDedicatedAllocationBufferCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
dedicatedAllocation
specifies whether the buffer will have a dedicated allocation bound to it.
To define a set of external memory handle types that may be used as backing
store for a buffer, add a VkExternalMemoryBufferCreateInfo structure
to the pNext
chain of the VkBufferCreateInfo structure.
The VkExternalMemoryBufferCreateInfo
structure is defined as:
typedef struct VkExternalMemoryBufferCreateInfo {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlags handleTypes;
} VkExternalMemoryBufferCreateInfo;
or the equivalent
typedef VkExternalMemoryBufferCreateInfo VkExternalMemoryBufferCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleTypes
is a bitmask of VkExternalMemoryHandleTypeFlagBits specifying one or more external memory handle types.
To destroy a buffer, call:
void vkDestroyBuffer(
VkDevice device,
VkBuffer buffer,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the buffer. -
buffer
is the buffer to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
11.2. Buffer Views
A buffer view represents a contiguous range of a buffer and a specific format to be used to interpret the data. Buffer views are used to enable shaders to access buffer contents interpreted as formatted data. In order to create a valid buffer view, the buffer must have been created with at least one of the following usage flags:
-
VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT
-
VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT
Buffer views are represented by VkBufferView
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkBufferView)
To create a buffer view, call:
VkResult vkCreateBufferView(
VkDevice device,
const VkBufferViewCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkBufferView* pView);
-
device
is the logical device that creates the buffer view. -
pCreateInfo
is a pointer to an instance of theVkBufferViewCreateInfo
structure containing parameters to be used to create the buffer. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pView
points to a VkBufferView handle in which the resulting buffer view object is returned.
The VkBufferViewCreateInfo
structure is defined as:
typedef struct VkBufferViewCreateInfo {
VkStructureType sType;
const void* pNext;
VkBufferViewCreateFlags flags;
VkBuffer buffer;
VkFormat format;
VkDeviceSize offset;
VkDeviceSize range;
} VkBufferViewCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
buffer
is a VkBuffer on which the view will be created. -
format
is a VkFormat describing the format of the data elements in the buffer. -
offset
is an offset in bytes from the base address of the buffer. Accesses to the buffer view from shaders use addressing that is relative to this starting offset. -
range
is a size in bytes of the buffer view. Ifrange
is equal toVK_WHOLE_SIZE
, the range fromoffset
to the end of the buffer is used. IfVK_WHOLE_SIZE
is used and the remaining size of the buffer is not a multiple of the texel block size offormat
, the nearest smaller multiple is used.
typedef VkFlags VkBufferViewCreateFlags;
VkBufferViewCreateFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
To destroy a buffer view, call:
void vkDestroyBufferView(
VkDevice device,
VkBufferView bufferView,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the buffer view. -
bufferView
is the buffer view to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
11.3. Images
Images represent multidimensional - up to 3 - arrays of data which can be used for various purposes (e.g. attachments, textures), by binding them to a graphics or compute pipeline via descriptor sets, or by directly specifying them as parameters to certain commands.
Images are represented by VkImage
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkImage)
To create images, call:
VkResult vkCreateImage(
VkDevice device,
const VkImageCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkImage* pImage);
-
device
is the logical device that creates the image. -
pCreateInfo
is a pointer to an instance of theVkImageCreateInfo
structure containing parameters to be used to create the image. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pImage
points to a VkImage handle in which the resulting image object is returned.
The VkImageCreateInfo
structure is defined as:
typedef struct VkImageCreateInfo {
VkStructureType sType;
const void* pNext;
VkImageCreateFlags flags;
VkImageType imageType;
VkFormat format;
VkExtent3D extent;
uint32_t mipLevels;
uint32_t arrayLayers;
VkSampleCountFlagBits samples;
VkImageTiling tiling;
VkImageUsageFlags usage;
VkSharingMode sharingMode;
uint32_t queueFamilyIndexCount;
const uint32_t* pQueueFamilyIndices;
VkImageLayout initialLayout;
} VkImageCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkImageCreateFlagBits describing additional parameters of the image. -
imageType
is a VkImageType value specifying the basic dimensionality of the image. Layers in array textures do not count as a dimension for the purposes of the image type. -
format
is a VkFormat describing the format and type of the texel blocks that will be contained in the image. -
extent
is a VkExtent3D describing the number of data elements in each dimension of the base level. -
mipLevels
describes the number of levels of detail available for minified sampling of the image. -
arrayLayers
is the number of layers in the image. -
samples
is a VkSampleCountFlagBits specifying the number of samples per texel. -
tiling
is a VkImageTiling value specifying the tiling arrangement of the texel blocks in memory. -
usage
is a bitmask of VkImageUsageFlagBits describing the intended usage of the image. -
sharingMode
is a VkSharingMode value specifying the sharing mode of the image when it will be accessed by multiple queue families. -
queueFamilyIndexCount
is the number of entries in thepQueueFamilyIndices
array. -
pQueueFamilyIndices
is a list of queue families that will access this image (ignored ifsharingMode
is notVK_SHARING_MODE_CONCURRENT
). -
initialLayout
is a VkImageLayout value specifying the initial VkImageLayout of all image subresources of the image. See Image Layouts.
Images created with tiling
equal to VK_IMAGE_TILING_LINEAR
have
further restrictions on their limits and capabilities compared to images
created with tiling
equal to VK_IMAGE_TILING_OPTIMAL
.
Creation of images with tiling VK_IMAGE_TILING_LINEAR
may not be
supported unless other parameters meet all of the constraints:
-
imageType
isVK_IMAGE_TYPE_2D
-
format
is not a depth/stencil format -
mipLevels
is 1 -
arrayLayers
is 1 -
samples
isVK_SAMPLE_COUNT_1_BIT
-
usage
only includesVK_IMAGE_USAGE_TRANSFER_SRC_BIT
and/orVK_IMAGE_USAGE_TRANSFER_DST_BIT
Implementations may support additional limits and capabilities beyond those listed above.
To determine the set of valid usage
bits for a given format, call
vkGetPhysicalDeviceFormatProperties.
If the size of the resultant image would exceed maxResourceSize
, then
vkCreateImage
must fail and return
VK_ERROR_OUT_OF_DEVICE_MEMORY
.
This failure may occur even when all image creation parameters satisfy
their valid usage requirements.
Note
For images created without For images created with |
If the pNext
chain of VkImageCreateInfo includes a
VkImageStencilUsageCreateInfoEXT
structure, then that structure
includes the usage flags specific to the stencil aspect of the image for an
image with a depth-stencil format.
The VkImageStencilUsageCreateInfoEXT
structure is defined as:
typedef struct VkImageStencilUsageCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkImageUsageFlags stencilUsage;
} VkImageStencilUsageCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
stencilUsage
is a bitmask of VkImageUsageFlagBits describing the intended usage of the stencil aspect of the image.
When this structure is not present in the pNext
chain of
VkImageCreateInfo
then the implicit value of stencilUsage
matches that of VkImageCreateInfo
::usage
.
When this structure is present VkImageCreateInfo
::usage
specifies the intended usage of the depth aspect of the image and
VkImageStencilUsageCreateInfoEXT
::stencilUsage specifies the intended
usage of the stencil aspect of the image.
However, for the purposes of determining image specific valid usage
conditions, the image itself is considered to be created with a particular
VkImageUsageFlagBits value if either
VkImageCreateInfo
::usage
or
VkImageStencilUsageCreateInfoEXT
::stencil usage includes that bit
value.
This structure can also be included in the pNext
chain of
VkPhysicalDeviceImageFormatInfo2 to query additional capabilities
specific to image creation parameter combinations including a separate set
of usage flags for the stencil aspect of the image using
vkGetPhysicalDeviceImageFormatProperties2.
When this structure is not present in the pNext
chain of
VkPhysicalDeviceImageFormatInfo2
then the implicit value of
stencilUsage
matches that of
VkPhysicalDeviceImageFormatInfo2
::usage
.
If the pNext
chain includes a
VkDedicatedAllocationImageCreateInfoNV
structure, then that structure
includes an enable controlling whether the image will have a dedicated
memory allocation bound to it.
The VkDedicatedAllocationImageCreateInfoNV
structure is defined as:
typedef struct VkDedicatedAllocationImageCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkBool32 dedicatedAllocation;
} VkDedicatedAllocationImageCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
dedicatedAllocation
specifies whether the image will have a dedicated allocation bound to it.
Note
Using a dedicated allocation for color and depth/stencil attachments or other large images may improve performance on some devices. |
To define a set of external memory handle types that may be used as backing
store for an image, add a VkExternalMemoryImageCreateInfo structure to
the pNext
chain of the VkImageCreateInfo structure.
The VkExternalMemoryImageCreateInfo
structure is defined as:
typedef struct VkExternalMemoryImageCreateInfo {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlags handleTypes;
} VkExternalMemoryImageCreateInfo;
or the equivalent
typedef VkExternalMemoryImageCreateInfo VkExternalMemoryImageCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleTypes
is a bitmask of VkExternalMemoryHandleTypeFlagBits specifying one or more external memory handle types.
If the pNext
chain includes a VkExternalMemoryImageCreateInfoNV
structure, then that structure defines a set of external memory handle types
that may be used as backing store for the image.
The VkExternalMemoryImageCreateInfoNV
structure is defined as:
typedef struct VkExternalMemoryImageCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagsNV handleTypes;
} VkExternalMemoryImageCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleTypes
is a bitmask of VkExternalMemoryHandleTypeFlagBitsNV specifying one or more external memory handle types.
To create an image with an
external
format, include an instance of VkExternalFormatANDROID
in the
pNext
chain of VkImageCreateInfo.
VkExternalFormatANDROID
is defined as:
typedef struct VkExternalFormatANDROID {
VkStructureType sType;
void* pNext;
uint64_t externalFormat;
} VkExternalFormatANDROID;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
externalFormat
is an implementation-defined identifier for the external format
If externalFormat
is zero, the effect is as if the
VkExternalFormatANDROID
structure was not present.
Otherwise, the image
will have the specified external format.
If the pNext
chain of VkImageCreateInfo includes a
VkImageSwapchainCreateInfoKHR
structure, then that structure includes
a swapchain handle indicating that the image will be bound to memory from
that swapchain.
The VkImageSwapchainCreateInfoKHR
structure is defined as:
typedef struct VkImageSwapchainCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkSwapchainKHR swapchain;
} VkImageSwapchainCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
swapchain
is VK_NULL_HANDLE or a handle of a swapchain that the image will be bound to.
If the pNext
list of VkImageCreateInfo includes a
VkImageFormatListCreateInfoKHR
structure, then that structure contains
a list of all formats that can be used when creating views of this image.
The VkImageFormatListCreateInfoKHR
structure is defined as:
typedef struct VkImageFormatListCreateInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t viewFormatCount;
const VkFormat* pViewFormats;
} VkImageFormatListCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
viewFormatCount
is the number of entries in thepViewFormats
array. -
pViewFormats
is an array which lists of all formats which can be used when creating views of this image.
If viewFormatCount
is zero, pViewFormats
is ignored and the
image is created as if the VkImageFormatListCreateInfoKHR
structure
were not included in the pNext
list of VkImageCreateInfo.
If the pNext
chain of VkImageCreateInfo contains
VkImageDrmFormatModifierListCreateInfoEXT, then the image will be
created with one of the Linux DRM format
modifiers listed in the structure.
The choice of modifier is implementation-dependent.
The VkImageDrmFormatModifierListCreateInfoEXT structure is defined as:
typedef struct VkImageDrmFormatModifierListCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t drmFormatModifierCount;
const uint64_t* pDrmFormatModifiers;
} VkImageDrmFormatModifierListCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
drmFormatModifierCount
is the length of thepDrmFormatModifiers
array. -
pDrmFormatModifiers
is an array of Linux DRM format modifiers.
If the pNext
chain of VkImageCreateInfo contains
VkImageDrmFormatModifierExplicitCreateInfoEXT, then the image will be
created with the Linux DRM format modifier
and memory layout defined by the structure.
The VkImageDrmFormatModifierExplicitCreateInfoEXT structure is defined as:
typedef struct VkImageDrmFormatModifierExplicitCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint64_t drmFormatModifier;
uint32_t drmFormatModifierPlaneCount;
const VkSubresourceLayout* pPlaneLayouts;
} VkImageDrmFormatModifierExplicitCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
drmFormatModifier
is the Linux DRM format modifier with which the image will be created. -
drmFormatModifierPlaneCount
is the number of memory planes in the image (as reported by VkDrmFormatModifierPropertiesEXT) as well as the length of thepPlaneLayouts
array. -
pPlaneLayouts
is an array of VkSubresourceLayout structures that describe the image’s memory planes.
The ith member of pPlaneLayouts
describes the layout of the image’s
ith memory plane (that is,
VK_IMAGE_ASPECT_MEMORY_PLANE
_i_BIT_EXT).
In each element of pPlaneLayouts
, the implementation must ignore
size
.
The implementation calculates the size of each plane, which the application
can query with vkGetImageSubresourceLayout.
When creating an image with
VkImageDrmFormatModifierExplicitCreateInfoEXT, it is the application’s
responsibility to satisfy all Valid Usage requirements.
However, the implementation must validate that the provided
pPlaneLayouts
, when combined with the provided drmFormatModifier
and other creation parameters in VkImageCreateInfo and its pNext
chain, produce a valid image.
(This validation is necessarily implementation-dependent and outside the
scope of Vulkan, and therefore not described by Valid Usage requirements).
If this validation fails, then vkCreateImage returns
VK_ERROR_INVALID_DRM_FORMAT_MODIFIER_PLANE_LAYOUT_EXT
.
Bits which can be set in VkImageCreateInfo::usage
, specifying
intended usage of an image, are:
typedef enum VkImageUsageFlagBits {
VK_IMAGE_USAGE_TRANSFER_SRC_BIT = 0x00000001,
VK_IMAGE_USAGE_TRANSFER_DST_BIT = 0x00000002,
VK_IMAGE_USAGE_SAMPLED_BIT = 0x00000004,
VK_IMAGE_USAGE_STORAGE_BIT = 0x00000008,
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT = 0x00000010,
VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT = 0x00000020,
VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT = 0x00000040,
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT = 0x00000080,
VK_IMAGE_USAGE_SHADING_RATE_IMAGE_BIT_NV = 0x00000100,
VK_IMAGE_USAGE_FRAGMENT_DENSITY_MAP_BIT_EXT = 0x00000200,
} VkImageUsageFlagBits;
-
VK_IMAGE_USAGE_TRANSFER_SRC_BIT
specifies that the image can be used as the source of a transfer command. -
VK_IMAGE_USAGE_TRANSFER_DST_BIT
specifies that the image can be used as the destination of a transfer command. -
VK_IMAGE_USAGE_SAMPLED_BIT
specifies that the image can be used to create aVkImageView
suitable for occupying aVkDescriptorSet
slot either of typeVK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
orVK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
, and be sampled by a shader. -
VK_IMAGE_USAGE_STORAGE_BIT
specifies that the image can be used to create aVkImageView
suitable for occupying aVkDescriptorSet
slot of typeVK_DESCRIPTOR_TYPE_STORAGE_IMAGE
. -
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
specifies that the image can be used to create aVkImageView
suitable for use as a color or resolve attachment in aVkFramebuffer
. -
VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
specifies that the image can be used to create aVkImageView
suitable for use as a depth/stencil attachment in aVkFramebuffer
. -
VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT
specifies that the memory bound to this image will have been allocated with theVK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT
(see Memory Allocation for more detail). This bit can be set for any image that can be used to create aVkImageView
suitable for use as a color, resolve, depth/stencil, or input attachment. -
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT
specifies that the image can be used to create aVkImageView
suitable for occupyingVkDescriptorSet
slot of typeVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
; be read from a shader as an input attachment; and be used as an input attachment in a framebuffer. -
VK_IMAGE_USAGE_SHADING_RATE_IMAGE_BIT_NV
specifies that the image can be used to create aVkImageView
suitable for use as a shading rate image.
typedef VkFlags VkImageUsageFlags;
VkImageUsageFlags
is a bitmask type for setting a mask of zero or more
VkImageUsageFlagBits.
Bits which can be set in VkImageCreateInfo::flags
, specifying
additional parameters of an image, are:
typedef enum VkImageCreateFlagBits {
VK_IMAGE_CREATE_SPARSE_BINDING_BIT = 0x00000001,
VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT = 0x00000002,
VK_IMAGE_CREATE_SPARSE_ALIASED_BIT = 0x00000004,
VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT = 0x00000008,
VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT = 0x00000010,
VK_IMAGE_CREATE_ALIAS_BIT = 0x00000400,
VK_IMAGE_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT = 0x00000040,
VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT = 0x00000020,
VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT = 0x00000080,
VK_IMAGE_CREATE_EXTENDED_USAGE_BIT = 0x00000100,
VK_IMAGE_CREATE_PROTECTED_BIT = 0x00000800,
VK_IMAGE_CREATE_DISJOINT_BIT = 0x00000200,
VK_IMAGE_CREATE_CORNER_SAMPLED_BIT_NV = 0x00002000,
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT = 0x00001000,
VK_IMAGE_CREATE_SUBSAMPLED_BIT_EXT = 0x00004000,
VK_IMAGE_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT_KHR = VK_IMAGE_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT,
VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT_KHR = VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT,
VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT_KHR = VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT,
VK_IMAGE_CREATE_EXTENDED_USAGE_BIT_KHR = VK_IMAGE_CREATE_EXTENDED_USAGE_BIT,
VK_IMAGE_CREATE_DISJOINT_BIT_KHR = VK_IMAGE_CREATE_DISJOINT_BIT,
VK_IMAGE_CREATE_ALIAS_BIT_KHR = VK_IMAGE_CREATE_ALIAS_BIT,
} VkImageCreateFlagBits;
-
VK_IMAGE_CREATE_SPARSE_BINDING_BIT
specifies that the image will be backed using sparse memory binding. -
VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
specifies that the image can be partially backed using sparse memory binding. Images created with this flag must also be created with theVK_IMAGE_CREATE_SPARSE_BINDING_BIT
flag. -
VK_IMAGE_CREATE_SPARSE_ALIASED_BIT
specifies that the image will be backed using sparse memory binding with memory ranges that might also simultaneously be backing another image (or another portion of the same image). Images created with this flag must also be created with theVK_IMAGE_CREATE_SPARSE_BINDING_BIT
flag -
VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
specifies that the image can be used to create aVkImageView
with a different format from the image. For multi-planar formats,VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
specifies that aVkImageView
can be created of a plane of the image. -
VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
specifies that the image can be used to create aVkImageView
of typeVK_IMAGE_VIEW_TYPE_CUBE
orVK_IMAGE_VIEW_TYPE_CUBE_ARRAY
. -
VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT
specifies that the image can be used to create aVkImageView
of typeVK_IMAGE_VIEW_TYPE_2D
orVK_IMAGE_VIEW_TYPE_2D_ARRAY
. -
VK_IMAGE_CREATE_PROTECTED_BIT
specifies that the image is a protected image. -
VK_IMAGE_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT
specifies that the image can be used with a non-zero value of thesplitInstanceBindRegionCount
member of a VkBindImageMemoryDeviceGroupInfo structure passed into vkBindImageMemory2. This flag also has the effect of making the image use the standard sparse image block dimensions. -
VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT
specifies that the image having a compressed format can be used to create aVkImageView
with an uncompressed format where each texel in the image view corresponds to a compressed texel block of the image. -
VK_IMAGE_CREATE_EXTENDED_USAGE_BIT
specifies that the image can be created with usage flags that are not supported for the format the image is created with but are supported for at least one format aVkImageView
created from the image can have. -
VK_IMAGE_CREATE_DISJOINT_BIT
specifies that an image with a multi-planar format must have each plane separately bound to memory, rather than having a single memory binding for the whole image; the presence of this bit distinguishes a disjoint image from an image without this bit set. -
VK_IMAGE_CREATE_ALIAS_BIT
specifies that two images created with the same creation parameters and aliased to the same memory can interpret the contents of the memory consistently with each other, subject to the rules described in the Memory Aliasing section. This flag further specifies that each plane of a disjoint image can share an in-memory non-linear representation with single-plane images, and that a single-plane image can share an in-memory non-linear representation with a plane of a multi-planar disjoint image, according to the rules in Compatible formats of planes of multi-planar formats. If thepNext
chain includes a VkExternalMemoryImageCreateInfo or VkExternalMemoryImageCreateInfoNV structure whosehandleTypes
member is not0
, it is as ifVK_IMAGE_CREATE_ALIAS_BIT
is set. -
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
specifies that an image with a depth or depth/stencil format can be used with custom sample locations when used as a depth/stencil attachment. -
VK_IMAGE_CREATE_CORNER_SAMPLED_BIT_NV
specifies that the image is a corner-sampled image. -
VK_IMAGE_CREATE_SUBSAMPLED_BIT_EXT
specifies that an image can be in a subsampled format which may be more optimal when written as an attachment by a render pass that has a fragment density map attachment. Accessing a subsampled image has additional considerations:-
Image data read as an image sampler is undefined if the sampler was not created with
flags
containingVK_SAMPLER_CREATE_SUBSAMPLED_BIT_EXT
or was not sampled through the use of a combined image sampler with an immutable sampler inVkDescriptorSetLayoutBinding
. -
Image data read with an input attachment is undefined if the contents were not written as an attachment in an earlier subpass of the same render pass.
-
Image data read with load operations may be resampled to the fragment density of the render pass.
-
Image contents outside of the render area become undefined if the image is stored as a render pass attachment.
-
See Sparse Resource Features and Sparse Physical Device Features for more details.
typedef VkFlags VkImageCreateFlags;
VkImageCreateFlags
is a bitmask type for setting a mask of zero or
more VkImageCreateFlagBits.
Possible values of VkImageCreateInfo::imageType
, specifying the
basic dimensionality of an image, are:
typedef enum VkImageType {
VK_IMAGE_TYPE_1D = 0,
VK_IMAGE_TYPE_2D = 1,
VK_IMAGE_TYPE_3D = 2,
} VkImageType;
-
VK_IMAGE_TYPE_1D
specifies a one-dimensional image. -
VK_IMAGE_TYPE_2D
specifies a two-dimensional image. -
VK_IMAGE_TYPE_3D
specifies a three-dimensional image.
Possible values of VkImageCreateInfo::tiling
, specifying the
tiling arrangement of texel blocks in an image, are:
typedef enum VkImageTiling {
VK_IMAGE_TILING_OPTIMAL = 0,
VK_IMAGE_TILING_LINEAR = 1,
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT = 1000158000,
} VkImageTiling;
-
VK_IMAGE_TILING_OPTIMAL
specifies optimal tiling (texels are laid out in an implementation-dependent arrangement, for more optimal memory access). -
VK_IMAGE_TILING_LINEAR
specifies linear tiling (texels are laid out in memory in row-major order, possibly with some padding on each row). -
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
indicates that the image’s tiling is defined by a Linux DRM format modifier. The modifier is specified at image creation with VkImageDrmFormatModifierListCreateInfoEXT or VkImageDrmFormatModifierExplicitCreateInfoEXT, and can be queried with vkGetImageDrmFormatModifierPropertiesEXT.
To query the memory layout of an image subresource, call:
void vkGetImageSubresourceLayout(
VkDevice device,
VkImage image,
const VkImageSubresource* pSubresource,
VkSubresourceLayout* pLayout);
-
device
is the logical device that owns the image. -
image
is the image whose layout is being queried. -
pSubresource
is a pointer to a VkImageSubresource structure selecting a specific image for the image subresource. -
pLayout
points to a VkSubresourceLayout structure in which the layout is returned.
If the image is linear, then the returned layout is valid for host access.
If the image’s
tiling is VK_IMAGE_TILING_LINEAR
and its
format is a
multi-planar format,
then vkGetImageSubresourceLayout
describes one
format plane
of the image.
If the image’s tiling is VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
, then
vkGetImageSubresourceLayout
describes one memory plane of the image.
If the image’s tiling is VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
and
the image is non-linear, then the returned
layout has an implementation-dependent meaning; the vendor of the image’s
DRM format modifier may provide
documentation that explains how to interpret the returned layout.
vkGetImageSubresourceLayout
is invariant for the lifetime of a single
image.
However, the subresource layout of images in Android hardware buffer
external memory is not known until the image has been bound to memory, so
calling vkGetImageSubresourceLayout
for such an image before it has
been bound will result in undefined behavior.
The VkImageSubresource
structure is defined as:
typedef struct VkImageSubresource {
VkImageAspectFlags aspectMask;
uint32_t mipLevel;
uint32_t arrayLayer;
} VkImageSubresource;
-
aspectMask
is a VkImageAspectFlags selecting the image aspect. -
mipLevel
selects the mipmap level. -
arrayLayer
selects the array layer.
Information about the layout of the image subresource is returned in a
VkSubresourceLayout
structure:
typedef struct VkSubresourceLayout {
VkDeviceSize offset;
VkDeviceSize size;
VkDeviceSize rowPitch;
VkDeviceSize arrayPitch;
VkDeviceSize depthPitch;
} VkSubresourceLayout;
-
offset
is the byte offset from the start of the image or the plane where the image subresource begins. -
size
is the size in bytes of the image subresource.size
includes any extra memory that is required based onrowPitch
. -
rowPitch
describes the number of bytes between each row of texels in an image. -
arrayPitch
describes the number of bytes between each array layer of an image. -
depthPitch
describes the number of bytes between each slice of 3D image.
If the image is linear, then rowPitch
,
arrayPitch
and depthPitch
describe the layout of the image
subresource in linear memory.
For uncompressed formats, rowPitch
is the number of bytes between
texels with the same x coordinate in adjacent rows (y coordinates differ by
one).
arrayPitch
is the number of bytes between texels with the same x and y
coordinate in adjacent array layers of the image (array layer values differ
by one).
depthPitch
is the number of bytes between texels with the same x and y
coordinate in adjacent slices of a 3D image (z coordinates differ by one).
Expressed as an addressing formula, the starting byte of a texel in the
image subresource has address:
// (x,y,z,layer) are in texel coordinates
address(x,y,z,layer) = layer*arrayPitch + z*depthPitch + y*rowPitch + x*elementSize + offset
For compressed formats, the rowPitch
is the number of bytes between
compressed texel blocks in adjacent rows.
arrayPitch
is the number of bytes between compressed texel blocks in
adjacent array layers.
depthPitch
is the number of bytes between compressed texel blocks in
adjacent slices of a 3D image.
// (x,y,z,layer) are in compressed texel block coordinates
address(x,y,z,layer) = layer*arrayPitch + z*depthPitch + y*rowPitch + x*compressedTexelBlockByteSize + offset;
The value of arrayPitch
is undefined for images that were not created
as arrays.
depthPitch
is defined only for 3D images.
If the image has a
single-plane
color format
and its tiling is VK_IMAGE_TILING_LINEAR
, then the aspectMask
member of VkImageSubresource
must be
VK_IMAGE_ASPECT_COLOR_BIT
.
If the image has a depth/stencil format
and its tiling is VK_IMAGE_TILING_LINEAR
, then aspectMask
must be either VK_IMAGE_ASPECT_DEPTH_BIT
or
VK_IMAGE_ASPECT_STENCIL_BIT
.
On implementations that store depth and stencil aspects separately, querying
each of these image subresource layouts will return a different offset
and size
representing the region of memory used for that aspect.
On implementations that store depth and stencil aspects interleaved, the
same offset
and size
are returned and represent the interleaved
memory allocation.
If the image has a
multi-planar format
and its tiling is VK_IMAGE_TILING_LINEAR
, then the aspectMask
member of VkImageSubresource
must be
VK_IMAGE_ASPECT_PLANE_0_BIT
, VK_IMAGE_ASPECT_PLANE_1_BIT
, or
(for 3-plane formats only) VK_IMAGE_ASPECT_PLANE_2_BIT
.
Querying each of these image subresource layouts will return a different
offset
and size
representing the region of memory used for that
plane.
If the image is disjoint, then the offset
is relative to the base
address of the plane.
If the image is non-disjoint, then the offset
is relative to the
base address of the image.
If the image’s tiling is VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
, then
the aspectMask
member of VkImageSubresource
must be one of
VK_IMAGE_ASPECT_MEMORY_PLANE
i_BIT_EXT, where the maximum allowed
plane index i is defined by the
drmFormatModifierPlaneCount
associated with the image’s format
and
modifier.
The memory range used by the subresource is described by offset
and
size
.
If the image is _disjoint, then the offset
is relative to the base
address of the memory plane.
If the image is non-disjoint, then the offset
is relative to the
base address of the image.
If the image is non-linear, then
rowPitch
, arrayPitch
, and depthPitch
have an
implementation-dependent meaning.
If an image was created with VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
,
then the image has a Linux DRM format
modifier.
To query the modifier, call:
VkResult vkGetImageDrmFormatModifierPropertiesEXT(
VkDevice device,
VkImage image,
VkImageDrmFormatModifierPropertiesEXT* pProperties);
-
device
is the logical device that owns the image. -
image
is the queried image. -
pProperties
will return properties of the image’s DRM format modifier.
The VkImageDrmFormatModifierPropertiesEXT structure is defined as:
typedef struct VkImageDrmFormatModifierPropertiesEXT {
VkStructureType sType;
void* pNext;
uint64_t drmFormatModifier;
} VkImageDrmFormatModifierPropertiesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
drmFormatModifier
returns the image’s Linux DRM format modifier.
If the image
was created with
VkImageDrmFormatModifierListCreateInfoEXT, then the returned
drmFormatModifier
must belong to the list of modifiers provided at
time of image creation in
VkImageDrmFormatModifierListCreateInfoEXT::pDrmFormatModifiers
.
If the image
was created with
VkImageDrmFormatModifierExplicitCreateInfoEXT, then the returned
drmFormatModifier
must be the modifier provided at time of image
creation in
VkImageDrmFormatModifierExplicitCreateInfoEXT::drmFormatModifier
.
To destroy an image, call:
void vkDestroyImage(
VkDevice device,
VkImage image,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the image. -
image
is the image to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
11.3.1. Image Format Features
Valid usage of a VkImage may be constrained by the image’s format features, defined below. Such constraints are documented in the affected valid usage statement.
-
If the image was created with
VK_IMAGE_TILING_LINEAR
, then its set of format features is the value of VkFormatProperties::linearTilingFeatures
found by calling vkGetPhysicalDeviceFormatProperties on the sameformat
as VkImageCreateInfo::format
. -
If the image was created with
VK_IMAGE_TILING_OPTIMAL
, but without an external format, then its set of format features is the value of VkFormatProperties::optimalTilingFeatures
found by calling vkGetPhysicalDeviceFormatProperties on the sameformat
as VkImageCreateInfo::format
. -
If the image was created with an external format, then its set of format features is the value of VkAndroidHardwareBufferFormatPropertiesANDROID::
formatFeatures
found by calling vkGetAndroidHardwareBufferPropertiesANDROID on the Android hardware buffer that was imported to the VkDeviceMemory to which the image is bound. -
If the image was created with
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
, then:-
The image’s DRM format modifier is the value of VkImageDrmFormatModifierListCreateInfoEXT::
drmFormatModifier
found by calling vkGetImageDrmFormatModifierPropertiesEXT. -
Let VkDrmFormatModifierPropertiesListEXT::
pDrmFormatModifierProperties
be the array found by calling vkGetPhysicalDeviceFormatProperties2 on the sameformat
as VkImageCreateInfo::format
. -
Let
VkDrmFormatModifierPropertiesEXT prop
be the array element whosedrmFormatModifier
member is the value of the image’s DRM format modifier. -
Then the image set of format features is the value of
prop
::drmFormatModifierTilingFeatures
.
-
11.3.2. Corner-Sampled Images
A corner-sampled image is an image where unnormalized texel coordinates are centered on integer values rather than half-integer values.
A corner-sampled image has a number of differences compared to conventional texture image:
-
Texels are centered on integer coordinates. See Unnormalized Texel Coordinate Operations
-
Normalized coordinates are scaled using coord * (dim - 1) rather than coord * dim, where dim is the size of one dimension of the image. See normalized texel coordinate transform.
-
Partial derivatives are scaled using coord * (dim - 1) rather than coord * dim. See Scale Factor Operation.
-
Calculation of the next higher lod size goes according to ⌈dim / 2⌉ rather than ⌊dim / 2⌋. See Image Miplevel Sizing.
-
The minimum level size is 2x2 for 2D images and 2x2x2 for 3D images. See Image Miplevel Sizing.
Corner-sampling is only supported for 2D and 3D images.
When sampling a corner-sampled image, the sampler addressing mode must be
VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
.
Corner-sampled images are not supported as cubemaps or depth/stencil images.
11.3.3. Image Miplevel Sizing
A complete mipmap chain is the full set of miplevels, from the largest miplevel provided, down to the minimum miplevel size.
Conventional Images
For conventional images, the dimensions of each successive miplevel, n+1, are:
-
width
n+1 = max(⌊width
n/2⌋, 1) -
height
n+1 = max(⌊height
n/2⌋, 1) -
depth
n+1 = max(⌊depth
n/2⌋, 1)
where width
n, height
n, and depth
n
are the dimensions of the next larger miplevel, n.
The minimum miplevel size is:
-
1 for one-dimensional images,
-
1x1 for two-dimensional images, and
-
1x1x1 for three-dimensional images.
The number of levels in a complete mipmap chain is:
-
⌊log2(max(
width
0,height
0,depth
0))⌋ + 1
where width
0, height
0, and depth
0
are the dimensions of the largest (most detailed) miplevel, 0
.
Corner-Sampled Images
For corner-sampled images, the dimensions of each successive miplevel, n+1, are:
-
width
n+1 = max(⌈width
n/2⌉, 2) -
height
n+1 = max(⌈height
n/2⌉, 2) -
depth
n+1 = max(⌈depth
n/2⌉, 2)
where width
n, height
n, and depth
n
are the dimensions of the next larger miplevel, n.
The minimum miplevel size is:
-
2x2 for two-dimensional images, and
-
2x2x2 for three-dimensional images.
The number of levels in a complete mipmap chain is:
-
⌈log2(max(
width
0,height
0,depth
0))⌉
where width
0, height
0, and depth
0
are the dimensions of the largest (most detailed) miplevel, 0
.
11.4. Image Layouts
Images are stored in implementation-dependent opaque layouts in memory.
Each layout has limitations on what kinds of operations are supported for
image subresources using the layout.
At any given time, the data representing an image subresource in memory
exists in a particular layout which is determined by the most recent layout
transition that was performed on that image subresource.
Applications have control over which layout each image subresource uses, and
can transition an image subresource from one layout to another.
Transitions can happen with an image memory barrier, included as part of a
vkCmdPipelineBarrier
or a vkCmdWaitEvents
command buffer command
(see Image Memory Barriers), or as part of a subpass
dependency within a render pass (see VkSubpassDependency
).
The image layout is per-image subresource, and separate image subresources
of the same image can be in different layouts at the same time with one
exception - depth and stencil aspects of a given image subresource must
always be in the same layout.
Note
Each layout may offer optimal performance for a specific usage of image
memory.
For example, an image with a layout of
|
Upon creation, all image subresources of an image are initially in the same
layout, where that layout is selected by the
VkImageCreateInfo
::initialLayout
member.
The initialLayout
must be either VK_IMAGE_LAYOUT_UNDEFINED
or
VK_IMAGE_LAYOUT_PREINITIALIZED
.
If it is VK_IMAGE_LAYOUT_PREINITIALIZED
, then the image data can be
preinitialized by the host while using this layout, and the transition away
from this layout will preserve that data.
If it is VK_IMAGE_LAYOUT_UNDEFINED
, then the contents of the data are
considered to be undefined, and the transition away from this layout is not
guaranteed to preserve that data.
For either of these initial layouts, any image subresources must be
transitioned to another layout before they are accessed by the device.
Host access to image memory is only well-defined for
[glossary-linear-resource] images and for image subresources of those
images which are currently in either the
VK_IMAGE_LAYOUT_PREINITIALIZED
or VK_IMAGE_LAYOUT_GENERAL
layout.
Calling vkGetImageSubresourceLayout for a linear image returns a
subresource layout mapping that is valid for either of those image layouts.
The set of image layouts consists of:
typedef enum VkImageLayout {
VK_IMAGE_LAYOUT_UNDEFINED = 0,
VK_IMAGE_LAYOUT_GENERAL = 1,
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL = 2,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL = 3,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL = 4,
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL = 5,
VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL = 6,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL = 7,
VK_IMAGE_LAYOUT_PREINITIALIZED = 8,
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL = 1000117000,
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL = 1000117001,
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR = 1000001002,
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR = 1000111000,
VK_IMAGE_LAYOUT_SHADING_RATE_OPTIMAL_NV = 1000164003,
VK_IMAGE_LAYOUT_FRAGMENT_DENSITY_MAP_OPTIMAL_EXT = 1000218000,
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL_KHR = VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL,
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL_KHR = VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL,
} VkImageLayout;
The type(s) of device access supported by each layout are:
-
VK_IMAGE_LAYOUT_UNDEFINED
does not support device access. This layout must only be used as theinitialLayout
member ofVkImageCreateInfo
orVkAttachmentDescription
, or as theoldLayout
in an image transition. When transitioning out of this layout, the contents of the memory are not guaranteed to be preserved. -
VK_IMAGE_LAYOUT_PREINITIALIZED
does not support device access. This layout must only be used as theinitialLayout
member ofVkImageCreateInfo
orVkAttachmentDescription
, or as theoldLayout
in an image transition. When transitioning out of this layout, the contents of the memory are preserved. This layout is intended to be used as the initial layout for an image whose contents are written by the host, and hence the data can be written to memory immediately, without first executing a layout transition. Currently,VK_IMAGE_LAYOUT_PREINITIALIZED
is only useful with linear images because there is not a standard layout defined forVK_IMAGE_TILING_OPTIMAL
images. -
VK_IMAGE_LAYOUT_GENERAL
supports all types of device access. -
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL
must only be used as a color or resolve attachment in aVkFramebuffer
. This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
usage bit enabled. -
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
must only be used as a depth/stencil attachment in aVkFramebuffer
. This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
usage bit enabled. -
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL
must only be used as a read-only depth/stencil attachment in aVkFramebuffer
and/or as a read-only image in a shader (which can be read as a sampled image, combined image/sampler and/or input attachment). This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
usage bit enabled. Only image views created with ausage
value includingVK_IMAGE_USAGE_SAMPLED_BIT
can be used as a sampled image or combined image/sampler in a shader. Similarly, only image views created with ausage
value includingVK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT
can be used as input attachments. -
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL
: must only be used as a depth/stencil attachment in aVkFramebuffer
, where the depth aspect is read-only, and/or as a read-only image in a shader (which can be read as a sampled image, combined image/sampler and/or input attachment) where only the depth aspect is accessed. This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
usage bit enabled. Only image views created with ausage
value includingVK_IMAGE_USAGE_SAMPLED_BIT
can be used as a sampled image or combined image/sampler in a shader. Similarly, only image views created with ausage
value includingVK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT
can be used as input attachments. -
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL
: must only be used as a depth/stencil attachment in aVkFramebuffer
, where the stencil aspect is read-only, and/or as a read-only image in a shader (which can be read as a sampled image, combined image/sampler and/or input attachment) where only the stencil aspect is accessed. This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
usage bit enabled. Only image views created with ausage
value includingVK_IMAGE_USAGE_SAMPLED_BIT
can be used as a sampled image or combined image/sampler in a shader. Similarly, only image views created with ausage
value includingVK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT
can be used as input attachments. -
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL
must only be used as a read-only image in a shader (which can be read as a sampled image, combined image/sampler and/or input attachment). This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_SAMPLED_BIT
orVK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT
usage bit enabled. -
VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL
must only be used as a source image of a transfer command (see the definition ofVK_PIPELINE_STAGE_TRANSFER_BIT
). This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_TRANSFER_SRC_BIT
usage bit enabled. -
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL
must only be used as a destination image of a transfer command. This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_TRANSFER_DST_BIT
usage bit enabled. -
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR
must only be used for presenting a presentable image for display. A swapchain’s image must be transitioned to this layout before calling vkQueuePresentKHR, and must be transitioned away from this layout after calling vkAcquireNextImageKHR. -
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
is valid only for shared presentable images, and must be used for any usage the image supports. -
VK_IMAGE_LAYOUT_SHADING_RATE_OPTIMAL_NV
must only be used as a read-only shading-rate-image. This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_SHADING_RATE_IMAGE_BIT_NV
usage bit enabled. -
VK_IMAGE_LAYOUT_FRAGMENT_DENSITY_MAP_OPTIMAL_EXT
must only be used as a fragment density map attachment in aVkRenderPass
. This layout is valid only for image subresources of images created with theVK_IMAGE_USAGE_FRAGMENT_DENSITY_MAP_BIT_EXT
usage bit enabled.
The layout of each image subresource is not a state of the image subresource
itself, but is rather a property of how the data in memory is organized, and
thus for each mechanism of accessing an image in the API the application
must specify a parameter or structure member that indicates which image
layout the image subresource(s) are considered to be in when the image will
be accessed.
For transfer commands, this is a parameter to the command (see Clear Commands
and Copy Commands).
For use as a framebuffer attachment, this is a member in the substructures
of the VkRenderPassCreateInfo
(see Render Pass).
For use in a descriptor set, this is a member in the
VkDescriptorImageInfo
structure (see Descriptor Set Updates).
11.4.1. Image Layout Matching Rules
At the time that any command buffer command accessing an image executes on any queue, the layouts of the image subresources that are accessed must all match exactly the layout specified via the API controlling those accesses , except in case of accesses to an image with a depth/stencil format performed through descriptors referring to only a single aspect of the image, where the following relaxed matching rules apply:
-
Descriptors referring just to the depth aspect of a depth/stencil image only need to match in the image layout of the depth aspect, thus
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL
andVK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL
are considered to match. -
Descriptors referring just to the stencil aspect of a depth/stencil image only need to match in the image layout of the stencil aspect, thus
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL
andVK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL
are considered to match .
When performing a layout transition on an image subresource, the old layout
value must either equal the current layout of the image subresource (at the
time the transition executes), or else be VK_IMAGE_LAYOUT_UNDEFINED
(implying that the contents of the image subresource need not be preserved).
The new layout used in a transition must not be
VK_IMAGE_LAYOUT_UNDEFINED
or VK_IMAGE_LAYOUT_PREINITIALIZED
.
The image layout of each image subresource of a depth/stencil image created
with VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
is
dependent on the last sample locations used to render to the image
subresource as a depth/stencil attachment, thus applications must provide
the same sample locations that were last used to render to the given image
subresource whenever a layout transition of the image subresource happens,
otherwise the contents of the depth aspect of the image subresource become
undefined.
In addition, depth reads from a depth/stencil attachment referring to an
image subresource range of a depth/stencil image created with
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
using
different sample locations than what have been last used to perform depth
writes to the image subresources of the same image subresource range return
undefined values.
Similarly, depth writes to a depth/stencil attachment referring to an image
subresource range of a depth/stencil image created with
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
using
different sample locations than what have been last used to perform depth
writes to the image subresources of the same image subresource range make
the contents of the depth aspect of those image subresources undefined.
11.5. Image Views
Image objects are not directly accessed by pipeline shaders for reading or writing image data. Instead, image views representing contiguous ranges of the image subresources and containing additional metadata are used for that purpose. Views must be created on images of compatible types, and must represent a valid subset of image subresources.
Image views are represented by VkImageView
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkImageView)
The types of image views that can be created are:
typedef enum VkImageViewType {
VK_IMAGE_VIEW_TYPE_1D = 0,
VK_IMAGE_VIEW_TYPE_2D = 1,
VK_IMAGE_VIEW_TYPE_3D = 2,
VK_IMAGE_VIEW_TYPE_CUBE = 3,
VK_IMAGE_VIEW_TYPE_1D_ARRAY = 4,
VK_IMAGE_VIEW_TYPE_2D_ARRAY = 5,
VK_IMAGE_VIEW_TYPE_CUBE_ARRAY = 6,
} VkImageViewType;
The exact image view type is partially implicit, based on the image’s type
and sample count, as well as the view creation parameters as described in
the image view compatibility table
for vkCreateImageView.
This table also shows which SPIR-V OpTypeImage
Dim
and
Arrayed
parameters correspond to each image view type.
To create an image view, call:
VkResult vkCreateImageView(
VkDevice device,
const VkImageViewCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkImageView* pView);
-
device
is the logical device that creates the image view. -
pCreateInfo
is a pointer to an instance of theVkImageViewCreateInfo
structure containing parameters to be used to create the image view. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pView
points to a VkImageView handle in which the resulting image view object is returned.
Some of the image creation parameters are inherited by the view.
In particular, image view creation inherits the implicit parameter
usage
specifying the allowed usages of the image view that, by
default, takes the value of the corresponding usage
parameter
specified in VkImageCreateInfo
at image creation time
, except if the image has a depth-stencil format,
subresourceRange.aspectMask
specified in the pCreateInfo
parameter includes VK_IMAGE_ASPECT_STENCIL_BIT
, and the pNext
chain of VkImageCreateInfo
specified at image creation time contained
an instance of VkImageStencilUsageCreateInfoEXT in which case it takes
the value of the stencilUsage
member of that structure
.
This implicit parameter can be overriden by chaining a
VkImageViewUsageCreateInfo structure through the pNext
member to
VkImageViewCreateInfo
as described later in this section.
The remaining parameters are contained in the pCreateInfo
.
The VkImageViewCreateInfo
structure is defined as:
typedef struct VkImageViewCreateInfo {
VkStructureType sType;
const void* pNext;
VkImageViewCreateFlags flags;
VkImage image;
VkImageViewType viewType;
VkFormat format;
VkComponentMapping components;
VkImageSubresourceRange subresourceRange;
} VkImageViewCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkImageViewCreateFlagBits describing additional parameters of the image view. -
image
is a VkImage on which the view will be created. -
viewType
is a VkImageViewType value specifying the type of the image view. -
format
is a VkFormat describing the format and type used to interpret texel blocks in the image. -
components
is a VkComponentMapping specifies a remapping of color components (or of depth or stencil components after they have been converted into color components). -
subresourceRange
is a VkImageSubresourceRange selecting the set of mipmap levels and array layers to be accessible to the view.
If image
was created with the VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
flag,
and if the format
of the image is not
multi-planar,
format
can be different from the image’s format, but if
image
was created without the
VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT
flag and
they are not equal they must be compatible.
Image format compatibility is defined in the
Format Compatibility Classes
section.
Views of compatible formats will have the same mapping between texel
coordinates and memory locations irrespective of the format
, with only
the interpretation of the bit pattern changing.
Note
Values intended to be used with one view format may not be exactly preserved when written or read through a different format. For example, an integer value that happens to have the bit pattern of a floating point denorm or NaN may be flushed or canonicalized when written or read through a view with a floating point format. Similarly, a value written through a signed normalized format that has a bit pattern exactly equal to -2b may be changed to -2b + 1 as described in Conversion from Normalized Fixed-Point to Floating-Point. |
If image
was created with the
VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT
flag, format
must be compatible with the image’s format as described above, or must
be an uncompressed format in which case it must be size-compatible with
the image’s format, as defined for
copying data between images In
this case the resulting image view’s texel dimensions equal the dimensions
of the selected mip level divided by the compressed texel block size and
rounded up.
If the image view is to be used with a sampler which supports
sampler Y’CBCR conversion, an identically
defined object of type VkSamplerYcbcrConversion to that used to
create the sampler must be passed to vkCreateImageView in a
VkSamplerYcbcrConversionInfo added to the pNext
chain of
VkImageViewCreateInfo.
If the image has a
multi-planar
format
and subresourceRange.aspectMask
is
VK_IMAGE_ASPECT_COLOR_BIT
, format
must be identical to the
image format
, and the sampler to be used with the image view must
enable sampler Y’CBCR conversion.
If image
was created with the VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
and the image has a
multi-planar
format
, and if subresourceRange.aspectMask
is
VK_IMAGE_ASPECT_PLANE_0_BIT
, VK_IMAGE_ASPECT_PLANE_1_BIT
, or
VK_IMAGE_ASPECT_PLANE_2_BIT
, format
must be
compatible with the corresponding
plane of the image, and the sampler to be used with the image view must not
enable sampler Y’CBCR conversion.
The width
and height
of the single-plane image view must be
derived from the multi-planar image’s dimensions in the manner listed for
plane compatibility for the plane.
Any view of an image plane will have the same mapping between texel coordinates and memory locations as used by the channels of the color aspect, subject to the formulae relating texel coordinates to lower-resolution planes as described in Chroma Reconstruction. That is, if an R or B plane has a reduced resolution relative to the G plane of the multi-planar image, the image view operates using the (uplane, vplane) unnormalized coordinates of the reduced-resolution plane, and these coordinates access the same memory locations as the (ucolor, vcolor) unnormalized coordinates of the color aspect for which chroma reconstruction operations operate on the same (uplane, vplane) or (iplane, jplane) coordinates.
Dim, Arrayed, MS | Image parameters | View parameters |
---|---|---|
|
|
|
1D, 0, 0 |
|
|
1D, 1, 0 |
|
|
2D, 0, 0 |
|
|
2D, 1, 0 |
|
|
2D, 0, 1 |
|
|
2D, 1, 1 |
|
|
CUBE, 0, 0 |
|
|
CUBE, 1, 0 |
|
|
3D, 0, 0 |
|
|
3D, 0, 0 |
|
|
3D, 0, 0 |
|
|
Bits which can be set in VkImageViewCreateInfo::flags
,
specifying additional parameters of an image, are:
typedef enum VkImageViewCreateFlagBits {
VK_IMAGE_VIEW_CREATE_FRAGMENT_DENSITY_MAP_DYNAMIC_BIT_EXT = 0x00000001,
} VkImageViewCreateFlagBits;
-
VK_IMAGE_VIEW_CREATE_FRAGMENT_DENSITY_MAP_DYNAMIC_BIT_EXT
prohibits the implementation from accessing the fragment density map by the host duringvkCmdBeginRenderPass
as the contents are expected to change after recording
typedef VkFlags VkImageViewCreateFlags;
VkImageViewCreateFlags
is a bitmask type for setting a mask of zero or
more VkImageViewCreateFlagBits.
The set of usages for the created image view can be restricted compared to
the parent image’s usage
flags by chaining a
VkImageViewUsageCreateInfo
structure through the pNext
member to
VkImageViewCreateInfo
.
The VkImageViewUsageCreateInfo
structure is defined as:
typedef struct VkImageViewUsageCreateInfo {
VkStructureType sType;
const void* pNext;
VkImageUsageFlags usage;
} VkImageViewUsageCreateInfo;
or the equivalent
typedef VkImageViewUsageCreateInfo VkImageViewUsageCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
usage
is a bitmask describing the allowed usages of the image view. See VkImageUsageFlagBits for a description of the supported bits.
When this structure is chained to VkImageViewCreateInfo
the
usage
field overrides the implicit usage
parameter inherited
from image creation time and its value is used instead for the purposes of
determining the valid usage conditions of VkImageViewCreateInfo.
The VkImageSubresourceRange
structure is defined as:
typedef struct VkImageSubresourceRange {
VkImageAspectFlags aspectMask;
uint32_t baseMipLevel;
uint32_t levelCount;
uint32_t baseArrayLayer;
uint32_t layerCount;
} VkImageSubresourceRange;
-
aspectMask
is a bitmask of VkImageAspectFlagBits specifying which aspect(s) of the image are included in the view. -
baseMipLevel
is the first mipmap level accessible to the view. -
levelCount
is the number of mipmap levels (starting frombaseMipLevel
) accessible to the view. -
baseArrayLayer
is the first array layer accessible to the view. -
layerCount
is the number of array layers (starting frombaseArrayLayer
) accessible to the view.
The number of mipmap levels and array layers must be a subset of the image
subresources in the image.
If an application wants to use all mip levels or layers in an image after
the baseMipLevel
or baseArrayLayer
, it can set levelCount
and layerCount
to the special values VK_REMAINING_MIP_LEVELS
and
VK_REMAINING_ARRAY_LAYERS
without knowing the exact number of mip
levels or layers.
For cube and cube array image views, the layers of the image view starting
at baseArrayLayer
correspond to faces in the order +X, -X, +Y, -Y, +Z,
-Z.
For cube arrays, each set of six sequential layers is a single cube, so the
number of cube maps in a cube map array view is layerCount
/ 6, and
image array layer (baseArrayLayer
+ i) is face index
(i mod 6) of cube i / 6.
If the number of layers in the view, whether set explicitly in
layerCount
or implied by VK_REMAINING_ARRAY_LAYERS
, is not a
multiple of 6, behavior when indexing the last cube is undefined.
aspectMask
must be only VK_IMAGE_ASPECT_COLOR_BIT
,
VK_IMAGE_ASPECT_DEPTH_BIT
or VK_IMAGE_ASPECT_STENCIL_BIT
if
format
is a color, depth-only or stencil-only format,
respectively, except if format
is a
multi-planar format.
If using a depth/stencil format with both depth and stencil components,
aspectMask
must include at least one of
VK_IMAGE_ASPECT_DEPTH_BIT
and VK_IMAGE_ASPECT_STENCIL_BIT
, and
can include both.
When the VkImageSubresourceRange
structure is used to select a subset
of the slices of a 3D image’s mip level in order to create a 2D or 2D array
image view of a 3D image created with
VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT
, baseArrayLayer
and
layerCount
specify the first slice index and the number of slices to
include in the created image view.
Such an image view can be used as a framebuffer attachment that refers only
to the specified range of slices of the selected mip level.
However, any layout transitions performed on such an attachment view during
a render pass instance still apply to the entire subresource referenced
which includes all the slices of the selected mip level.
When using an imageView of a depth/stencil image to populate a descriptor
set (e.g. for sampling in the shader, or for use as an input attachment),
the aspectMask
must only include one bit and selects whether the
imageView is used for depth reads (i.e. using a floating-point sampler or
input attachment in the shader) or stencil reads (i.e. using an unsigned
integer sampler or input attachment in the shader).
When an imageView of a depth/stencil image is used as a depth/stencil
framebuffer attachment, the aspectMask
is ignored and both depth and
stencil image subresources are used.
The components
member is of type VkComponentMapping, and
describes a remapping from components of the image to components of the
vector returned by shader image instructions.
This remapping must be identity for storage image descriptors, input
attachment descriptors,
framebuffer attachments, and any VkImageView
used with a combined
image sampler that enables sampler Y’CBCR
conversion.
When creating a VkImageView
, if sampler
Y’CBCR conversion is enabled in the sampler, the aspectMask
of a
subresourceRange
used by the VkImageView
must be
VK_IMAGE_ASPECT_COLOR_BIT
.
When creating a VkImageView
, if sampler Y’CBCR conversion is not
enabled in the sampler and the image format
is
multi-planar, the
image must have been created with VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
,
and the aspectMask
of the VkImageView
’s subresourceRange
must be VK_IMAGE_ASPECT_PLANE_0_BIT
,
VK_IMAGE_ASPECT_PLANE_1_BIT
or VK_IMAGE_ASPECT_PLANE_2_BIT
.
Bits which can be set in an aspect mask to specify aspects of an image for purposes such as identifying a subresource, are:
typedef enum VkImageAspectFlagBits {
VK_IMAGE_ASPECT_COLOR_BIT = 0x00000001,
VK_IMAGE_ASPECT_DEPTH_BIT = 0x00000002,
VK_IMAGE_ASPECT_STENCIL_BIT = 0x00000004,
VK_IMAGE_ASPECT_METADATA_BIT = 0x00000008,
VK_IMAGE_ASPECT_PLANE_0_BIT = 0x00000010,
VK_IMAGE_ASPECT_PLANE_1_BIT = 0x00000020,
VK_IMAGE_ASPECT_PLANE_2_BIT = 0x00000040,
VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT_EXT = 0x00000080,
VK_IMAGE_ASPECT_MEMORY_PLANE_1_BIT_EXT = 0x00000100,
VK_IMAGE_ASPECT_MEMORY_PLANE_2_BIT_EXT = 0x00000200,
VK_IMAGE_ASPECT_MEMORY_PLANE_3_BIT_EXT = 0x00000400,
VK_IMAGE_ASPECT_PLANE_0_BIT_KHR = VK_IMAGE_ASPECT_PLANE_0_BIT,
VK_IMAGE_ASPECT_PLANE_1_BIT_KHR = VK_IMAGE_ASPECT_PLANE_1_BIT,
VK_IMAGE_ASPECT_PLANE_2_BIT_KHR = VK_IMAGE_ASPECT_PLANE_2_BIT,
} VkImageAspectFlagBits;
-
VK_IMAGE_ASPECT_COLOR_BIT
specifies the color aspect. -
VK_IMAGE_ASPECT_DEPTH_BIT
specifies the depth aspect. -
VK_IMAGE_ASPECT_STENCIL_BIT
specifies the stencil aspect. -
VK_IMAGE_ASPECT_METADATA_BIT
specifies the metadata aspect, used for sparse sparse resource operations.
typedef VkFlags VkImageAspectFlags;
VkImageAspectFlags
is a bitmask type for setting a mask of zero or
more VkImageAspectFlagBits.
The VkComponentMapping
structure is defined as:
typedef struct VkComponentMapping {
VkComponentSwizzle r;
VkComponentSwizzle g;
VkComponentSwizzle b;
VkComponentSwizzle a;
} VkComponentMapping;
-
r
is a VkComponentSwizzle specifying the component value placed in the R component of the output vector. -
g
is a VkComponentSwizzle specifying the component value placed in the G component of the output vector. -
b
is a VkComponentSwizzle specifying the component value placed in the B component of the output vector. -
a
is a VkComponentSwizzle specifying the component value placed in the A component of the output vector.
Possible values of the members of VkComponentMapping, specifying the component values placed in each component of the output vector, are:
typedef enum VkComponentSwizzle {
VK_COMPONENT_SWIZZLE_IDENTITY = 0,
VK_COMPONENT_SWIZZLE_ZERO = 1,
VK_COMPONENT_SWIZZLE_ONE = 2,
VK_COMPONENT_SWIZZLE_R = 3,
VK_COMPONENT_SWIZZLE_G = 4,
VK_COMPONENT_SWIZZLE_B = 5,
VK_COMPONENT_SWIZZLE_A = 6,
} VkComponentSwizzle;
-
VK_COMPONENT_SWIZZLE_IDENTITY
specifies that the component is set to the identity swizzle. -
VK_COMPONENT_SWIZZLE_ZERO
specifies that the component is set to zero. -
VK_COMPONENT_SWIZZLE_ONE
specifies that the component is set to either 1 or 1.0, depending on whether the type of the image view format is integer or floating-point respectively, as determined by the Format Definition section for each VkFormat. -
VK_COMPONENT_SWIZZLE_R
specifies that the component is set to the value of the R component of the image. -
VK_COMPONENT_SWIZZLE_G
specifies that the component is set to the value of the G component of the image. -
VK_COMPONENT_SWIZZLE_B
specifies that the component is set to the value of the B component of the image. -
VK_COMPONENT_SWIZZLE_A
specifies that the component is set to the value of the A component of the image.
Setting the identity swizzle on a component is equivalent to setting the identity mapping on that component. That is:
Component | Identity Mapping |
---|---|
|
|
|
|
|
|
|
|
If the pNext
list includes a VkImageViewASTCDecodeModeEXT
structure, then that structure includes a parameter that specifies the
decode mode for image views using ASTC compressed formats.
The VkImageViewASTCDecodeModeEXT
structure is defined as:
typedef struct VkImageViewASTCDecodeModeEXT {
VkStructureType sType;
const void* pNext;
VkFormat decodeMode;
} VkImageViewASTCDecodeModeEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
decodeMode
is the intermediate format used to decode ASTC compressed formats.
If format
uses sRGB encoding then the decodeMode
has no effect.
To destroy an image view, call:
void vkDestroyImageView(
VkDevice device,
VkImageView imageView,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the image view. -
imageView
is the image view to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
11.5.1. Image View Format Features
Valid usage of a VkImageView may be constrained by the image view’s format features, defined below. Such constraints are documented in the affected valid usage statement.
-
If the view’s image was created with
VK_IMAGE_TILING_LINEAR
, then the image view’s set of format features is the value of VkFormatProperties::linearTilingFeatures
found by calling vkGetPhysicalDeviceFormatProperties on the sameformat
as VkImageViewCreateInfo::format
. -
If the view’s image was created with
VK_IMAGE_TILING_OPTIMAL
, but without an external format, then the image view’s set of format features is the value of VkFormatProperties::optimalTilingFeatures
found by calling vkGetPhysicalDeviceFormatProperties on the sameformat
as VkImageViewCreateInfo::format
. -
If the view’s image was created with an external format, then the image views’s set of format features is the value of VkAndroidHardwareBufferFormatPropertiesANDROID::
formatFeatures
found by calling vkGetAndroidHardwareBufferPropertiesANDROID on the Android hardware buffer that was imported to the VkDeviceMemory to which the image is bound. -
If the view’s image was created with
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
, then:-
The image’s DRM format modifier is the value of VkImageDrmFormatModifierListCreateInfoEXT::
drmFormatModifier
found by calling vkGetImageDrmFormatModifierPropertiesEXT. -
Let VkDrmFormatModifierPropertiesListEXT::
pDrmFormatModifierProperties
be the array found by calling vkGetPhysicalDeviceFormatProperties2 on the sameformat
as VkImageViewCreateInfo::format
. -
Let
VkDrmFormatModifierPropertiesEXT prop
be the array element whosedrmFormatModifier
member is the value of the image’s DRM format modifier. -
Then the image view’s set of format features is the value of
prop
::drmFormatModifierTilingFeatures
.
-
11.6. Resource Memory Association
Resources are initially created as virtual allocations with no backing memory. Device memory is allocated separately (see Device Memory) and then associated with the resource. This association is done differently for sparse and non-sparse resources.
Resources created with any of the sparse creation flags are considered sparse resources. Resources created without these flags are non-sparse. The details on resource memory association for sparse resources is described in Sparse Resources.
Non-sparse resources must be bound completely and contiguously to a single
VkDeviceMemory
object before the resource is passed as a parameter to
any of the following operations:
-
creating image or buffer views
-
updating descriptor sets
-
recording commands in a command buffer
Once bound, the memory binding is immutable for the lifetime of the resource.
In a logical device representing more than one physical device, buffer and image resources exist on all physical devices but can be bound to memory differently on each. Each such replicated resource is an instance of the resource. For sparse resources, each instance can be bound to memory arbitrarily differently. For non-sparse resources, each instance can either be bound to the local or a peer instance of the memory, or for images can be bound to rectangular regions from the local and/or peer instances. When a resource is used in a descriptor set, each physical device interprets the descriptor according to its own instance’s binding to memory.
Note
There are no new copy commands to transfer data between physical devices. Instead, an application can create a resource with a peer mapping and use it as the source or destination of a transfer command executed by a single physical device to copy the data from one physical device to another. |
To determine the memory requirements for a buffer resource, call:
void vkGetBufferMemoryRequirements(
VkDevice device,
VkBuffer buffer,
VkMemoryRequirements* pMemoryRequirements);
-
device
is the logical device that owns the buffer. -
buffer
is the buffer to query. -
pMemoryRequirements
points to an instance of the VkMemoryRequirements structure in which the memory requirements of the buffer object are returned.
To determine the memory requirements for an image resource which is not
created with the VK_IMAGE_CREATE_DISJOINT_BIT
flag set, call:
void vkGetImageMemoryRequirements(
VkDevice device,
VkImage image,
VkMemoryRequirements* pMemoryRequirements);
-
device
is the logical device that owns the image. -
image
is the image to query. -
pMemoryRequirements
points to an instance of the VkMemoryRequirements structure in which the memory requirements of the image object are returned.
The VkMemoryRequirements
structure is defined as:
typedef struct VkMemoryRequirements {
VkDeviceSize size;
VkDeviceSize alignment;
uint32_t memoryTypeBits;
} VkMemoryRequirements;
-
size
is the size, in bytes, of the memory allocation required for the resource. -
alignment
is the alignment, in bytes, of the offset within the allocation required for the resource. -
memoryTypeBits
is a bitmask and contains one bit set for every supported memory type for the resource. Biti
is set if and only if the memory typei
in theVkPhysicalDeviceMemoryProperties
structure for the physical device is supported for the resource.
The precise size of images that will be bound to external Android hardware
buffer memory is unknown until the memory has been imported or allocated, so
calling vkGetImageMemoryRequirements with such an image before it has
been bound to memory will result in undefined behavior.
When importing Android hardware buffer memory, the allocationSize
can
be determined by calling vkGetAndroidHardwareBufferPropertiesANDROID.
When allocating new memory for an image that can be exported to an Android
hardware buffer, the memory’s allocationSize
must be zero; the actual
size will be determined by the dedicated image’s parameters.
After the memory has been allocated, the amount of space allocated from the
memory’s heap can be obtained by getting the image’s memory requirements or
by calling vkGetAndroidHardwareBufferPropertiesANDROID with the
Android hardware buffer exported from the memory.
The implementation guarantees certain properties about the memory requirements returned by vkGetBufferMemoryRequirements and vkGetImageMemoryRequirements:
-
The
memoryTypeBits
member always contains at least one bit set. -
If
buffer
is aVkBuffer
not created with theVK_BUFFER_CREATE_SPARSE_BINDING_BIT
bit set, or ifimage
is linear image, then thememoryTypeBits
member always contains at least one bit set corresponding to aVkMemoryType
with apropertyFlags
that has both theVK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
bit and theVK_MEMORY_PROPERTY_HOST_COHERENT_BIT
bit set. In other words, mappable coherent memory can always be attached to these objects. -
If
buffer
was created with VkExternalMemoryBufferCreateInfo::handleTypes
set to0
orimage
was created with VkExternalMemoryImageCreateInfo::handleTypes
set to0
, thememoryTypeBits
member always contains at least one bit set corresponding to aVkMemoryType
with apropertyFlags
that has theVK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
bit set. -
The
memoryTypeBits
member is identical for allVkBuffer
objects created with the same value for theflags
andusage
members in theVkBufferCreateInfo
structure and thehandleTypes
member of the VkExternalMemoryBufferCreateInfo structure passed tovkCreateBuffer
. Further, ifusage1
andusage2
of type VkBufferUsageFlags are such that the bits set inusage2
are a subset of the bits set inusage1
, and they have the sameflags
and VkExternalMemoryBufferCreateInfo::handleTypes
, then the bits set inmemoryTypeBits
returned forusage1
must be a subset of the bits set inmemoryTypeBits
returned forusage2
, for all values offlags
. -
The
alignment
member is a power of two. -
The
alignment
member is identical for allVkBuffer
objects created with the same combination of values for theusage
andflags
members in theVkBufferCreateInfo
structure passed tovkCreateBuffer
. -
The
alignment
member satisfies the buffer descriptor offset alignment requirements associated with theVkBuffer
’susage
:-
If
usage
includedVK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT
orVK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT
,alignment
must be an integer multiple ofVkPhysicalDeviceLimits
::minTexelBufferOffsetAlignment
. -
If
usage
includedVK_BUFFER_USAGE_UNIFORM_BUFFER_BIT
,alignment
must be an integer multiple ofVkPhysicalDeviceLimits
::minUniformBufferOffsetAlignment
. -
If
usage
includedVK_BUFFER_USAGE_STORAGE_BUFFER_BIT
,alignment
must be an integer multiple ofVkPhysicalDeviceLimits
::minStorageBufferOffsetAlignment
.
-
-
For images created with a color format, the
memoryTypeBits
member is identical for allVkImage
objects created with the same combination of values for thetiling
member, theVK_IMAGE_CREATE_SPARSE_BINDING_BIT
bit of theflags
member, theVK_IMAGE_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT
bit of theflags
member,handleTypes
member of VkExternalMemoryImageCreateInfo, and theVK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT
of theusage
member in theVkImageCreateInfo
structure passed tovkCreateImage
. -
For images created with a depth/stencil format, the
memoryTypeBits
member is identical for allVkImage
objects created with the same combination of values for theformat
member, thetiling
member, theVK_IMAGE_CREATE_SPARSE_BINDING_BIT
bit of theflags
member, theVK_IMAGE_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT
bit of theflags
member,handleTypes
member of VkExternalMemoryImageCreateInfo, and theVK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT
of theusage
member in theVkImageCreateInfo
structure passed tovkCreateImage
. -
If the memory requirements are for a
VkImage
, thememoryTypeBits
member must not refer to aVkMemoryType
with apropertyFlags
that has theVK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT
bit set if the vkGetImageMemoryRequirements::image
did not haveVK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT
bit set in theusage
member of theVkImageCreateInfo
structure passed tovkCreateImage
. -
If the memory requirements are for a
VkBuffer
, thememoryTypeBits
member must not refer to aVkMemoryType
with apropertyFlags
that has theVK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT
bit set.NoteThe implication of this requirement is that lazily allocated memory is disallowed for buffers in all cases.
-
The
size
member is identical for allVkBuffer
objects created with the same combination of creation parameters specified inVkBufferCreateInfo
and itspNext
chain. -
The
size
member is identical for allVkImage
objects created with the same combination of creation parameters specified inVkImageCreateInfo
and itspNext
chain.NoteThis, however, does not imply that they interpret the contents of the bound memory identically with each other. That additional guarantee, however, can be explicitly requested using
VK_IMAGE_CREATE_ALIAS_BIT
.
To determine the memory requirements for a buffer resource, call:
void vkGetBufferMemoryRequirements2(
VkDevice device,
const VkBufferMemoryRequirementsInfo2* pInfo,
VkMemoryRequirements2* pMemoryRequirements);
or the equivalent command
void vkGetBufferMemoryRequirements2KHR(
VkDevice device,
const VkBufferMemoryRequirementsInfo2* pInfo,
VkMemoryRequirements2* pMemoryRequirements);
-
device
is the logical device that owns the buffer. -
pInfo
is a pointer to an instance of theVkBufferMemoryRequirementsInfo2
structure containing parameters required for the memory requirements query. -
pMemoryRequirements
points to an instance of the VkMemoryRequirements2 structure in which the memory requirements of the buffer object are returned.
The VkBufferMemoryRequirementsInfo2
structure is defined as:
typedef struct VkBufferMemoryRequirementsInfo2 {
VkStructureType sType;
const void* pNext;
VkBuffer buffer;
} VkBufferMemoryRequirementsInfo2;
or the equivalent
typedef VkBufferMemoryRequirementsInfo2 VkBufferMemoryRequirementsInfo2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
buffer
is the buffer to query.
To determine the memory requirements for an image resource, call:
void vkGetImageMemoryRequirements2(
VkDevice device,
const VkImageMemoryRequirementsInfo2* pInfo,
VkMemoryRequirements2* pMemoryRequirements);
or the equivalent command
void vkGetImageMemoryRequirements2KHR(
VkDevice device,
const VkImageMemoryRequirementsInfo2* pInfo,
VkMemoryRequirements2* pMemoryRequirements);
-
device
is the logical device that owns the image. -
pInfo
is a pointer to an instance of theVkImageMemoryRequirementsInfo2
structure containing parameters required for the memory requirements query. -
pMemoryRequirements
points to an instance of the VkMemoryRequirements2 structure in which the memory requirements of the image object are returned.
The VkImageMemoryRequirementsInfo2
structure is defined as:
typedef struct VkImageMemoryRequirementsInfo2 {
VkStructureType sType;
const void* pNext;
VkImage image;
} VkImageMemoryRequirementsInfo2;
or the equivalent
typedef VkImageMemoryRequirementsInfo2 VkImageMemoryRequirementsInfo2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
image
is the image to query.
To determine the memory requirements for a plane of a disjoint image, add a
VkImagePlaneMemoryRequirementsInfo
to the pNext
chain of the
VkImageMemoryRequirementsInfo2
structure.
The VkImagePlaneMemoryRequirementsInfo
structure is defined as:
typedef struct VkImagePlaneMemoryRequirementsInfo {
VkStructureType sType;
const void* pNext;
VkImageAspectFlagBits planeAspect;
} VkImagePlaneMemoryRequirementsInfo;
or the equivalent
typedef VkImagePlaneMemoryRequirementsInfo VkImagePlaneMemoryRequirementsInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
planeAspect
is the aspect corresponding to the image plane to query.
The VkMemoryRequirements2
structure is defined as:
typedef struct VkMemoryRequirements2 {
VkStructureType sType;
void* pNext;
VkMemoryRequirements memoryRequirements;
} VkMemoryRequirements2;
or the equivalent
typedef VkMemoryRequirements2 VkMemoryRequirements2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memoryRequirements
is a structure of type VkMemoryRequirements describing the memory requirements of the resource.
To determine the dedicated allocation requirements of a buffer or image
resource, add a VkMemoryDedicatedRequirements structure to the
pNext
chain of the VkMemoryRequirements2 structure passed as the
pMemoryRequirements
parameter of vkGetBufferMemoryRequirements2
or vkGetImageMemoryRequirements2
.
The VkMemoryDedicatedRequirements
structure is defined as:
typedef struct VkMemoryDedicatedRequirements {
VkStructureType sType;
void* pNext;
VkBool32 prefersDedicatedAllocation;
VkBool32 requiresDedicatedAllocation;
} VkMemoryDedicatedRequirements;
or the equivalent
typedef VkMemoryDedicatedRequirements VkMemoryDedicatedRequirementsKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
prefersDedicatedAllocation
specifies that the implementation would prefer a dedicated allocation for this resource. The application is still free to suballocate the resource but it may get better performance if a dedicated allocation is used. -
requiresDedicatedAllocation
specifies that a dedicated allocation is required for this resource.
When the implementation sets requiresDedicatedAllocation
to
VK_TRUE
, it must also set prefersDedicatedAllocation
to
VK_TRUE
.
If the VkMemoryDedicatedRequirements
structure is included in the
pNext
chain of the VkMemoryRequirements2 structure passed as the
pMemoryRequirements
parameter of a
vkGetBufferMemoryRequirements2
call, requiresDedicatedAllocation
may be VK_TRUE
under one of the following conditions:
-
The
pNext
chain ofVkBufferCreateInfo
for the call tovkCreateBuffer
used to create the buffer being queried contained an instance ofVkExternalMemoryBufferCreateInfo
, and any of the handle types specified inVkExternalMemoryBufferCreateInfo
::handleTypes
requires dedicated allocation, as reported by vkGetPhysicalDeviceExternalBufferProperties inVkExternalBufferProperties
::externalMemoryProperties
::externalMemoryFeatures
, therequiresDedicatedAllocation
field will be set toVK_TRUE
.
In all other cases, requiresDedicatedAllocation
must be set to
VK_FALSE
by the implementation whenever a
VkMemoryDedicatedRequirements
structure is included in the pNext
chain of the VkMemoryRequirements2
structure passed to a call to
vkGetBufferMemoryRequirements2
.
If the VkMemoryDedicatedRequirements
structure is included in the
pNext
chain of the VkMemoryRequirements2
structure passed as the
pMemoryRequirements
parameter of a
vkGetBufferMemoryRequirements2
call and
VK_BUFFER_CREATE_SPARSE_BINDING_BIT
was set in
VkBufferCreateInfo
::flags
when buffer
was created then the
implementation must set both prefersDedicatedAllocation
and
requiresDedicatedAllocation
to VK_FALSE
.
If the VkMemoryDedicatedRequirements
structure is included in the
pNext
chain of the VkMemoryRequirements2
structure passed as the
pMemoryRequirements
parameter of a vkGetImageMemoryRequirements2
call, requiresDedicatedAllocation
may be VK_TRUE
under one of
the following conditions:
-
The
pNext
chain ofVkImageCreateInfo
for the call tovkCreateImage
used to create the image being queried contained an instance ofVkExternalMemoryImageCreateInfo
, and any of the handle types specified inVkExternalMemoryImageCreateInfo
::handleTypes
requires dedicated allocation, as reported by vkGetPhysicalDeviceImageFormatProperties2 inVkExternalImageFormatProperties
::externalMemoryProperties
::externalMemoryFeatures
, therequiresDedicatedAllocation
field will be set toVK_TRUE
.
In all other cases, requiresDedicatedAllocation
must be set to
VK_FALSE
by the implementation whenever a
VkMemoryDedicatedRequirements
structure is included in the pNext
chain of the VkMemoryRequirements2
structure passed to a call to
vkGetImageMemoryRequirements2
.
If the VkMemoryDedicatedRequirements
structure is included in the
pNext
chain of the VkMemoryRequirements2
structure passed as the
pMemoryRequirements
parameter of a vkGetImageMemoryRequirements2
call and VK_IMAGE_CREATE_SPARSE_BINDING_BIT
was set in
VkImageCreateInfo
::flags
when image
was created then the
implementation must set both prefersDedicatedAllocation
and
requiresDedicatedAllocation
to VK_FALSE
.
To attach memory to a buffer object, call:
VkResult vkBindBufferMemory(
VkDevice device,
VkBuffer buffer,
VkDeviceMemory memory,
VkDeviceSize memoryOffset);
-
device
is the logical device that owns the buffer and memory. -
buffer
is the buffer to be attached to memory. -
memory
is a VkDeviceMemory object describing the device memory to attach. -
memoryOffset
is the start offset of the region ofmemory
which is to be bound to the buffer. The number of bytes returned in theVkMemoryRequirements
::size
member inmemory
, starting frommemoryOffset
bytes, will be bound to the specified buffer.
vkBindBufferMemory
is equivalent to passing the same parameters
through VkBindBufferMemoryInfo to vkBindBufferMemory2.
To attach memory to buffer objects for one or more buffers at a time, call:
VkResult vkBindBufferMemory2(
VkDevice device,
uint32_t bindInfoCount,
const VkBindBufferMemoryInfo* pBindInfos);
or the equivalent command
VkResult vkBindBufferMemory2KHR(
VkDevice device,
uint32_t bindInfoCount,
const VkBindBufferMemoryInfo* pBindInfos);
-
device
is the logical device that owns the buffers and memory. -
bindInfoCount
is the number of elements inpBindInfos
. -
pBindInfos
is a pointer to an array of structures of type VkBindBufferMemoryInfo, describing buffers and memory to bind.
On some implementations, it may be more efficient to batch memory bindings into a single command.
VkBindBufferMemoryInfo
contains members corresponding to the
parameters of vkBindBufferMemory.
The VkBindBufferMemoryInfo
structure is defined as:
typedef struct VkBindBufferMemoryInfo {
VkStructureType sType;
const void* pNext;
VkBuffer buffer;
VkDeviceMemory memory;
VkDeviceSize memoryOffset;
} VkBindBufferMemoryInfo;
or the equivalent
typedef VkBindBufferMemoryInfo VkBindBufferMemoryInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
buffer
is the buffer to be attached to memory. -
memory
is a VkDeviceMemory object describing the device memory to attach. -
memoryOffset
is the start offset of the region ofmemory
which is to be bound to the buffer. The number of bytes returned in theVkMemoryRequirements
::size
member inmemory
, starting frommemoryOffset
bytes, will be bound to the specified buffer.
typedef struct VkBindBufferMemoryDeviceGroupInfo {
VkStructureType sType;
const void* pNext;
uint32_t deviceIndexCount;
const uint32_t* pDeviceIndices;
} VkBindBufferMemoryDeviceGroupInfo;
or the equivalent
typedef VkBindBufferMemoryDeviceGroupInfo VkBindBufferMemoryDeviceGroupInfoKHR;
If the pNext
list of VkBindBufferMemoryInfo includes a
VkBindBufferMemoryDeviceGroupInfo
structure, then that structure
determines how memory is bound to buffers across multiple devices in a
device group.
The VkBindBufferMemoryDeviceGroupInfo
structure is defined as:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
deviceIndexCount
is the number of elements inpDeviceIndices
. -
pDeviceIndices
is a pointer to an array of device indices.
If deviceIndexCount
is greater than zero, then on device index i
the buffer is attached to the instance of memory
on the physical
device with device index pDeviceIndices[i].
If deviceIndexCount
is zero and memory
comes from a memory heap
with the VK_MEMORY_HEAP_MULTI_INSTANCE_BIT
bit set, then it is as if
pDeviceIndices
contains consecutive indices from zero to the number of
physical devices in the logical device, minus one.
In other words, by default each physical device attaches to its own instance
of memory
.
If deviceIndexCount
is zero and memory
comes from a memory heap
without the VK_MEMORY_HEAP_MULTI_INSTANCE_BIT
bit set, then it is as
if pDeviceIndices
contains an array of zeros.
In other words, by default each physical device attaches to instance zero.
To attach memory to a VkImage
object created without the
VK_IMAGE_CREATE_DISJOINT_BIT
set, call:
VkResult vkBindImageMemory(
VkDevice device,
VkImage image,
VkDeviceMemory memory,
VkDeviceSize memoryOffset);
-
device
is the logical device that owns the image and memory. -
image
is the image. -
memory
is the VkDeviceMemory object describing the device memory to attach. -
memoryOffset
is the start offset of the region ofmemory
which is to be bound to the image. The number of bytes returned in theVkMemoryRequirements
::size
member inmemory
, starting frommemoryOffset
bytes, will be bound to the specified image.
vkBindImageMemory
is equivalent to passing the same parameters through
VkBindImageMemoryInfo to vkBindImageMemory2.
To attach memory to image objects for one or more images at a time, call:
VkResult vkBindImageMemory2(
VkDevice device,
uint32_t bindInfoCount,
const VkBindImageMemoryInfo* pBindInfos);
or the equivalent command
VkResult vkBindImageMemory2KHR(
VkDevice device,
uint32_t bindInfoCount,
const VkBindImageMemoryInfo* pBindInfos);
-
device
is the logical device that owns the images and memory. -
bindInfoCount
is the number of elements inpBindInfos
. -
pBindInfos
is a pointer to an array of structures of type VkBindImageMemoryInfo, describing images and memory to bind.
On some implementations, it may be more efficient to batch memory bindings into a single command.
VkBindImageMemoryInfo
contains members corresponding to the parameters
of vkBindImageMemory.
The VkBindImageMemoryInfo
structure is defined as:
typedef struct VkBindImageMemoryInfo {
VkStructureType sType;
const void* pNext;
VkImage image;
VkDeviceMemory memory;
VkDeviceSize memoryOffset;
} VkBindImageMemoryInfo;
or the equivalent
typedef VkBindImageMemoryInfo VkBindImageMemoryInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
image
is the image to be attached to memory. -
memory
is a VkDeviceMemory object describing the device memory to attach. -
memoryOffset
is the start offset of the region ofmemory
which is to be bound to the image. The number of bytes returned in theVkMemoryRequirements
::size
member inmemory
, starting frommemoryOffset
bytes, will be bound to the specified image.
typedef struct VkBindImageMemoryDeviceGroupInfo {
VkStructureType sType;
const void* pNext;
uint32_t deviceIndexCount;
const uint32_t* pDeviceIndices;
uint32_t splitInstanceBindRegionCount;
const VkRect2D* pSplitInstanceBindRegions;
} VkBindImageMemoryDeviceGroupInfo;
or the equivalent
typedef VkBindImageMemoryDeviceGroupInfo VkBindImageMemoryDeviceGroupInfoKHR;
If the pNext
list of VkBindImageMemoryInfo includes a
VkBindImageMemoryDeviceGroupInfo
structure, then that structure
determines how memory is bound to images across multiple devices in a device
group.
The VkBindImageMemoryDeviceGroupInfo
structure is defined as:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
deviceIndexCount
is the number of elements inpDeviceIndices
. -
pDeviceIndices
is a pointer to an array of device indices. -
splitInstanceBindRegionCount
is the number of elements inpSplitInstanceBindRegions
. -
pSplitInstanceBindRegions
is a pointer to an array of rectangles describing which regions of the image are attached to each instance of memory.
If deviceIndexCount
is greater than zero, then on device index i
image
is attached to the instance of the memory on the physical device
with device index pDeviceIndices[i].
Let N be the number of physical devices in the logical device.
If splitInstanceBindRegionCount
is greater than zero, then
pSplitInstanceBindRegions
is an array of N2 rectangles, where
the image region specified by the rectangle at element i*N+j in
resource instance i is bound to the memory instance j.
The blocks of the memory that are bound to each sparse image block region
use an offset in memory, relative to memoryOffset
, computed as if the
whole image were being bound to a contiguous range of memory.
In other words, horizontally adjacent image blocks use consecutive blocks of
memory, vertically adjacent image blocks are separated by the number of
bytes per block multiplied by the width in blocks of image
, and the
block at (0,0) corresponds to memory starting at memoryOffset
.
If splitInstanceBindRegionCount
and deviceIndexCount
are zero
and the memory comes from a memory heap with the
VK_MEMORY_HEAP_MULTI_INSTANCE_BIT
bit set, then it is as if
pDeviceIndices
contains consecutive indices from zero to the number of
physical devices in the logical device, minus one.
In other words, by default each physical device attaches to its own instance
of the memory.
If splitInstanceBindRegionCount
and deviceIndexCount
are zero
and the memory comes from a memory heap without the
VK_MEMORY_HEAP_MULTI_INSTANCE_BIT
bit set, then it is as if
pDeviceIndices
contains an array of zeros.
In other words, by default each physical device attaches to instance zero.
If the pNext
chain of VkBindImageMemoryInfo includes a
VkBindImageMemorySwapchainInfoKHR
structure, then that structure
includes a swapchain handle and image index indicating that the image will
be bound to memory from that swapchain.
The VkBindImageMemorySwapchainInfoKHR
structure is defined as:
typedef struct VkBindImageMemorySwapchainInfoKHR {
VkStructureType sType;
const void* pNext;
VkSwapchainKHR swapchain;
uint32_t imageIndex;
} VkBindImageMemorySwapchainInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
swapchain
is VK_NULL_HANDLE or a swapchain handle. -
imageIndex
is an image index withinswapchain
.
If swapchain
is not NULL
, the swapchain
and imageIndex
are used to determine the memory that the image is bound to, instead of
memory
and memoryOffset
.
Memory can be bound to a swapchain and use the pDeviceIndices
or
pSplitInstanceBindRegions
members of
VkBindImageMemoryDeviceGroupInfo.
In order to bind planes of a disjoint image, include a
VkBindImagePlaneMemoryInfo
structure in the pNext
chain of
VkBindImageMemoryInfo.
The VkBindImagePlaneMemoryInfo
structure is defined as:
typedef struct VkBindImagePlaneMemoryInfo {
VkStructureType sType;
const void* pNext;
VkImageAspectFlagBits planeAspect;
} VkBindImagePlaneMemoryInfo;
or the equivalent
typedef VkBindImagePlaneMemoryInfo VkBindImagePlaneMemoryInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
planeAspect
is the aspect of the disjoint image plane to bind.
There is an implementation-dependent limit, bufferImageGranularity
,
which specifies a page-like granularity at which linear and non-linear
resources must be placed in adjacent memory locations to avoid aliasing.
Two resources which do not satisfy this granularity requirement are said to
alias.
bufferImageGranularity
is specified in bytes, and must be a power of
two.
Implementations which do not impose a granularity restriction may report a
bufferImageGranularity
value of one.
Note
Despite its name, |
Given resourceA at the lower memory offset and resourceB at the higher
memory offset in the same VkDeviceMemory
object, where one resource is
linear and the other is non-linear (as defined in the
Glossary), and the following:
resourceA.end = resourceA.memoryOffset + resourceA.size - 1
resourceA.endPage = resourceA.end & ~(bufferImageGranularity-1)
resourceB.start = resourceB.memoryOffset
resourceB.startPage = resourceB.start & ~(bufferImageGranularity-1)
The following property must hold:
resourceA.endPage < resourceB.startPage
That is, the end of the first resource (A) and the beginning of the second
resource (B) must be on separate “pages” of size
bufferImageGranularity
.
bufferImageGranularity
may be different than the physical page size
of the memory heap.
This restriction is only needed when a linear resource and a non-linear
resource are adjacent in memory and will be used simultaneously.
The memory ranges of adjacent resources can be closer than
bufferImageGranularity
, provided they meet the alignment
requirement for the objects in question.
Sparse block size in bytes and sparse image and buffer memory alignments
must all be multiples of the bufferImageGranularity
.
Therefore, memory bound to sparse resources naturally satisfies the
bufferImageGranularity
.
11.7. Resource Sharing Mode
Buffer and image objects are created with a sharing mode controlling how they can be accessed from queues. The supported sharing modes are:
typedef enum VkSharingMode {
VK_SHARING_MODE_EXCLUSIVE = 0,
VK_SHARING_MODE_CONCURRENT = 1,
} VkSharingMode;
-
VK_SHARING_MODE_EXCLUSIVE
specifies that access to any range or image subresource of the object will be exclusive to a single queue family at a time. -
VK_SHARING_MODE_CONCURRENT
specifies that concurrent access to any range or image subresource of the object from multiple queue families is supported.
Note
|
Ranges of buffers and image subresources of image objects created using
VK_SHARING_MODE_EXCLUSIVE
must only be accessed by queues in the
queue family that has ownership of the resource.
Upon creation, such resources are not owned by any queue family; ownership
is implicitly acquired upon first use within a queue.
Once a resource using VK_SHARING_MODE_EXCLUSIVE
is owned by some queue
family, the application must perform a
queue family ownership transfer to make
the memory contents of a range or image subresource accessible to a
different queue family.
Note
Images still require a layout transition from
|
A queue family can take ownership of an image subresource or buffer range
of a resource created with VK_SHARING_MODE_EXCLUSIVE
, without an
ownership transfer, in the same way as for a resource that was just created;
however, taking ownership in this way has the effect that the contents of
the image subresource or buffer range are undefined.
Ranges of buffers and image subresources of image objects created using
VK_SHARING_MODE_CONCURRENT
must only be accessed by queues from the
queue families specified through the queueFamilyIndexCount
and
pQueueFamilyIndices
members of the corresponding create info
structures.
11.7.1. External Resource Sharing
Resources should only be accessed in the Vulkan instance that has exclusive
ownership of their underlying memory.
Only one Vulkan instance has exclusive ownership of a resource’s underlying
memory at a given time, regardless of whether the resource was created using
VK_SHARING_MODE_EXCLUSIVE
or VK_SHARING_MODE_CONCURRENT
.
Applications can transfer ownership of a resource’s underlying memory only
if the memory has been imported from or exported to another instance or
external API using external memory handles.
The semantics for transferring ownership outside of the instance are similar
to those used for transferring ownership of VK_SHARING_MODE_EXCLUSIVE
resources between queues, and is also accomplished using
VkBufferMemoryBarrier or VkImageMemoryBarrier operations.
Applications must
-
Release exclusive ownership from the source instance or API.
-
Ensure the release operation has completed using semaphores or fences.
-
Acquire exclusive ownership in the destination instance or API
Unlike queue ownership transfers, the destination instance or API is not
specified explicitly when releasing ownership, nor is the source instance or
API specified when acquiring ownership.
Instead, the image or memory barrier’s dstQueueFamilyIndex
or
srcQueueFamilyIndex
parameters are set to the reserved queue family
index VK_QUEUE_FAMILY_EXTERNAL
or VK_QUEUE_FAMILY_FOREIGN_EXT
to represent the external destination or source respectively.
Binding a resource to a memory object shared between multiple Vulkan
instances or other APIs does not change the ownership of the underlying
memory.
The first entity to access the resource implicitly acquires ownership.
Accessing a resource backed by memory that is owned by a particular instance
or API has the same semantics as accessing a VK_SHARING_MODE_EXCLUSIVE
resource, with one exception: Implementations must ensure layout
transitions performed on one member of a set of identical subresources of
identical images that alias the same range of an underlying memory object
affect the layout of all the subresources in the set.
As a corollary, writes to any image subresources in such a set must not
make the contents of memory used by other subresources in the set
undefined.
An application can define the content of a subresource of one image by
performing device writes to an identical subresource of another image
provided both images are bound to the same region of external memory.
Applications may also add resources to such a set after the content of the
existing set members has been defined without making the content undefined
by creating a new image with the initial layout
VK_IMAGE_LAYOUT_UNDEFINED
and binding it to the same region of
external memory as the existing images.
Note
Because layout transitions apply to all identical images aliasing the same region of external memory, the actual layout of the memory backing a new image as well as an existing image with defined content will not be undefined. Such an image is not usable until it acquires ownership of its memory from the existing owner. Therefore, the layout specified as part of this transition will be the true initial layout of the image. The undefined layout specified when creating it is a placeholder to simplify valid usage requirements. |
11.8. Memory Aliasing
A range of a VkDeviceMemory
allocation is aliased if it is bound to
multiple resources simultaneously, as described below, via
vkBindImageMemory, vkBindBufferMemory,
via sparse memory bindings, or by binding
the memory to resources in multiple Vulkan instances or external APIs using
external memory handle export and import mechanisms.
Consider two resources, resourceA and resourceB, bound respectively to
memory rangeA and rangeB.
Let paddedRangeA and paddedRangeB be, respectively, rangeA and
rangeB aligned to bufferImageGranularity
.
If the resources are both linear or both non-linear (as defined in the
Glossary), then the resources alias the
memory in the intersection of rangeA and rangeB.
If one resource is linear and the other is non-linear, then the resources
alias the memory in the intersection of paddedRangeA and paddedRangeB.
Applications can alias memory, but use of multiple aliases is subject to several constraints.
Note
Memory aliasing can be useful to reduce the total device memory footprint of an application, if some large resources are used for disjoint periods of time. |
When a non-linear,
non-VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
image is bound to an aliased
range, all image subresources of the image overlap the range.
When a linear image is bound to an aliased range, the image subresources
that (according to the image’s advertised layout) include bytes from the
aliased range overlap the range.
When a VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
image has sparse image
blocks bound to an aliased range, only image subresources including those
sparse image blocks overlap the range, and when the memory bound to the
image’s mip tail overlaps an aliased range all image subresources in the mip
tail overlap the range.
Buffers, and linear image subresources in either the
VK_IMAGE_LAYOUT_PREINITIALIZED
or VK_IMAGE_LAYOUT_GENERAL
layouts, are host-accessible subresources.
That is, the host has a well-defined addressing scheme to interpret the
contents, and thus the layout of the data in memory can be consistently
interpreted across aliases if each of those aliases is a host-accessible
subresource.
Non-linear images, and linear image subresources in other layouts, are not
host-accessible.
If two aliases are both host-accessible, then they interpret the contents of the memory in consistent ways, and data written to one alias can be read by the other alias.
If two aliases are both images that were created with identical creation
parameters, both were created with the VK_IMAGE_CREATE_ALIAS_BIT
flag
set, and both are bound identically to memory
except for VkBindImageMemoryDeviceGroupInfo::pDeviceIndices
and
VkBindImageMemoryDeviceGroupInfo::pSplitInstanceBindRegions
,
then they interpret the contents of the memory in consistent ways, and data
written to one alias can be read by the other alias.
Additionally, if an invididual plane of a multi-planar image and a single-plane image alias the same memory, then they also interpret the contents of the memory in consistent ways under the same conditions, but with the following modifications:
-
Both must have been created with the
VK_IMAGE_CREATE_DISJOINT_BIT
flag. -
The single-plane image must have a VkFormat that is equivalent to that of the multi-planar image’s individual plane.
-
The single-plane image and the individual plane of the multi-planar image must be bound identically to memory except for VkBindImageMemoryDeviceGroupInfo::
pDeviceIndices
and VkBindImageMemoryDeviceGroupInfo::pSplitInstanceBindRegions
. -
The
width
andheight
of the single-plane image are derived from the multi-planar image’s dimensions in the manner listed for plane compatibility for the aliased plane. -
If either image’s
tiling
isVK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
, then both images must be linear. -
All other creation parameters must be identical
Aliases created by binding the same memory to resources in multiple Vulkan instances or external APIs using external memory handle export and import mechanisms interpret the contents of the memory in consistent ways, and data written to one alias can be read by the other alias.
Otherwise, the aliases interpret the contents of the memory differently, and writes via one alias make the contents of memory partially or completely undefined to the other alias. If the first alias is a host-accessible subresource, then the bytes affected are those written by the memory operations according to its addressing scheme. If the first alias is not host-accessible, then the bytes affected are those overlapped by the image subresources that were written. If the second alias is a host-accessible subresource, the affected bytes become undefined. If the second alias is a not host-accessible, all sparse image blocks (for sparse partially-resident images) or all image subresources (for non-sparse image and fully resident sparse images) that overlap the affected bytes become undefined.
If any image subresources are made undefined due to writes to an alias,
then each of those image subresources must have its layout transitioned
from VK_IMAGE_LAYOUT_UNDEFINED
to a valid layout before it is used, or
from VK_IMAGE_LAYOUT_PREINITIALIZED
if the memory has been written by
the host.
If any sparse blocks of a sparse image have been made undefined, then only
the image subresources containing them must be transitioned.
Use of an overlapping range by two aliases must be separated by a memory dependency using the appropriate access types if at least one of those uses performs writes, whether the aliases interpret memory consistently or not. If buffer or image memory barriers are used, the scope of the barrier must contain the entire range and/or set of image subresources that overlap.
If two aliasing image views are used in the same framebuffer, then the
render pass must declare the attachments using the
VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT
, and
follow the other rules listed in that section.
Note
Memory recycled via an application suballocator (i.e. without freeing and reallocating the memory objects) is not substantially different from memory aliasing. However, a suballocator usually waits on a fence before recycling a region of memory, and signaling a fence involves sufficient implicit dependencies to satisfy all the above requirements. |
11.9. Acceleration Structures
Acceleration structures are an opaque structure that is built by the implementation to more efficiently perform spatial queries on the provided geometric data. For this extension, an acceleration structure is either a top-level acceleration structure containing a set of bottom-level acceleration structures or a bottom-level acceleration structure containing either a set of axis-aligned bounding boxes for custom geometry or a set of triangles.
Each instance in the top-level acceleration structure contains a reference to a bottom-level acceleration structure as well as an instance transform plus information required to index into the shader bindings. The top-level acceleration structure is what is bound to the acceleration descriptor to trace inside the shader in the ray tracing pipeline.
Acceleration structures are represented by VkAccelerationStructureNV
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkAccelerationStructureNV)
To create acceleration structures, call:
VkResult vkCreateAccelerationStructureNV(
VkDevice device,
const VkAccelerationStructureCreateInfoNV* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkAccelerationStructureNV* pAccelerationStructure);
-
device
is the logical device that creates the buffer object. -
pCreateInfo
is a pointer to an instance of theVkAccelerationStructureCreateInfoNV
structure containing parameters affecting creation of the acceleration structure. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pAccelerationStructure
points to aVkAccelerationStructureNV
handle in which the resulting acceleration structure object is returned.
Similar to other objects in Vulkan, the acceleration structure creation
merely creates an object with a specific “shape” as specified by the
information in VkAccelerationStructureInfoNV and compactedSize
in pCreateInfo
.
Populating the data in the object after allocating and binding memory is
done with vkCmdBuildAccelerationStructureNV and
vkCmdCopyAccelerationStructureNV.
Acceleration structure creation uses the count and type information from the geometries, but does not use the data references in the structures.
The VkAccelerationStructureCreateInfoNV
structure is defined as:
typedef struct VkAccelerationStructureCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkDeviceSize compactedSize;
VkAccelerationStructureInfoNV info;
} VkAccelerationStructureCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
compactedSize
is the size from the result of vkCmdWriteAccelerationStructuresPropertiesNV if this acceleration structure is going to be the target of a compacting copy. -
info
is the VkAccelerationStructureInfoNV structure that specifies further parameters of the created acceleration structure.
The VkAccelerationStructureInfoNV
structure is defined as:
typedef struct VkAccelerationStructureInfoNV {
VkStructureType sType;
const void* pNext;
VkAccelerationStructureTypeNV type;
VkBuildAccelerationStructureFlagsNV flags;
uint32_t instanceCount;
uint32_t geometryCount;
const VkGeometryNV* pGeometries;
} VkAccelerationStructureInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
type
is a VkAccelerationStructureTypeNV value specifying the type of acceleration structure that will be created. -
flags
is a bitmask of VkBuildAccelerationStructureFlagBitsNV specifying additional parameters of the acceleration structure. -
instanceCount
specifies the number of instances that will be in the new acceleration structure. -
geometryCount
specifies the number of geometries that will be in the new acceleration structure. -
pGeometries
is an array of VkGeometryNV structures, which contain the scene data being passed into the acceleration structure.
VkAccelerationStructureInfoNV
contains information that is used both
for acceleration structure creation with
vkCreateAccelerationStructureNV
and in combination with the actual
geometric data to build the acceleration structure with
vkCmdBuildAccelerationStructureNV.
Values which can be set in VkAccelerationStructureInfoNV::type
,
specifying the type of acceleration structure, are:
typedef enum VkAccelerationStructureTypeNV {
VK_ACCELERATION_STRUCTURE_TYPE_TOP_LEVEL_NV = 0,
VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_NV = 1,
} VkAccelerationStructureTypeNV;
-
VK_ACCELERATION_STRUCTURE_TYPE_TOP_LEVEL_NV
is a top-level acceleration structure containing instance data referring to bottom-level level acceleration structures. -
VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_NV
is a bottom-level acceleration structure containing the AABBs or geometry to be intersected.
Bits which can be set in VkAccelerationStructureInfoNV::flags
,
specifying additional parameters for acceleration structure builds, are:
typedef enum VkBuildAccelerationStructureFlagBitsNV {
VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_UPDATE_BIT_NV = 0x00000001,
VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_COMPACTION_BIT_NV = 0x00000002,
VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_TRACE_BIT_NV = 0x00000004,
VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_BUILD_BIT_NV = 0x00000008,
VK_BUILD_ACCELERATION_STRUCTURE_LOW_MEMORY_BIT_NV = 0x00000010,
} VkBuildAccelerationStructureFlagBitsNV;
-
VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_UPDATE_BIT_NV
indicates that the specified acceleration structure can be updated withupdate
ofVK_TRUE
in vkCmdBuildAccelerationStructureNV. -
VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_COMPACTION_BIT_NV
indicates that the specified acceleration structure can act as the source for vkCmdCopyAccelerationStructureNV withmode
ofVK_COPY_ACCELERATION_STRUCTURE_MODE_COMPACT_NV
to produce a compacted acceleration structure. -
VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_TRACE_BIT_NV
indicates that the given acceleration structure build should prioritize trace performance over build time. -
VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_BUILD_BIT_NV
indicates that the given acceleration structure build should prioritize build time over trace performance. -
VK_BUILD_ACCELERATION_STRUCTURE_LOW_MEMORY_BIT_NV
indicates that this acceleration structure should minimize the size of the scratch memory and the final result build, potentially at the expense of build time or trace performance.
Note
|
typedef VkFlags VkBuildAccelerationStructureFlagsNV;
VkBuildAccelerationStructureFlagsNV
is a bitmask type for setting a
mask of zero or more VkBuildAccelerationStructureFlagBitsNV.
The VkGeometryNV
structure is defined as:
typedef struct VkGeometryNV {
VkStructureType sType;
const void* pNext;
VkGeometryTypeNV geometryType;
VkGeometryDataNV geometry;
VkGeometryFlagsNV flags;
} VkGeometryNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
geometryType
describes which type of geometry thisVkGeometryNV
refers to. -
geometry
contains the geometry data as described in VkGeometryDataNV. -
flags
has flags describing options for this geometry.
Geometry types are specified by VkGeometryTypeNV, which takes values:
typedef enum VkGeometryTypeNV {
VK_GEOMETRY_TYPE_TRIANGLES_NV = 0,
VK_GEOMETRY_TYPE_AABBS_NV = 1,
} VkGeometryTypeNV;
-
VK_GEOMETRY_TYPE_TRIANGLES_NV
indicates that thetriangles
of VkGeometryDataNV contains valid data. -
VK_GEOMETRY_TYPE_AABBS_NV
indicates that theaabbs
of VkGeometryDataNV contains valid data.
Bits which can be set in VkGeometryNV::flags
, specifying
additional parameters for acceleration structure builds, are:
typedef enum VkGeometryFlagBitsNV {
VK_GEOMETRY_OPAQUE_BIT_NV = 0x00000001,
VK_GEOMETRY_NO_DUPLICATE_ANY_HIT_INVOCATION_BIT_NV = 0x00000002,
} VkGeometryFlagBitsNV;
-
VK_GEOMETRY_OPAQUE_BIT_NV
indicates that this geometry does not invoke the any-hit shaders even if present in a hit group. -
VK_GEOMETRY_NO_DUPLICATE_ANY_HIT_INVOCATION_BIT_NV
indicates that the implementation must only call the any-hit shader a single time for each primitive in this geometry. If this bit is absent an implementation may invoke the any-hit shader more than once for this geometry.
typedef VkFlags VkGeometryFlagsNV;
VkGeometryFlagsNV
is a bitmask type for setting a mask of zero or more
VkGeometryFlagBitsNV.
The VkGeometryDataNV
structure is defined as:
typedef struct VkGeometryDataNV {
VkGeometryTrianglesNV triangles;
VkGeometryAABBNV aabbs;
} VkGeometryDataNV;
-
triangles
contains triangle data ifVkGeometryNV
::geometryType
isVK_GEOMETRY_TYPE_TRIANGLES_NV
. -
aabbs
contains axis-aligned bounding box data ifVkGeometryNV
::geometryType
isVK_GEOMETRY_TYPE_AABBS_NV
.
The VkGeometryTrianglesNV
structure is defined as:
typedef struct VkGeometryTrianglesNV {
VkStructureType sType;
const void* pNext;
VkBuffer vertexData;
VkDeviceSize vertexOffset;
uint32_t vertexCount;
VkDeviceSize vertexStride;
VkFormat vertexFormat;
VkBuffer indexData;
VkDeviceSize indexOffset;
uint32_t indexCount;
VkIndexType indexType;
VkBuffer transformData;
VkDeviceSize transformOffset;
} VkGeometryTrianglesNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
vertexData
is the buffer containing vertex data for this geometry. -
vertexOffset
is the offset in bytes withinvertexData
containing vertex data for this geometry. -
vertexCount
is the number of valid vertices. -
vertexStride
is the stride in bytes between each vertex. -
vertexFormat
is the format of each vertex element. -
indexData
is the buffer containing index data for this geometry. -
indexOffset
is the offset in bytes withinindexData
containing index data for this geometry. -
indexCount
is the number of indices to include in this geometry. -
indexType
is the format of each index. -
transformData
is a buffer containing optional reference to an array of 32-bit floats representing a 3x4 row major affine transformation matrix for this geometry. -
transformOffset
is the offset in bytes intransformData
of the transform information described above.
If indexType
is VK_INDEX_TYPE_NONE_NV
, then this structure
describes a set of triangles determined by vertexCount
.
Otherwise, this structure describes a set of indexed triangles determined by
indexCount
.
The VkGeometryAABBNV
structure is defined as:
typedef struct VkGeometryAABBNV {
VkStructureType sType;
const void* pNext;
VkBuffer aabbData;
uint32_t numAABBs;
uint32_t stride;
VkDeviceSize offset;
} VkGeometryAABBNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
aabbData
is the buffer containing axis-aligned bounding box data. -
numAABBs
is the number of AABBs in this geometry. -
stride
is the stride in bytes between AABBs inaabbData
. -
offset
is the offset in bytes of the first AABB inaabbData
.
The AABB data in memory is six 32-bit floats consisting of the minimum x, y, and z values followed by the maximum x, y, and z values.
To destroy an acceleration structure, call:
void vkDestroyAccelerationStructureNV(
VkDevice device,
VkAccelerationStructureNV accelerationStructure,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the buffer. -
accelerationStructure
is the acceleration structure to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
An acceleration structure has memory requirements for the structure object itself, scratch space for the build, and scratch space for the update.
To query the memory requirements call:
void vkGetAccelerationStructureMemoryRequirementsNV(
VkDevice device,
const VkAccelerationStructureMemoryRequirementsInfoNV* pInfo,
VkMemoryRequirements2KHR* pMemoryRequirements);
-
device
is the logical device on which the acceleration structure was created. -
pInfo
specifies the acceleration structure to get memory requirements for. -
pMemoryRequirements
returns the requested acceleration structure memory requirements.
The VkAccelerationStructureMemoryRequirementsInfoNV
structure is
defined as:
typedef struct VkAccelerationStructureMemoryRequirementsInfoNV {
VkStructureType sType;
const void* pNext;
VkAccelerationStructureMemoryRequirementsTypeNV type;
VkAccelerationStructureNV accelerationStructure;
} VkAccelerationStructureMemoryRequirementsInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
type
selects the type of memory requirement being queried.VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_OBJECT_NV
returns the memory requirements for the object itself.VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_BUILD_SCRATCH_NV
returns the memory requirements for the scratch memory when doing a build.VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_UPDATE_SCRATCH_NV
returns the memory requirements for the scratch memory when doing an update. -
accelerationStructure
is the acceleration structure to be queried for memory requirements.
Possible values of type
in
VkAccelerationStructureMemoryRequirementsInfoNV
are:,
typedef enum VkAccelerationStructureMemoryRequirementsTypeNV {
VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_OBJECT_NV = 0,
VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_BUILD_SCRATCH_NV = 1,
VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_UPDATE_SCRATCH_NV = 2,
} VkAccelerationStructureMemoryRequirementsTypeNV;
-
VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_OBJECT_NV
requests the memory requirement for theVkAccelerationStructureNV
backing store. -
VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_BUILD_SCRATCH_NV
requests the memory requirement for scratch space during the initial build. -
VK_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_TYPE_UPDATE_SCRATCH_NV
requests the memory requirement for scratch space during an update.
To attach memory to one or more acceleration structures at a time, call:
VkResult vkBindAccelerationStructureMemoryNV(
VkDevice device,
uint32_t bindInfoCount,
const VkBindAccelerationStructureMemoryInfoNV* pBindInfos);
-
device
is the logical device that owns the acceleration structures and memory. -
bindInfoCount
is the number of elements inpBindInfos
. -
pBindInfos
is a pointer to an array of structures of type VkBindAccelerationStructureMemoryInfoNV, describing images and memory to bind.
The VkBindAccelerationStructureMemoryInfoNV
structure is defined as:
typedef struct VkBindAccelerationStructureMemoryInfoNV {
VkStructureType sType;
const void* pNext;
VkAccelerationStructureNV accelerationStructure;
VkDeviceMemory memory;
VkDeviceSize memoryOffset;
uint32_t deviceIndexCount;
const uint32_t* pDeviceIndices;
} VkBindAccelerationStructureMemoryInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
accelerationStructure
is the acceleration structure to be attached to memory. -
memory
is aVkDeviceMemory
object describing the device memory to attach. -
memoryOffset
is the start offset of the region of memory that is to be bound to the acceleration structure. The number of bytes returned in the VkMemoryRequirements::size
member inmemory
, starting frommemoryOffset
bytes, will be bound to the specified acceleration structure. -
deviceIndexCount
is the number of elements inpDeviceIndices
. -
pDeviceIndices
is a pointer to an array of device indices.
To allow constructing geometry instances with device code if desired, we need to be able to query a opaque handle for an acceleration structure. This handle is a value of 8 bytes. To get this handle, call:
VkResult vkGetAccelerationStructureHandleNV(
VkDevice device,
VkAccelerationStructureNV accelerationStructure,
size_t dataSize,
void* pData);
-
device
is the logical device that owns the acceleration structures. -
accelerationStructure
is the acceleration structure. -
dataSize
is the size in bytes of the buffer pointed to bypData
. -
pData
is a pointer to a user-allocated buffer where the results will be written.
12. Samplers
VkSampler
objects represent the state of an image sampler which is
used by the implementation to read image data and apply filtering and other
transformations for the shader.
Samplers are represented by VkSampler
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSampler)
To create a sampler object, call:
VkResult vkCreateSampler(
VkDevice device,
const VkSamplerCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSampler* pSampler);
-
device
is the logical device that creates the sampler. -
pCreateInfo
is a pointer to an instance of the VkSamplerCreateInfo structure specifying the state of the sampler object. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pSampler
points to a VkSampler handle in which the resulting sampler object is returned.
The VkSamplerCreateInfo
structure is defined as:
typedef struct VkSamplerCreateInfo {
VkStructureType sType;
const void* pNext;
VkSamplerCreateFlags flags;
VkFilter magFilter;
VkFilter minFilter;
VkSamplerMipmapMode mipmapMode;
VkSamplerAddressMode addressModeU;
VkSamplerAddressMode addressModeV;
VkSamplerAddressMode addressModeW;
float mipLodBias;
VkBool32 anisotropyEnable;
float maxAnisotropy;
VkBool32 compareEnable;
VkCompareOp compareOp;
float minLod;
float maxLod;
VkBorderColor borderColor;
VkBool32 unnormalizedCoordinates;
} VkSamplerCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkSamplerCreateFlagBits describing additional parameters of the sampler. -
magFilter
is a VkFilter value specifying the magnification filter to apply to lookups. -
minFilter
is a VkFilter value specifying the minification filter to apply to lookups. -
mipmapMode
is a VkSamplerMipmapMode value specifying the mipmap filter to apply to lookups. -
addressModeU
is a VkSamplerAddressMode value specifying the addressing mode for outside [0..1] range for U coordinate. -
addressModeV
is a VkSamplerAddressMode value specifying the addressing mode for outside [0..1] range for V coordinate. -
addressModeW
is a VkSamplerAddressMode value specifying the addressing mode for outside [0..1] range for W coordinate. -
mipLodBias
is the bias to be added to mipmap LOD (level-of-detail) calculation and bias provided by image sampling functions in SPIR-V, as described in the Level-of-Detail Operation section. -
anisotropyEnable
isVK_TRUE
to enable anisotropic filtering, as described in the Texel Anisotropic Filtering section, orVK_FALSE
otherwise. -
maxAnisotropy
is the anisotropy value clamp used by the sampler whenanisotropyEnable
isVK_TRUE
. IfanisotropyEnable
isVK_FALSE
,maxAnisotropy
is ignored. -
compareEnable
isVK_TRUE
to enable comparison against a reference value during lookups, orVK_FALSE
otherwise.-
Note: Some implementations will default to shader state if this member does not match.
-
-
compareOp
is a VkCompareOp value specifying the comparison function to apply to fetched data before filtering as described in the Depth Compare Operation section. -
minLod
andmaxLod
are the values used to clamp the computed LOD value, as described in the Level-of-Detail Operation section. -
borderColor
is a VkBorderColor value specifying the predefined border color to use. -
unnormalizedCoordinates
controls whether to use unnormalized or normalized texel coordinates to address texels of the image. When set toVK_TRUE
, the range of the image coordinates used to lookup the texel is in the range of zero to the image dimensions for x, y and z. When set toVK_FALSE
the range of image coordinates is zero to one. WhenunnormalizedCoordinates
isVK_TRUE
, samplers have the following requirements:-
minFilter
andmagFilter
must be equal. -
mipmapMode
must beVK_SAMPLER_MIPMAP_MODE_NEAREST
. -
minLod
andmaxLod
must be zero. -
addressModeU
andaddressModeV
must each be eitherVK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
orVK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER
. -
anisotropyEnable
must beVK_FALSE
. -
compareEnable
must beVK_FALSE
. -
The sampler must not enable sampler Y’CBCR conversion.
-
flags
must not includeVK_SAMPLER_CREATE_SUBSAMPLED_BIT_EXT
-
-
When
unnormalizedCoordinates
isVK_TRUE
, images the sampler is used with in the shader have the following requirements:-
The
viewType
must be eitherVK_IMAGE_VIEW_TYPE_1D
orVK_IMAGE_VIEW_TYPE_2D
. -
The image view must have a single layer and a single mip level.
-
-
When
unnormalizedCoordinates
isVK_TRUE
, image built-in functions in the shader that use the sampler have the following requirements:-
The functions must not use projection.
-
The functions must not use offsets.
-
Mapping of OpenGL to Vulkan filter modes
There are no Vulkan filter modes that directly correspond to OpenGL
minification filters of Note that using a |
The maximum number of sampler objects which can be simultaneously created
on a device is implementation-dependent and specified by the
maxSamplerAllocationCount
member of the VkPhysicalDeviceLimits structure.
If maxSamplerAllocationCount
is exceeded, vkCreateSampler
will
return VK_ERROR_TOO_MANY_OBJECTS
.
Since VkSampler is a non-dispatchable handle type, implementations
may return the same handle for sampler state vectors that are identical.
In such cases, all such objects would only count once against the
maxSamplerAllocationCount
limit.
Bits which can be set in VkSamplerCreateInfo::flags
, specifying
additional parameters of a sampler, are:
typedef enum VkSamplerCreateFlagBits {
VK_SAMPLER_CREATE_SUBSAMPLED_BIT_EXT = 0x00000001,
VK_SAMPLER_CREATE_SUBSAMPLED_COARSE_RECONSTRUCTION_BIT_EXT = 0x00000002,
} VkSamplerCreateFlagBits;
-
VK_SAMPLER_CREATE_SUBSAMPLED_BIT_EXT
specifies that the sampler will read from an image created withflags
containingVK_IMAGE_CREATE_SUBSAMPLED_BIT_EXT
. -
VK_SAMPLER_CREATE_SUBSAMPLED_COARSE_RECONSTRUCTION_BIT_EXT
specifies that the implementation may use approximations when reconstructing a full color value for texture access from a subsampled image.
Note
The approximations used when
|
typedef VkFlags VkSamplerCreateFlags;
VkSamplerCreateFlags
is a bitmask type for setting a mask of zero or
more VkSamplerCreateFlagBits.
If the pNext
chain of VkSamplerCreateInfo includes a
VkSamplerReductionModeCreateInfoEXT
structure, then that structure
includes a mode that controls how texture filtering combines texel values.
The VkSamplerReductionModeCreateInfoEXT
structure is defined as:
typedef struct VkSamplerReductionModeCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkSamplerReductionModeEXT reductionMode;
} VkSamplerReductionModeCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
reductionMode
is an enum of type VkSamplerReductionModeEXT that controls how texture filtering combines texel values.
If this structure is not present, reductionMode
is considered to be
VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE_EXT
.
Reduction modes are specified by VkSamplerReductionModeEXT, which takes values:
typedef enum VkSamplerReductionModeEXT {
VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE_EXT = 0,
VK_SAMPLER_REDUCTION_MODE_MIN_EXT = 1,
VK_SAMPLER_REDUCTION_MODE_MAX_EXT = 2,
} VkSamplerReductionModeEXT;
-
VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE_EXT
specifies that texel values are combined by computing a weighted average of values in the footprint, using weights as specified in the image operations chapter. -
VK_SAMPLER_REDUCTION_MODE_MIN_EXT
specifies that texel values are combined by taking the component-wise minimum of values in the footprint with non-zero weights. -
VK_SAMPLER_REDUCTION_MODE_MAX_EXT
specifies that texel values are combined by taking the component-wise maximum of values in the footprint with non-zero weights.
Possible values of the VkSamplerCreateInfo::magFilter
and
minFilter
parameters, specifying filters used for texture lookups,
are:
typedef enum VkFilter {
VK_FILTER_NEAREST = 0,
VK_FILTER_LINEAR = 1,
VK_FILTER_CUBIC_IMG = 1000015000,
} VkFilter;
-
VK_FILTER_NEAREST
specifies nearest filtering. -
VK_FILTER_LINEAR
specifies linear filtering. -
VK_FILTER_CUBIC_IMG
specifies cubic filtering.
These filters are described in detail in Texel Filtering.
Possible values of the VkSamplerCreateInfo::mipmapMode
,
specifying the mipmap mode used for texture lookups, are:
typedef enum VkSamplerMipmapMode {
VK_SAMPLER_MIPMAP_MODE_NEAREST = 0,
VK_SAMPLER_MIPMAP_MODE_LINEAR = 1,
} VkSamplerMipmapMode;
-
VK_SAMPLER_MIPMAP_MODE_NEAREST
specifies nearest filtering. -
VK_SAMPLER_MIPMAP_MODE_LINEAR
specifies linear filtering.
These modes are described in detail in Texel Filtering.
Possible values of the VkSamplerCreateInfo::addressMode*
parameters, specifying the behavior of sampling with coordinates outside the
range [0,1] for the respective u, v, or w coordinate
as defined in the Wrapping Operation
section, are:
typedef enum VkSamplerAddressMode {
VK_SAMPLER_ADDRESS_MODE_REPEAT = 0,
VK_SAMPLER_ADDRESS_MODE_MIRRORED_REPEAT = 1,
VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE = 2,
VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER = 3,
VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE = 4,
} VkSamplerAddressMode;
-
VK_SAMPLER_ADDRESS_MODE_REPEAT
specifies that the repeat wrap mode will be used. -
VK_SAMPLER_ADDRESS_MODE_MIRRORED_REPEAT
specifies that the mirrored repeat wrap mode will be used. -
VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
specifies that the clamp to edge wrap mode will be used. -
VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER
specifies that the clamp to border wrap mode will be used. -
VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE
specifies that the mirror clamp to edge wrap mode will be used. This is only valid if theVK_KHR_sampler_mirror_clamp_to_edge
extension is enabled.
Possible values of VkSamplerCreateInfo::borderColor
, specifying
the border color used for texture lookups, are:
typedef enum VkBorderColor {
VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK = 0,
VK_BORDER_COLOR_INT_TRANSPARENT_BLACK = 1,
VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK = 2,
VK_BORDER_COLOR_INT_OPAQUE_BLACK = 3,
VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE = 4,
VK_BORDER_COLOR_INT_OPAQUE_WHITE = 5,
} VkBorderColor;
-
VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK
specifies a transparent, floating-point format, black color. -
VK_BORDER_COLOR_INT_TRANSPARENT_BLACK
specifies a transparent, integer format, black color. -
VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK
specifies an opaque, floating-point format, black color. -
VK_BORDER_COLOR_INT_OPAQUE_BLACK
specifies an opaque, integer format, black color. -
VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE
specifies an opaque, floating-point format, white color. -
VK_BORDER_COLOR_INT_OPAQUE_WHITE
specifies an opaque, integer format, white color.
These colors are described in detail in Texel Replacement.
To destroy a sampler, call:
void vkDestroySampler(
VkDevice device,
VkSampler sampler,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the sampler. -
sampler
is the sampler to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
12.1. Sampler Y’CBCR conversion
To create a sampler with Y’CBCR conversion enabled, add a
VkSamplerYcbcrConversionInfo to the pNext
chain of the
VkSamplerCreateInfo structure.
To create a sampler Y’CBCR conversion, the
samplerYcbcrConversion
feature must be enabled.
Conversion must be fixed at pipeline creation time, through use of a
combined image sampler with an immutable sampler in
VkDescriptorSetLayoutBinding
.
A VkSamplerYcbcrConversionInfo must be provided for samplers to be
used with image views that access VK_IMAGE_ASPECT_COLOR_BIT
if the
format appears in Formats requiring sampler Y’CBCR conversion for VK_IMAGE_ASPECT_COLOR_BIT
image views
, or if the image view has an
external format
.
The VkSamplerYcbcrConversionInfo
structure is defined as:
typedef struct VkSamplerYcbcrConversionInfo {
VkStructureType sType;
const void* pNext;
VkSamplerYcbcrConversion conversion;
} VkSamplerYcbcrConversionInfo;
or the equivalent
typedef VkSamplerYcbcrConversionInfo VkSamplerYcbcrConversionInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
conversion
is a VkSamplerYcbcrConversion handle created with vkCreateSamplerYcbcrConversion.
A sampler Y’CBCR conversion is an opaque representation of a
device-specific sampler Y’CBCR conversion description, represented as a
VkSamplerYcbcrConversion
handle:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSamplerYcbcrConversion)
or the equivalent
typedef VkSamplerYcbcrConversion VkSamplerYcbcrConversionKHR;
To create a VkSamplerYcbcrConversion, call:
VkResult vkCreateSamplerYcbcrConversion(
VkDevice device,
const VkSamplerYcbcrConversionCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSamplerYcbcrConversion* pYcbcrConversion);
or the equivalent command
VkResult vkCreateSamplerYcbcrConversionKHR(
VkDevice device,
const VkSamplerYcbcrConversionCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSamplerYcbcrConversion* pYcbcrConversion);
-
device
is the logical device that creates the sampler Y’CBCR conversion. -
pCreateInfo
is a pointer to an instance of the VkSamplerYcbcrConversionCreateInfo specifying the requested sampler Y’CBCR conversion. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pYcbcrConversion
points to a VkSamplerYcbcrConversion handle in which the resulting sampler Y’CBCR conversion is returned.
The interpretation of the configured sampler Y’CBCR conversion is described in more detail in the description of sampler Y’CBCR conversion in the Image Operations chapter.
The VkSamplerYcbcrConversionCreateInfo
structure is defined as:
typedef struct VkSamplerYcbcrConversionCreateInfo {
VkStructureType sType;
const void* pNext;
VkFormat format;
VkSamplerYcbcrModelConversion ycbcrModel;
VkSamplerYcbcrRange ycbcrRange;
VkComponentMapping components;
VkChromaLocation xChromaOffset;
VkChromaLocation yChromaOffset;
VkFilter chromaFilter;
VkBool32 forceExplicitReconstruction;
} VkSamplerYcbcrConversionCreateInfo;
or the equivalent
typedef VkSamplerYcbcrConversionCreateInfo VkSamplerYcbcrConversionCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
format
is the format of the image from which color information will be retrieved. -
ycbcrModel
describes the color matrix for conversion between color models. -
ycbcrRange
describes whether the encoded values have headroom and foot room, or whether the encoding uses the full numerical range. -
components
applies a swizzle based on VkComponentSwizzle enums prior to range expansion and color model conversion. -
xChromaOffset
describes the sample location associated with downsampled chroma channels in the x dimension.xChromaOffset
has no effect for formats in which chroma channels are the same resolution as the luma channel. -
yChromaOffset
describes the sample location associated with downsampled chroma channels in the y dimension.yChromaOffset
has no effect for formats in which the chroma channels are not downsampled vertically. -
chromaFilter
is the filter for chroma reconstruction. -
forceExplicitReconstruction
can be used to ensure that reconstruction is done explicitly, if supported.
Note
Setting |
If the pNext
chain has an instance of VkExternalFormatANDROID
with non-zero externalFormat
member, the sampler Y’CBCR conversion
object represents an external format conversion, and format
must be
VK_FORMAT_UNDEFINED
.
Such conversions must only be used to sample image views with a matching
external
format.
When creating an external format conversion, the value of components
is ignored.
If chromaFilter
is VK_FILTER_NEAREST
, chroma samples are
reconstructed to luma channel resolution using nearest-neighbour sampling.
Otherwise, chroma samples are reconstructed using interpolation.
More details can be found in the
description of sampler Y’CBCR conversion in the Image
Operations chapter.
VkSamplerYcbcrModelConversion defines the conversion from the source color model to the shader color model. Possible values are:
typedef enum VkSamplerYcbcrModelConversion {
VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY = 0,
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_IDENTITY = 1,
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709 = 2,
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_601 = 3,
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_2020 = 4,
VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY_KHR = VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY,
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_IDENTITY_KHR = VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_IDENTITY,
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709_KHR = VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709,
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_601_KHR = VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_601,
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_2020_KHR = VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_2020,
} VkSamplerYcbcrModelConversion;
or the equivalent
typedef VkSamplerYcbcrModelConversion VkSamplerYcbcrModelConversionKHR;
-
VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY
specifies that the input values to the conversion are unmodified. -
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_IDENTITY
specifies no model conversion but the inputs are range expanded as for Y’CBCR. -
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709
specifies the color model conversion from Y’CBCR to R’G’B' defined in BT.709 and described in the “BT.709 Y’CBCR conversion” section of the Khronos Data Format Specification. -
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_601
specifies the color model conversion from Y’CBCR to R’G’B' defined in BT.601 and described in the “BT.601 Y’CBCR conversion” section of the Khronos Data Format Specification. -
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_2020
specifies the color model conversion from Y’CBCR to R’G’B' defined in BT.2020 and described in the “BT.2020 Y’CBCR conversion” section of the Khronos Data Format Specification.
In the VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_*
color models, for the
input to the sampler Y’CBCR range expansion and model conversion:
-
the Y (Y' luma) channel corresponds to the G channel of an RGB image.
-
the CB (CB or “U” blue color difference) channel corresponds to the B channel of an RGB image.
-
the CR (CR or “V” red color difference) channel corresponds to the R channel of an RGB image.
-
the alpha channel, if present, is not modified by color model conversion.
These rules reflect the mapping of channels after the channel swizzle
operation (controlled by
VkSamplerYcbcrConversionCreateInfo::components
).
Note
For example, an “YUVA” 32-bit format comprising four 8-bit channels can be
implemented as
|
The VkSamplerYcbcrRange enum describes whether color channels are encoded using the full range of numerical values or whether values are reserved for headroom and foot room. VkSamplerYcbcrRange is defined as:
typedef enum VkSamplerYcbcrRange {
VK_SAMPLER_YCBCR_RANGE_ITU_FULL = 0,
VK_SAMPLER_YCBCR_RANGE_ITU_NARROW = 1,
VK_SAMPLER_YCBCR_RANGE_ITU_FULL_KHR = VK_SAMPLER_YCBCR_RANGE_ITU_FULL,
VK_SAMPLER_YCBCR_RANGE_ITU_NARROW_KHR = VK_SAMPLER_YCBCR_RANGE_ITU_NARROW,
} VkSamplerYcbcrRange;
or the equivalent
typedef VkSamplerYcbcrRange VkSamplerYcbcrRangeKHR;
-
VK_SAMPLER_YCBCR_RANGE_ITU_FULL
specifies that the full range of the encoded values are valid and interpreted according to the ITU “full range” quantization rules. -
VK_SAMPLER_YCBCR_RANGE_ITU_NARROW
specifies that headroom and foot room are reserved in the numerical range of encoded values, and the remaining values are expanded according to the ITU “narrow range” quantization rules.
The formulae for these conversions is described in the Sampler Y’CBCR Range Expansion section of the Image Operations chapter.
No range modification takes place if ycbcrModel
is
VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY
; the ycbcrRange
field of VkSamplerYcbcrConversionCreateInfo
is ignored in this case.
The VkChromaLocation enum, which defines the location of downsampled chroma channel samples relative to the luma samples, is defined as:
typedef enum VkChromaLocation {
VK_CHROMA_LOCATION_COSITED_EVEN = 0,
VK_CHROMA_LOCATION_MIDPOINT = 1,
VK_CHROMA_LOCATION_COSITED_EVEN_KHR = VK_CHROMA_LOCATION_COSITED_EVEN,
VK_CHROMA_LOCATION_MIDPOINT_KHR = VK_CHROMA_LOCATION_MIDPOINT,
} VkChromaLocation;
or the equivalent
typedef VkChromaLocation VkChromaLocationKHR;
-
VK_CHROMA_LOCATION_COSITED_EVEN
specifies that downsampled chroma samples are aligned with luma samples with even coordinates. -
VK_CHROMA_LOCATION_MIDPOINT
specifies that downsampled chroma samples are located half way between each even luma sample and the nearest higher odd luma sample.
To destroy a sampler Y’CBCR conversion, call:
void vkDestroySamplerYcbcrConversion(
VkDevice device,
VkSamplerYcbcrConversion ycbcrConversion,
const VkAllocationCallbacks* pAllocator);
or the equivalent command
void vkDestroySamplerYcbcrConversionKHR(
VkDevice device,
VkSamplerYcbcrConversion ycbcrConversion,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the Y’CBCR conversion. -
ycbcrConversion
is the conversion to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
13. Resource Descriptors
A descriptor is an opaque data structure representing a shader resource
such as a buffer, buffer view, image view, sampler, or combined image
sampler.
Descriptors are organised into descriptor sets, which are bound during
command recording for use in subsequent draw commands.
The arrangement of content in each descriptor set is determined by a
descriptor set layout, which determines what descriptors can be stored
within it.
The sequence of descriptor set layouts that can be used by a pipeline is
specified in a pipeline layout.
Each pipeline object can use up to maxBoundDescriptorSets
(see
Limits) descriptor sets.
Shaders access resources via variables decorated with a descriptor set and binding number that link them to a descriptor in a descriptor set. The shader interface mapping to bound descriptor sets is described in the Shader Resource Interface section.
13.1. Descriptor Types
There are a number of different types of descriptor supported by Vulkan, corresponding to different resources or usage. The following sections describe the API definitions of each descriptor type. The mapping of each type to SPIR-V is listed in the Shader Resource and Descriptor Type Correspondence and Shader Resource and Storage Class Correspondence tables in the Shader Interfaces chapter.
13.1.1. Storage Image
A storage image (VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
) is a descriptor
type associated with an image resource via an
image view that load, store, and atomic
operations can be performed on.
Storage image loads are supported in all shader stages for image views whose
format features contain
VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT
.
Stores to storage images are supported in compute shaders for image views
whose format features contain
VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT
.
Atomic operations on storage images are supported in compute shaders for
image views whose format features
contain
VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT
.
When the fragmentStoresAndAtomics
feature is enabled, stores and atomic
operations are also supported for storage images in fragment shaders with
the same set of image formats as supported in compute shaders.
When the vertexPipelineStoresAndAtomics
feature is enabled, stores and atomic
operations are also supported in vertex, tessellation, and geometry shaders
with the same set of image formats as supported in compute shaders.
The image subresources for a storage image must be in the
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
or
VK_IMAGE_LAYOUT_GENERAL
layout in order to access its data in a
shader.
13.1.2. Sampler
A sampler descriptor (VK_DESCRIPTOR_TYPE_SAMPLER
) is a descriptor
type associated with a sampler object, used to control the
behaviour of sampling operations performed on a
sampled image.
13.1.3. Sampled Image
A sampled image (VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
) is a descriptor
type associated with an image resource via an
image view that sampling operations
can be performed on.
Shaders combine a sampled image variable and a sampler variable to perform sampling operations.
Sampled images are supported in all shader stages for image views whose
format features contain
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
.
The image subresources for a sampled image must be in the
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
,
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL
,
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL
,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL
,
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL
, or
VK_IMAGE_LAYOUT_GENERAL
layout in order to access its data in a
shader.
13.1.4. Combined Image Sampler
A combined image sampler (VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
)
is a single descriptor type associated with both a sampler and
an image resource, combining both a
sampler and sampled image descriptor into a single descriptor.
If the descriptor refers to a sampler that performs Y’CBCR conversion or samples a subsampled image, the sampler must only be used to sample the image in the same descriptor. Otherwise, the sampler and image in this type of descriptor can be used freely with any other samplers and images.
The image subresources for a combined image sampler must be in the
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
,
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL
,
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL
,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL
,
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL
, or
VK_IMAGE_LAYOUT_GENERAL
layout in order to access its data in a
shader.
Note
On some implementations, it may be more efficient to sample from an image using a combination of sampler and sampled image that are stored together in the descriptor set in a combined descriptor. |
13.1.5. Uniform Texel Buffer
A uniform texel buffer (VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
) is
a descriptor type associated with a buffer resource
via a buffer view that formatted load
operations can be performed on.
Uniform texel buffers define a tightly-packed 1-dimensional linear array of texels, with texels going through format conversion when read in a shader in the same way as they are for an image.
Load operations from uniform texel buffers are supported in all shader
stages for image formats which report support for the
VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT
feature bit via vkGetPhysicalDeviceFormatProperties in
VkFormatProperties::bufferFeatures
.
13.1.6. Storage Texel Buffer
A storage texel buffer (VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
) is
a descriptor type associated with a buffer resource
via a buffer view that formatted
load, store, and atomic operations can be performed on.
Storage texel buffers define a tightly-packed 1-dimensional linear array of texels, with texels going through format conversion when read in a shader in the same way as they are for an image. Unlike uniform texel buffers, these buffers can also be written to in the same way as for storage images.
Storage texel buffer loads are supported in all shader stages for texel
buffer formats which report support for the
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_BIT
feature bit via vkGetPhysicalDeviceFormatProperties in
VkFormatProperties::bufferFeatures
.
Stores to storage texel buffers are supported in compute shaders for texel
buffer formats which report support for the
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_BIT
feature via
vkGetPhysicalDeviceFormatProperties in
VkFormatProperties::bufferFeatures
.
Atomic operations on storage texel buffers are supported in compute shaders
for texel buffer formats which report support for the
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_ATOMIC_BIT
feature via vkGetPhysicalDeviceFormatProperties in
VkFormatProperties::bufferFeatures
.
When the fragmentStoresAndAtomics
feature is enabled, stores and atomic
operations are also supported for storage texel buffers in fragment shaders
with the same set of texel buffer formats as supported in compute shaders.
When the vertexPipelineStoresAndAtomics
feature is enabled, stores and atomic
operations are also supported in vertex, tessellation, and geometry shaders
with the same set of texel buffer formats as supported in compute shaders.
13.1.7. Storage Buffer
A storage buffer (VK_DESCRIPTOR_TYPE_STORAGE_BUFFER
) is a descriptor
type associated with a buffer resource directly,
described in a shader as a structure with various members that load, store,
and atomic operations can be performed on.
Note
Atomic operations can only be performed on members of certain types as defined in the SPIR-V environment appendix. |
13.1.8. Uniform Buffer
A uniform buffer (VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
) is a descriptor
type associated with a buffer resource directly,
described in a shader as a structure with various members that load
operations can be performed on.
13.1.9. Dynamic Uniform Buffer
A dynamic uniform buffer (VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
)
is almost identical to a uniform buffer,
and differs only in how the offset into the buffer is specified.
The base offset calculated by the VkDescriptorBufferInfo when
initially updating the descriptor set is added
to a dynamic offset when binding
the descriptor set.
13.1.10. Dynamic Storage Buffer
A dynamic storage buffer (VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
)
is almost identical to a storage buffer,
and differs only in how the offset into the buffer is specified.
The base offset calculated by the VkDescriptorBufferInfo when
initially updating the descriptor set is added
to a dynamic offset when binding
the descriptor set.
13.1.11. Inline Uniform Block
An inline uniform block
(VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
) is almost identical to a
uniform buffer, and differs only in taking
its storage directly from the encompassing descriptor set instead of being
backed by buffer memory.
It is typically used to access a small set of constant data that does not
require the additional flexibility provided by the indirection enabled when
using a uniform buffer where the descriptor and the referenced buffer memory
are decoupled.
Compared to push constants, they allow reusing the same set of constant data
across multiple disjoint sets of draw and dispatch commands.
Inline uniform block descriptors cannot be aggregated into arrays. Instead, the array size specified for an inline uniform block descriptor binding specifies the binding’s capacity in bytes.
13.1.12. Input Attachment
An input attachment (VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
) is a
descriptor type associated with an image resource via
an image view that can be used for
framebuffer local load operations in
fragment shaders.
All image formats that are supported for color attachments
(VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT
) or depth/stencil attachments
(VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT
) for a given image
tiling mode are also supported for input attachments.
The image subresources for an input attachment must be in the
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
,
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL
,
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL
,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL
,
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL
, or
VK_IMAGE_LAYOUT_GENERAL
layout in order to access its data in a
shader.
13.2. Descriptor Sets
Descriptors are grouped together into descriptor set objects. A descriptor set object is an opaque object that contains storage for a set of descriptors, where the types and number of descriptors is defined by a descriptor set layout. The layout object may be used to define the association of each descriptor binding with memory or other implementation resources. The layout is used both for determining the resources that need to be associated with the descriptor set, and determining the interface between shader stages and shader resources.
13.2.1. Descriptor Set Layout
A descriptor set layout object is defined by an array of zero or more descriptor bindings. Each individual descriptor binding is specified by a descriptor type, a count (array size) of the number of descriptors in the binding, a set of shader stages that can access the binding, and (if using immutable samplers) an array of sampler descriptors.
Descriptor set layout objects are represented by VkDescriptorSetLayout
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorSetLayout)
To create descriptor set layout objects, call:
VkResult vkCreateDescriptorSetLayout(
VkDevice device,
const VkDescriptorSetLayoutCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDescriptorSetLayout* pSetLayout);
-
device
is the logical device that creates the descriptor set layout. -
pCreateInfo
is a pointer to an instance of the VkDescriptorSetLayoutCreateInfo structure specifying the state of the descriptor set layout object. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pSetLayout
points to a VkDescriptorSetLayout handle in which the resulting descriptor set layout object is returned.
Information about the descriptor set layout is passed in an instance of the
VkDescriptorSetLayoutCreateInfo
structure:
typedef struct VkDescriptorSetLayoutCreateInfo {
VkStructureType sType;
const void* pNext;
VkDescriptorSetLayoutCreateFlags flags;
uint32_t bindingCount;
const VkDescriptorSetLayoutBinding* pBindings;
} VkDescriptorSetLayoutCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkDescriptorSetLayoutCreateFlagBits specifying options for descriptor set layout creation. -
bindingCount
is the number of elements inpBindings
. -
pBindings
is a pointer to an array of VkDescriptorSetLayoutBinding structures.
Bits which can be set in VkDescriptorSetLayoutCreateInfo::flags
to specify options for descriptor set layout are:
typedef enum VkDescriptorSetLayoutCreateFlagBits {
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR = 0x00000001,
VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT = 0x00000002,
} VkDescriptorSetLayoutCreateFlagBits;
-
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR
specifies that descriptor sets must not be allocated using this layout, and descriptors are instead pushed by vkCmdPushDescriptorSetKHR. -
VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
specifies that descriptor sets using this layout must be allocated from a descriptor pool created with theVK_DESCRIPTOR_POOL_CREATE_UPDATE_AFTER_BIND_BIT_EXT
bit set. Descriptor set layouts created with this bit set have alternate limits for the maximum number of descriptors per-stage and per-pipeline layout. The non-UpdateAfterBind limits only count descriptors in sets created without this flag. The UpdateAfterBind limits count all descriptors, but the limits may be higher than the non-UpdateAfterBind limits.
typedef VkFlags VkDescriptorSetLayoutCreateFlags;
VkDescriptorSetLayoutCreateFlags
is a bitmask type for setting a mask
of zero or more VkDescriptorSetLayoutCreateFlagBits.
The VkDescriptorSetLayoutBinding
structure is defined as:
typedef struct VkDescriptorSetLayoutBinding {
uint32_t binding;
VkDescriptorType descriptorType;
uint32_t descriptorCount;
VkShaderStageFlags stageFlags;
const VkSampler* pImmutableSamplers;
} VkDescriptorSetLayoutBinding;
-
binding
is the binding number of this entry and corresponds to a resource of the same binding number in the shader stages. -
descriptorType
is a VkDescriptorType specifying which type of resource descriptors are used for this binding. -
descriptorCount
is the number of descriptors contained in the binding, accessed in a shader as an array , except ifdescriptorType
isVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
in which casedescriptorCount
is the size in bytes of the inline uniform block . IfdescriptorCount
is zero this binding entry is reserved and the resource must not be accessed from any stage via this binding within any pipeline using the set layout. -
stageFlags
member is a bitmask of VkShaderStageFlagBits specifying which pipeline shader stages can access a resource for this binding.VK_SHADER_STAGE_ALL
is a shorthand specifying that all defined shader stages, including any additional stages defined by extensions, can access the resource.If a shader stage is not included in
stageFlags
, then a resource must not be accessed from that stage via this binding within any pipeline using the set layout. Other than input attachments which are limited to the fragment shader, there are no limitations on what combinations of stages can use a descriptor binding, and in particular a binding can be used by both graphics stages and the compute stage. -
pImmutableSamplers
affects initialization of samplers. IfdescriptorType
specifies aVK_DESCRIPTOR_TYPE_SAMPLER
orVK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
type descriptor, thenpImmutableSamplers
can be used to initialize a set of immutable samplers. Immutable samplers are permanently bound into the set layout; later binding a sampler into an immutable sampler slot in a descriptor set is not allowed. IfpImmutableSamplers
is notNULL
, then it is considered to be a pointer to an array of sampler handles that will be consumed by the set layout and used for the corresponding binding. IfpImmutableSamplers
isNULL
, then the sampler slots are dynamic and sampler handles must be bound into descriptor sets using this layout. IfdescriptorType
is not one of these descriptor types, thenpImmutableSamplers
is ignored.
The above layout definition allows the descriptor bindings to be specified
sparsely such that not all binding numbers between 0 and the maximum binding
number need to be specified in the pBindings
array.
Bindings that are not specified have a descriptorCount
and
stageFlags
of zero, and the value of descriptorType
is
undefined.
However, all binding numbers between 0 and the maximum binding number in the
VkDescriptorSetLayoutCreateInfo::pBindings
array may consume
memory in the descriptor set layout even if not all descriptor bindings are
used, though it should not consume additional memory from the descriptor
pool.
Note
The maximum binding number specified should be as compact as possible to avoid wasted memory. |
If the pNext
chain of a VkDescriptorSetLayoutCreateInfo
structure includes a VkDescriptorSetLayoutBindingFlagsCreateInfoEXT
structure, then that structure includes an array of flags, one for each
descriptor set layout binding.
The VkDescriptorSetLayoutBindingFlagsCreateInfoEXT structure is defined as:
typedef struct VkDescriptorSetLayoutBindingFlagsCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t bindingCount;
const VkDescriptorBindingFlagsEXT* pBindingFlags;
} VkDescriptorSetLayoutBindingFlagsCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
bindingCount
is zero or the number of elements inpBindingFlags
. -
pBindingFlags
is a pointer to an array of VkDescriptorBindingFlagsEXT bitfields, one for each descriptor set layout binding.
If bindingCount
is zero or if this structure is not in the pNext
chain, the VkDescriptorBindingFlagsEXT for each descriptor set layout
binding is considered to be zero.
Otherwise, the descriptor set layout binding at
VkDescriptorSetLayoutCreateInfo::pBindings
[i] uses the flags in
pBindingFlags
[i].
Bits which can be set in each element of
VkDescriptorSetLayoutBindingFlagsCreateInfoEXT::pBindingFlags
to
specify options for the corresponding descriptor set layout binding are:
typedef enum VkDescriptorBindingFlagBitsEXT {
VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT = 0x00000001,
VK_DESCRIPTOR_BINDING_UPDATE_UNUSED_WHILE_PENDING_BIT_EXT = 0x00000002,
VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT = 0x00000004,
VK_DESCRIPTOR_BINDING_VARIABLE_DESCRIPTOR_COUNT_BIT_EXT = 0x00000008,
} VkDescriptorBindingFlagBitsEXT;
-
VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
indicates that if descriptors in this binding are updated between when the descriptor set is bound in a command buffer and when that command buffer is submitted to a queue, then the submission will use the most recently set descriptors for this binding and the updates do not invalidate the command buffer. Descriptor bindings created with this flag are also partially exempt from the external synchronization requirement in vkUpdateDescriptorSetWithTemplateKHR and vkUpdateDescriptorSets. They can be updated concurrently with the set being bound to a command buffer in another thread, but not concurrently with the set being reset or freed. -
VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT
indicates that descriptors in this binding that are not dynamically used need not contain valid descriptors at the time the descriptors are consumed. A descriptor is dynamically used if any shader invocation executes an instruction that performs any memory access using the descriptor. -
VK_DESCRIPTOR_BINDING_UPDATE_UNUSED_WHILE_PENDING_BIT_EXT
indicates that descriptors in this binding can be updated after a command buffer has bound this descriptor set, or while a command buffer that uses this descriptor set is pending execution, as long as the descriptors that are updated are not used by those command buffers. IfVK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT
is also set, then descriptors can be updated as long as they are not dynamically used by any shader invocations. IfVK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT
is not set, then descriptors can be updated as long as they are not statically used by any shader invocations. -
VK_DESCRIPTOR_BINDING_VARIABLE_DESCRIPTOR_COUNT_BIT_EXT
indicates that this descriptor binding has a variable size that will be specified when a descriptor set is allocated using this layout. The value ofdescriptorCount
is treated as an upper bound on the size of the binding. This must only be used for the last binding in the descriptor set layout (i.e. the binding with the largest value ofbinding
). For the purposes of counting against limits such asmaxDescriptorSet
* andmaxPerStageDescriptor
*, the full value ofdescriptorCount
is counted , except for descriptor bindings with a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
wheredescriptorCount
specifies the upper bound on the byte size of the binding, thus it counts against themaxInlineUniformBlockSize
limit instead. .
Note
Note that while |
typedef VkFlags VkDescriptorBindingFlagsEXT;
VkDescriptorBindingFlagsEXT
is a bitmask type for setting a mask of
zero or more VkDescriptorBindingFlagBitsEXT.
To query information about whether a descriptor set layout can be created, call:
void vkGetDescriptorSetLayoutSupport(
VkDevice device,
const VkDescriptorSetLayoutCreateInfo* pCreateInfo,
VkDescriptorSetLayoutSupport* pSupport);
or the equivalent command
void vkGetDescriptorSetLayoutSupportKHR(
VkDevice device,
const VkDescriptorSetLayoutCreateInfo* pCreateInfo,
VkDescriptorSetLayoutSupport* pSupport);
-
device
is the logical device that would create the descriptor set layout. -
pCreateInfo
is a pointer to an instance of the VkDescriptorSetLayoutCreateInfo structure specifying the state of the descriptor set layout object. -
pSupport
points to a VkDescriptorSetLayoutSupport structure in which information about support for the descriptor set layout object is returned.
Some implementations have limitations on what fits in a descriptor set which
are not easily expressible in terms of existing limits like
maxDescriptorSet
*, for example if all descriptor types share a limited
space in memory but each descriptor is a different size or alignment.
This command returns information about whether a descriptor set satisfies
this limit.
If the descriptor set layout satisfies the
VkPhysicalDeviceMaintenance3Properties::maxPerSetDescriptors
limit, this command is guaranteed to return VK_TRUE
in
VkDescriptorSetLayoutSupport::supported
.
If the descriptor set layout exceeds the
VkPhysicalDeviceMaintenance3Properties::maxPerSetDescriptors
limit, whether the descriptor set layout is supported is
implementation-dependent and may depend on whether the descriptor sizes and
alignments cause the layout to exceed an internal limit.
This command does not consider other limits such as
maxPerStageDescriptor
*, and so a descriptor set layout that is
supported according to this command must still satisfy the pipeline layout
limits such as maxPerStageDescriptor
* in order to be used in a
pipeline layout.
Note
This is a |
Information about support for the descriptor set layout is returned in an
instance of the VkDescriptorSetLayoutSupport
structure:
typedef struct VkDescriptorSetLayoutSupport {
VkStructureType sType;
void* pNext;
VkBool32 supported;
} VkDescriptorSetLayoutSupport;
or the equivalent
typedef VkDescriptorSetLayoutSupport VkDescriptorSetLayoutSupportKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
supported
specifies whether the descriptor set layout can be created.
supported
is set to VK_TRUE
if the descriptor set can be
created, or else is set to VK_FALSE
.
If the pNext
chain of a VkDescriptorSetLayoutSupport structure
includes a VkDescriptorSetVariableDescriptorCountLayoutSupportEXT
structure, then that structure returns additional information about whether
the descriptor set layout is supported.
typedef struct VkDescriptorSetVariableDescriptorCountLayoutSupportEXT {
VkStructureType sType;
void* pNext;
uint32_t maxVariableDescriptorCount;
} VkDescriptorSetVariableDescriptorCountLayoutSupportEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxVariableDescriptorCount
indicates the maximum number of descriptors supported in the highest numbered binding of the layout, if that binding is variable-sized. If the highest numbered binding of the layout has a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
thenmaxVariableDescriptorCount
indicates the maximum byte size supported for the binding, if that binding is variable-sized.
If the create info includes a variable-sized descriptor, then
supported
is determined assuming the requested size of the
variable-sized descriptor, and maxVariableDescriptorCount
is set to
the maximum size of that descriptor that can be successfully created (which
is greater than or equal to the requested size passed in).
If the create info does not include a variable-sized descriptor or if the
VkPhysicalDeviceDescriptorIndexingFeaturesEXT::descriptorBindingVariableDescriptorCount
feature is not enabled, then maxVariableDescriptorCount
is set to
zero.
For the purposes of this command, a variable-sized descriptor binding with a
descriptorCount
of zero is treated as if the descriptorCount
is
one, and thus the binding is not ignored and the maximum descriptor count
will be returned.
If the layout is not supported, then the value written to
maxVariableDescriptorCount
is undefined.
The following examples show a shader snippet using two descriptor sets, and application code that creates corresponding descriptor set layouts.
//
// binding to a single sampled image descriptor in set 0
//
layout (set=0, binding=0) uniform texture2D mySampledImage;
//
// binding to an array of sampled image descriptors in set 0
//
layout (set=0, binding=1) uniform texture2D myArrayOfSampledImages[12];
//
// binding to a single uniform buffer descriptor in set 1
//
layout (set=1, binding=0) uniform myUniformBuffer
{
vec4 myElement[32];
};
...
%1 = OpExtInstImport "GLSL.std.450"
...
OpName %9 "mySampledImage"
OpName %14 "myArrayOfSampledImages"
OpName %18 "myUniformBuffer"
OpMemberName %18 0 "myElement"
OpName %20 ""
OpDecorate %9 DescriptorSet 0
OpDecorate %9 Binding 0
OpDecorate %14 DescriptorSet 0
OpDecorate %14 Binding 1
OpDecorate %17 ArrayStride 16
OpMemberDecorate %18 0 Offset 0
OpDecorate %18 Block
OpDecorate %20 DescriptorSet 1
OpDecorate %20 Binding 0
%2 = OpTypeVoid
%3 = OpTypeFunction %2
%6 = OpTypeFloat 32
%7 = OpTypeImage %6 2D 0 0 0 1 Unknown
%8 = OpTypePointer UniformConstant %7
%9 = OpVariable %8 UniformConstant
%10 = OpTypeInt 32 0
%11 = OpConstant %10 12
%12 = OpTypeArray %7 %11
%13 = OpTypePointer UniformConstant %12
%14 = OpVariable %13 UniformConstant
%15 = OpTypeVector %6 4
%16 = OpConstant %10 32
%17 = OpTypeArray %15 %16
%18 = OpTypeStruct %17
%19 = OpTypePointer Uniform %18
%20 = OpVariable %19 Uniform
...
VkResult myResult;
const VkDescriptorSetLayoutBinding myDescriptorSetLayoutBinding[] =
{
// binding to a single image descriptor
{
0, // binding
VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, // descriptorType
1, // descriptorCount
VK_SHADER_STAGE_FRAGMENT_BIT, // stageFlags
NULL // pImmutableSamplers
},
// binding to an array of image descriptors
{
1, // binding
VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, // descriptorType
12, // descriptorCount
VK_SHADER_STAGE_FRAGMENT_BIT, // stageFlags
NULL // pImmutableSamplers
},
// binding to a single uniform buffer descriptor
{
0, // binding
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, // descriptorType
1, // descriptorCount
VK_SHADER_STAGE_FRAGMENT_BIT, // stageFlags
NULL // pImmutableSamplers
}
};
const VkDescriptorSetLayoutCreateInfo myDescriptorSetLayoutCreateInfo[] =
{
// Create info for first descriptor set with two descriptor bindings
{
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO, // sType
NULL, // pNext
0, // flags
2, // bindingCount
&myDescriptorSetLayoutBinding[0] // pBindings
},
// Create info for second descriptor set with one descriptor binding
{
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO, // sType
NULL, // pNext
0, // flags
1, // bindingCount
&myDescriptorSetLayoutBinding[2] // pBindings
}
};
VkDescriptorSetLayout myDescriptorSetLayout[2];
//
// Create first descriptor set layout
//
myResult = vkCreateDescriptorSetLayout(
myDevice,
&myDescriptorSetLayoutCreateInfo[0],
NULL,
&myDescriptorSetLayout[0]);
//
// Create second descriptor set layout
//
myResult = vkCreateDescriptorSetLayout(
myDevice,
&myDescriptorSetLayoutCreateInfo[1],
NULL,
&myDescriptorSetLayout[1]);
To destroy a descriptor set layout, call:
void vkDestroyDescriptorSetLayout(
VkDevice device,
VkDescriptorSetLayout descriptorSetLayout,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the descriptor set layout. -
descriptorSetLayout
is the descriptor set layout to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
13.2.2. Pipeline Layouts
Access to descriptor sets from a pipeline is accomplished through a pipeline layout. Zero or more descriptor set layouts and zero or more push constant ranges are combined to form a pipeline layout object which describes the complete set of resources that can be accessed by a pipeline. The pipeline layout represents a sequence of descriptor sets with each having a specific layout. This sequence of layouts is used to determine the interface between shader stages and shader resources. Each pipeline is created using a pipeline layout.
Pipeline layout objects are represented by VkPipelineLayout
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkPipelineLayout)
To create a pipeline layout, call:
VkResult vkCreatePipelineLayout(
VkDevice device,
const VkPipelineLayoutCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkPipelineLayout* pPipelineLayout);
-
device
is the logical device that creates the pipeline layout. -
pCreateInfo
is a pointer to an instance of the VkPipelineLayoutCreateInfo structure specifying the state of the pipeline layout object. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pPipelineLayout
points to a VkPipelineLayout handle in which the resulting pipeline layout object is returned.
The VkPipelineLayoutCreateInfo structure is defined as:
typedef struct VkPipelineLayoutCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineLayoutCreateFlags flags;
uint32_t setLayoutCount;
const VkDescriptorSetLayout* pSetLayouts;
uint32_t pushConstantRangeCount;
const VkPushConstantRange* pPushConstantRanges;
} VkPipelineLayoutCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
setLayoutCount
is the number of descriptor sets included in the pipeline layout. -
pSetLayouts
is a pointer to an array ofVkDescriptorSetLayout
objects. -
pushConstantRangeCount
is the number of push constant ranges included in the pipeline layout. -
pPushConstantRanges
is a pointer to an array ofVkPushConstantRange
structures defining a set of push constant ranges for use in a single pipeline layout. In addition to descriptor set layouts, a pipeline layout also describes how many push constants can be accessed by each stage of the pipeline.NotePush constants represent a high speed path to modify constant data in pipelines that is expected to outperform memory-backed resource updates.
typedef VkFlags VkPipelineLayoutCreateFlags;
VkPipelineLayoutCreateFlags
is a bitmask type for setting a mask, but
is currently reserved for future use.
The VkPushConstantRange
structure is defined as:
typedef struct VkPushConstantRange {
VkShaderStageFlags stageFlags;
uint32_t offset;
uint32_t size;
} VkPushConstantRange;
-
stageFlags
is a set of stage flags describing the shader stages that will access a range of push constants. If a particular stage is not included in the range, then accessing members of that range of push constants from the corresponding shader stage will return undefined values. -
offset
andsize
are the start offset and size, respectively, consumed by the range. Bothoffset
andsize
are in units of bytes and must be a multiple of 4. The layout of the push constant variables is specified in the shader.
Once created, pipeline layouts are used as part of pipeline creation (see Pipelines), as part of binding descriptor sets (see Descriptor Set Binding), and as part of setting push constants (see Push Constant Updates). Pipeline creation accepts a pipeline layout as input, and the layout may be used to map (set, binding, arrayElement) tuples to implementation resources or memory locations within a descriptor set. The assignment of implementation resources depends only on the bindings defined in the descriptor sets that comprise the pipeline layout, and not on any shader source.
All resource variables statically used in all shaders
in a pipeline must be declared with a (set,binding,arrayElement) that
exists in the corresponding descriptor set layout and is of an appropriate
descriptor type and includes the set of shader stages it is used by in
stageFlags
.
The pipeline layout can include entries that are not used by a particular
pipeline, or that are dead-code eliminated from any of the shaders.
The pipeline layout allows the application to provide a consistent set of
bindings across multiple pipeline compiles, which enables those pipelines to
be compiled in a way that the implementation may cheaply switch pipelines
without reprogramming the bindings.
Similarly, the push constant block declared in each shader (if present)
must only place variables at offsets that are each included in a push
constant range with stageFlags
including the bit corresponding to the
shader stage that uses it.
The pipeline layout can include ranges or portions of ranges that are not
used by a particular pipeline, or for which the variables have been
dead-code eliminated from any of the shaders.
There is a limit on the total number of resources of each type that can be included in bindings in all descriptor set layouts in a pipeline layout as shown in Pipeline Layout Resource Limits. The “Total Resources Available” column gives the limit on the number of each type of resource that can be included in bindings in all descriptor sets in the pipeline layout. Some resource types count against multiple limits. Additionally, there are limits on the total number of each type of resource that can be used in any pipeline stage as described in Shader Resource Limits.
Total Resources Available | Resource Types |
---|---|
|
sampler |
combined image sampler |
|
|
sampled image |
combined image sampler |
|
uniform texel buffer |
|
|
storage image |
storage texel buffer |
|
|
uniform buffer |
uniform buffer dynamic |
|
|
uniform buffer dynamic |
|
storage buffer |
storage buffer dynamic |
|
|
storage buffer dynamic |
|
input attachment |
|
inline uniform block |
|
acceleration structure |
To destroy a pipeline layout, call:
void vkDestroyPipelineLayout(
VkDevice device,
VkPipelineLayout pipelineLayout,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the pipeline layout. -
pipelineLayout
is the pipeline layout to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
Pipeline Layout Compatibility
Two pipeline layouts are defined to be “compatible for push constants” if they were created with identical push constant ranges. Two pipeline layouts are defined to be “compatible for set N” if they were created with identically defined descriptor set layouts for sets zero through N, and if they were created with identical push constant ranges.
When binding a descriptor set (see Descriptor Set Binding) to set number N, if the previously bound descriptor sets for sets zero through N-1 were all bound using compatible pipeline layouts, then performing this binding does not disturb any of the lower numbered sets. If, additionally, the previous bound descriptor set for set N was bound using a pipeline layout compatible for set N, then the bindings in sets numbered greater than N are also not disturbed.
Similarly, when binding a pipeline, the pipeline can correctly access any previously bound descriptor sets which were bound with compatible pipeline layouts, as long as all lower numbered sets were also bound with compatible layouts.
Layout compatibility means that descriptor sets can be bound to a command buffer for use by any pipeline created with a compatible pipeline layout, and without having bound a particular pipeline first. It also means that descriptor sets can remain valid across a pipeline change, and the same resources will be accessible to the newly bound pipeline.
Note
Place the least frequently changing descriptor sets near the start of the pipeline layout, and place the descriptor sets representing the most frequently changing resources near the end. When pipelines are switched, only the descriptor set bindings that have been invalidated will need to be updated and the remainder of the descriptor set bindings will remain in place. |
The maximum number of descriptor sets that can be bound to a pipeline
layout is queried from physical device properties (see
maxBoundDescriptorSets
in Limits).
const VkDescriptorSetLayout layouts[] = { layout1, layout2 };
const VkPushConstantRange ranges[] =
{
{
VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, // stageFlags
0, // offset
4 // size
},
{
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, // stageFlags
4, // offset
4 // size
},
};
const VkPipelineLayoutCreateInfo createInfo =
{
VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO, // sType
NULL, // pNext
0, // flags
2, // setLayoutCount
layouts, // pSetLayouts
2, // pushConstantRangeCount
ranges // pPushConstantRanges
};
VkPipelineLayout myPipelineLayout;
myResult = vkCreatePipelineLayout(
myDevice,
&createInfo,
NULL,
&myPipelineLayout);
13.2.3. Allocation of Descriptor Sets
A descriptor pool maintains a pool of descriptors, from which descriptor sets are allocated. Descriptor pools are externally synchronized, meaning that the application must not allocate and/or free descriptor sets from the same pool in multiple threads simultaneously.
Descriptor pools are represented by VkDescriptorPool
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorPool)
To create a descriptor pool object, call:
VkResult vkCreateDescriptorPool(
VkDevice device,
const VkDescriptorPoolCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDescriptorPool* pDescriptorPool);
-
device
is the logical device that creates the descriptor pool. -
pCreateInfo
is a pointer to an instance of the VkDescriptorPoolCreateInfo structure specifying the state of the descriptor pool object. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pDescriptorPool
points to a VkDescriptorPool handle in which the resulting descriptor pool object is returned.
pAllocator
controls host memory allocation as described in the
Memory Allocation chapter.
The created descriptor pool is returned in pDescriptorPool
.
Additional information about the pool is passed in an instance of the
VkDescriptorPoolCreateInfo
structure:
typedef struct VkDescriptorPoolCreateInfo {
VkStructureType sType;
const void* pNext;
VkDescriptorPoolCreateFlags flags;
uint32_t maxSets;
uint32_t poolSizeCount;
const VkDescriptorPoolSize* pPoolSizes;
} VkDescriptorPoolCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkDescriptorPoolCreateFlagBits specifying certain supported operations on the pool. -
maxSets
is the maximum number of descriptor sets that can be allocated from the pool. -
poolSizeCount
is the number of elements inpPoolSizes
. -
pPoolSizes
is a pointer to an array of VkDescriptorPoolSize structures, each containing a descriptor type and number of descriptors of that type to be allocated in the pool.
If multiple VkDescriptorPoolSize
structures appear in the
pPoolSizes
array then the pool will be created with enough storage for
the total number of descriptors of each type.
Fragmentation of a descriptor pool is possible and may lead to descriptor set allocation failures. A failure due to fragmentation is defined as failing a descriptor set allocation despite the sum of all outstanding descriptor set allocations from the pool plus the requested allocation requiring no more than the total number of descriptors requested at pool creation. Implementations provide certain guarantees of when fragmentation must not cause allocation failure, as described below.
If a descriptor pool has not had any descriptor sets freed since it was
created or most recently reset then fragmentation must not cause an
allocation failure (note that this is always the case for a pool created
without the VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT
bit
set).
Additionally, if all sets allocated from the pool since it was created or
most recently reset use the same number of descriptors (of each type) and
the requested allocation also uses that same number of descriptors (of each
type), then fragmentation must not cause an allocation failure.
If an allocation failure occurs due to fragmentation, an application can create an additional descriptor pool to perform further descriptor set allocations.
If flags
has the
VK_DESCRIPTOR_POOL_CREATE_UPDATE_AFTER_BIND_BIT_EXT
bit set,
descriptor pool creation may fail with the error
VK_ERROR_FRAGMENTATION_EXT
if the total number of descriptors across
all pools (including this one) created with this bit set exceeds
maxUpdateAfterBindDescriptorsInAllPools
, or if fragmentation of the
underlying hardware resources occurs.
In order to be able to allocate descriptor sets having
inline uniform block bindings the
descriptor pool must be created with specifying the inline uniform block
binding capacity of the descriptor pool, in addition to the total inline
uniform data capacity in bytes which is specified through an instance of the
VkDescriptorPoolSize structure with a descriptorType
value of
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
.
This can be done by chaining an instance of the
VkDescriptorPoolInlineUniformBlockCreateInfoEXT
structure to the
pNext
chain of VkDescriptorPoolCreateInfo
.
The VkDescriptorPoolInlineUniformBlockCreateInfoEXT
structure is
defined as:
typedef struct VkDescriptorPoolInlineUniformBlockCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t maxInlineUniformBlockBindings;
} VkDescriptorPoolInlineUniformBlockCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxInlineUniformBlockBindings
is the number of inline uniform block bindings to allocate.
Bits which can be set in VkDescriptorPoolCreateInfo::flags
to
enable operations on a descriptor pool are:
typedef enum VkDescriptorPoolCreateFlagBits {
VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT = 0x00000001,
VK_DESCRIPTOR_POOL_CREATE_UPDATE_AFTER_BIND_BIT_EXT = 0x00000002,
} VkDescriptorPoolCreateFlagBits;
-
VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT
specifies that descriptor sets can return their individual allocations to the pool, i.e. all of vkAllocateDescriptorSets, vkFreeDescriptorSets, and vkResetDescriptorPool are allowed. Otherwise, descriptor sets allocated from the pool must not be individually freed back to the pool, i.e. only vkAllocateDescriptorSets and vkResetDescriptorPool are allowed. -
VK_DESCRIPTOR_POOL_CREATE_UPDATE_AFTER_BIND_BIT_EXT
specifies that descriptor sets allocated from this pool can include bindings with theVK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
bit set. It is valid to allocate descriptor sets that have bindings that do not set theVK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
bit from a pool that hasVK_DESCRIPTOR_POOL_CREATE_UPDATE_AFTER_BIND_BIT_EXT
set.
typedef VkFlags VkDescriptorPoolCreateFlags;
VkDescriptorPoolCreateFlags
is a bitmask type for setting a mask of
zero or more VkDescriptorPoolCreateFlagBits.
The VkDescriptorPoolSize
structure is defined as:
typedef struct VkDescriptorPoolSize {
VkDescriptorType type;
uint32_t descriptorCount;
} VkDescriptorPoolSize;
-
type
is the type of descriptor. -
descriptorCount
is the number of descriptors of that type to allocate. Iftype
isVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
thendescriptorCount
is the number of bytes to allocate for descriptors of this type.
To destroy a descriptor pool, call:
void vkDestroyDescriptorPool(
VkDevice device,
VkDescriptorPool descriptorPool,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the descriptor pool. -
descriptorPool
is the descriptor pool to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
When a pool is destroyed, all descriptor sets allocated from the pool are implicitly freed and become invalid. Descriptor sets allocated from a given pool do not need to be freed before destroying that descriptor pool.
Descriptor sets are allocated from descriptor pool objects, and are
represented by VkDescriptorSet
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorSet)
To allocate descriptor sets from a descriptor pool, call:
VkResult vkAllocateDescriptorSets(
VkDevice device,
const VkDescriptorSetAllocateInfo* pAllocateInfo,
VkDescriptorSet* pDescriptorSets);
-
device
is the logical device that owns the descriptor pool. -
pAllocateInfo
is a pointer to an instance of the VkDescriptorSetAllocateInfo structure describing parameters of the allocation. -
pDescriptorSets
is a pointer to an array of VkDescriptorSet handles in which the resulting descriptor set objects are returned.
The allocated descriptor sets are returned in pDescriptorSets
.
When a descriptor set is allocated, the initial state is largely
uninitialized and all descriptors are undefined.
However, the descriptor set can be bound in a command buffer without
causing errors or exceptions.
For descriptor set bindings created with the
VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT
bit set, all descriptors
in that binding that are dynamically used must have been populated before
the descriptor set is consumed.
For descriptor set bindings created without the
VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT
bit set, all descriptors
in that binding that are statically used must have been populated before
the descriptor set is consumed.
Descriptor bindings with descriptor type of
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
need not be populated
before the descriptor set is consumed.
Entries that are not used by a pipeline can have uninitialized descriptors
or descriptors of resources that have been destroyed, and executing a draw
or dispatch with such a descriptor set bound does not cause undefined
behavior.
This means applications need not populate unused entries with dummy
descriptors.
If a call to vkAllocateDescriptorSets
would cause the total number of
descriptor sets allocated from the pool to exceed the value of
VkDescriptorPoolCreateInfo::maxSets
used to create
pAllocateInfo
→descriptorPool
, then the allocation may fail due
to lack of space in the descriptor pool.
Similarly, the allocation may fail due to lack of space if the call to
vkAllocateDescriptorSets
would cause the number of any given
descriptor type to exceed the sum of all the descriptorCount
members
of each element of VkDescriptorPoolCreateInfo::pPoolSizes
with a
member
equal to that type.
Additionally, the allocation may also fail if a call to
vkAllocateDescriptorSets
would cause the total number of inline
uniform block bindings allocated from the pool to exceed the value of
VkDescriptorPoolInlineUniformBlockCreateInfoEXT::maxInlineUniformBlockBindings
used to create the descriptor pool.
If the allocation fails due to no more space in the descriptor pool, and not
because of system or device memory exhaustion, then
VK_ERROR_OUT_OF_POOL_MEMORY
must be returned.
vkAllocateDescriptorSets
can be used to create multiple descriptor
sets.
If the creation of any of those descriptor sets fails, then the
implementation must destroy all successfully created descriptor set objects
from this command, set all entries of the pDescriptorSets
array to
VK_NULL_HANDLE and return the error.
The VkDescriptorSetAllocateInfo
structure is defined as:
typedef struct VkDescriptorSetAllocateInfo {
VkStructureType sType;
const void* pNext;
VkDescriptorPool descriptorPool;
uint32_t descriptorSetCount;
const VkDescriptorSetLayout* pSetLayouts;
} VkDescriptorSetAllocateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
descriptorPool
is the pool which the sets will be allocated from. -
descriptorSetCount
determines the number of descriptor sets to be allocated from the pool. -
pSetLayouts
is an array of descriptor set layouts, with each member specifying how the corresponding descriptor set is allocated.
If the pNext
chain of a VkDescriptorSetAllocateInfo structure
includes a VkDescriptorSetVariableDescriptorCountAllocateInfoEXT
structure, then that structure includes an array of descriptor counts for
variable descriptor count bindings, one for each descriptor set being
allocated.
The VkDescriptorSetVariableDescriptorCountAllocateInfoEXT
structure is
defined as:
typedef struct VkDescriptorSetVariableDescriptorCountAllocateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t descriptorSetCount;
const uint32_t* pDescriptorCounts;
} VkDescriptorSetVariableDescriptorCountAllocateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
descriptorSetCount
is zero or the number of elements inpDescriptorCounts
. -
pDescriptorCounts
is an array of descriptor counts, with each member specifying the number of descriptors in a variable descriptor count binding in the corresponding descriptor set being allocated.
If descriptorSetCount
is zero or this structure is not included in the
pNext
chain, then the variable lengths are considered to be zero.
Otherwise, pDescriptorCounts
[i] is the number of descriptors in the
variable count descriptor binding in the corresponding descriptor set
layout.
If the variable count descriptor binding in the corresponding descriptor set
layout has a descriptor type of
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
then
pDescriptorCounts
[i] specifies the binding’s capacity in bytes.
If VkDescriptorSetAllocateInfo::pSetLayouts
[i] does not include
a variable count descriptor binding, then pDescriptorCounts
[i] is
ignored.
To free allocated descriptor sets, call:
VkResult vkFreeDescriptorSets(
VkDevice device,
VkDescriptorPool descriptorPool,
uint32_t descriptorSetCount,
const VkDescriptorSet* pDescriptorSets);
-
device
is the logical device that owns the descriptor pool. -
descriptorPool
is the descriptor pool from which the descriptor sets were allocated. -
descriptorSetCount
is the number of elements in thepDescriptorSets
array. -
pDescriptorSets
is an array of handles to VkDescriptorSet objects.
After a successful call to vkFreeDescriptorSets
, all descriptor sets
in pDescriptorSets
are invalid.
To return all descriptor sets allocated from a given pool to the pool, rather than freeing individual descriptor sets, call:
VkResult vkResetDescriptorPool(
VkDevice device,
VkDescriptorPool descriptorPool,
VkDescriptorPoolResetFlags flags);
-
device
is the logical device that owns the descriptor pool. -
descriptorPool
is the descriptor pool to be reset. -
flags
is reserved for future use.
Resetting a descriptor pool recycles all of the resources from all of the descriptor sets allocated from the descriptor pool back to the descriptor pool, and the descriptor sets are implicitly freed.
typedef VkFlags VkDescriptorPoolResetFlags;
VkDescriptorPoolResetFlags
is a bitmask type for setting a mask, but
is currently reserved for future use.
13.2.4. Descriptor Set Updates
Once allocated, descriptor sets can be updated with a combination of write and copy operations. To update descriptor sets, call:
void vkUpdateDescriptorSets(
VkDevice device,
uint32_t descriptorWriteCount,
const VkWriteDescriptorSet* pDescriptorWrites,
uint32_t descriptorCopyCount,
const VkCopyDescriptorSet* pDescriptorCopies);
-
device
is the logical device that updates the descriptor sets. -
descriptorWriteCount
is the number of elements in thepDescriptorWrites
array. -
pDescriptorWrites
is a pointer to an array of VkWriteDescriptorSet structures describing the descriptor sets to write to. -
descriptorCopyCount
is the number of elements in thepDescriptorCopies
array. -
pDescriptorCopies
is a pointer to an array of VkCopyDescriptorSet structures describing the descriptor sets to copy between.
The operations described by pDescriptorWrites
are performed first,
followed by the operations described by pDescriptorCopies
.
Within each array, the operations are performed in the order they appear in
the array.
Each element in the pDescriptorWrites
array describes an operation
updating the descriptor set using descriptors for resources specified in the
structure.
Each element in the pDescriptorCopies
array is a
VkCopyDescriptorSet structure describing an operation copying
descriptors between sets.
If the dstSet
member of any element of pDescriptorWrites
or
pDescriptorCopies
is bound, accessed, or modified by any command that
was recorded to a command buffer which is currently in the
recording or executable state,
and any of the descriptor bindings that are updated were not created with
the VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
or
VK_DESCRIPTOR_BINDING_UPDATE_UNUSED_WHILE_PENDING_BIT_EXT
bits set,
that command buffer becomes invalid.
The VkWriteDescriptorSet
structure is defined as:
typedef struct VkWriteDescriptorSet {
VkStructureType sType;
const void* pNext;
VkDescriptorSet dstSet;
uint32_t dstBinding;
uint32_t dstArrayElement;
uint32_t descriptorCount;
VkDescriptorType descriptorType;
const VkDescriptorImageInfo* pImageInfo;
const VkDescriptorBufferInfo* pBufferInfo;
const VkBufferView* pTexelBufferView;
} VkWriteDescriptorSet;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
dstSet
is the destination descriptor set to update. -
dstBinding
is the descriptor binding within that set. -
dstArrayElement
is the starting element in that array. If the descriptor binding identified bydstSet
anddstBinding
has a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
thendstArrayElement
specifies the starting byte offset within the binding. -
descriptorCount
is the number of descriptors to update (the number of elements inpImageInfo
,pBufferInfo
, orpTexelBufferView
, or a value matching thedataSize
member of an instance of VkWriteDescriptorSetInlineUniformBlockEXT in thepNext
chain , or a value matching theaccelerationStructureCount
of an instance of VkWriteDescriptorSetAccelerationStructureNV in thepNext
chain ). If the descriptor binding identified bydstSet
anddstBinding
has a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
thendescriptorCount
specifies the number of bytes to update. -
descriptorType
is a VkDescriptorType specifying the type of each descriptor inpImageInfo
,pBufferInfo
, orpTexelBufferView
, as described below. It must be the same type as that specified inVkDescriptorSetLayoutBinding
fordstSet
atdstBinding
. The type of the descriptor also controls which array the descriptors are taken from. -
pImageInfo
points to an array of VkDescriptorImageInfo structures or is ignored, as described below. -
pBufferInfo
points to an array of VkDescriptorBufferInfo structures or is ignored, as described below. -
pTexelBufferView
points to an array of VkBufferView handles as described in the Buffer Views section or is ignored, as described below.
Only one of pImageInfo
, pBufferInfo
, or pTexelBufferView
members is used according to the descriptor type specified in the
descriptorType
member of the containing VkWriteDescriptorSet
structure,
or none of them in case descriptorType
is
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
, in which case the source
data for the descriptor writes is taken from the instance of
VkWriteDescriptorSetInlineUniformBlockEXT in the pNext
chain of
VkWriteDescriptorSet
,
or if descriptorType
is
VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV
, in which case the source
data for the descriptor writes is taken from the instance of
VkWriteDescriptorSetAccelerationStructureNV in the pNext
chain
of VkWriteDescriptorSet
,
as specified below.
If the dstBinding
has fewer than descriptorCount
array elements
remaining starting from dstArrayElement
, then the remainder will be
used to update the subsequent binding - dstBinding
+1 starting at
array element zero.
If a binding has a descriptorCount
of zero, it is skipped.
This behavior applies recursively, with the update affecting consecutive
bindings as needed to update all descriptorCount
descriptors.
Note
The same behavior applies to bindings with a descriptor type of
|
The type of descriptors in a descriptor set is specified by
VkWriteDescriptorSet::descriptorType
, which must be one of the
values:
typedef enum VkDescriptorType {
VK_DESCRIPTOR_TYPE_SAMPLER = 0,
VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER = 1,
VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE = 2,
VK_DESCRIPTOR_TYPE_STORAGE_IMAGE = 3,
VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER = 4,
VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER = 5,
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER = 6,
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER = 7,
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC = 8,
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC = 9,
VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT = 10,
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT = 1000138000,
VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV = 1000165000,
} VkDescriptorType;
-
VK_DESCRIPTOR_TYPE_SAMPLER
specifies a sampler descriptor. -
VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
specifies a combined image sampler descriptor. -
VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
specifies a sampled image descriptor. -
VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
specifies a storage image descriptor. -
VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
specifies a uniform texel buffer descriptor. -
VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
specifies a storage texel buffer descriptor. -
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
specifies a uniform buffer descriptor. -
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER
specifies a storage buffer descriptor. -
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
specifies a dynamic uniform buffer descriptor. -
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
specifies a dynamic storage buffer descriptor. -
VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
specifies an input attachment descriptor. -
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
specifies an inline uniform block.
When a descriptor set is updated via elements of VkWriteDescriptorSet,
members of pImageInfo
, pBufferInfo
and pTexelBufferView
are only accessed by the implementation when they correspond to descriptor
type being defined - otherwise they are ignored.
The members accessed are as follows for each descriptor type:
-
For
VK_DESCRIPTOR_TYPE_SAMPLER
, only thesampler
member of each element of VkWriteDescriptorSet::pImageInfo
is accessed. -
For
VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
,VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
, orVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
, only theimageView
andimageLayout
members of each element of VkWriteDescriptorSet::pImageInfo
are accessed. -
For
VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
, all members of each element of VkWriteDescriptorSet::pImageInfo
are accessed. -
For
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
,VK_DESCRIPTOR_TYPE_STORAGE_BUFFER
,VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
, orVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
, all members of each element of VkWriteDescriptorSet::pBufferInfo
are accessed. -
For
VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
orVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
, each element of VkWriteDescriptorSet::pTexelBufferView
is accessed.
When updating descriptors with a descriptorType
of
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
, none of the
pImageInfo
, pBufferInfo
, or pTexelBufferView
members are
accessed, instead the source data of the descriptor update operation is
taken from the instance of VkWriteDescriptorSetInlineUniformBlockEXT
in the pNext
chain of VkWriteDescriptorSet
.
When updating descriptors with a descriptorType
of
VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV
, none of the
pImageInfo
, pBufferInfo
, or pTexelBufferView
members are
accessed, instead the source data of the descriptor update operation is
taken from the instance of VkWriteDescriptorSetAccelerationStructureNV
in the pNext
chain of VkWriteDescriptorSet
.
The VkDescriptorBufferInfo
structure is defined as:
typedef struct VkDescriptorBufferInfo {
VkBuffer buffer;
VkDeviceSize offset;
VkDeviceSize range;
} VkDescriptorBufferInfo;
-
buffer
is the buffer resource. -
offset
is the offset in bytes from the start ofbuffer
. Access to buffer memory via this descriptor uses addressing that is relative to this starting offset. -
range
is the size in bytes that is used for this descriptor update, orVK_WHOLE_SIZE
to use the range fromoffset
to the end of the buffer.
Note
When setting |
For VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
and
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
descriptor types,
offset
is the base offset from which the dynamic offset is applied and
range
is the static size used for all dynamic offsets.
The VkDescriptorImageInfo
structure is defined as:
typedef struct VkDescriptorImageInfo {
VkSampler sampler;
VkImageView imageView;
VkImageLayout imageLayout;
} VkDescriptorImageInfo;
-
sampler
is a sampler handle, and is used in descriptor updates for typesVK_DESCRIPTOR_TYPE_SAMPLER
andVK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
if the binding being updated does not use immutable samplers. -
imageView
is an image view handle, and is used in descriptor updates for typesVK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
,VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
,VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
, andVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
. -
imageLayout
is the layout that the image subresources accessible fromimageView
will be in at the time this descriptor is accessed.imageLayout
is used in descriptor updates for typesVK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
,VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
,VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
, andVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
.
Members of VkDescriptorImageInfo
that are not used in an update (as
described above) are ignored.
If the descriptorType
member of VkWriteDescriptorSet is
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
then the data to write to
the descriptor set is specified through an instance of
VkWriteDescriptorSetInlineUniformBlockEXT
chained to the pNext
chain of VkWriteDescriptorSet
.
The VkWriteDescriptorSetInlineUniformBlockEXT
structure is defined as:
typedef struct VkWriteDescriptorSetInlineUniformBlockEXT {
VkStructureType sType;
const void* pNext;
uint32_t dataSize;
const void* pData;
} VkWriteDescriptorSetInlineUniformBlockEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
dataSize
is the number of bytes of inline uniform block data pointed to bypData
. -
pData
is a pointer todataSize
number of bytes of data to write to the inline uniform block.
13.2.5. Acceleration Structure
An acceleration structure
(VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV
) is a descriptor type
that is used to retrieve scene geometry from within shaders bound to ray
tracing pipelines.
Shaders have read-only access to the memory.
The VkWriteDescriptorSetAccelerationStructureNV
structure is defined
as:
typedef struct VkWriteDescriptorSetAccelerationStructureNV {
VkStructureType sType;
const void* pNext;
uint32_t accelerationStructureCount;
const VkAccelerationStructureNV* pAccelerationStructures;
} VkWriteDescriptorSetAccelerationStructureNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
accelerationStructureCount
is the number of elements inpAccelerationStructures
. -
pAccelerationStructures
are the acceleration structures to update.
The VkCopyDescriptorSet
structure is defined as:
typedef struct VkCopyDescriptorSet {
VkStructureType sType;
const void* pNext;
VkDescriptorSet srcSet;
uint32_t srcBinding;
uint32_t srcArrayElement;
VkDescriptorSet dstSet;
uint32_t dstBinding;
uint32_t dstArrayElement;
uint32_t descriptorCount;
} VkCopyDescriptorSet;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
srcSet
,srcBinding
, andsrcArrayElement
are the source set, binding, and array element, respectively. If the descriptor binding identified bysrcSet
andsrcBinding
has a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
thensrcArrayElement
specifies the starting byte offset within the binding to copy from. -
dstSet
,dstBinding
, anddstArrayElement
are the destination set, binding, and array element, respectively. If the descriptor binding identified bydstSet
anddstBinding
has a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
thendstArrayElement
specifies the starting byte offset within the binding to copy to. -
descriptorCount
is the number of descriptors to copy from the source to destination. IfdescriptorCount
is greater than the number of remaining array elements in the source or destination binding, those affect consecutive bindings in a manner similar to VkWriteDescriptorSet above. If the descriptor binding identified bysrcSet
andsrcBinding
has a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
thendescriptorCount
specifies the number of bytes to copy and the remaining array elements in the source or destination binding refer to the remaining number of bytes in those.
13.2.6. Descriptor Update Templates
A descriptor update template specifies a mapping from descriptor update information in host memory to descriptors in a descriptor set. It is designed to avoid passing redundant information to the driver when frequently updating the same set of descriptors in descriptor sets.
Descriptor update template objects are represented by
VkDescriptorUpdateTemplate
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorUpdateTemplate)
or the equivalent
typedef VkDescriptorUpdateTemplate VkDescriptorUpdateTemplateKHR;
13.2.7. Descriptor Set Updates with Templates
Updating a large VkDescriptorSet
array can be an expensive operation
since an application must specify one VkWriteDescriptorSet structure
for each descriptor or descriptor array to update, each of which
re-specifies the same state when updating the same descriptor in multiple
descriptor sets.
For cases when an application wishes to update the same set of descriptors
in multiple descriptor sets allocated using the same
VkDescriptorSetLayout
, vkUpdateDescriptorSetWithTemplate can be
used as a replacement for vkUpdateDescriptorSets.
VkDescriptorUpdateTemplate
allows implementations to convert a set of
descriptor update operations on a single descriptor set to an internal
format that, in conjunction with vkUpdateDescriptorSetWithTemplate
or vkCmdPushDescriptorSetWithTemplateKHR
, can be more efficient compared to calling vkUpdateDescriptorSets
or vkCmdPushDescriptorSetKHR
.
The descriptors themselves are not specified in the
VkDescriptorUpdateTemplate
, rather, offsets into an application
provided pointer to host memory are specified, which are combined with a
pointer passed to vkUpdateDescriptorSetWithTemplate
or vkCmdPushDescriptorSetWithTemplateKHR
.
This allows large batches of updates to be executed without having to
convert application data structures into a strictly-defined Vulkan data
structure.
To create a descriptor update template, call:
VkResult vkCreateDescriptorUpdateTemplate(
VkDevice device,
const VkDescriptorUpdateTemplateCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDescriptorUpdateTemplate* pDescriptorUpdateTemplate);
or the equivalent command
VkResult vkCreateDescriptorUpdateTemplateKHR(
VkDevice device,
const VkDescriptorUpdateTemplateCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDescriptorUpdateTemplate* pDescriptorUpdateTemplate);
-
device
is the logical device that creates the descriptor update template. -
pCreateInfo
is a pointer to an instance of the VkDescriptorUpdateTemplateCreateInfo structure specifying the set of descriptors to update with a single call to vkCmdPushDescriptorSetWithTemplateKHR or vkUpdateDescriptorSetWithTemplate. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pDescriptorUpdateTemplate
points to aVkDescriptorUpdateTemplate
handle in which the resulting descriptor update template object is returned.
The VkDescriptorUpdateTemplateCreateInfo structure is defined as:
typedef struct VkDescriptorUpdateTemplateCreateInfo {
VkStructureType sType;
const void* pNext;
VkDescriptorUpdateTemplateCreateFlags flags;
uint32_t descriptorUpdateEntryCount;
const VkDescriptorUpdateTemplateEntry* pDescriptorUpdateEntries;
VkDescriptorUpdateTemplateType templateType;
VkDescriptorSetLayout descriptorSetLayout;
VkPipelineBindPoint pipelineBindPoint;
VkPipelineLayout pipelineLayout;
uint32_t set;
} VkDescriptorUpdateTemplateCreateInfo;
or the equivalent
typedef VkDescriptorUpdateTemplateCreateInfo VkDescriptorUpdateTemplateCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
descriptorUpdateEntryCount
is the number of elements in thepDescriptorUpdateEntries
array. -
pDescriptorUpdateEntries
is a pointer to an array of VkDescriptorUpdateTemplateEntry structures describing the descriptors to be updated by the descriptor update template. -
templateType
Specifies the type of the descriptor update template. If set toVK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET
it can only be used to update descriptor sets with a fixeddescriptorSetLayout
. If set toVK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_PUSH_DESCRIPTORS_KHR
it can only be used to push descriptor sets using the providedpipelineBindPoint
,pipelineLayout
, andset
number. -
descriptorSetLayout
is the descriptor set layout the parameter update template will be used with. All descriptor sets which are going to be updated through the newly created descriptor update template must be created with this layout.descriptorSetLayout
is the descriptor set layout used to build the descriptor update template. All descriptor sets which are going to be updated through the newly created descriptor update template must be created with a layout that matches (is the same as, or defined identically to) this layout. This parameter is ignored iftemplateType
is notVK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET
. -
pipelineBindPoint
is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. This parameter is ignored iftemplateType
is notVK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_PUSH_DESCRIPTORS_KHR
-
pipelineLayout
is a VkPipelineLayout object used to program the bindings. This parameter is ignored iftemplateType
is notVK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_PUSH_DESCRIPTORS_KHR
-
set
is the set number of the descriptor set in the pipeline layout that will be updated. This parameter is ignored iftemplateType
is notVK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_PUSH_DESCRIPTORS_KHR
typedef VkFlags VkDescriptorUpdateTemplateCreateFlags;
or the equivalent
typedef VkDescriptorUpdateTemplateCreateFlags VkDescriptorUpdateTemplateCreateFlagsKHR;
VkDescriptorUpdateTemplateCreateFlags
is a bitmask type for setting a
mask, but is currently reserved for future use.
The descriptor update template type is determined by the
VkDescriptorUpdateTemplateCreateInfo::templateType
property,
which takes the following values:
typedef enum VkDescriptorUpdateTemplateType {
VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET = 0,
VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_PUSH_DESCRIPTORS_KHR = 1,
VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET,
} VkDescriptorUpdateTemplateType;
or the equivalent
typedef VkDescriptorUpdateTemplateType VkDescriptorUpdateTemplateTypeKHR;
-
VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET
specifies that the descriptor update template will be used for descriptor set updates only. -
VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_PUSH_DESCRIPTORS_KHR
specifies that the descriptor update template will be used for push descriptor updates only.
The VkDescriptorUpdateTemplateEntry
structure is defined as:
typedef struct VkDescriptorUpdateTemplateEntry {
uint32_t dstBinding;
uint32_t dstArrayElement;
uint32_t descriptorCount;
VkDescriptorType descriptorType;
size_t offset;
size_t stride;
} VkDescriptorUpdateTemplateEntry;
or the equivalent
typedef VkDescriptorUpdateTemplateEntry VkDescriptorUpdateTemplateEntryKHR;
-
dstBinding
is the descriptor binding to update when using this descriptor update template. -
dstArrayElement
is the starting element in the array belonging todstBinding
. If the descriptor binding identified bysrcBinding
has a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
thendstArrayElement
specifies the starting byte offset to update. -
descriptorCount
is the number of descriptors to update. IfdescriptorCount
is greater than the number of remaining array elements in the destination binding, those affect consecutive bindings in a manner similar to VkWriteDescriptorSet above. If the descriptor binding identified bydstBinding
has a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
thendescriptorCount
specifies the number of bytes to update and the remaining array elements in the destination binding refer to the remaining number of bytes in it. -
descriptorType
is a VkDescriptorType specifying the type of the descriptor. -
offset
is the offset in bytes of the first binding in the raw data structure. -
stride
is the stride in bytes between two consecutive array elements of the descriptor update informations in the raw data structure. The actual pointer ptr for each array element j of update entry i is computed using the following formula:const char *ptr = (const char *)pData + pDescriptorUpdateEntries[i].offset + j * pDescriptorUpdateEntries[i].stride
The stride is useful in case the bindings are stored in structs along with other data. If
descriptorType
isVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
then the value ofstride
is ignored and the stride is assumed to be1
, i.e. the descriptor update information for them is always specified as a contiguous range.
To destroy a descriptor update template, call:
void vkDestroyDescriptorUpdateTemplate(
VkDevice device,
VkDescriptorUpdateTemplate descriptorUpdateTemplate,
const VkAllocationCallbacks* pAllocator);
or the equivalent command
void vkDestroyDescriptorUpdateTemplateKHR(
VkDevice device,
VkDescriptorUpdateTemplate descriptorUpdateTemplate,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that has been used to create the descriptor update template -
descriptorUpdateTemplate
is the descriptor update template to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
Once a VkDescriptorUpdateTemplate
has been created, descriptor sets
can be updated by calling:
void vkUpdateDescriptorSetWithTemplate(
VkDevice device,
VkDescriptorSet descriptorSet,
VkDescriptorUpdateTemplate descriptorUpdateTemplate,
const void* pData);
or the equivalent command
void vkUpdateDescriptorSetWithTemplateKHR(
VkDevice device,
VkDescriptorSet descriptorSet,
VkDescriptorUpdateTemplate descriptorUpdateTemplate,
const void* pData);
-
device
is the logical device that updates the descriptor sets. -
descriptorSet
is the descriptor set to update -
descriptorUpdateTemplate
is the VkDescriptorUpdateTemplate which specifies the update mapping betweenpData
and the descriptor set to update. -
pData
is a pointer to memory which contains one or more structures of VkDescriptorImageInfo, VkDescriptorBufferInfo, or VkBufferView used to write the descriptors.
struct AppBufferView {
VkBufferView bufferView;
uint32_t applicationRelatedInformation;
};
struct AppDataStructure
{
VkDescriptorImageInfo imageInfo; // a single image info
VkDescriptorBufferInfo bufferInfoArray[3]; // 3 buffer infos in an array
AppBufferView bufferView[2]; // An application defined structure containing a bufferView
// ... some more application related data
};
const VkDescriptorUpdateTemplateEntry descriptorUpdateTemplateEntries[] =
{
// binding to a single image descriptor
{
0, // binding
0, // dstArrayElement
1, // descriptorCount
VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, // descriptorType
offsetof(AppDataStructure, imageInfo), // offset
0 // stride is not required if descriptorCount is 1
},
// binding to an array of buffer descriptors
{
1, // binding
0, // dstArrayElement
3, // descriptorCount
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, // descriptorType
offsetof(AppDataStructure, bufferInfoArray), // offset
sizeof(VkDescriptorBufferInfo) // stride, descriptor buffer infos are compact
},
// binding to an array of buffer views
{
2, // binding
0, // dstArrayElement
2, // descriptorCount
VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER, // descriptorType
offsetof(AppDataStructure, bufferView) +
offsetof(AppBufferView, bufferView), // offset
sizeof(AppBufferView) // stride, bufferViews do not have to be compact
},
};
// create a descriptor update template for descriptor set updates
const VkDescriptorUpdateTemplateCreateInfo createInfo =
{
VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO, // sType
NULL, // pNext
0, // flags
3, // descriptorUpdateEntryCount
descriptorUpdateTemplateEntries, // pDescriptorUpdateEntries
VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET, // templateType
myLayout, // descriptorSetLayout
0, // pipelineBindPoint, ignored by given templateType
0, // pipelineLayout, ignored by given templateType
0, // set, ignored by given templateType
};
VkDescriptorUpdateTemplate myDescriptorUpdateTemplate;
myResult = vkCreateDescriptorUpdateTemplate(
myDevice,
&createInfo,
NULL,
&myDescriptorUpdateTemplate);
}
AppDataStructure appData;
// fill appData here or cache it in your engine
vkUpdateDescriptorSetWithTemplate(myDevice, myDescriptorSet, myDescriptorUpdateTemplate, &appData);
13.2.8. Descriptor Set Binding
To bind one or more descriptor sets to a command buffer, call:
void vkCmdBindDescriptorSets(
VkCommandBuffer commandBuffer,
VkPipelineBindPoint pipelineBindPoint,
VkPipelineLayout layout,
uint32_t firstSet,
uint32_t descriptorSetCount,
const VkDescriptorSet* pDescriptorSets,
uint32_t dynamicOffsetCount,
const uint32_t* pDynamicOffsets);
-
commandBuffer
is the command buffer that the descriptor sets will be bound to. -
pipelineBindPoint
is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other. -
layout
is a VkPipelineLayout object used to program the bindings. -
firstSet
is the set number of the first descriptor set to be bound. -
descriptorSetCount
is the number of elements in thepDescriptorSets
array. -
pDescriptorSets
is an array of handles to VkDescriptorSet objects describing the descriptor sets to write to. -
dynamicOffsetCount
is the number of dynamic offsets in thepDynamicOffsets
array. -
pDynamicOffsets
is a pointer to an array ofuint32_t
values specifying dynamic offsets.
vkCmdBindDescriptorSets
causes the sets numbered [firstSet
..
firstSet
+descriptorSetCount
-1] to use the bindings stored in
pDescriptorSets
[0..descriptorSetCount
-1] for subsequent
rendering commands (either compute or graphics, according to the
pipelineBindPoint
).
Any bindings that were previously applied via these sets are no longer
valid.
Once bound, a descriptor set affects rendering of subsequent graphics or compute commands in the command buffer until a different set is bound to the same set number, or else until the set is disturbed as described in Pipeline Layout Compatibility.
A compatible descriptor set must be bound for all set numbers that any shaders in a pipeline access, at the time that a draw or dispatch command is recorded to execute using that pipeline. However, if none of the shaders in a pipeline statically use any bindings with a particular set number, then no descriptor set need be bound for that set number, even if the pipeline layout includes a non-trivial descriptor set layout for that set number.
If any of the sets being bound include dynamic uniform or storage buffers,
then pDynamicOffsets
includes one element for each array element in
each dynamic descriptor type binding in each set.
Values are taken from pDynamicOffsets
in an order such that all
entries for set N come before set N+1; within a set, entries are ordered by
the binding numbers in the descriptor set layouts; and within a binding
array, elements are in order.
dynamicOffsetCount
must equal the total number of dynamic descriptors
in the sets being bound.
The effective offset used for dynamic uniform and storage buffer bindings is
the sum of the relative offset taken from pDynamicOffsets
, and the
base address of the buffer plus base offset in the descriptor set.
The range of the dynamic uniform and storage buffer bindings is the buffer
range as specified in the descriptor set.
Each of the pDescriptorSets
must be compatible with the pipeline
layout specified by layout
.
The layout used to program the bindings must also be compatible with the
pipeline used in subsequent graphics or compute commands, as defined in the
Pipeline Layout Compatibility section.
The descriptor set contents bound by a call to vkCmdBindDescriptorSets
may be consumed at the following times:
-
For descriptor bindings created with the
VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
bit set, the contents may be consumed when the command buffer is submitted to a queue, or during shader execution of the resulting draws and dispatches, or any time in between. Otherwise, -
during host execution of the command, or during shader execution of the resulting draws and dispatches, or any time in between.
Thus, the contents of a descriptor set binding must not be altered (overwritten by an update command, or freed) between the first point in time that it may be consumed, and when the command completes executing on the queue.
The contents of pDynamicOffsets
are consumed immediately during
execution of vkCmdBindDescriptorSets
.
Once all pending uses have completed, it is legal to update and reuse a
descriptor set.
13.2.9. Push Descriptor Updates
In addition to allocating descriptor sets and binding them to a command buffer, an application can record descriptor updates into the command buffer.
To push descriptor updates into a command buffer, call:
void vkCmdPushDescriptorSetKHR(
VkCommandBuffer commandBuffer,
VkPipelineBindPoint pipelineBindPoint,
VkPipelineLayout layout,
uint32_t set,
uint32_t descriptorWriteCount,
const VkWriteDescriptorSet* pDescriptorWrites);
-
commandBuffer
is the command buffer that the descriptors will be recorded in. -
pipelineBindPoint
is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of push descriptor bindings for each of graphics and compute, so binding one does not disturb the other. -
layout
is a VkPipelineLayout object used to program the bindings. -
set
is the set number of the descriptor set in the pipeline layout that will be updated. -
descriptorWriteCount
is the number of elements in thepDescriptorWrites
array. -
pDescriptorWrites
is a pointer to an array of VkWriteDescriptorSet structures describing the descriptors to be updated.
Push descriptors are a small bank of descriptors whose storage is internally managed by the command buffer rather than being written into a descriptor set and later bound to a command buffer. Push descriptors allow for incremental updates of descriptors without managing the lifetime of descriptor sets.
When a command buffer begins recording, all push descriptors have undefined
contents.
Push descriptors can be updated incrementally and cause shaders to use the
updated descriptors for subsequent rendering commands (either compute or
graphics, according to the pipelineBindPoint
) until the descriptor is
overwritten, or else until the set is disturbed as described in
Pipeline Layout Compatibility.
When the set is disturbed or push descriptors with a different descriptor
set layout are set, all push descriptors become invalid.
Valid descriptors must be pushed for all bindings that any shaders in a pipeline access, at the time that a draw or dispatch command is recorded to execute using that pipeline. This includes immutable sampler descriptors, which must be pushed before they are accessed by a pipeline. However, if none of the shaders in a pipeline statically use certain bindings in the push descriptor set, then those descriptors need not be valid.
Push descriptors do not use dynamic offsets.
Instead, the corresponding non-dynamic descriptor types can be used and the
offset
member of VkDescriptorBufferInfo can be changed each
time the descriptor is written.
Each element of pDescriptorWrites
is interpreted as in
VkWriteDescriptorSet, except the dstSet
member is ignored.
To push an immutable sampler, use a VkWriteDescriptorSet with
dstBinding
and dstArrayElement
selecting the immutable sampler’s
binding.
If the descriptor type is VK_DESCRIPTOR_TYPE_SAMPLER
, the
pImageInfo
parameter is ignored and the immutable sampler is taken
from the push descriptor set layout in the pipeline layout.
If the descriptor type is VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
,
the sampler
member of the pImageInfo
parameter is ignored and
the immutable sampler is taken from the push descriptor set layout in the
pipeline layout.
13.2.10. Push Descriptor Updates with Descriptor Update Templates
It is also possible to use a descriptor update template to specify the push descriptors to update. To do so, call:
void vkCmdPushDescriptorSetWithTemplateKHR(
VkCommandBuffer commandBuffer,
VkDescriptorUpdateTemplate descriptorUpdateTemplate,
VkPipelineLayout layout,
uint32_t set,
const void* pData);
-
commandBuffer
is the command buffer that the descriptors will be recorded in. -
descriptorUpdateTemplate
A descriptor update template which defines how to interpret the descriptor information in pData. -
layout
is a VkPipelineLayout object used to program the bindings. It must be compatible with the layout used to create thedescriptorUpdateTemplate
handle. -
set
is the set number of the descriptor set in the pipeline layout that will be updated. This must be the same number used to create thedescriptorUpdateTemplate
handle. -
pData
Points to memory which contains the descriptors for the templated update.
struct AppDataStructure
{
VkDescriptorImageInfo imageInfo; // a single image info
// ... some more application related data
};
const VkDescriptorUpdateTemplateEntry descriptorUpdateTemplateEntries[] =
{
// binding to a single image descriptor
{
0, // binding
0, // dstArrayElement
1, // descriptorCount
VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, // descriptorType
offsetof(AppDataStructure, imageInfo), // offset
0 // stride is not required if descriptorCount is 1
}
};
// create a descriptor update template for descriptor set updates
const VkDescriptorUpdateTemplateCreateInfo createInfo =
{
VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO, // sType
NULL, // pNext
0, // flags
1, // descriptorUpdateEntryCount
descriptorUpdateTemplateEntries, // pDescriptorUpdateEntries
VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_PUSH_DESCRIPTORS_KHR, // templateType
0, // descriptorSetLayout, ignored by given templateType
VK_PIPELINE_BIND_POINT_GRAPHICS, // pipelineBindPoint
myPipelineLayout, // pipelineLayout
0, // set
};
VkDescriptorUpdateTemplate myDescriptorUpdateTemplate;
myResult = vkCreateDescriptorUpdateTemplate(
myDevice,
&createInfo,
NULL,
&myDescriptorUpdateTemplate);
}
AppDataStructure appData;
// fill appData here or cache it in your engine
vkCmdPushDescriptorSetWithTemplateKHR(myCmdBuffer, myDescriptorUpdateTemplate, myPipelineLayout, 0,&appData);
13.2.11. Push Constant Updates
As described above in section Pipeline Layouts, the pipeline layout defines shader push constants which are updated via Vulkan commands rather than via writes to memory or copy commands.
Note
Push constants represent a high speed path to modify constant data in pipelines that is expected to outperform memory-backed resource updates. |
The values of push constants are undefined at the start of a command buffer.
To update push constants, call:
void vkCmdPushConstants(
VkCommandBuffer commandBuffer,
VkPipelineLayout layout,
VkShaderStageFlags stageFlags,
uint32_t offset,
uint32_t size,
const void* pValues);
-
commandBuffer
is the command buffer in which the push constant update will be recorded. -
layout
is the pipeline layout used to program the push constant updates. -
stageFlags
is a bitmask of VkShaderStageFlagBits specifying the shader stages that will use the push constants in the updated range. -
offset
is the start offset of the push constant range to update, in units of bytes. -
size
is the size of the push constant range to update, in units of bytes. -
pValues
is an array ofsize
bytes containing the new push constant values.
Note
As |
14. Shader Interfaces
When a pipeline is created, the set of shaders specified in the
corresponding Vk*PipelineCreateInfo
structure are implicitly linked at
a number of different interfaces.
Interface definitions make use of the following SPIR-V decorations:
-
DescriptorSet
andBinding
-
Location
,Component
, andIndex
-
Flat
,NoPerspective
,Centroid
, andSample
-
Block
andBufferBlock
-
InputAttachmentIndex
-
Offset
,ArrayStride
, andMatrixStride
-
BuiltIn
This specification describes valid uses for Vulkan of these decorations. Any other use of one of these decorations is invalid.
14.1. Shader Input and Output Interfaces
When multiple stages are present in a pipeline, the outputs of one stage form an interface with the inputs of the next stage. When such an interface involves a shader, shader outputs are matched against the inputs of the next stage, and shader inputs are matched against the outputs of the previous stage.
There are two classes of variables that can be matched between shader stages, built-in variables and user-defined variables. Each class has a different set of matching criteria. Generally, when non-shader stages are between shader stages, the user-defined variables, and most built-in variables, form an interface between the shader stages.
The variables forming the input or output interfaces are listed as
operands to the OpEntryPoint
instruction and are declared with the
Input
or Output
storage classes, respectively, in the SPIR-V
module.
Output
variables of a shader stage have undefined values until the
shader writes to them or uses the Initializer
operand when declaring
the variable.
14.1.1. Built-in Interface Block
Shader built-in variables meeting the following requirements define the built-in interface block. They must
-
be explicitly declared (there are no implicit built-ins),
-
be identified with a
BuiltIn
decoration, -
form object types as described in the Built-in Variables section, and
-
be declared in a block whose top-level members are the built-ins.
Built-ins only participate in interface matching if they are declared in
such a block.
They must not have any Location
or Component
decorations.
There must be no more than one built-in interface block per shader per interface.
14.1.2. User-defined Variable Interface
The remaining variables listed by OpEntryPoint
with the Input
or
Output
storage class form the user-defined variable interface.
By default such variables have a type with a width of 32 or 64.
If an implementation supports
storageInputOutput16,
user-defined variables in the Input
and Output
storage classes
can also have types with a width of 16.
These variables must be identified with a Location
decoration and can
also be identified with a Component
decoration.
14.1.3. Interface Matching
A user-defined output variable is considered to match an input variable in
the subsequent stage if the two variables are declared with the same
Location
and Component
decoration and match in type and
decoration, except that interpolation
decorations are not required to match.
XfbBuffer
, XfbStride
, Offset
, and Stream
are also not
required to match for the purposes of interface matching.
For the purposes of interface matching, variables declared without a
Component
decoration are considered to have a Component
decoration
of zero.
Note
Matching rules for passthrough geometry shaders are slightly different and are described in the Passthrough Interface Matching section. |
Variables or block members declared as structures are considered to match in type if and only if the structure members match in type, decoration, number, and declaration order. Variables or block members declared as arrays are considered to match in type only if both declarations specify the same element type and size.
Tessellation control and mesh shader per-vertex output variables and blocks, and tessellation control, tessellation evaluation, and geometry shader per-vertex input variables and blocks are required to be declared as arrays, with each element representing input or output values for a single vertex of a multi-vertex primitive. For the purposes of interface matching, the outermost array dimension of such variables and blocks is ignored.
At an interface between two non-fragment shader stages, the built-in interface block must match exactly, as described above, except for per-view outputs as described in Mesh Shader Per-View Outputs. At an interface involving the fragment shader inputs, the presence or absence of any built-in output does not affect the interface matching.
At an interface between two shader stages, the user-defined variable interface must match exactly, as described above.
Any input value to a shader stage is well-defined as long as the preceding stages writes to a matching output, as described above.
Additionally, scalar and vector inputs are well-defined if there is a corresponding output satisfying all of the following conditions:
-
the input and output match exactly in decoration,
-
the output is a vector with the same basic type and has at least as many components as the input, and
-
the common component type of the input and output is 16-bit integer or floating-point, or 32-bit integer or floating-point (64-bit component types are excluded).
In this case, the components of the input will be taken from the first components of the output, and any extra components of the output will be ignored.
14.1.4. Location Assignment
This section describes how many locations are consumed by a given type. As mentioned above, geometry shader inputs, tessellation control shader inputs and outputs, and tessellation evaluation inputs all have an additional level of arrayness relative to other shader inputs and outputs. This outer array level is removed from the type before considering how many locations the type consumes.
The Location
value specifies an interface slot comprised of a 32-bit
four-component vector conveyed between stages.
The Component
specifies
components within these vector
locations.
Only types with widths of
16,
32 or 64 are supported in shader interfaces.
Inputs and outputs of the following types consume a single interface location:
-
16-bit scalar and vector types, and
-
32-bit scalar and vector types, and
-
64-bit scalar and 2-component vector types.
64-bit three- and four-component vectors consume two consecutive locations.
If a declared input or output is an array of size n and each element takes m locations, it will be assigned m × n consecutive locations starting with the location specified.
If the declared input or output is an n × m 16-, 32- or 64-bit matrix, it will be assigned multiple locations starting with the location specified. The number of locations assigned for each matrix will be the same as for an n-element array of m-component vectors.
The layout of a structure type used as an Input
or Output
depends
on whether it is also a Block
(i.e. has a Block
decoration).
If it is a not a Block
, then the structure type must have a
Location
decoration.
Its members are assigned consecutive locations in their declaration order,
with the first member assigned to the location specified for the structure
type.
The members, and their nested types, must not themselves have Location
decorations.
If the structure type is a Block
but without a Location
, then each
of its members must have a Location
decoration.
If it is a Block
with a Location
decoration, then its members are
assigned consecutive locations in declaration order, starting from the first
member which is initially assigned the location specified for the
Block
.
Any member with its own Location
decoration is assigned that location.
Each remaining member is assigned the location after the immediately
preceding member in declaration order.
The locations consumed by block and structure members are determined by applying the rules above in a depth-first traversal of the instantiated members as though the structure or block member were declared as an input or output variable of the same type.
Any two inputs listed as operands on the same OpEntryPoint
must not be
assigned the same location, either explicitly or implicitly.
Any two outputs listed as operands on the same OpEntryPoint
must not
be assigned the same location, either explicitly or implicitly.
The number of input and output locations available for a shader input or output interface are limited, and dependent on the shader stage as described in Shader Input and Output Locations. All variables in both the built-in interface block and the user-defined variable interface count against these limits.
Shader Interface | Locations Available |
---|---|
vertex input |
|
vertex output |
|
tessellation control input |
|
tessellation control output |
|
tessellation evaluation input |
|
tessellation evaluation output |
|
geometry input |
|
geometry output |
|
fragment input |
|
fragment output |
|
14.1.5. Component Assignment
The Component
decoration allows the Location
to be more finely
specified for scalars and vectors, down to the individual components within
a location that are consumed.
The components within a location are 0, 1, 2, and 3.
A variable or block member starting at component N will consume components
N, N+1, N+2, …
up through its size.
For 16-, and 32-bit types,
it is invalid if this sequence of components gets larger than 3.
A scalar 64-bit type will consume two of these components in sequence, and a
two-component 64-bit vector type will consume all four components available
within a location.
A three- or four-component 64-bit vector type must not specify a
Component
decoration.
A three-component 64-bit vector type will consume all four components of the
first location and components 0 and 1 of the second location.
This leaves components 2 and 3 available for other component-qualified
declarations.
A scalar or two-component 64-bit data type must not specify a
Component
decoration of 1 or 3.
A Component
decoration must not be specified for any type that is not
a scalar or vector.
14.2. Vertex Input Interface
When the vertex stage is present in a pipeline, the vertex shader input
variables form an interface with the vertex input attributes.
The vertex shader input variables are matched by the Location
and
Component
decorations to the vertex input attributes specified in the
pVertexInputState
member of the VkGraphicsPipelineCreateInfo
structure.
The vertex shader input variables listed by OpEntryPoint
with the
Input
storage class form the vertex input interface.
These variables must be identified with a Location
decoration and can
also be identified with a Component
decoration.
For the purposes of interface matching: variables declared without a
Component
decoration are considered to have a Component
decoration
of zero.
The number of available vertex input locations is given by the
maxVertexInputAttributes
member of the VkPhysicalDeviceLimits
structure.
See Attribute Location and Component Assignment for details.
All vertex shader inputs declared as above must have a corresponding attribute and binding in the pipeline.
14.3. Fragment Output Interface
When the fragment stage is present in a pipeline, the fragment shader
outputs form an interface with the output attachments of the current
subpass.
The fragment shader output variables are matched by the Location
and
Component
decorations to the color attachments specified in the
pColorAttachments
array of the VkSubpassDescription structure
that describes the subpass that the fragment shader is executed in.
The fragment shader output variables listed by OpEntryPoint
with the
Output
storage class form the fragment output interface.
These variables must be identified with a Location
decoration.
They can also be identified with a Component
decoration and/or an
Index
decoration.
For the purposes of interface matching: variables declared without a
Component
decoration are considered to have a Component
decoration
of zero, and variables declared without an Index
decoration are
considered to have an Index
decoration of zero.
A fragment shader output variable identified with a Location
decoration
of i is directed to the color attachment indicated by
pColorAttachments
[i], after passing through the blending unit as
described in Blending, if enabled.
Locations are consumed as described in
Location Assignment.
The number of available fragment output locations is given by the
maxFragmentOutputAttachments
member of the
VkPhysicalDeviceLimits
structure.
Components of the output variables are assigned as described in Component Assignment. Output components identified as 0, 1, 2, and 3 will be directed to the R, G, B, and A inputs to the blending unit, respectively, or to the output attachment if blending is disabled. If two variables are placed within the same location, they must have the same underlying type (floating-point or integer). The input values to blending or color attachment writes are undefined for components which do not correspond to a fragment shader output.
Fragment outputs identified with an Index
of zero are directed to the
first input of the blending unit associated with the corresponding
Location
.
Outputs identified with an Index
of one are directed to the second
input of the corresponding blending unit.
No component aliasing of output variables is allowed, that is there must not be two output variables which have the same location, component, and index, either explicitly declared or implied.
Output values written by a fragment shader must be declared with either
OpTypeFloat
or OpTypeInt
, and a Width of 32.
If storageInputOutput16
is supported, output values written by a
fragment shader can be also declared with either OpTypeFloat
or
OpTypeInt
and a Width of 16.
Composites of these types are also permitted.
If the color attachment has a signed or unsigned normalized fixed-point
format, color values are assumed to be floating-point and are converted to
fixed-point as described in Conversion from Floating-Point to Normalized Fixed-Point; If the color
attachment has an integer format, color values are assumed to be integers
and converted to the bit-depth of the target.
Any value that cannot be represented in the attachment’s format is
undefined.
For any other attachment format no conversion is performed.
If the type of the values written by the fragment shader do not match the
format of the corresponding color attachment, the resulting values are
undefined for those components.
14.4. Fragment Input Attachment Interface
When a fragment stage is present in a pipeline, the fragment shader subpass
inputs form an interface with the input attachments of the current subpass.
The fragment shader subpass input variables are matched by
InputAttachmentIndex
decorations to the input attachments specified in
the pInputAttachments
array of the VkSubpassDescription
structure that describes the subpass that the fragment shader is executed
in.
The fragment shader subpass input variables with the UniformConstant
storage class and a decoration of InputAttachmentIndex
that are
statically used by OpEntryPoint
form the fragment input attachment
interface.
These variables must be declared with a type of OpTypeImage
, a
Dim
operand of SubpassData
, and a Sampled
operand of 2.
A subpass input variable identified with an InputAttachmentIndex
decoration of i reads from the input attachment indicated by
pInputAttachments
[i] member of VkSubpassDescription
.
If the subpass input variable is declared as an array of size N, it consumes
N consecutive input attachments, starting with the index specified.
There must not be more than one input variable with the same
InputAttachmentIndex
whether explicitly declared or implied by an array
declaration.
The number of available input attachment indices is given by the
maxPerStageDescriptorInputAttachments
member of the
VkPhysicalDeviceLimits
structure.
Variables identified with the InputAttachmentIndex
must only be used
by a fragment stage.
The basic data type (floating-point, integer, unsigned integer) of the
subpass input must match the basic format of the corresponding input
attachment, or the values of subpass loads from these variables are
undefined.
See Input Attachment for more details.
14.5. Shader Resource Interface
When a shader stage accesses buffer or image resources, as described in the Resource Descriptors section, the shader resource variables must be matched with the pipeline layout that is provided at pipeline creation time.
The set of shader resources that form the shader resource interface for a
stage are the variables statically used by OpEntryPoint
with the
storage class of Uniform
, UniformConstant
, or PushConstant
.
For the fragment shader, this includes the fragment input attachment interface.
The shader resource interface consists of two sub-interfaces: the push constant interface and the descriptor set interface.
14.5.1. Push Constant Interface
The shader variables defined with a storage class of PushConstant
that
are statically used by the shader entry points for the pipeline define the
push constant interface.
They must be:
-
typed as
OpTypeStruct
, -
identified with a
Block
decoration, and -
laid out explicitly using the
Offset
,ArrayStride
, andMatrixStride
decorations as specified in Offset and Stride Assignment.
There must be no more than one push constant block statically used per shader entry point.
Each statically used member of a push constant block must be placed at an
Offset
such that the entire member is entirely contained within the
VkPushConstantRange for each OpEntryPoint
that uses it, and the
stageFlags
for that range must specify the appropriate
VkShaderStageFlagBits for that stage.
The Offset
decoration for any member of a push constant block must not
cause the space required for that member to extend outside the range
[0, maxPushConstantsSize
).
Any member of a push constant block that is declared as an array must only be accessed with dynamically uniform indices.
14.5.2. Descriptor Set Interface
The descriptor set interface is comprised of the shader variables with the
storage class of Uniform
or UniformConstant
(including the
variables in the fragment input attachment
interface) that are statically used by the shader entry points for the
pipeline.
These variables must have DescriptorSet
and Binding
decorations
specified, which are assigned and matched with the
VkDescriptorSetLayout
objects in the pipeline layout as described in
DescriptorSet and Binding Assignment.
Variables identified with the UniformConstant
storage class are used
only as handles to refer to opaque resources.
Such variables must be typed as OpTypeImage
, OpTypeSampler
,
OpTypeSampledImage
, or an array of one of these types.
The Sampled
Type
of an OpTypeImage
declaration must match
the same basic data type as the corresponding resource, or the values
obtained by reading or sampling from this image are undefined.
The Image
Format
of an OpTypeImage
declaration must not be
Unknown, for variables which are used for OpImageRead
,
OpImageSparseRead
, or OpImageWrite
operations, except under the
following conditions:
-
For
OpImageWrite
, if theshaderStorageImageWriteWithoutFormat
feature is enabled and the shader module declares theStorageImageWriteWithoutFormat
capability. -
For
OpImageRead
orOpImageSparseRead
, if theshaderStorageImageReadWithoutFormat
feature is enabled and the shader module declares theStorageImageReadWithoutFormat
capability.
The Image
Format
of an OpTypeImage
declaration must not be
Unknown, for variables which are used for OpAtomic
* operations.
Variables identified with the Uniform
storage class are used to access
transparent buffer backed resources.
Such variables must be:
-
typed as
OpTypeStruct
, or an array of this type, -
identified with a
Block
orBufferBlock
decoration, and -
laid out explicitly using the
Offset
,ArrayStride
, andMatrixStride
decorations as specified in Offset and Stride Assignment.
Variables identified with the StorageBuffer
storage class are used to
access transparent buffer backed resources.
Such variables must be:
-
typed as
OpTypeStruct
, or an array of this type, -
identified with a
Block
decoration, and -
laid out explicitly using the
Offset
,ArrayStride
, andMatrixStride
decorations as specified in Offset and Stride Assignment.
The Offset
decoration for any member of a Block
-decorated variable
in the Uniform
storage class must not cause the space required for
that variable to extend outside the range [0,
maxUniformBufferRange
).
The Offset
decoration for any member of a Block
-decorated variable
in the StorageBuffer
storage class must not cause the space required
for that variable to extend outside the range [0,
maxStorageBufferRange
).
Variables identified with the Uniform
storage class can also be used
to access transparent descriptor set backed resources when the variable is
assigned to a descriptor set layout binding with a descriptorType
of
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
.
In this case the variable must be typed as OpTypeStruct
and cannot be
aggregated into arrays of that type.
Further, the Offset
decoration for any member of such a variable must
not cause the space required for that variable to extend outside the range
[0,maxInlineUniformBlockSize).
Variables identified with a storage class of UniformConstant
and a
decoration of InputAttachmentIndex
must be declared as described in
Fragment Input Attachment Interface.
SPIR-V variables decorated with a descriptor set and binding that identify a
combined image sampler descriptor
can have a type of OpTypeImage
, OpTypeSampler
(Sampled
=1),
or OpTypeSampledImage
.
Arrays of any of these types can be indexed with constant integral expressions. The following features must be enabled and capabilities must be declared in order to index such arrays with dynamically uniform or non-uniform indices:
-
Storage images (except storage texel buffers and input attachments):
-
Dynamically uniform:
shaderStorageImageArrayDynamicIndexing
andStorageImageArrayDynamicIndexing
-
Non-uniform:
shaderStorageImageArrayNonUniformIndexing
andStorageImageArrayNonUniformIndexingEXT
-
-
Storage texel buffers:
-
Dynamically uniform:
shaderStorageTexelBufferArrayDynamicIndexing
andStorageTexelBufferArrayDynamicIndexingEXT
-
Non-uniform:
shaderStorageTexelBufferArrayNonUniformIndexing
andStorageTexelBufferArrayNonUniformIndexingEXT
-
-
Input attachments:
-
Dynamically uniform:
shaderInputAttachmentArrayDynamicIndexing
andInputAttachmentArrayDynamicIndexingEXT
-
Non-uniform:
shaderInputAttachmentArrayNonUniformIndexing
andInputAttachmentArrayNonUniformIndexingEXT
-
-
Sampled images (except uniform texel buffers):
-
Dynamically uniform:
shaderSampledImageArrayDynamicIndexing
andSampledImageArrayDynamicIndexing
-
Non-uniform:
shaderSampledImageArrayNonUniformIndexing
andSampledImageArrayNonUniformIndexingEXT
-
-
Uniform texel buffers:
-
Dynamically uniform:
shaderUniformTexelBufferArrayDynamicIndexing
andUniformTexelBufferArrayDynamicIndexingEXT
-
Non-uniform:
shaderUniformTexelBufferArrayNonUniformIndexing
andUniformTexelBufferArrayNonUniformIndexingEXT
-
-
Uniform buffers:
-
Dynamically uniform:
shaderUniformBufferArrayDynamicIndexing
andUniformBufferArrayDynamicIndexing
-
Non-uniform:
shaderUniformBufferArrayNonUniformIndexing
andUniformBufferArrayNonUniformIndexingEXT
-
-
Storage buffers:
-
Dynamically uniform:
shaderStorageBufferArrayDynamicIndexing
andStorageBufferArrayDynamicIndexing
-
Non-uniform:
shaderStorageBufferArrayNonUniformIndexing
andStorageBufferArrayNonUniformIndexingEXT
-
If an instruction loads from or stores to a resource (including atomics and image instructions) and the resource descriptor being accessed is not dynamically uniform, then the corresponding non-uniform indexing feature must be enabled and the capability must be declared. If an instruction loads from or stores to a resource (including atomics and image instructions) and the resource descriptor being accessed is not uniform, then the corresponding dynamic indexing or non-uniform feature must be enabled and the capability must be declared.
If the combined image sampler enables sampler Y’CBCR
conversion or samples a subsampled image,
it must be indexed only by constant integral expressions when aggregated
into arrays in shader code, irrespective of the
shaderSampledImageArrayDynamicIndexing
feature.
Resource type | Descriptor Type |
---|---|
sampler |
|
sampled image |
|
storage image |
|
combined image sampler |
|
uniform texel buffer |
|
storage texel buffer |
|
uniform buffer |
|
storage buffer |
|
input attachment |
|
inline uniform block |
|
acceleration structure |
|
Resource type | Storage Class | Type | Decoration(s)1 |
---|---|---|---|
sampler |
|
|
|
sampled image |
|
|
|
storage image |
|
|
|
combined image sampler |
|
|
|
uniform texel buffer |
|
|
|
storage texel buffer |
|
|
|
uniform buffer |
|
|
|
storage buffer |
|
|
|
|
|
||
input attachment |
|
|
|
inline uniform block |
|
|
|
- 1
-
in addition to
DescriptorSet
andBinding
14.5.3. DescriptorSet and Binding Assignment
A variable decorated with a DescriptorSet
decoration of s and a
Binding
decoration of b indicates that this variable is
associated with the VkDescriptorSetLayoutBinding that has a
binding
equal to b in pSetLayouts
[s] that was specified
in VkPipelineLayoutCreateInfo.
DescriptorSet
decoration values must be between zero and
maxBoundDescriptorSets
minus one, inclusive.
Binding
decoration values can be any 32-bit unsigned integer value, as
described in Descriptor Set Layout.
Each descriptor set has its own binding name space.
If the Binding
decoration is used with an array, the entire array is
assigned that binding value.
The array must be a single-dimensional array and size of the array must be
no larger than the number of descriptors in the binding.
If the array is runtime-sized, then array elements greater than or equal to
the size of that binding in the bound descriptor set must not be used.
If the array is runtime-sized, the runtimeDescriptorArray
feature
must be enabled and the RuntimeDescriptorArrayEXT
capability must be
declared.
The index of each element of the array is referred to as the arrayElement.
For the purposes of interface matching and descriptor set
operations, if a resource variable is not an
array, it is treated as if it has an arrayElement of zero.
There is a limit on the number of resources of each type that can be accessed by a pipeline stage as shown in Shader Resource Limits. The “Resources Per Stage” column gives the limit on the number each type of resource that can be statically used for an entry point in any given stage in a pipeline. The “Resource Types” column lists which resource types are counted against the limit. Some resource types count against multiple limits.
The pipeline layout may include descriptor sets and bindings which are not
referenced by any variables statically used by the entry points for the
shader stages in the binding’s stageFlags
.
However, if a variable assigned to a given DescriptorSet
and
Binding
is statically used by the entry point for a shader stage, the
pipeline layout must contain a descriptor set layout binding in that
descriptor set layout and for that binding number, and that binding’s
stageFlags
must include the appropriate VkShaderStageFlagBits
for that stage.
The variable must be of a valid resource type determined by its SPIR-V type
and storage class, as defined in
Shader Resource and
Storage Class Correspondence.
The descriptor set layout binding must be of a corresponding descriptor
type, as defined in Shader Resource
and Descriptor Type Correspondence.
Note
There are no limits on the number of shader variables that can have overlapping set and binding values in a shader; but which resources are statically used has an impact. If any shader variable identifying a resource is statically used in a shader, then the underlying descriptor bound at the declared set and binding must support the declared type in the shader when the shader executes. If multiple shader variables are declared with the same set and binding
values, and with the same underlying descriptor type, they can all be
statically used within the same shader.
However, accesses are not automatically synchronized, and If multiple shader variables with the same set and binding values are declared in a single shader, but with different declared types, where any of those are not supported by the relevant bound descriptor, that shader can only be executed if the variables with the unsupported type are not statically used. A noteworthy example of using multiple statically-used shader variables
sharing the same descriptor set and binding values is a descriptor of type
|
Resources per Stage | Resource Types |
---|---|
|
sampler |
combined image sampler |
|
|
sampled image |
combined image sampler |
|
uniform texel buffer |
|
|
storage image |
storage texel buffer |
|
|
uniform buffer |
uniform buffer dynamic |
|
|
storage buffer |
storage buffer dynamic |
|
|
input attachment1 |
|
inline uniform block |
- 1
-
Input attachments can only be used in the fragment shader stage
14.5.4. Offset and Stride Assignment
All variables with a storage class of Uniform
, StorageBuffer
, or
PushConstant
must be explicitly laid out using the Offset
,
ArrayStride
, and MatrixStride
decorations.
Note
The numeric order of |
If the scalarBlockLayout
feature is enabled, then the layout of blocks in these storage classes must
adhere to the Scalar Alignment
requirements below.
If the feature is not enabled, they must adhere to the stricter
Base Alignment.
Performance Note
Even if scalar alignment is supported, it is generally more performant to use the base alignment. |
Base Alignment
There are two different layouts requirements depending on the specific resources.
Standard Uniform Buffer Layout
The base alignment of the type of an OpTypeStruct
member of is
defined recursively as follows:
-
A scalar of size N has a base alignment of N.
-
A two-component vector, with components of size N, has a base alignment of 2 N.
-
A three- or four-component vector, with components of size N, has a base alignment of 4 N.
-
An array has a base alignment equal to the base alignment of its element type, rounded up to a multiple of 16.
-
A structure has a base alignment equal to the largest base alignment of any of its members, rounded up to a multiple of 16.
-
A row-major matrix of C columns has a base alignment equal to the base alignment of a vector of C matrix components.
-
A column-major matrix has a base alignment equal to the base alignment of the matrix column type.
A member is defined to improperly straddle if either of the following are true:
-
It is a vector with total size less than or equal to 16 bytes, and has
Offset
decorations placing its first byte at F and its last byte at L, where floor(F / 16) != floor(L / 16). -
It is a vector with total size greater than 16 bytes and has its
Offset
decorations placing its first byte at a non-integer multiple of 16.
Every member of an OpTypeStruct
with storage class of Uniform
and
a decoration of Block
(uniform buffers) must be laid out according to
the following rules:
-
The
Offset
decoration of a scalar, an array, a structure, or a matrix must be a multiple of its base alignment. -
The
Offset
decoration of a vector must be an integer multiple of the base alignment of its scalar component type, and must not improperly straddle, as defined above. -
Any
ArrayStride
orMatrixStride
decoration must be an integer multiple of the base alignment of the array or matrix from above. -
The
Offset
decoration of a member must not place it between the end of a structure or an array and the next multiple of the base alignment of that structure or array.
Note
The std140 layout in GLSL satisfies these rules. |
Standard Storage Buffer Layout
Member variables of an OpTypeStruct
with a storage class of
PushConstant
(push constants), or a storage class of Uniform
with
a decoration of BufferBlock
(storage buffers)
, or a storage class of StorageBuffer
with a decoration of Block
must be laid out as above, except
for array and structure base alignment which do not need to be rounded up to
a multiple of 16.
Note
The std430 layout in GLSL satisfies these rules. |
Scalar Alignment
The scalar alignment of the type of an OpTypeStruct
member of is
defined recursively as follows:
-
A scalar of size N has a scalar alignment of N.
-
A vector has a scalar alignment equal to that of its component type.
-
A matrix has a scalar alignment equal to that of its component type.
-
An array has a scalar alignment equal to that of its element type.
-
A structure has a scalar alignment equal to the largest scalar alignment of any of its members.
Every member of an OpTypeStruct
with storage class of Uniform
,
StorageBuffer
, or PushConstant
must be laid out according to the
following rules:
-
The
Offset
decoration must be a multiple of its scalar alignment. -
Any
ArrayStride
orMatrixStride
decoration must be an integer multiple of the scalar alignment of the array or matrix from above.
14.6. Built-In Variables
Built-in variables are accessed in shaders by declaring a variable decorated
with a BuiltIn
SPIR-V decoration.
The meaning of each BuiltIn
decoration is as follows.
In the remainder of this section, the name of a built-in is used
interchangeably with a term equivalent to a variable decorated with that
particular built-in.
Built-ins that represent integer values can be declared as either signed or
unsigned 32-bit integers.
BaryCoordNV
-
The
BaryCoordNV
decoration can be used to decorate a fragment shader input variable. This variable will contain a three-component floating-point vector with barycentric weights that indicate the location of the fragment relative to the screen-space locations of vertices of its primitive, obtained using perspective interpolation.The
BaryCoordNV
decoration must be used only within fragment shaders.The variable decorated with
BaryCoordNV
must be declared using theInput
storage class.The variable decorated with
BaryCoordNV
must be declared as three-component vector of 32-bit floating-point values. BaryCoordNoPerspAMD
-
The
BaryCoordNoPerspAMD
decoration can be used to decorate a fragment shader input variable. This variable will contain the (I,J) pair of the barycentric coordinates corresponding to the fragment evaluated using linear interpolation at the fragment’s center. The K coordinate of the barycentric coordinates can be derived given the identity I + J + K = 1.0.
BaryCoordNoPerspNV
-
The
BaryCoordNoPerspNV
decoration can be used to decorate a fragment shader input variable. This variable will contain a three-component floating-point vector with barycentric weights that indicate the location of the fragment relative to the screen-space locations of vertices of its primitive, obtained using linear interpolation.The
BaryCoordNoPerspNV
decoration must be used only within fragment shaders.The variable decorated with
BaryCoordNoPerspNV
must be declared using theInput
storage class.The variable decorated with
BaryCoordNoPerspNV
must be declared as three-component vector of 32-bit floating-point values. BaryCoordNoPerspCentroidAMD
-
The
BaryCoordNoPerspCentroidAMD
decoration can be used to decorate a fragment shader input variable. This variable will contain the (I,J) pair of the barycentric coordinates corresponding to the fragment evaluated using linear interpolation at the centroid. The K coordinate of the barycentric coordinates can be derived given the identity I + J + K = 1.0. BaryCoordNoPerspSampleAMD
-
The
BaryCoordNoPerspCentroidAMD
decoration can be used to decorate a fragment shader input variable. This variable will contain the (I,J) pair of the barycentric coordinates corresponding to the fragment evaluated using linear interpolation at each covered sample. The K coordinate of the barycentric coordinates can be derived given the identity I + J + K = 1.0. BaryCoordPullModelAMD
-
The
BaryCoordPullModelAMD
decoration can be used to decorate a fragment shader input variable. This variable will contain (1/W, 1/I, 1/J) evaluated at the fragment center and can be used to calculate gradients and then interpolate I, J, and W at any desired sample location. BaryCoordSmoothAMD
-
The
BaryCoordSmoothAMD
decoration can be used to decorate a fragment shader input variable. This variable will contain the (I,J) pair of the barycentric coordinates corresponding to the fragment evaluated using perspective interpolation at the fragment’s center. The K coordinate of the barycentric coordinates can be derived given the identity I + J + K = 1.0. BaryCoordSmoothCentroidAMD
-
The
BaryCoordSmoothCentroidAMD
decoration can be used to decorate a fragment shader input variable. This variable will contain the (I,J) pair of the barycentric coordinates corresponding to the fragment evaluated using perspective interpolation at the centroid. The K coordinate of the barycentric coordinates can be derived given the identity I + J + K = 1.0. BaryCoordSmoothSampleAMD
-
The
BaryCoordSmoothCentroidAMD
decoration can be used to decorate a fragment shader input variable. This variable will contain the (I,J) pair of the barycentric coordinates corresponding to the fragment evaluated using perspective interpolation at each covered sample. The K coordinate of the barycentric coordinates can be derived given the identity I + J + K = 1.0.
BaseInstance
-
Decorating a variable with the
BaseInstance
built-in will make that variable contain the integer value corresponding to the first instance that was passed to the command that invoked the current vertex shader invocation.BaseInstance
is thefirstInstance
parameter to a direct drawing command or thefirstInstance
member of a structure consumed by an indirect drawing command.The
BaseInstance
decoration must be used only within vertex shaders.The variable decorated with BaseInstance must be declared using the input storage class.
The variable decorated with BaseInstance must be declared as a scalar 32-bit integer.
BaseVertex
-
Decorating a variable with the
BaseVertex
built-in will make that variable contain the integer value corresponding to the first vertex or vertex offset that was passed to the command that invoked the current vertex shader invocation. For non-indexed drawing commands, this variable is thefirstVertex
parameter to a direct drawing command or thefirstVertex
member of the structure consumed by an indirect drawing command. For indexed drawing commands, this variable is thevertexOffset
parameter to a direct drawing command or thevertexOffset
member of the structure consumed by an indirect drawing command.The
BaseVertex
decoration must be used only within vertex shaders.The variable decorated with
BaseVertex
must be declared using the input storage class.The variable decorated with codeBaseVertex must be declared as a scalar 32-bit integer.
ClipDistance
-
Decorating a variable with the
ClipDistance
built-in decoration will make that variable contain the mechanism for controlling user clipping.ClipDistance
is an array such that the ith element of the array specifies the clip distance for plane i. A clip distance of 0 means the vertex is on the plane, a positive distance means the vertex is inside the clip half-space, and a negative distance means the point is outside the clip half-space.The
ClipDistance
decoration must be used only within mesh, vertex, fragment, tessellation control, tessellation evaluation, and geometry shaders.In mesh or vertex shaders, any variable decorated with
ClipDistance
must be declared using theOutput
storage class.In fragment shaders, any variable decorated with
ClipDistance
must be declared using theInput
storage class.In tessellation control, tessellation evaluation, or geometry shaders, any variable decorated with
ClipDistance
must not be in a storage class other thanInput
orOutput
.Any variable decorated with
ClipDistance
must be declared as an array of 32-bit floating-point values.
Note
The array variable decorated with |
Note
In the last vertex processing stage, these values will be linearly
interpolated across the primitive and the portion of the primitive with
interpolated distances less than 0 will be considered outside the clip
volume.
If |
ClipDistancePerViewNV
-
Decorating a variable with the
ClipDistancePerViewNV
built-in decoration will make that variable contain the per-view clip distances. The per-view clip distances have the same semantics asClipDistance
.The
ClipDistancePerViewNV
must be used only within mesh shaders.Any variable decorated with
ClipDistancePerViewNV
must be declared using theOutput
storage class, and must also be decorated with thePerViewNV
decoration.Any variable decorated with
ClipDistancePerViewNV
must be declared as a two-dimensional array of 32-bit floating-point values. CullDistance
-
Decorating a variable with the
CullDistance
built-in decoration will make that variable contain the mechanism for controlling user culling. If any member of this array is assigned a negative value for all vertices belonging to a primitive, then the primitive is discarded before rasterization.The
CullDistance
decoration must be used only within mesh, vertex, fragment, tessellation control, tessellation evaluation, and geometry shaders.In mesh or vertex shaders, any variable decorated with
CullDistance
must be declared using theOutput
storage class.In fragment shaders, any variable decorated with
CullDistance
must be declared using theInput
storage class.In tessellation control, tessellation evaluation, or geometry shaders, any variable decorated with
CullDistance
must not be declared in a storage class other than input or output.Any variable decorated with
CullDistance
must be declared as an array of 32-bit floating-point values.
Note
In fragment shaders, the values of the |
Note
If |
CullDistancePerViewNV
-
Decorating a variable with the
CullDistancePerViewNV
built-in decoration will make that variable contain the per-view cull distances. The per-view clip distances have the same semantics asCullDistance
.The
CullDistancePerViewNV
must be used only within mesh shaders.Any variable decorated with
CullDistancePerViewNV
must be declared using theOutput
storage class, and must also be decorated with thePerViewNV
decoration.Any variable decorated with
CullDistancePerViewNV
must be declared as a two-dimensional array of 32-bit floating-point values.
DeviceIndex
-
The
DeviceIndex
decoration can be applied to a shader input which will be filled with the device index of the physical device that is executing the current shader invocation. This value will be in the range \([0,max(1,physicalDeviceCount))\), where physicalDeviceCount is thephysicalDeviceCount
member of VkDeviceGroupDeviceCreateInfo.The
DeviceIndex
decoration can be used in any shader.The variable decorated with
DeviceIndex
must be declared using theInput
storage class.The variable decorated with
DeviceIndex
must be declared as a scalar 32-bit integer.
DrawIndex
-
Decorating a variable with the
DrawIndex
built-in will make that variable contain the integer value corresponding to the zero-based index of the drawing command that invoked the current task, mesh, or vertex shader invocation. For indirect drawing commands,DrawIndex
begins at zero and increments by one for each draw command executed. The number of draw commands is given by thedrawCount
parameter. For direct drawing commands,DrawIndex
is always zero.DrawIndex
is dynamically uniform.The
DrawIndex
decoration must be used only within task, mesh or vertex shaders.The variable decorated with
DrawIndex
must be declared using the input storage class.The variable decorated with
DrawIndex
must be declared as a scalar 32-bit integer.When task or mesh shaders are used, only the first active stage will have proper access to the variable, other stages will have undefined values.
FragCoord
-
Decorating a variable with the
FragCoord
built-in decoration will make that variable contain the framebuffer coordinate \((x,y,z,\frac{1}{w})\) of the fragment being processed. The (x,y) coordinate (0,0) is the upper left corner of the upper left pixel in the framebuffer.When Sample Shading is enabled, the x and y components of
FragCoord
reflect the location of one of the samples corresponding to the shader invocation.Otherwise, the x and y components of
FragCoord
reflect the location of the center of the fragment.The z component of
FragCoord
is the interpolated depth value of the primitive.The w component is the interpolated \(\frac{1}{w}\).
The
FragCoord
decoration must be used only within fragment shaders.The variable decorated with
FragCoord
must be declared using theInput
storage class.The
Centroid
interpolation decoration is ignored, but allowed, onFragCoord
.The variable decorated with
FragCoord
must be declared as a four-component vector of 32-bit floating-point values. FragDepth
-
To have a shader supply a fragment-depth value, the shader must declare the
DepthReplacing
execution mode. Such a shader’s fragment-depth value will come from the variable decorated with theFragDepth
built-in decoration.This value will be used for any subsequent depth testing performed by the implementation or writes to the depth attachment.
The
FragDepth
decoration must be used only within fragment shaders.The variable decorated with
FragDepth
must be declared using theOutput
storage class.The variable decorated with
FragDepth
must be declared as a scalar 32-bit floating-point value.
FragInvocationCountEXT
-
Decorating a variable with the
FragInvocationCountEXT
built-in decoration will make that variable contain the maximum number of fragment shader invocations for the fragment, as determined byminSampleShading
.The
FragInvocationCountEXT
decoration must be used only within fragment shaders and theFragmentDensityEXT
capability must be declared.If Sample Shading is not enabled,
FragInvocationCountEXT
will be filled with a value of 1.The variable decorated with
FragInvocationCountEXT
must be declared using theInput
storage class.The variable decorated with
FragInvocationCountEXT
must be declared as a scalar 32-bit integer.
FragSizeEXT
-
Decorating a variable with the
FragSizeEXT
built-in decoration will make that variable contain the dimensions in pixels of the area that the fragment covers for that invocation.The
FragSizeEXT
decoration must be used only within fragment shaders and theFragmentDensityEXT
capability must be declared.If fragment density map is not enabled,
FragSizeEXT
will be filled with a value of (1,1).The variable decorated with
FragSizeEXT
must be declared using theInput
storage class.The variable decorated with
FragSizeEXT
must be declared as a two-component vector of 32-bit unsigned integer values. FragStencilRefEXT
-
Decorating a variable with the
FragStencilRefEXT
built-in decoration will make that variable contain the new stencil reference value for all samples covered by the fragment. This value will be used as the stencil reference value used in stencil testing.To write to
FragStencilRefEXT
, a shader must declare theStencilRefReplacingEXT
execution mode. If a shader declares theStencilRefReplacingEXT
execution mode and there is an execution path through the shader that does not setFragStencilRefEXT
, then the fragment’s stencil reference value is undefined for executions of the shader that take that path.The
FragStencilRefEXT
decoration must be used only within fragment shaders.The variable decorated with
FragStencilRefEXT
must be declared using theOutput
storage class.The variable decorated with
FragStencilRefEXT
must be declared as a scalar integer value. Only the least significant s bits of the integer value of the variable decorated withFragStencilRefEXT
are considered for stencil testing, where s is the number of bits in the stencil framebuffer attachment, and higher order bits are discarded. FragmentSizeNV
-
Decorating a variable with the
FragmentSizeNV
built-in decoration will make that variable contain the width and height of the fragment.The
FragmentSizeNV
decoration must be used only within fragment shaders.The variable decorated with
FragmentSizeNV
must be declared using theInput
storage class.The variable decorated with
FragmentSizeNV
must be declared as a two-component vector of 32-bit integers. FrontFacing
-
Decorating a variable with the
FrontFacing
built-in decoration will make that variable contain whether the fragment is front or back facing. This variable is non-zero if the current fragment is considered to be part of a front-facing polygon primitive or of a non-polygon primitive and is zero if the fragment is considered to be part of a back-facing polygon primitive.The
FrontFacing
decoration must be used only within fragment shaders.The variable decorated with
FrontFacing
must be declared using theInput
storage class.The variable decorated with
FrontFacing
must be declared as a boolean. FullyCoveredEXT
-
Decorating a variable with the
FullyCoveredEXT
built-in decoration will make that variable indicate whether the fragment area is fully covered by the generating primitive. This variable is non-zero if conservative rasterization is enabled and the current fragment area is fully covered by the generating primitive, and is zero if the fragment is not covered or partially covered, or conservative rasterization is disabled.The
FullyCoveredEXT
decoration must be used only within fragment shaders and theFragmentFullyCoveredEXT
capability must be declared.The variable decorated with
FullyCoveredEXT
must be declared using theInput
storage class.The variable decorated with
FullyCoveredEXT
must be declared as a boolean.If the implementation supports
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::conservativeRasterizationPostDepthCoverage
and thePostDepthCoverage
execution mode is specified theSampleMask
built-in input variable will reflect the coverage after the early per-fragment depth and stencil tests are applied. IfVkPhysicalDeviceConservativeRasterizationPropertiesEXT
::conservativeRasterizationPostDepthCoverage
is not supported thePostDepthCoverage
execution mode must not be specified. GlobalInvocationId
-
Decorating a variable with the
GlobalInvocationId
built-in decoration will make that variable contain the location of the current invocation within the global workgroup. Each component is equal to the index of the local workgroup multiplied by the size of the local workgroup plusLocalInvocationId
.The
GlobalInvocationId
decoration must be used only within task, mesh, or compute shaders.The variable decorated with
GlobalInvocationId
must be declared using theInput
storage class.The variable decorated with
GlobalInvocationId
must be declared as a three-component vector of 32-bit integers. HelperInvocation
-
Decorating a variable with the
HelperInvocation
built-in decoration will make that variable contain whether the current invocation is a helper invocation. This variable is non-zero if the current fragment being shaded is a helper invocation and zero otherwise. A helper invocation is an invocation of the shader that is produced to satisfy internal requirements such as the generation of derivatives.The
HelperInvocation
decoration must be used only within fragment shaders.The variable decorated with
HelperInvocation
must be declared using theInput
storage class.The variable decorated with
HelperInvocation
must be declared as a boolean.
Note
It is very likely that a helper invocation will have a value of
|
HitKindNV
-
A variable decorated with the
HitKindNV
decoration will describe the intersection that triggered the execution of the current shader. The values are determined by the intersection shader.The
HitKindNV
decoration must only be used in any hit and closest hit shaders.Any variable decorated with
HitKindNV
must be declared using theInput
storage class.Any variable decorated with
HitKindNV
must be declared as a scalar 32-bit integer.
HitTNV
-
A variable decorated with the
HitTNV
decoration is equivalent to a variable decorated with theRayTmaxNV
decoration.The
HitTNV
decoration must only be used in any hit and closest hit shaders.Any variable decorated with
HitTNV
must be declared using theInput
storage class.Any variable decorated with
HitTNV
must be declared as a scalar 32-bit floating-point value.
IncomingRayFlagsNV
-
A variable with the
IncomingRayFlagsNV
decoration will contain the ray flags passed in to the trace call that invoked this particular shader.The
IncomingRayFlagsNV
decoration must only be used in the intersection, any hit, closest hit, and miss shaders.Any variable decorated with
IncomingRayFlagsNV
must be declared using theInput
storage class.Any variable decorated with
IncomingRayFlagsNV
must be declared as a scalar 32-bit integer.
InstanceCustomIndexNV
-
A variable decorated with the
InstanceCustomIndexNV
decoration will contain the application-defined value of the instance that intersects the current ray. Only the lower 24 bits are valid, the upper 8 bits will be ignored.The
InstanceCustomIndexNV
decoration must only be used in the intersection, any hit, and closest hit shaders.Any variable decorated with
InstanceCustomIndexNV
must be declared using theInput
storage class.Any variable decorated with
InstanceCustomIndexNV
must be declared as a scalar 32-bit integer.
InstanceId
-
Decorating a variable in an intersection, any-hit, or closest hit shader with the
InstanceId
decoration will make that variable contain the index of the instance that intersects the current ray.The
InstanceId
decoration must be used only within intersection, any-hit, or closest hit shaders.The variable decorated with
InstanceId
must be declared using theInput
storage class.The variable decorated with
InstanceId
must be declared as a scalar 32-bit integer. InvocationId
-
Decorating a variable with the
InvocationId
built-in decoration will make that variable contain the index of the current shader invocation in a geometry shader, or the index of the output patch vertex in a tessellation control shader.In a geometry shader, the index of the current shader invocation ranges from zero to the number of instances declared in the shader minus one. If the instance count of the geometry shader is one or is not specified, then
InvocationId
will be zero.The
InvocationId
decoration must be used only within tessellation control and geometry shaders.The variable decorated with
InvocationId
must be declared using theInput
storage class.The variable decorated with
InvocationId
must be declared as a scalar 32-bit integer. InvocationsPerPixelNV
-
Decorating a variable with the
InvocationsPerPixelNV
built-in decoration will make that variable contain the maximum number of fragment shader invocations per pixel, as derived from the effective shading rate for the fragment. If a primitive does not fully cover a pixel, the number of fragment shader invocations for that pixel may be less than the value ofInvocationsPerPixelNV
. If the shading rate indicates a fragment covering multiple pixels, thenInvocationsPerPixelNV
will be one.The
InvocationsPerPixelNV
decoration must be used only within fragment shaders.The variable decorated with
InvocationsPerPixelNV
must be declared using theInput
storage class.The variable decorated with
InvocationsPerPixelNV
must be declared as a scalar 32-bit integer. InstanceIndex
-
Decorating a variable in a vertex shader with the
InstanceIndex
built-in decoration will make that variable contain the index of the instance that is being processed by the current vertex shader invocation.InstanceIndex
begins at thefirstInstance
parameter to vkCmdDraw or vkCmdDrawIndexed or at thefirstInstance
member of a structure consumed by vkCmdDrawIndirect or vkCmdDrawIndexedIndirect.The
InstanceIndex
decoration must be used only within vertex shaders.The variable decorated with
InstanceIndex
must be declared using theInput
storage class.The variable decorated with
InstanceIndex
must be declared as a scalar 32-bit integer.
LaunchIDNV
-
A variable decorated with the
LaunchIDNV
decoration will specify the index of the work item being process. One work item is generated for each of thewidth
×height
×depth
items dispatched by a vkCmdTraceRaysNV command. All shader invocations inherit the same value for variables decorated withLaunchIDNV
.The
LaunchIDNV
decoration must only be used within the ray generation, intersection, any hit, closest hit, and miss shaders.Any variable decorated with
LaunchIDNV
must be declared using theInput
storage class.Any variable decorated with
LaunchIDNV
must be declared as a three-component vector of 32-bit integer values.
LaunchSizeNV
-
A variable decorated with the
LaunchSizeNV
decoration will contain thewidth
,height
, anddepth
dimensions passed to the vkCmdTraceRaysNV command that initiated this shader execution. Thewidth
is in the first component, theheight
is in the second component, and thedepth
is in the third component.The
LaunchSizeNV
decoration must only be used within ray generation, intersection, any hit, closest hit, and miss shaders.Any variable decorated with
LaunchSizeNV
must be declared using theInput
storage class.Any variable decorated with
LaunchSizeNV
must be declared as a three-component vector of 32-bit integer values.
Layer
-
Decorating a variable with the
Layer
built-in decoration will make that variable contain the select layer of a multi-layer framebuffer attachment.In a mesh, vertex, tessellation evaluation, or geometry shader, any variable decorated with
Layer
can be written with the framebuffer layer index to which the primitive produced by that shader will be directed.The last active vertex processing stage (in pipeline order) controls the
Layer
that is used. Outputs in previous shader stages are not used, even if the last stage fails to write theLayer
.If the last active vertex processing stage shader entry point’s interface does not include a variable decorated with
Layer
, then the first layer is used. If a vertex processing stage shader entry point’s interface includes a variable decorated withLayer
, it must write the same value toLayer
for all output vertices of a given primitive. If theLayer
value is less than 0 or greater than or equal to the number of layers in the framebuffer, then primitives may still be rasterized, fragment shaders may be executed, and the framebuffer values for all layers are undefined.The
Layer
decoration must be used only within mesh, vertex, tessellation evaluation, geometry, and fragment shaders.In a mesh, vertex, tessellation evaluation, or geometry shader, any variable decorated with
Layer
must be declared using theOutput
storage class. If such a variable is also decorated withViewportRelativeNV
, then theViewportIndex
is added to the layer that is used for rendering and that is made available in the fragment shader. If the shader writes to a variable decoratedViewportMaskNV
, then the layer selected has a different value for each viewport a primitive is rendered to.In a fragment shader, a variable decorated with
Layer
contains the layer index of the primitive that the fragment invocation belongs to.In a fragment shader, any variable decorated with
Layer
must be declared using theInput
storage class.Any variable decorated with
Layer
must be declared as a scalar 32-bit integer.
LayerPerViewNV
-
Decorating a variable with the
LayerPerViewNV
built-in decoration will make that variable contain the per-view layer information. The per-view layer has the same semantics asLayer
, for each view.The
LayerPerViewNV
must only be used within mesh shaders.Any variable decorated with
LayerPerViewNV
must be declared using theOutput
storage class, and must also be decorated with thePerViewNV
decoration.Any variable decorated with
LayerPerViewNV
must be declared as an array of scalar 32-bit integer values. LocalInvocationId
-
Decorating a variable with the
LocalInvocationId
built-in decoration will make that variable contain the location of the current task, mesh, or compute shader invocation within the local workgroup. Each component ranges from zero through to the size of the workgroup in that dimension minus one.The
LocalInvocationId
decoration must be used only within task, mesh, or compute shaders.The variable decorated with
LocalInvocationId
must be declared using theInput
storage class.The variable decorated with
LocalInvocationId
must be declared as a three-component vector of 32-bit integers.
Note
If the size of the workgroup in a particular dimension is one, then the
|
LocalInvocationIndex
-
Decorating a variable with the
LocalInvocationIndex
built-in decoration will make that variable contain a one-dimensional representation ofLocalInvocationId
. This is computed as:LocalInvocationIndex = LocalInvocationId.z * WorkgroupSize.x * WorkgroupSize.y + LocalInvocationId.y * WorkgroupSize.x + LocalInvocationId.x;
The
LocalInvocationIndex
decoration must be used only within task, mesh, or compute shaders.The variable decorated with
LocalInvocationIndex
must be declared using theInput
storage class.The variable decorated with
LocalInvocationIndex
must be declared as a scalar 32-bit integer.
MeshViewCountNV
-
Decorating a variable with the
MeshViewCountNV
built-in decoration will make that variable contain the number of views processed by the current mesh or task shader invocations.The
MeshViewCountNV
decoration must only be used in task and mesh shaders.Any variable decorated with
MeshViewCountNV
must be declared using theInput
storage class.Any variable decorated with
MeshViewCountNV
must be declared as a scalar 32-bit integer.
MeshViewIndicesNV
-
Decorating a variable with the
MeshViewIndicesNV
built-in decoration will make that variable contain the mesh view indices. The mesh view indices is an array of values where each element holds the view number of one of the views being processed by the current mesh or task shader invocations. The array elements with indices great than or equal toMeshViewCountNV
are undefined. If the value ofMeshViewIndicesNV
[i] is j, then any outputs decorated withPerViewNV
will take on the value of array element i when processing primitives for view index j.The
MeshViewIndicesNV
decoration must only be used in task and mesh shaders.Any variable decorated with
MeshViewIndicesNV
must be declared using theInput
storage class.Any variable decorated with
MeshViewIndicesNV
must be declared as an array of scalar 32-bit integers. NumSubgroups
-
Decorating a variable with the
NumSubgroups
built-in decoration will make that variable contain the number of subgroups in the local workgroup.The
NumSubgroups
decoration must be used only within task, mesh, or compute shaders.The variable decorated with
NumSubgroups
must be declared using theInput
storage class.The object decorated with
NumSubgroups
must be declared as a scalar 32-bit integer. NumWorkgroups
-
Decorating a variable with the
NumWorkgroups
built-in decoration will make that variable contain the number of local workgroups that are part of the dispatch that the invocation belongs to. Each component is equal to the values of the workgroup count parameters passed into the dispatch commands.The
NumWorkgroups
decoration must be used only within compute shaders.The variable decorated with
NumWorkgroups
must be declared using theInput
storage class.The variable decorated with
NumWorkgroups
must be declared as a three-component vector of 32-bit integers.
ObjectRayDirectionNV
-
A variable decorated with the
ObjectRayDirectionNV
decoration will specify the direction of the ray being processed, in object space.The
ObjectRayDirectionNV
decoration must only be used within intersection, any hit, closest hit, and miss shaders.Any variable decorated with
ObjectRayDirectionNV
must be declared using theInput
storage class.Any variable decorated with
ObjectRayDirectionNV
must be declared as a three-component vector of 32-bit floating-point values.
ObjectRayOriginNV
-
A variable decorated with the
ObjectRayOriginNV
decoration will specify the origin of the ray being processed, in object space.The
ObjectRayOriginNV
decoration must only be used within intersection, any hit, closest hit, and miss shaders.Any variable decorated with
ObjectRayOriginNV
must be declared using theInput
storage class.Any variable decorated with
ObjectRayOriginNV
must be declared as a three-component vector of 32-bit floating-point values.
ObjectToWorldNV
-
A variable decorated with the
ObjectToWorldNV
decoration will contain the current object-to-world transformation matrix, which is determined by the instance of the current intersection.The
ObjectToWorldNV
decoration must only be used within intersection, any hit, and closest hit shaders.Any variable decorated with
ObjectToWorldNV
must be declared using theInput
storage class.Any variable decorated with
ObjectToWorldNV
must be declared as a matrix with four columns of three-component vectors of 32-bit floating-point values. PatchVertices
-
Decorating a variable with the
PatchVertices
built-in decoration will make that variable contain the number of vertices in the input patch being processed by the shader. A single tessellation control or tessellation evaluation shader can read patches of differing sizes, so the value of thePatchVertices
variable may differ between patches.The
PatchVertices
decoration must be used only within tessellation control and tessellation evaluation shaders.The variable decorated with
PatchVertices
must be declared using theInput
storage class.The variable decorated with
PatchVertices
must be declared as a scalar 32-bit integer. PointCoord
-
Decorating a variable with the
PointCoord
built-in decoration will make that variable contain the coordinate of the current fragment within the point being rasterized, normalized to the size of the point with origin in the upper left corner of the point, as described in Basic Point Rasterization. If the primitive the fragment shader invocation belongs to is not a point, then the variable decorated withPointCoord
contains an undefined value.The
PointCoord
decoration must be used only within fragment shaders.The variable decorated with
PointCoord
must be declared using theInput
storage class.The variable decorated with
PointCoord
must be declared as two-component vector of 32-bit floating-point values.
Note
Depending on how the point is rasterized, |
PointSize
-
Decorating a variable with the
PointSize
built-in decoration will make that variable contain the size of point primitives. The value written to the variable decorated withPointSize
by the last vertex processing stage in the pipeline is used as the framebuffer-space size of points produced by rasterization.The
PointSize
decoration must be used only within mesh, vertex, tessellation control, tessellation evaluation, and geometry shaders.In a mesh or vertex shader, any variable decorated with
PointSize
must be declared using theOutput
storage class.In a tessellation control, tessellation evaluation, or geometry shader, any variable decorated with
PointSize
must be declared using either theInput
orOutput
storage class.Any variable decorated with
PointSize
must be declared as a scalar 32-bit floating-point value.
Note
When |
Position
-
Decorating a variable with the
Position
built-in decoration will make that variable contain the position of the current vertex. In the last vertex processing stage, the value of the variable decorated withPosition
is used in subsequent primitive assembly, clipping, and rasterization operations.The
Position
decoration must be used only within mesh, vertex, tessellation control, tessellation evaluation, and geometry shaders.In a mesh or vertex shader, any variable decorated with
Position
must be declared using theOutput
storage class.In a tessellation control, tessellation evaluation, or geometry shader, any variable decorated with
Position
must not be declared in a storage class other thanInput
orOutput
.Any variable decorated with
Position
must be declared as a four-component vector of 32-bit floating-point values.
Note
When |
PositionPerViewNV
-
Decorating a variable with the
PositionPerViewNV
built-in decoration will make that variable contain the position of the current vertex, for each view.The
PositionPerViewNV
decoration must be used only within mesh, vertex, tessellation control, tessellation evaluation, and geometry shaders.In a vertex shader, any variable decorated with
PositionPerViewNV
must be declared using theOutput
storage class.In a tessellation control, tessellation evaluation, or geometry shader, any variable decorated with
PositionPerViewNV
must not be declared in a storage class other than input or output.Any variable decorated with
PositionPerViewNV
must be declared as an array of four-component vector of 32-bit floating-point values with at least as many elements as the maximum view in the subpass’s view mask plus one. The array must be indexed by a constant or specialization constant.Elements of the array correspond to views in a multiview subpass, and those elements corresponding to views in the view mask of the subpass the shader is compiled against will be used as the position value for those views. For the final vertex processing stage in the pipeline, values written to an output variable decorated with
PositionPerViewNV
are used in subsequent primitive assembly, clipping, and rasterization operations, as withPosition
.PositionPerViewNV
output in an earlier vertex processing stage is available as an input in the subsequent vertex processing stage.If a shader is compiled against a subpass that has the
VK_SUBPASS_DESCRIPTION_PER_VIEW_POSITION_X_ONLY_BIT_NVX
bit set, then the position values for each view must not differ in any component other than the X component. If the values do differ, one will be chosen in an implementation-dependent manner.
PrimitiveCountNV
-
Decorating a variable with the
PrimitiveCountNV
decoration will make that variable contain the primitive count. The primitive count specifies the number of primitives in the output mesh produced by the mesh shader that will be processed by subsequent pipeline stages.The
PrimitiveCountNV
decoration must only be used in mesh shaders.Any variable decorated with
PrimitiveCountNV
must be declared using theOutput
storage class.Any variable decorated with
PrimitiveCountNV
must be declared as a scalar 32-bit integer. PrimitiveId
-
Decorating a variable with the
PrimitiveId
built-in decoration will make that variable contain the index of the current primitive.The index of the first primitive generated by a drawing command is zero, and the index is incremented after every individual point, line, or triangle primitive is processed.
For triangles drawn as points or line segments (see Polygon Mode), the primitive index is incremented only once, even if multiple points or lines are eventually drawn.
Variables decorated with
PrimitiveId
are reset to zero between each instance drawn.Restarting a primitive topology using primitive restart has no effect on the value of variables decorated with
PrimitiveId
.In tessellation control and tessellation evaluation shaders, it will contain the index of the patch within the current set of rendering primitives that correspond to the shader invocation.
In a geometry shader, it will contain the number of primitives presented as input to the shader since the current set of rendering primitives was started.
In a fragment shader, it will contain the primitive index written by the geometry shader if a geometry shader is present, or with the value that would have been presented as input to the geometry shader had it been present.
In an intersection, any hit, or closest hit shader, it will contain the index of the triangle or bounding box being processed.
If a geometry shader is present and the fragment shader reads from an input variable decorated with
PrimitiveId
, then the geometry shader must write to an output variable decorated withPrimitiveId
in all execution paths.If a mesh shader is present and the fragment shader reads from an input variable decorated with
PrimitiveId
, then the mesh shader must write to the output variables decorated withPrimitiveId
in all execution paths.The
PrimitiveId
decoration must be used only within mesh, intersection, any hit, closest hit, fragment, tessellation control, tessellation evaluation, and geometry shaders.In an intersection, any hit, closest hit, tessellation control, or tessellation evaluation shader, any variable decorated with
PrimitiveId
must be declared using theInput
storage class.In a geometry shader, any variable decorated with
PrimitiveId
must be declared using either theInput
orOutput
storage class.In a mesh shader, any variable decorated with
PrimitiveId
must be declared using theOutput
storage class.In a fragment shader, any variable decorated with
PrimitiveId
must be declared using theInput
storage class, and either theGeometry
orTessellation
capability must also be declared.Any variable decorated with
PrimitiveId
must be declared as a scalar 32-bit integer.
Note
When the |
PrimitiveIndicesNV
-
Decorating a variable with the
PrimitiveIndicesNV
decoration will make that variable contain the output array of vertex index values. Depending on the output primitive type declared using the execution mode, the indices are split into groups of one (OutputPoints
), two (OutputLinesNV
), or three (OutputTriangles
) indices and each group generates a primitive.All index values must be in the range [0, N-1], where N is the value specified by the
OutputVertices
execution mode. Out-of-bounds index values result in undefined behavior.The
PrimitiveIndicesNV
decoration must only be used in mesh shaders.Any variable decorated with
PrimitiveIndicesNV
must be declared using theOutput
storage class.Any variable decorated with
PrimitiveIndicesNV
must be declared as an array of scalar 32-bit integers. The array must be sized according to the primitive type andOutputPrimitivesNV
execution modes, where the size is:-
the value specified by
OutputPrimitivesNV
if the execution mode isOutputPoints
, -
two times the value specified by
OutputPrimitivesNV
if the execution mode isOutputLinesNV
, or -
three times the value specified by
OutputPrimitivesNV
if the execution mode isOutputTrianglesNV
.
-
RayTmaxNV
-
A variable decorated with the
RayTmaxNV
decoration will contain the parametrictmax
values of the ray being processed. The values are independent of the space in which the ray and origin exist.The
tmax
value changes throughout the lifetime of the ray query that produced the intersection. In the closest hit shader, the value reflects the closest distance to the intersected primitive. In the any hit shader, it reflects the distance to the primitive currently being intersected. In the intersection shader, it reflects the distance to the closest primitive intersected so far. The value can change in the intersection shader after callingOpReportIntersectionNV
if the corresponding any hit shader does not ignore the intersection. In a miss shader, the value is identical to the parameter passed intoOpTraceNV
.The
RayTmaxNV
decoration must only be used with the intersection, any hit, closest hit, and miss shaders.Any variable decorated with
RayTmaxNV
must be declared with theInput
storage class.Any variable decorated with
RayTmaxNV
must be declared as a scalar 32-bit floating-point value.
RayTminNV
-
A variable decorated with the
RayTminNV
decoration will contain the parametrictmin
values of the ray being processed. The values are independent of the space in which the ray and origin exist.The
tmin
value remains constant for the duration of the ray query.The
RayTminNV
decoration must only be used with the intersection, any hit, closest hit, and miss shaders.Any variable decorated with
RayTminNV
must be declared with theInput
storage class.Any variable decorated with
RayTminNV
must be declared as a scalar 32-bit floating-point value. SampleId
-
Decorating a variable with the
SampleId
built-in decoration will make that variable contain the zero-based index of the sample the invocation corresponds to.SampleId
ranges from zero to the number of samples in the framebuffer minus one. If a fragment shader entry point’s interface includes an input variable decorated withSampleId
, Sample Shading is considered enabled with aminSampleShading
value of 1.0.The
SampleId
decoration must be used only within fragment shaders.The variable decorated with
SampleId
must be declared using theInput
storage class.The variable decorated with
SampleId
must be declared as a scalar 32-bit integer.
SampleMask
-
Decorating a variable with the
SampleMask
built-in decoration will make any variable contain the sample coverage mask for the current fragment shader invocation.A variable in the
Input
storage class decorated withSampleMask
will contain a bitmask of the set of samples covered by the primitive generating the fragment during rasterization. It has a sample bit set if and only if the sample is considered covered for this fragment shader invocation.SampleMask
[] is an array of integers. Bits are mapped to samples in a manner where bit B of mask M (SampleMask[M]
) corresponds to sample 32 × M + B.When state specifies multiple fragment shader invocations for a given fragment, the sample mask for any single fragment shader invocation specifies the subset of the covered samples for the fragment that correspond to the invocation. In this case, the bit corresponding to each covered sample will be set in exactly one fragment shader invocation.
If the
PostDepthCoverage
execution mode is specified, the sample is considered covered if and only if the sample is covered by the primitive and the sample passes the early per-fragment tests. Otherwise the sample is considered covered if the sample is covered by the primitive, regardless of the result of the fragment tests.A variable in the
Output
storage class decorated withSampleMask
is an array of integers forming a bit array in a manner similar an input variable decorated withSampleMask
, but where each bit represents coverage as computed by the shader. Modifying the sample mask by writing zero to a bit ofSampleMask
causes the sample to be considered uncovered. If this variable is also decorated withOverrideCoverageNV
, the fragment coverage is replaced with the sample mask bits set in the shader otherwise the fragment coverage isANDed
with the bits of the sample mask. If the fragment shader is being evaluated at any frequency other than per-fragment, bits of the sample mask not corresponding to the current fragment shader invocation are ignored. This array must be sized in the fragment shader either implicitly or explicitly, to be no larger than the implementation-dependent maximum sample-mask (as an array of 32-bit elements), determined by the maximum number of samples. If a fragment shader entry point’s interface includes an output variable decorated withSampleMask
, the sample mask will be undefined for any array elements of any fragment shader invocations that fail to assign a value. If a fragment shader entry point’s interface does not include an output variable decorated withSampleMask
, the sample mask has no effect on the processing of a fragment.The
SampleMask
decoration must be used only within fragment shaders.Any variable decorated with
SampleMask
must be declared using either theInput
orOutput
storage class.Any variable decorated with
SampleMask
must be declared as an array of 32-bit integers. SamplePosition
-
Decorating a variable with the
SamplePosition
built-in decoration will make that variable contain the sub-pixel position of the sample being shaded. The top left of the pixel is considered to be at coordinate (0,0) and the bottom right of the pixel is considered to be at coordinate (1,1).
If the render pass has a fragment density map attachment, the variable will instead contain the sub-fragment position of the sample being shaded. The top left of the fragment is considered to be at coordinate (0,0) and the bottom right of the fragment is considered to be at coordinate (1,1) for any fragment area.
If a fragment shader entry point’s interface includes an input variable
decorated with SamplePosition
, Sample
Shading is considered enabled with a minSampleShading
value of 1.0.
+
The SamplePosition
decoration must be used only within fragment
shaders.
+
The variable decorated with SamplePosition
must be declared using the
Input
storage class.
If the current pipeline uses custom sample
locations the value of any variable decorated with the SamplePosition
built-in decoration is undefined.
+
The variable decorated with SamplePosition
must be declared as a
two-component vector of 32-bit floating-point values.
SubgroupId
-
Decorating a variable with the
SubgroupId
built-in decoration will make that variable contain the index of the subgroup within the local workgroup. This variable is in range [0,NumSubgroups
-1].The
SubgroupId
decoration must be used only within task, mesh or, compute shaders.The variable decorated with
SubgroupId
must be declared using theInput
storage class.The variable decorated with
SubgroupId
must be declared as a scalar 32-bit integer.
SubgroupEqMask
-
Decorating a variable with the
SubgroupEqMask
builtin decoration will make that variable contain the subgroup mask of the current subgroup invocation. The bit corresponding to theSubgroupLocalInvocationId
is set in the variable decorated withSubgroupEqMask
. All other bits are set to zero.The variable decorated with
SubgroupEqMask
must be declared using theInput
storage class.The variable decorated with
SubgroupEqMask
must be declared as a four-component vector of 32-bit integer values.SubgroupEqMaskKHR
is an alias ofSubgroupEqMask
.
SubgroupGeMask
-
Decorating a variable with the
SubgroupGeMask
builtin decoration will make that variable contain the subgroup mask of the current subgroup invocation. The bits corresponding to the invocations greater than or equal toSubgroupLocalInvocationId
throughSubgroupSize
-1 are set in the variable decorated withSubgroupGeMask
. All other bits are set to zero.The variable decorated with
SubgroupGeMask
must be declared using theInput
storage class.The variable decorated with
SubgroupGeMask
must be declared as a four-component vector of 32-bit integer values.SubgroupGeMaskKHR
is an alias ofSubgroupGeMask
.
SubgroupGtMask
-
Decorating a variable with the
SubgroupGtMask
builtin decoration will make that variable contain the subgroup mask of the current subgroup invocation. The bits corresponding to the invocations greater thanSubgroupLocalInvocationId
throughSubgroupSize
-1 are set in the variable decorated withSubgroupGtMask
. All other bits are set to zero.The variable decorated with
SubgroupGtMask
must be declared using theInput
storage class.The variable decorated with
SubgroupGtMask
must be declared as a four-component vector of 32-bit integer values.SubgroupGtMaskKHR
is an alias ofSubgroupGtMask
.
SubgroupLeMask
-
Decorating a variable with the
SubgroupLeMask
builtin decoration will make that variable contain the subgroup mask of the current subgroup invocation. The bits corresponding to the invocations less than or equal toSubgroupLocalInvocationId
are set in the variable decorated withSubgroupLeMask
. All other bits are set to zero.The variable decorated with
SubgroupLeMask
must be declared using theInput
storage class.The variable decorated with
SubgroupLeMask
must be declared as a four-component vector of 32-bit integer values.SubgroupLeMaskKHR
is an alias ofSubgroupLeMask
.
SubgroupLtMask
-
Decorating a variable with the
SubgroupLtMask
builtin decoration will make that variable contain the subgroup mask of the current subgroup invocation. The bits corresponding to the invocations less thanSubgroupLocalInvocationId
are set in the variable decorated withSubgroupLtMask
. All other bits are set to zero.The variable decorated with
SubgroupLtMask
must be declared using theInput
storage class.The variable decorated with
SubgroupLtMask
must be declared as a four-component vector of 32-bit integer values.SubgroupLtMaskKHR
is an alias ofSubgroupLtMask
.
SubgroupLocalInvocationId
-
Decorating a variable with the
SubgroupLocalInvocationId
builtin decoration will make that variable contain the index of the invocation within the subgroup. This variable is in range [0,SubgroupSize
-1].The variable decorated with
SubgroupLocalInvocationId
must be declared using theInput
storage class.The variable decorated with
SubgroupLocalInvocationId
must be declared as a scalar 32-bit integer.
SubgroupSize
-
Decorating a variable with the
SubgroupSize
builtin decoration will make that variable contain the implementation-dependent maximum number of invocations in a subgroup. The maximum number of invocations that an implementation can support per subgroup is 128.The variable decorated with
SubgroupSize
must be declared using theInput
storage class.The variable decorated with
SubgroupSize
must be declared as a scalar 32-bit integer.
TaskCountNV
-
Decorating a variable with the
TaskCountNV
decoration will make that variable contain the task count. The task count specifies the number of subsequent mesh shader workgroups that get generated upon completion of the task shader.The
TaskCountNV
decoration must only be used in task shaders.Any variable decorated with
TaskCountNV
must be declared using theOutput
storage class.Any variable decorated with
TaskCountNV
must be declared as a scalar 32-bit integer. TessCoord
-
Decorating a variable with the
TessCoord
built-in decoration will make that variable contain the three-dimensional (u,v,w) barycentric coordinate of the tessellated vertex within the patch. u, v, and w are in the range [0,1] and vary linearly across the primitive being subdivided. For the tessellation modes ofQuads
orIsoLines
, the third component is always zero.The
TessCoord
decoration must be used only within tessellation evaluation shaders.The variable decorated with
TessCoord
must be declared using theInput
storage class.The variable decorated with
TessCoord
must be declared as three-component vector of 32-bit floating-point values. TessLevelOuter
-
Decorating a variable with the
TessLevelOuter
built-in decoration will make that variable contain the outer tessellation levels for the current patch.In tessellation control shaders, the variable decorated with
TessLevelOuter
can be written to which controls the tessellation factors for the resulting patch. These values are used by the tessellator to control primitive tessellation and can be read by tessellation evaluation shaders.In tessellation evaluation shaders, the variable decorated with
TessLevelOuter
can read the values written by the tessellation control shader.The
TessLevelOuter
decoration must be used only within tessellation control and tessellation evaluation shaders.In a tessellation control shader, any variable decorated with
TessLevelOuter
must be declared using theOutput
storage class.In a tessellation evaluation shader, any variable decorated with
TessLevelOuter
must be declared using theInput
storage class.Any variable decorated with
TessLevelOuter
must be declared as an array of size four, containing 32-bit floating-point values. TessLevelInner
-
Decorating a variable with the
TessLevelInner
built-in decoration will make that variable contain the inner tessellation levels for the current patch.In tessellation control shaders, the variable decorated with
TessLevelInner
can be written to, which controls the tessellation factors for the resulting patch. These values are used by the tessellator to control primitive tessellation and can be read by tessellation evaluation shaders.In tessellation evaluation shaders, the variable decorated with
TessLevelInner
can read the values written by the tessellation control shader.The
TessLevelInner
decoration must be used only within tessellation control and tessellation evaluation shaders.In a tessellation control shader, any variable decorated with
TessLevelInner
must be declared using theOutput
storage class.In a tessellation evaluation shader, any variable decorated with
TessLevelInner
must be declared using theInput
storage class.Any variable decorated with
TessLevelInner
must be declared as an array of size two, containing 32-bit floating-point values. VertexIndex
-
Decorating a variable with the
VertexIndex
built-in decoration will make that variable contain the index of the vertex that is being processed by the current vertex shader invocation. For non-indexed draws, this variable begins at thefirstVertex
parameter to vkCmdDraw or thefirstVertex
member of a structure consumed by vkCmdDrawIndirect and increments by one for each vertex in the draw. For indexed draws, its value is the content of the index buffer for the vertex plus thevertexOffset
parameter to vkCmdDrawIndexed or thevertexOffset
member of the structure consumed by vkCmdDrawIndexedIndirect.The
VertexIndex
decoration must be used only within vertex shaders.The variable decorated with
VertexIndex
must be declared using theInput
storage class.The variable decorated with
VertexIndex
must be declared as a scalar 32-bit integer.
Note
|
ViewIndex
-
The
ViewIndex
decoration can be applied to a shader input which will be filled with the index of the view that is being processed by the current shader invocation.If multiview is enabled in the render pass, this value will be one of the bits set in the view mask of the subpass the pipeline is compiled against. If multiview is not enabled in the render pass, this value will be zero.
The
ViewIndex
decoration must not be used within compute shaders.The variable decorated with
ViewIndex
must be declared using theInput
storage class.The variable decorated with
ViewIndex
must be declared as a scalar 32-bit integer.
ViewportIndex
-
Decorating a variable with the
ViewportIndex
built-in decoration will make that variable contain the index of the viewport.In a mesh, vertex, tessellation evaluation, or geometry shader, the variable decorated with
ViewportIndex
can be written to with the viewport index to which the primitive produced by that shader will be directed.The selected viewport index is used to select the viewport transform, scissor rectangle, and exclusive scissor rectangle.
The last active vertex processing stage (in pipeline order) controls the
ViewportIndex
that is used. Outputs in previous shader stages are not used, even if the last stage fails to write theViewportIndex
.If the last active vertex processing stage shader entry point’s interface does not include a variable decorated with
ViewportIndex
, then the first viewport is used. If a vertex processing stage shader entry point’s interface includes a variable decorated withViewportIndex
, it must write the same value toViewportIndex
for all output vertices of a given primitive.The
ViewportIndex
decoration must be used only within mesh, vertex, tessellation evaluation, geometry, and fragment shaders.In a mesh, vertex, tessellation evaluation, or geometry shader, any variable decorated with
ViewportIndex
must be declared using theOutput
storage class.In a fragment shader, the variable decorated with
ViewportIndex
contains the viewport index of the primitive that the fragment invocation belongs to.In a fragment shader, any variable decorated with
ViewportIndex
must be declared using theInput
storage class.Any variable decorated with
ViewportIndex
must be declared as a scalar 32-bit integer.
ViewportMaskNV
-
Decorating a variable with the
ViewportMaskNV
built-in decoration will make that variable contain the viewport mask.In a mesh, vertex, tessellation evaluation, or geometry shader, the variable decorated with
ViewportMaskNV
can be written to with the mask of which viewports the primitive produced by that shader will directed.The
ViewportMaskNV
variable must be an array that has ⌈(VkPhysicalDeviceLimits
::maxViewports
/ 32)⌉ elements. When a shader writes to this variable, bit B of element M controls whether a primitive is emitted to viewport 32 × M +B. The viewports indicated by the mask are used to select the viewport transform, scissor rectangle, and exclusive scissor rectangle that a primitive will be transformed by.The last active vertex processing stage (in pipeline order) controls the
ViewportMaskNV
that is used. Outputs in previous shader stages are not used, even if the last stage fails to write theViewportMaskNV
. WhenViewportMaskNV
is written by the final vertex processing stage, any variable decorated withViewportIndex
in the fragment shader will have the index of the viewport that was used in generating that fragment.If a vertex processing stage shader entry point’s interface includes a variable decorated with
ViewportMaskNV
, it must write the same value toViewportMaskNV
for all output vertices of a given primitive.The
ViewportMaskNV
decoration must be used only within mesh, vertex, tessellation evaluation, and geometry shaders.Any variable decorated with
ViewportMaskNV
must be declared using theOutput
storage class.Any variable decorated with
ViewportMaskNV
must be declared as an array of 32-bit integers.
ViewportMaskPerViewNV
-
Decorating a variable with the
ViewportMaskPerViewNV
built-in decoration will make that variable contain the mask of viewports primitives are broadcast to, for each view.The
ViewportMaskPerViewNV
decoration must be used only within mesh, vertex, tessellation control, tessellation evaluation, and geometry shaders.Any variable decorated with
ViewportMaskPerViewNV
must be declared using theOutput
storage class.The value written to an element of
ViewportMaskPerViewNV
in the last vertex processing stage is a bitmask indicating which viewports the primitive will be directed to. The primitive will be broadcast to the viewport corresponding to each non-zero bit of the bitmask, and that viewport index is used to select the viewport transform, scissor rectangle, and exclusive scissor rectangle, for each view. The same values must be written to all vertices in a given primitive, or else the set of viewports used for that primitive is undefined.Any variable decorated with
ViewportMaskPerViewNV
must be declared as an array of scalar 32-bit integers with at least as many elements as the maximum view in the subpass’s view mask plus one. The array must be indexed by a constant or specialization constant.Elements of the array correspond to views in a multiview subpass, and those elements corresponding to views in the view mask of the subpass the shader is compiled against will be used as the viewport mask value for those views.
ViewportMaskPerViewNV
output in an earlier vertex processing stage is not available as an input in the subsequent vertex processing stage.Although
ViewportMaskNV
is an array,ViewportMaskPerViewNV
is not a two-dimensional array. Instead,ViewportMaskPerViewNV
is limited to 32 viewports. WorkgroupId
-
Decorating a variable with the
WorkgroupId
built-in decoration will make that variable contain the global workgroup that the current invocation is a member of. Each component ranges from a base value to a base + count value, based on the parameters passed into the dispatch commands.The
WorkgroupId
decoration must be used only within task, mesh, or compute shaders.The variable decorated with
WorkgroupId
must be declared using theInput
storage class.The variable decorated with
WorkgroupId
must be declared as a three-component vector of 32-bit integers. WorkgroupSize
-
Decorating an object with the
WorkgroupSize
built-in decoration will make that object contain the dimensions of a local workgroup. If an object is decorated with theWorkgroupSize
decoration, this must take precedence over any execution mode set forLocalSize
.The
WorkgroupSize
decoration must be used only within task, mesh, or compute shaders.The object decorated with
WorkgroupSize
must be a specialization constant or a constant.The object decorated with
WorkgroupSize
must be declared as a three-component vector of 32-bit integers.
WorldRayDirectionNV
-
A variable decorated with the
WorldRayDirectionNV
decoration will specify the direction of the ray being processed, in world space.The
WorldRayDirectionNV
decoration must only be used within intersection, any hit, closest hit, and miss shaders.Any variable decorated with
WorldRayDirectionNV
must be declared using theInput
storage class.Any variable decorated with
WorldRayDirectionNV
must be declared as a three-component vector of 32-bit floating-point values.
WorldRayOriginNV
-
A variable decorated with the
WorldRayOriginNV
decoration will specify the origin of the ray being processed, in world space.The
WorldRayOriginNV
decoration must only be used within intersection, any hit, closest hit, and miss shaders.Any variable decorated with
WorldRayOriginNV
must be declared using theInput
storage class.Any variable decorated with
WorldRayOriginNV
must be declared as a three-component vector of 32-bit floating-point values.
WorldToObjectNV
-
A variable decorated with the
WorldToObjectNV
decoration will contain the current world-to-object transformation matrix, which is determined by the instance of the current intersection.The
WorldToObjectNV
decoration must only be used within intersection, any hit, and closest hit shaders.Any variable decorated with
WorldToObjectNV
must be declared using theInput
storage class.Any variable decorated with
WorldToObjectNV
must be declared as a matrix with four columns of three-component vectors of 32-bit floating-point values.
15. Image Operations
15.1. Image Operations Overview
Image Operations are steps performed by SPIR-V image instructions, where
those instructions which take an OpTypeImage
(representing a
VkImageView
) or OpTypeSampledImage
(representing a
(VkImageView
, VkSampler
) pair) and texel coordinates as
operands, and return a value based on one or more neighboring texture
elements (texels) in the image.
Note
Texel is a term which is a combination of the words texture and element. Early interactive computer graphics supported texture operations on textures, a small subset of the image operations on images described here. The discrete samples remain essentially equivalent, however, so we retain the historical term texel to refer to them. |
SPIR-V Image Instructions include the following functionality:
-
OpImageSample
* andOpImageSparseSample
* read one or more neighboring texels of the image, and filter the texel values based on the state of the sampler.-
Instructions with
ImplicitLod
in the name determine the LOD used in the sampling operation based on the coordinates used in neighboring fragments. -
Instructions with
ExplicitLod
in the name determine the LOD used in the sampling operation based on additional coordinates. -
Instructions with
Proj
in the name apply homogeneous projection to the coordinates.
-
-
OpImageFetch
andOpImageSparseFetch
return a single texel of the image. No sampler is used. -
OpImage
*Gather
andOpImageSparse
*Gather
read neighboring texels and return a single component of each. -
OpImageRead
(andOpImageSparseRead
) andOpImageWrite
read and write, respectively, a texel in the image. No sampler is used. -
OpImageSampleFootprintNV
identifies and returns information about the set of texels in the image that would be accessed by an equivalentOpImageSample
* instruction. -
Instructions with
Dref
in the name apply depth comparison on the texel values. -
Instructions with
Sparse
in the name additionally return a sparse residency code.
15.1.1. Texel Coordinate Systems
Images are addressed by texel coordinates. There are three texel coordinate systems:
-
normalized texel coordinates [0.0, 1.0]
-
unnormalized texel coordinates [0.0, width / height / depth)
-
integer texel coordinates [0, width / height / depth)
SPIR-V OpImageFetch
, OpImageSparseFetch
, OpImageRead
,
OpImageSparseRead
, and OpImageWrite
instructions use integer texel
coordinates.
Other image instructions can use either normalized or unnormalized texel
coordinates (selected by the unnormalizedCoordinates
state of the
sampler used in the instruction), but there are
limitations on what operations, image
state, and sampler state is supported.
Normalized coordinates are logically
converted to unnormalized as part of
image operations, and certain steps are
only performed on normalized coordinates.
The array layer coordinate is always treated as unnormalized even when other
coordinates are normalized.
Normalized texel coordinates are referred to as (s,t,r,q,a), with the coordinates having the following meanings:
-
s: Coordinate in the first dimension of an image.
-
t: Coordinate in the second dimension of an image.
-
r: Coordinate in the third dimension of an image.
-
(s,t,r) are interpreted as a direction vector for Cube images.
-
-
q: Fourth coordinate, for homogeneous (projective) coordinates.
-
a: Coordinate for array layer.
The coordinates are extracted from the SPIR-V operand based on the
dimensionality of the image variable and type of instruction.
For Proj
instructions, the components are in order (s, [t,] [r,] q)
with t and r being conditionally present based on the Dim
of the image.
For non-Proj
instructions, the coordinates are (s [,t] [,r] [,a]), with
t and r being conditionally present based on the Dim
of the image and a
being conditionally present based on the Arrayed
property of the image.
Projective image instructions are not supported on Arrayed
images.
Unnormalized texel coordinates are referred to as (u,v,w,a), with the coordinates having the following meanings:
-
u: Coordinate in the first dimension of an image.
-
v: Coordinate in the second dimension of an image.
-
w: Coordinate in the third dimension of an image.
-
a: Coordinate for array layer.
Only the u and v coordinates are directly extracted from the
SPIR-V operand, because only 1D and 2D (non-Arrayed
) dimensionalities
support unnormalized coordinates.
The components are in order (u [,v]), with v being conditionally
present when the dimensionality is 2D.
When normalized coordinates are converted to unnormalized coordinates, all
four coordinates are used.
Integer texel coordinates are referred to as (i,j,k,l,n), and the
first four in that order have the same meanings as unnormalized texel
coordinates.
They are extracted from the SPIR-V operand in order (i, [,j], [,k],
[,l]), with j and k conditionally present based on the Dim
of the image, and l conditionally present based on the Arrayed
property
of the image.
n is the sample index and is taken from the Sample
image operand.
For all coordinate types, unused coordinates are assigned a value of zero.
The Texel Coordinate Systems - For the example shown of an 8×4 texel two dimensional image.
-
Normalized texel coordinates:
-
The s coordinate goes from 0.0 to 1.0, left to right.
-
The t coordinate goes from 0.0 to 1.0, top to bottom.
-
-
Unnormalized texel coordinates:
-
The u coordinate goes from -1.0 to 9.0, left to right. The u coordinate within the range 0.0 to 8.0 is within the image, otherwise it is within the border.
-
The v coordinate goes from -1.0 to 5.0, top to bottom. The v coordinate within the range 0.0 to 4.0 is within the image, otherwise it is within the border.
-
-
Integer texel coordinates:
-
The i coordinate goes from -1 to 8, left to right. The i coordinate within the range 0 to 7 addresses texels within the image, otherwise it addresses a border texel.
-
The j coordinate goes from -1 to 5, top to bottom. The j coordinate within the range 0 to 3 addresses texels within the image, otherwise it addresses a border texel.
-
-
Also shown for linear filtering:
-
Given the unnormalized coordinates (u,v), the four texels selected are i0j0, i1j0, i0j1, and i1j1.
-
The weights α and β.
-
Given the offset Δi and Δj, the four texels selected by the offset are i0j'0, i1j'0, i0j'1, and i1j'1.
-
Note
For formats with reduced-resolution channels, Δi and Δj are relative to the resolution of the highest-resolution channel, and therefore may be divided by two relative to the unnormalized coordinate space of the lower-resolution channels. |
The Texel Coordinate Systems - For the example shown of an 8×4 texel two dimensional image.
-
Texel coordinates as above. Also shown for nearest filtering:
-
Given the unnormalized coordinates (u,v), the texel selected is ij.
-
Given the offset Δi and Δj, the texel selected by the offset is ij'.
-
For corner-sampled images, the texel samples are located at the grid intersections instead of the texel centers.
editing-note
(dgkoch) TODO add a diagram demonstrating this. |
15.2. Conversion Formulas
editing-note
(Bill) These Conversion Formulas will likely move to Section 2.7 Fixed-Point Data Conversions (RGB to sRGB and sRGB to RGB) and section 2.6 Numeric Representation and Computation (RGB to Shared Exponent and Shared Exponent to RGB) |
15.2.1. RGB to Shared Exponent Conversion
An RGB color (red, green, blue) is transformed to a shared exponent color (redshared, greenshared, blueshared, expshared) as follows:
First, the components (red, green, blue) are clamped to (redclamped, greenclamped, blueclamped) as:
-
redclamped = max(0, min(sharedexpmax, red))
-
greenclamped = max(0, min(sharedexpmax, green))
-
blueclamped = max(0, min(sharedexpmax, blue))
Where:
Note
NaN, if supported, is handled as in IEEE 754-2008
|
The largest clamped component, maxclamped is determined:
-
maxclamped = max(redclamped, greenclamped, blueclamped)
A preliminary shared exponent exp' is computed:
The shared exponent expshared is computed:
Finally, three integer values in the range 0 to 2N are computed:
15.2.2. Shared Exponent to RGB
A shared exponent color (redshared, greenshared, blueshared, expshared) is transformed to an RGB color (red, green, blue) as follows:
-
\(red = red_{shared} \times {2^{(exp_{shared}-B-N)}}\)
-
\(green = green_{shared} \times {2^{(exp_{shared}-B-N)}}\)
-
\(blue = blue_{shared} \times {2^{(exp_{shared}-B-N)}}\)
Where:
-
N = 9 (number of mantissa bits per component)
-
B = 15 (exponent bias)
15.3. Texel Input Operations
Texel input instructions are SPIR-V image instructions that read from an image. Texel input operations are a set of steps that are performed on state, coordinates, and texel values while processing a texel input instruction, and which are common to some or all texel input instructions. They include the following steps, which are performed in the listed order:
For texel input instructions involving multiple texels (for sampling or gathering), these steps are applied for each texel that is used in the instruction. Depending on the type of image instruction, other steps are conditionally performed between these steps or involving multiple coordinate or texel values.
If Chroma Reconstruction is implicit, Texel Filtering instead takes place during chroma reconstruction, before sampler Y’CBCR conversion occurs.
15.3.1. Texel Input Validation Operations
Texel input validation operations inspect instruction/image/sampler state or coordinates, and in certain circumstances cause the texel value to be replaced or become undefined. There are a series of validations that the texel undergoes.
Instruction/Sampler/Image View Validation
There are a number of cases where a SPIR-V instruction can mismatch with the sampler, the image view, or both. There are a number of cases where the sampler can mismatch with the image view. In such cases the value of the texel returned is undefined.
These cases include:
-
The sampler
borderColor
is an integer type and the image viewformat
is not one of the VkFormat integer types or a stencil component of a depth/stencil format. -
The sampler
borderColor
is a float type and the image viewformat
is not one of the VkFormat float types or a depth component of a depth/stencil format. -
The sampler
borderColor
is one of the opaque black colors (VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK
orVK_BORDER_COLOR_INT_OPAQUE_BLACK
) and the image view VkComponentSwizzle for any of the VkComponentMapping components is notVK_COMPONENT_SWIZZLE_IDENTITY
. -
The VkImageLayout of any subresource in the image view does not match that specified in VkDescriptorImageInfo::
imageLayout
used to write the image descriptor. -
If the instruction is
OpImageRead
orOpImageSparseRead
and theshaderStorageImageReadWithoutFormat
feature is not enabled, or the instruction isOpImageWrite
and theshaderStorageImageWriteWithoutFormat
feature is not enabled, then the SPIR-V Image Format must be compatible with the image view’sformat
. -
The sampler
unnormalizedCoordinates
isVK_TRUE
and any of the limitations of unnormalized coordinates are violated. -
The sampler was created with
flags
containingVK_SAMPLER_CREATE_SUBSAMPLED_BIT_EXT
and the image was not created withflags
containingVK_IMAGE_CREATE_SUBSAMPLED_BIT_EXT
. -
The sampler was not created with
flags
containingVK_SAMPLER_CREATE_SUBSAMPLED_BIT_EXT
and the image was created withflags
containingVK_IMAGE_CREATE_SUBSAMPLED_BIT_EXT
. -
The sampler was created with
flags
containingVK_SAMPLER_CREATE_SUBSAMPLED_BIT_EXT
and is used with a function that is notOpImageSampleImplicitLod
orOpImageSampleExplicitLod
, or is used with operandsOffset
orConstOffsets
. -
The SPIR-V instruction is one of the
OpImage
*Dref
* instructions and the samplercompareEnable
isVK_FALSE
-
The SPIR-V instruction is not one of the
OpImage
*Dref
* instructions and the samplercompareEnable
isVK_TRUE
-
The SPIR-V instruction is one of the
OpImage
*Dref
* instructions and the image viewformat
is not one of the depth/stencil formats with a depth component, or the image view aspect is notVK_IMAGE_ASPECT_DEPTH_BIT
. -
The SPIR-V instruction’s image variable’s properties are not compatible with the image view:
-
Rules for
viewType
:-
VK_IMAGE_VIEW_TYPE_1D
must haveDim
= 1D,Arrayed
= 0,MS
= 0. -
VK_IMAGE_VIEW_TYPE_2D
must haveDim
= 2D,Arrayed
= 0. -
VK_IMAGE_VIEW_TYPE_3D
must haveDim
= 3D,Arrayed
= 0,MS
= 0. -
VK_IMAGE_VIEW_TYPE_CUBE
must haveDim
= Cube,Arrayed
= 0,MS
= 0. -
VK_IMAGE_VIEW_TYPE_1D_ARRAY
must haveDim
= 1D,Arrayed
= 1,MS
= 0. -
VK_IMAGE_VIEW_TYPE_2D_ARRAY
must haveDim
= 2D,Arrayed
= 1. -
VK_IMAGE_VIEW_TYPE_CUBE_ARRAY
must haveDim
= Cube,Arrayed
= 1,MS
= 0.
-
-
If the image was created with VkImageCreateInfo::
samples
equal toVK_SAMPLE_COUNT_1_BIT
, the instruction must haveMS
= 0. -
If the image was created with VkImageCreateInfo::
samples
not equal toVK_SAMPLE_COUNT_1_BIT
, the instruction must haveMS
= 1.
-
-
If the image was created with VkImageCreateInfo::
flags
containingVK_IMAGE_CREATE_CORNER_SAMPLED_BIT_NV
, the sampler addressing modes must only use a VkSamplerAddressMode ofVK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
. -
The SPIR-V instruction is
OpImageSampleFootprintNV
withDim
= 2D andaddressModeU
oraddressModeV
in the sampler is notVK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
. -
The SPIR-V instruction is
OpImageSampleFootprintNV
withDim
= 3D andaddressModeU
,addressModeV
, oraddressModeW
in the sampler is notVK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
.
Only OpImageSample
* and OpImageSparseSample
* can be used with a
sampler that enables sampler Y’CBCR
conversion.
OpImageFetch
, OpImageSparseFetch
, OpImage
*Gather
, and
OpImageSparse
*Gather
must not be used with a sampler that enables
sampler Y'CBCR conversion.
The ConstOffset
and Offset
operands must not be used with a
sampler that enables sampler Y’CBCR
conversion.
Integer Texel Coordinate Validation
Integer texel coordinates are validated against the size of the image level, and the number of layers and number of samples in the image. For SPIR-V instructions that use integer texel coordinates, this is performed directly on the integer coordinates. For instructions that use normalized or unnormalized texel coordinates, this is performed on the coordinates that result after conversion to integer texel coordinates.
If the integer texel coordinates do not satisfy all of the conditions
-
0 ≤ i < ws
-
0 ≤ j < hs
-
0 ≤ k < ds
-
0 ≤ l < layers
-
0 ≤ n < samples
where:
-
ws = width of the image level
-
hs = height of the image level
-
ds = depth of the image level
-
layers = number of layers in the image
-
samples = number of samples per texel in the image
then the texel fails integer texel coordinate validation.
There are four cases to consider:
-
Valid Texel Coordinates
-
If the texel coordinates pass validation (that is, the coordinates lie within the image),
then the texel value comes from the value in image memory.
-
-
Border Texel
-
If the texel coordinates fail validation, and
-
If the read is the result of an image sample instruction or image gather instruction, and
-
If the image is not a cube image,
then the texel is a border texel and texel replacement is performed.
-
-
Invalid Texel
-
If the texel coordinates fail validation, and
-
If the read is the result of an image fetch instruction, image read instruction, or atomic instruction,
then the texel is an invalid texel and texel replacement is performed.
-
-
Cube Map Edge or Corner
Otherwise the texel coordinates lie on the borders along the edges and corners of a cube map image, and Cube map edge handling is performed.
Cube Map Edge Handling
If the texel coordinates lie on the borders along the edges and corners of a
cube map image, the following steps are performed.
Note that this only occurs when using VK_FILTER_LINEAR
filtering
within a mip level, since VK_FILTER_NEAREST
is treated as using
VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
.
-
Cube Map Edge Texel
-
If the texel lies along the border in either only i or only j
then the texel lies along an edge, so the coordinates (i,j) and the array layer l are transformed to select the adjacent texel from the appropriate neighboring face.
-
-
Cube Map Corner Texel
-
If the texel lies along the border in both i and j
then the texel lies at a corner and there is no unique neighboring face from which to read that texel. The texel should be replaced by the average of the three values of the adjacent texels in each incident face. However, implementations may replace the cube map corner texel by other methods, subject to the constraint that if the three available samples have the same value, the replacement texel also has that value.
-
Sparse Validation
If the texel reads from an unbound region of a sparse image, the texel is a sparse unbound texel, and processing continues with texel replacement.
Layout Validation
If all planes of a disjoint multi-planar image are not in the same image layout when the image is sampled with sampler Y’CBCR conversion, the values returned by texel reads are undefined.
15.3.2. Format Conversion
Texels undergo a format conversion from the VkFormat of the image view to a vector of either floating point or signed or unsigned integer components, with the number of components based on the number of components present in the format.
-
Color formats have one, two, three, or four components, according to the format.
-
Depth/stencil formats are one component. The depth or stencil component is selected by the
aspectMask
of the image view.
Each component is converted based on its type and size (as defined in the Format Definition section for each VkFormat), using the appropriate equations in 16-Bit Floating-Point Numbers, Unsigned 11-Bit Floating-Point Numbers, Unsigned 10-Bit Floating-Point Numbers, Fixed-Point Data Conversion, and Shared Exponent to RGB. Signed integer components smaller than 32 bits are sign-extended.
If the image view format is sRGB, the color components are first converted as if they are UNORM, and then sRGB to linear conversion is applied to the R, G, and B components as described in the “sRGB EOTF” section of the Khronos Data Format Specification. The A component, if present, is unchanged.
If the image view format is block-compressed, then the texel value is first decoded, then converted based on the type and number of components defined by the compressed format.
15.3.3. Texel Replacement
A texel is replaced if it is one (and only one) of:
-
a border texel,
-
an invalid texel, or
-
a sparse unbound texel.
Border texels are replaced with a value based on the image format and the
borderColor
of the sampler.
The border color is:
Sampler borderColor |
Corresponding Border Color |
---|---|
|
B = (0.0, 0.0, 0.0, 0.0) |
|
B = (0.0, 0.0, 0.0, 1.0) |
|
B = (1.0, 1.0, 1.0, 1.0) |
|
B = (0, 0, 0, 0) |
|
B = (0, 0, 0, 1) |
|
B = (1, 1, 1, 1) |
Note
The names |
This is substituted for the texel value by replacing the number of components in the image format
Texel Aspect or Format | Component Assignment |
---|---|
Depth aspect |
D = Br |
Stencil aspect |
S = Br |
One component color format |
Cr = Br |
Two component color format |
Crg = (Br,Bg) |
Three component color format |
Crgb = (Br,Bg,Bb) |
Four component color format |
Crgba = (Br,Bg,Bb,Ba) |
The value returned by a read of an invalid texel is undefined, unless that
read operation is from a buffer resource and the robustBufferAccess
feature is enabled.
In that case, an invalid texel is replaced as described by the
robustBufferAccess
feature.
If the
VkPhysicalDeviceSparseProperties::residencyNonResidentStrict
property is VK_TRUE
, a sparse unbound texel is replaced with 0 or 0.0
values for integer and floating-point components of the image format,
respectively.
If residencyNonResidentStrict
is VK_FALSE
, the value of the
sparse unbound texel is undefined.
15.3.4. Depth Compare Operation
If the image view has a depth/stencil format, the depth component is
selected by the aspectMask
, and the operation is a Dref
instruction, a depth comparison is performed.
The value of the result D is 1.0 if the result of the compare
operation is true, and 0.0 otherwise.
The compare operation is selected by the compareOp
member of the
sampler.
where, in the depth comparison:
-
Dref = shaderOp.Dref (from optional SPIR-V operand)
-
D (texel depth value)
15.3.5. Conversion to RGBA
The texel is expanded from one, two, or three to four components based on the image base color:
Texel Aspect or Format | RGBA Color |
---|---|
Depth aspect |
Crgba = (D,0,0,one) |
Stencil aspect |
Crgba = (S,0,0,one) |
One component color format |
Crgba = (Cr,0,0,one) |
Two component color format |
Crgba = (Crg,0,one) |
Three component color format |
Crgba = (Crgb,one) |
Four component color format |
Crgba = Crgba |
where one = 1.0f for floating-point formats and depth aspects, and one = 1 for integer formats and stencil aspects.
15.3.6. Component Swizzle
All texel input instructions apply a swizzle based on:
-
the VkComponentSwizzle enums in the
components
member of the VkImageViewCreateInfo structure for the image being read if sampler Y’CBCR conversion is not enabled, and -
the VkComponentSwizzle enums in the
components
member of the VkSamplerYcbcrConversionCreateInfo structure for the sampler Y’CBCR conversion if sampler Y’CBCR conversion is enabled.
The swizzle can rearrange the components of the texel, or substitute zero and one for any components. It is defined as follows for the R component, and operates similarly for the other components.
where:
For each component this is applied to, the
VK_COMPONENT_SWIZZLE_IDENTITY
swizzle selects the corresponding
component from Crgba.
If the border color is one of the VK_BORDER_COLOR_*_OPAQUE_BLACK
enums
and the VkComponentSwizzle is not VK_COMPONENT_SWIZZLE_IDENTITY
for all components (or the
equivalent identity mapping),
the value of the texel after swizzle is undefined.
15.3.7. Sparse Residency
OpImageSparse
* instructions return a structure which includes a
residency code indicating whether any texels accessed by the instruction
are sparse unbound texels.
This code can be interpreted by the OpImageSparseTexelsResident
instruction which converts the residency code to a boolean value.
15.3.8. Chroma Reconstruction
In some color models, the color representation is defined in terms of monochromatic light intensity (often called “luma”) and color differences relative to this intensity, often called “chroma”. It is common for color models other than RGB to represent the chroma channels at lower spatial resolution than the luma channel. This approach is used to take advantage of the eye’s lower spatial sensitivity to color compared with its sensitivity to brightness. Less commonly, the same approach is used with additive color, since the green channel dominates the eye’s sensitivity to light intensity and the spatial sensitivity to color introduced by red and blue is lower.
Lower-resolution channels are “downsampled” by resizing them to a lower spatial resolution than the channel representing luminance. The process of reconstructing a full color value for texture access involves accessing both chroma and luma values at the same location. To generate the color accurately, the values of the lower-resolution channels at the location of the luma samples must be reconstructed from the lower-resolution sample locations, an operation known here as “chroma reconstruction” irrespective of the actual color model.
The location of the chroma samples relative to the luma coordinates is
determined by the xChromaOffset
and yChromaOffset
members of the
VkSamplerYcbcrConversionCreateInfo structure used to create the
sampler Y’CBCR conversion.
The following diagrams show the relationship between unnormalized (u,v) coordinates and (i,j) integer texel positions in the luma channel (shown in black, with circles showing integer sample positions) and the texel coordinates of reduced-resolution chroma channels, shown as crosses in red.
Note
If the chroma values are reconstructed at the locations of the luma samples
by means of interpolation, chroma samples from outside the image bounds are
needed; these are determined according to Wrapping Operation.
These diagrams represent this by showing the bounds of the “chroma texel”
extending beyond the image bounds, and including additional chroma sample
positions where required for interpolation.
The limits of a sample for |
Reconstruction is implemented in one of two ways:
If the format of the image that is to be sampled sets
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT
,
or the VkSamplerYcbcrConversionCreateInfo
’s
forceExplicitReconstruction
is set to VK_TRUE
, reconstruction is
performed as an explicit step independent of filtering, described in the
Explicit Reconstruction section.
If the format of the image that is to be sampled does not set
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT
and if the VkSamplerYcbcrConversionCreateInfo
’s
forceExplicitReconstruction
is set to VK_FALSE
, reconstruction
is performed as an implicit part of filtering prior to color model
conversion, with no separate post-conversion texel filtering step, as
described in the Implicit Reconstruction
section.
Explicit Reconstruction
-
If the
chromaFilter
member of the VkSamplerYcbcrConversionCreateInfo structure isVK_FILTER_NEAREST
:-
If the format’s R and B channels are reduced in resolution in just width by a factor of two relative to the G channel (i.e. this is a “
_422
” format), the \(\tau_{ijk}[level]\) values accessed by texel filtering are reconstructed as follows:\[\begin{aligned} \tau_R'(i, j) & = \tau_R(\lfloor{i\times 0.5}\rfloor, j)[level] \\ \tau_B'(i, j) & = \tau_B(\lfloor{i\times 0.5}\rfloor, j)[level] \end{aligned}\] -
If the format’s R and B channels are reduced in resolution in width and height by a factor of two relative to the G channel (i.e. this is a “
_420
” format), the \(\tau_{ijk}[level]\) values accessed by texel filtering are reconstructed as follows:\[\begin{aligned} \tau_R'(i, j) & = \tau_R(\lfloor{i\times 0.5}\rfloor, \lfloor{j\times 0.5}\rfloor)[level] \\ \tau_B'(i, j) & = \tau_B(\lfloor{i\times 0.5}\rfloor, \lfloor{j\times 0.5}\rfloor)[level] \end{aligned}\]NotexChromaOffset
andyChromaOffset
have no effect ifchromaFilter
isVK_FILTER_NEAREST
for explicit reconstruction.
-
-
If the
chromaFilter
member of the VkSamplerYcbcrConversionCreateInfo structure isVK_FILTER_LINEAR
:-
If the format’s R and B channels are reduced in resolution in just width by a factor of two relative to the G channel (i.e. this is a “422” format):
-
If
xChromaOffset
isVK_CHROMA_LOCATION_COSITED_EVEN
:\[\tau_{RB}'(i,j) = \begin{cases} \tau_{RB}(\lfloor{i\times 0.5}\rfloor,j)[level], & 0.5 \times i = \lfloor{0.5 \times i}\rfloor\\ 0.5\times\tau_{RB}(\lfloor{i\times 0.5}\rfloor,j)[level] + \\ 0.5\times\tau_{RB}(\lfloor{i\times 0.5}\rfloor + 1,j)[level], & 0.5 \times i \neq \lfloor{0.5 \times i}\rfloor \end{cases}\] -
If
xChromaOffset
isVK_CHROMA_LOCATION_MIDPOINT
:\[\tau_{RB}(i,j)' = \begin{cases} 0.25 \times \tau_{RB}(\lfloor{i\times 0.5}\rfloor - 1,j)[level] + \\ 0.75 \times \tau_{RB}(\lfloor{i\times 0.5}\rfloor,j)[level], & 0.5 \times i = \lfloor{0.5 \times i}\rfloor\\ 0.75 \times \tau_{RB}(\lfloor{i\times 0.5}\rfloor,j)[level] + \\ 0.25 \times \tau_{RB}(\lfloor{i\times 0.5}\rfloor + 1,j)[level], & 0.5 \times i \neq \lfloor{0.5 \times i}\rfloor \end{cases}\]
-
-
If the format’s R and B channels are reduced in resolution in width and height by a factor of two relative to the G channel (i.e. this is a “420” format), a similar relationship applies. Due to the number of options, these formulae are expressed more concisely as follows:
\[\begin{aligned} i_{RB} & = \begin{cases} 0.5 \times (i) & \textrm{If xChromaOffset = COSITED}\_\textrm{EVEN} \\ 0.5 \times (i - 0.5) & \textrm{If xChromaOffset = MIDPOINT} \end{cases}\\ j_{RB} & = \begin{cases} 0.5 \times (j) & \textrm{If yChromaOffset = COSITED}\_\textrm{EVEN} \\ 0.5 \times (j - 0.5) & \textrm{If yChromaOffset = MIDPOINT} \end{cases}\\ \\ i_{floor} & = \lfloor i_{RB} \rfloor \\ j_{floor} & = \lfloor j_{RB} \rfloor \\ \\ i_{frac} & = i_{RB} - i_{floor} \\ j_{frac} & = j_{RB} - j_{floor} \end{aligned}\]\[\begin{aligned} \tau_{RB}'(i,j) = & \tau_{RB}( i_{floor}, j_{floor})[level] & \times & ( 1 - i_{frac} ) & & \times & ( 1 - j_{frac} ) & + \\ & \tau_{RB}( 1 + i_{floor}, j_{floor})[level] & \times & ( i_{frac} ) & & \times & ( 1 - j_{frac} ) & + \\ & \tau_{RB}( i_{floor}, 1 + j_{floor})[level] & \times & ( 1 - i_{frac} ) & & \times & ( j_{frac} ) & + \\ & \tau_{RB}( 1 + i_{floor}, 1 + j_{floor})[level] & \times & ( i_{frac} ) & & \times & ( j_{frac} ) & \end{aligned}\]
-
Note
In the case where the texture itself is bilinearly interpolated as described
in Texel Filtering, thus requiring four
full-color samples for the filtering operation, and where the reconstruction
of these samples uses bilinear interpolation in the chroma channels due to
|
Implicit Reconstruction
Implicit reconstruction takes place by the samples being interpolated, as
required by the filter settings of the sampler, except that
chromaFilter
takes precedence for the chroma samples.
If chromaFilter
is VK_FILTER_NEAREST
, an implementation may
behave as if xChromaOffset
and yChromaOffset
were both
VK_CHROMA_LOCATION_MIDPOINT
, irrespective of the values set.
Note
This will not have any visible effect if the locations of the luma samples coincide with the location of the samples used for rasterization. |
The sample coordinates are adjusted by the downsample factor of the channel (such that, for example, the sample coordinates are divided by two if the channel has a downsample factor of two relative to the luma channel):
15.3.9. Sampler Y’CBCR Conversion
Sampler Y’CBCR conversion performs the following operations, which an implementation may combine into a single mathematical operation:
Sampler Y’CBCR Range Expansion
Sampler Y’CBCR range expansion is applied to color channel values after all texel input operations which are not specific to sampler Y’CBCR conversion. For example, the input values to this stage have been converted using the normal format conversion rules.
Sampler Y’CBCR range expansion is not applied if ycbcrModel
is
VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY
.
That is, the shader receives the vector C'rgba as output by the Component
Swizzle stage without further modification.
For other values of ycbcrModel
, range expansion is applied to the
texel channel values output by the Component
Swizzle defined by the components
member of
VkSamplerYcbcrConversionCreateInfo.
Range expansion applies independently to each channel of the image.
For the purposes of range expansion and Y’CBCR model conversion, the R
and B channels contain color difference (chroma) values and the G channel
contains luma.
The A channel is not modified by sampler Y’CBCR range expansion.
The range expansion to be applied is defined by the ycbcrRange
member
of the VkSamplerYcbcrConversionCreateInfo
structure:
-
If
ycbcrRange
isVK_SAMPLER_YCBCR_RANGE_ITU_FULL
, the following transformations are applied:\[\begin{aligned} Y' &= C'_{rgba}[G] \\ C_B &= C'_{rgba}[B] - {{2^{(n-1)}}\over{(2^n) - 1}} \\ C_R &= C'_{rgba}[R] - {{2^{(n-1)}}\over{(2^n) - 1}} \end{aligned}\]NoteThese formulae correspond to the “full range” encoding in the Khronos Data Format Specification.
Should any future amendments be made to the ITU specifications from which these equations are derived, the formulae used by Vulkan may also be updated to maintain parity.
-
If
ycbcrRange
isVK_SAMPLER_YCBCR_RANGE_ITU_NARROW
, the following transformations are applied:\[\begin{aligned} Y' &= {{C'_{rgba}[G] \times (2^n-1) - 16\times 2^{n-8}}\over{219\times 2^{n-8}}} \\ C_B &= {{C'_{rgba}[B] \times \left(2^n-1\right) - 128\times 2^{n-8}}\over{224\times 2^{n-8}}} \\ C_R &= {{C'_{rgba}[R] \times \left(2^n-1\right) - 128\times 2^{n-8}}\over{224\times 2^{n-8}}} \end{aligned}\]NoteThese formulae correspond to the “narrow range” encoding in the Khronos Data Format Specification.
-
n is the bit-depth of the channels in the format.
The precision of the operations performed during range expansion must be at least that of the source format.
An implementation may clamp the results of these range expansion operations such that Y' falls in the range [0,1], and/or such that CB and CR fall in the range [-0.5,0.5].
Sampler Y’CBCR Model Conversion
The range-expanded values are converted between color models, according to
the color model conversion specified in the ycbcrModel
member:
VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY
-
The color channels are not modified by the color model conversion since they are assumed already to represent the desired color model in which the shader is operating; Y’CBCR range expansion is also ignored.
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_IDENTITY
-
The color channels are not modified by the color model conversion and are assumed to be treated as though in Y’CBCR form both in memory and in the shader; Y’CBCR range expansion is applied to the channels as for other Y’CBCR models, with the vector (CR,Y',CB,A) provided to the shader.
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709
-
The color channels are transformed from a Y’CBCR representation to an R’G’B' representation as described in the “BT.709 Y’CBCR conversion” section of the Khronos Data Format Specification.
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_601
-
The color channels are transformed from a Y’CBCR representation to an R’G’B' representation as described in the “BT.601 Y’CBCR conversion” section of the Khronos Data Format Specification.
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_2020
-
The color channels are transformed from a Y’CBCR representation to an R’G’B' representation as described in the “BT.2020 Y’CBCR conversion” section of the Khronos Data Format Specification.
In this operation, each output channel is dependent on each input channel.
An implementation may clamp the R’G’B' results of these conversions to the range [0,1].
The precision of the operations performed during model conversion must be at least that of the source format.
The alpha channel is not modified by these model conversions.
Note
Sampling operations in a non-linear color space can introduce color and intensity shifts at sharp transition boundaries. To avoid this issue, the technically precise color correction sequence described in the “Introduction to Color Conversions” chapter of the Khronos Data Format Specification may be performed as follows:
The additional calculations and, especially, additional number of sampling
operations in the |
15.4. Texel Output Operations
Texel output instructions are SPIR-V image instructions that write to an image. Texel output operations are a set of steps that are performed on state, coordinates, and texel values while processing a texel output instruction, and which are common to some or all texel output instructions. They include the following steps, which are performed in the listed order:
15.4.1. Texel Output Validation Operations
Texel output validation operations inspect instruction/image state or coordinates, and in certain circumstances cause the write to have no effect. There are a series of validations that the texel undergoes.
Texel Format Validation
If the image format of the OpTypeImage
is not compatible with the
VkImageView
’s format
, the effect of the write on the image
view’s memory is undefined, but the write must not access memory outside of
the image view.
15.4.2. Integer Texel Coordinate Validation
The integer texel coordinates are validated according to the same rules as for texel input coordinate validation.
If the texel fails integer texel coordinate validation, then the write has no effect.
15.4.3. Sparse Texel Operation
If the texel attempts to write to an unbound region of a sparse image, the
texel is a sparse unbound texel.
In such a case, if the
VkPhysicalDeviceSparseProperties::residencyNonResidentStrict
property is VK_TRUE
, the sparse unbound texel write has no effect.
If residencyNonResidentStrict
is VK_FALSE
, the write may have a
side effect that becomes visible to other accesses to unbound texels in any
resource, but will not be visible to any device memory allocated by the
application.
15.4.4. Texel Output Format Conversion
If the image format is sRGB, a linear to sRGB conversion is applied to the R, G, and B components as described in the “sRGB EOTF” section of the Khronos Data Format Specification. The A component, if present, is unchanged.
Texels then undergo a format conversion from the floating point, signed, or unsigned integer type of the texel data to the VkFormat of the image view. Any unused components are ignored.
Each component is converted based on its type and size (as defined in the Format Definition section for each VkFormat). Floating-point outputs are converted as described in Floating-Point Format Conversions and Fixed-Point Data Conversion. Integer outputs are converted such that their value is preserved. The converted value of any integer that cannot be represented in the target format is undefined.
15.5. Derivative Operations
SPIR-V derivative instructions include OpDPdx
, OpDPdy
,
OpDPdxFine
, OpDPdyFine
, OpDPdxCoarse
, and OpDPdyCoarse
.
Derivative instructions are only available in
compute and
fragment shaders.
Derivatives are computed as if there is a 2×2 neighborhood of fragments for each fragment shader invocation. These neighboring fragments are used to compute derivatives with the assumption that the values of P in the neighborhood are piecewise linear. It is further assumed that the values of P in the neighborhood are locally continuous, therefore derivatives in non-uniform control flow are undefined.
The Fine
derivative instructions must return the values above, for a
group of fragments in a 2×2 neighborhood.
Coarse derivatives may return only two values.
In this case, the values should be:
OpDPdx
and OpDPdy
must return the same result as either
OpDPdxFine
or OpDPdxCoarse
and either OpDPdyFine
or
OpDPdyCoarse
, respectively.
Implementations must make the same choice of either coarse or fine for both
OpDPdx
and OpDPdy
, and implementations should make the choice
that is more efficient to compute.
If the subgroupSize
field of VkPhysicalDeviceSubgroupProperties
is at least 4, the 2x2 neighborhood of fragments corresponds exactly to a
subgroup quad.
The order in which the fragments appear within the quad is implementation
defined.
15.5.1. Compute Shader Derivatives
For compute shaders, derivatives are also evaluated using a 2×2
logical neighborhood of compute shader invocations.
Compute shader invocations are arranged into neighborhoods according to one
of two SPIR-V execution modes.
For the DerivativeGroupQuadsNV
execution mode, each neighborhood is
assembled from a 2×2×1 region of invocations based on the
LocalInvocationId
built-in.
For the DerivativeGroupLinearNV
execution mode, each neighborhood is
assembled from a group of four invocations based on the
LocalInvocationIndex
built-in.
The Compute shader derivative group assignments table specifies the
LocalInvocationId
or LocalInvocationIndex
values for the four
values of P in each neighborhood, where x and y are per-neighborhood
integer values.
Value | DerivativeGroupQuadsNV | DerivativeGroupLinearNV |
---|---|---|
Pi0,j0 |
(2x + 0, 2y + 0, z) |
4x + 0 |
Pi1,j0 |
(2x + 1, 2y + 0, z) |
4x + 1 |
Pi0,j1 |
(2x + 0, 2y + 1, z) |
4x + 2 |
Pi1,j1 |
(2x + 1, 2y + 1, z) |
4x + 3 |
For multi-planar formats, the derivatives are computed based on the plane with the largest dimensions.
15.6. Normalized Texel Coordinate Operations
If the image sampler instruction provides normalized texel coordinates, some of the following operations are performed.
15.6.1. Projection Operation
For Proj
image operations, the normalized texel coordinates
(s,t,r,q,a) and (if present) the Dref coordinate are
transformed as follows:
15.6.2. Derivative Image Operations
Derivatives are used for LOD selection.
These derivatives are either implicit (in an ImplicitLod
image
instruction in a fragment shader) or explicit (provided explicitly by shader
to the image instruction in any shader).
For implicit derivatives image instructions, the derivatives of texel coordinates are calculated in the same manner as derivative operations above. That is:
Partial derivatives not defined above for certain image dimensionalities are set to zero.
For explicit LOD image instructions, if the optional SPIR-V operand Grad is provided, then the operand values are used for the derivatives. The number of components present in each derivative for a given image dimensionality matches the number of partial derivatives computed above.
If the optional SPIR-V operand Lod is provided, then derivatives are set to zero, the cube map derivative transformation is skipped, and the scale factor operation is skipped. Instead, the floating point scalar coordinate is directly assigned to λbase as described in Level-of-Detail Operation.
For implicit derivative image instructions, the partial derivative values may be computed by linear approximation using a 2×2 neighborhood of shader invocations (known as a quad), as described above. If the instruction is in control flow that is not uniform across the quad, then the derivative values and hence the implicit LOD values are undefined.
If the image or sampler object used by an implicit derivative image
instruction is not uniform across the quad and
quadDivergentImplicitLod
is not supported, then the derivative and LOD values are undefined.
Implicit derivatives are well-defined when the image and sampler and control
flow are uniform across the quad, even if they diverge between different
quads.
If
quadDivergentImplicitLod
is supported, then derivatives and implicit LOD values are well-defined even
if the image or sampler object are not uniform within a quad.
The derivatives are computed as specified above, and the implicit LOD
calculation proceeds for each shader invocation using its respective image
and sampler object.
For the purposes of implicit derivatives, Flat
fragment input variables
are uniform within a quad.
15.6.3. Cube Map Face Selection and Transformations
For cube map image instructions, the (s,t,r) coordinates are treated as a direction vector (rx,ry,rz). The direction vector is used to select a cube map face. The direction vector is transformed to a per-face texel coordinate system (sface,tface), The direction vector is also used to transform the derivatives to per-face derivatives.
15.6.4. Cube Map Face Selection
The direction vector selects one of the cube map’s faces based on the largest magnitude coordinate direction (the major axis direction). Since two or more coordinates can have identical magnitude, the implementation must have rules to disambiguate this situation.
The rules should have as the first rule that rz wins over ry and rx, and the second rule that ry wins over rx. An implementation may choose other rules, but the rules must be deterministic and depend only on (rx,ry,rz).
The layer number (corresponding to a cube map face), the coordinate selections for sc, tc, rc, and the selection of derivatives, are determined by the major axis direction as specified in the following two tables.
Major Axis Direction | Layer Number | Cube Map Face | sc | tc | rc |
---|---|---|---|---|---|
+rx |
0 |
Positive X |
-rz |
-ry |
rx |
-rx |
1 |
Negative X |
+rz |
-ry |
rx |
+ry |
2 |
Positive Y |
+rx |
+rz |
ry |
-ry |
3 |
Negative Y |
+rx |
-rz |
ry |
+rz |
4 |
Positive Z |
+rx |
-ry |
rz |
-rz |
5 |
Negative Z |
-rx |
-ry |
rz |
Major Axis Direction | ∂sc / ∂x | ∂sc / ∂y | ∂tc / ∂x | ∂tc / ∂y | ∂rc / ∂x | ∂rc / ∂y |
---|---|---|---|---|---|---|
+rx |
-∂rz / ∂x |
-∂rz / ∂y |
-∂ry / ∂x |
-∂ry / ∂y |
+∂rx / ∂x |
+∂rx / ∂y |
-rx |
+∂rz / ∂x |
+∂rz / ∂y |
-∂ry / ∂x |
-∂ry / ∂y |
-∂rx / ∂x |
-∂rx / ∂y |
+ry |
+∂rx / ∂x |
+∂rx / ∂y |
+∂rz / ∂x |
+∂rz / ∂y |
+∂ry / ∂x |
+∂ry / ∂y |
-ry |
+∂rx / ∂x |
+∂rx / ∂y |
-∂rz / ∂x |
-∂rz / ∂y |
-∂ry / ∂x |
-∂ry / ∂y |
+rz |
+∂rx / ∂x |
+∂rx / ∂y |
-∂ry / ∂x |
-∂ry / ∂y |
+∂rz / ∂x |
+∂rz / ∂y |
-rz |
-∂rx / ∂x |
-∂rx / ∂y |
-∂ry / ∂x |
-∂ry / ∂y |
-∂rz / ∂x |
-∂rz / ∂y |
15.6.5. Cube Map Coordinate Transformation
15.6.6. Cube Map Derivative Transformation
editing-note
(Bill) Note that we never revisited ARB_texture_cubemap after we introduced dependent texture fetches (ARB_fragment_program and ARB_fragment_shader). The derivatives of sface and tface are only valid for non-dependent texture fetches (pre OpenGL 2.0). |
15.6.7. Scale Factor Operation, Level-of-Detail Operation and Image Level(s) Selection
LOD selection can be either explicit (provided explicitly by the image
instruction) or implicit (determined from a scale factor calculated from the
derivatives).
The implicit LOD selected can be queried using the SPIR-V instruction
OpImageQueryLod
, which gives access to the λ' and
dl values, defined below.
Scale Factor Operation
The magnitude of the derivatives are calculated by:
-
mux = |∂s/∂x| × wbase
-
mvx = |∂t/∂x| × hbase
-
mwx = |∂r/∂x| × dbase
-
muy = |∂s/∂y| × wbase
-
mvy = |∂t/∂y| × hbase
-
mwy = |∂r/∂y| × dbase
where:
-
∂t/∂x = ∂t/∂y = 0 (for 1D images)
-
∂r/∂x = ∂r/∂y = 0 (for 1D, 2D or Cube images)
and
-
wbase = image.w
-
hbase = image.h
-
dbase = image.d
(for the baseMipLevel
, from the image descriptor).
For corner-sampled images, the wbase, hbase, and dbase are instead:
-
wbase = image.w - 1
-
hbase = image.h - 1
-
dbase = image.d - 1
A point sampled in screen space has an elliptical footprint in texture space. The minimum and maximum scale factors (ρmin, ρmax) should be the minor and major axes of this ellipse.
The scale factors ρx and ρy, calculated from the magnitude of the derivatives in x and y, are used to compute the minimum and maximum scale factors.
ρx and ρy may be approximated with functions fx and fy, subject to the following constraints:
editing-note
(Bill) For reviewers only - anticipating questions. We only support implicit derivatives for normalized texel coordinates. So we are documenting the derivatives in s,t,r (normalized texel coordinates) rather than u,v,w (unnormalized texel coordinates) as in OpenGL and OpenGL ES specifications. (I know, u,v,w is the way it has been documented since OpenGL V1.0.) Also there is no reason to have conditional application of wbase, hbase, dbase for rectangle textures either, since they do not support implicit derivatives. |
The minimum and maximum scale factors (ρmin,ρmax) are determined by:
-
ρmax = max(ρx, ρy)
-
ρmin = min(ρx, ρy)
The ratio of anisotropy is determined by:
-
η = min(ρmax/ρmin, maxAniso)
where:
-
sampler.maxAniso =
maxAnisotropy
(from sampler descriptor) -
limits.maxAniso =
maxSamplerAnisotropy
(from physical device limits) -
maxAniso = min(sampler.maxAniso, limits.maxAniso)
If ρmax = ρmin = 0, then all the partial derivatives are
zero, the fragment’s footprint in texel space is a point, and N
should be treated as 1.
If ρmax ≠ 0 and ρmin = 0 then all partial
derivatives along one axis are zero, the fragment’s footprint in texel space
is a line segment, and η should be treated as maxAniso.
However, anytime the footprint is small in texel space the implementation
may use a smaller value of η, even when ρmin is zero
or close to zero.
If either VkPhysicalDeviceFeatures::samplerAnisotropy
or
VkSamplerCreateInfo::anisotropyEnable
are VK_FALSE
,
maxAniso is set to 1.
If η = 1, sampling is isotropic. If η > 1, sampling is anisotropic.
The sampling rate (N) is derived as:
-
N = ⌈η⌉
An implementation may round N up to the nearest supported sampling rate. An implementation may use the value of N as an approximation of η.
Level-of-Detail Operation
The LOD parameter λ is computed as follows:
where:
and maxSamplerLodBias is the value of the VkPhysicalDeviceLimits
feature maxSamplerLodBias
.
Image Level(s) Selection
The image level(s) d, dhi, and dlo which texels are read from are determined by an image-level parameter dl, which is computed based on the LOD parameter, as follows:
where:
and
-
levelbase =
baseMipLevel
-
q =
levelCount
- 1
baseMipLevel
and levelCount
are taken from the
subresourceRange
of the image view.
If the sampler’s mipmapMode
is VK_SAMPLER_MIPMAP_MODE_NEAREST
,
then the level selected is d = dl.
If the sampler’s mipmapMode
is VK_SAMPLER_MIPMAP_MODE_LINEAR
,
two neighboring levels are selected:
δ is the fractional value, quantized to the number of mipmap precision bits, used for linear filtering between levels.
15.6.8. (s,t,r,q,a) to (u,v,w,a) Transformation
The normalized texel coordinates are scaled by the image level dimensions and the array layer is selected. This transformation is performed once for each level (d or dhi and dlo) used in filtering.
where
-
widthscale = widthlevel
-
heightscale = heightlevel
-
depthscale = depthlevel
for conventional images , and
-
widthscale = widthlevel - 1
-
heightscale = heightlevel - 1
-
depthscale = depthlevel - 1
for corner-sampled images .
Operations then proceed to Unnormalized Texel Coordinate Operations.
15.7. Unnormalized Texel Coordinate Operations
15.7.1. (u,v,w,a) to (i,j,k,l,n) Transformation And Array Layer Selection
The unnormalized texel coordinates are transformed to integer texel coordinates relative to the selected mipmap level.
The layer index l is computed as:
-
l = clamp(RNE(a), 0,
layerCount
- 1) +baseArrayLayer
where layerCount
is the number of layers in the image subresource
range of the image view, baseArrayLayer
is the first layer from the
subresource range, and where:
The sample index n is assigned the value zero.
Nearest filtering (VK_FILTER_NEAREST
) computes the integer texel
coordinates that the unnormalized coordinates lie within:
where
-
shift = 0.0
for conventional images , and
-
shift = 0.5
for corner-sampled images .
Linear filtering (VK_FILTER_LINEAR
) computes a set of neighboring
coordinates which bound the unnormalized coordinates.
The integer texel coordinates are combinations of i0 or i1,
j0 or j1, k0 or k1, as well as weights
α, β, and γ.
where
-
shift = 0.5
for conventional images , and
-
shift = 0.0
for corner-sampled images .
Cubic filtering (VK_FILTER_CUBIC_IMG
) computes a set of neighboring
coordinates which bound the unnormalized coordinates.
The integer texel coordinates are combinations of i0, i1,
i2 or i3, j0, j1, j2 or j3,
as well as weights α and β.
If the image instruction includes a ConstOffset operand, the constant offsets (Δi, Δj, Δk) are added to (i,j,k) components of the integer texel coordinates.
15.8. Integer Texel Coordinate Operations
Integer texel coordinate operations may supply a LOD which texels are to be
read from or written to using the optional SPIR-V operand Lod
.
If the Lod
is provided then it must be an integer.
The image level selected is:
If d does not lie in the range [baseMipLevel
,
baseMipLevel
+ levelCount
) then any values fetched are
undefined, and any writes are discarded.
15.9. Image Sample Operations
15.9.1. Wrapping Operation
Cube
images ignore the wrap modes specified in the sampler.
Instead, if VK_FILTER_NEAREST
is used within a mip level then
VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
is used, and if
VK_FILTER_LINEAR
is used within a mip level then sampling at the edges
is performed as described earlier in the Cube map
edge handling section.
The first integer texel coordinate i is transformed based on the
addressModeU
parameter of the sampler.
where:
j (for 2D and Cube image) and k (for 3D image) are similarly
transformed based on the addressModeV
and addressModeW
parameters of the sampler, respectively.
15.9.2. Texel Gathering
SPIR-V instructions with Gather
in the name return a vector derived
from a 2×2 rectangular region of texels in the base level of the image
view.
The rules for the VK_FILTER_LINEAR
minification filter are applied to
identify the four selected texels.
Each texel is then converted to an RGBA value according to
conversion to RGBA and then
swizzled.
A four-component vector is then assembled by taking the component indicated
by the Component
value in the instruction from the swizzled color value
of the four texels:
where:
OpImage
*Gather must not be used on a sampled image with
sampler Y’CBCR conversion enabled.
15.9.3. Texel Filtering
If λ is less than or equal to zero, the texture is said to be
magnified, and the filter mode within a mip level is selected by the
magFilter
in the sampler.
If λ is greater than zero, the texture is said to be
minified, and the filter mode within a mip level is selected by the
minFilter
in the sampler.
Within a mip level, VK_FILTER_NEAREST
filtering selects a single value
using the (i, j, k) texel coordinates, with all texels taken from
layer l.
Within a mip level, VK_FILTER_LINEAR
filtering combines 8 (for 3D), 4
(for 2D or Cube), or 2 (for 1D) texel values, using the weights computed
earlier:
The function reduce() is defined to operate on pairs of weights and
texel values as follows.
When using linear or anisotropic filtering, the values of multiple texels
are combined using a weighted average to produce a filtered texture value.
However, a filtered texture value can also be produced by computing
per-component minimum and maximum values over the set of texels that would
normally be averaged.
The VkSamplerReductionModeCreateInfoEXT::reductionMode
controls
the process by which multiple texels are combined to produce a filtered
texture value.
When set to VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE_EXT
, a weighted
average is computed.
If the reduction mode is VK_SAMPLER_REDUCTION_MODE_MIN_EXT
or
VK_SAMPLER_REDUCTION_MODE_MAX_EXT
, reduce() computes a
component-wise minimum or maximum, respectively, of the components of the
set of provided texels with non-zero weights.
Within a mip level, VK_FILTER_CUBIC_IMG
filtering computes a weighted
average of 16 (for 2D), or 4 (for 1D) texel values, using the weights
computed during texel selection.
Catmull-Rom Spine interpolation of four points is defined by the equation:
Using the values calculated in texel selection, this equation is applied to the four points in 1D images. For 2D images, the this equation is evaluated first for each row, and the result is then fed back into the equation and interpolated again:
-
τ1D[level] = cinterp(τi0[level], τi1[level], τi2[level], τi3[level], α)
-
τj0[level] = cinterp(τi0j0[level], τi1j0[level], τi2j0[level], τi3j0[level], α)
-
τj1[level] = cinterp(τi0j1[level], τi1j1[level], τi2j1[level], τi3j1[level], α)
-
τj2[level] = cinterp(τi0j2[level], τi1j2[level], τi2j2[level], τi3j2[level], α)
-
τj3[level] = cinterp(τi0j3[level], τi1j3[level], τi2j3[level], τi3j3[level], α)
-
τ2D[level] = cinterp(τj0[level], τj1[level], τj2[level], τj3[level], β)
Finally, mipmap filtering either selects a value from one mip level or computes a weighted average between neighboring mip levels:
15.9.4. Texel Anisotropic Filtering
Anisotropic filtering is enabled by the anisotropyEnable
in the
sampler.
When enabled, the image filtering scheme accounts for a degree of
anisotropy.
The particular scheme for anisotropic texture filtering is implementation
dependent.
Implementations should consider the magFilter
, minFilter
and
mipmapMode
of the sampler to control the specifics of the anisotropic
filtering scheme used.
In addition, implementations should consider minLod
and maxLod
of the sampler.
The following describes one particular approach to implementing anisotropic filtering for the 2D Image case, implementations may choose other methods:
Given a magFilter
, minFilter
of VK_FILTER_LINEAR
and a
mipmapMode
of VK_SAMPLER_MIPMAP_MODE_NEAREST
:
Instead of a single isotropic sample, N isotropic samples are be sampled within the image footprint of the image level d to approximate an anisotropic filter. The sum τ2Daniso is defined using the single isotropic τ2D(u,v) at level d.
When VkSamplerReductionModeCreateInfoEXT::reductionMode
is set
to VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE_EXT
, the above summation
is used.
If the reduction mode is VK_SAMPLER_REDUCTION_MODE_MIN_EXT
or
VK_SAMPLER_REDUCTION_MODE_MAX_EXT
, then the value is instead computed
as τ2Daniso = reduce(τ1, …, τN), combining all
texel values with non-zero weights.
15.10. Texel Footprint Evaluation
The SPIR-V instruction OpImageSampleFootprintNV
evaluates the set of
texels from a single mip level that would be accessed during a
texel filtering operation.
In addition to the inputs that would be accepted by an equivalent
OpImageSample
* instruction, OpImageSampleFootprintNV
accepts two
additional inputs.
The Granularity
input is an integer identifying the size of texel
groups used to evaluate the footprint.
Each bit in the returned footprint mask corresponds to an aligned block of
texels whose size is given by the following table:
Granularity |
Dim = 2D |
Dim = 3D |
---|---|---|
0 |
unsupported |
unsupported |
1 |
2x2 |
2x2x2 |
2 |
4x2 |
unsupported |
3 |
4x4 |
4x4x2 |
4 |
8x4 |
unsupported |
5 |
8x8 |
unsupported |
6 |
16x8 |
unsupported |
7 |
16x16 |
unsupported |
8 |
unsupported |
unsupported |
9 |
unsupported |
unsupported |
10 |
unsupported |
16x16x16 |
11 |
64x64 |
32x16x16 |
12 |
128x64 |
32x32x16 |
13 |
128x128 |
32x32x32 |
14 |
256x128 |
64x32x32 |
15 |
256x256 |
unsupported |
The Coarse
input is used to select between the two mip levels that may
be accessed during texel filtering when using a mipmapMode
of
VK_SAMPLER_MIPMAP_MODE_LINEAR
.
When filtering between two mip levels, a Coarse
value of true
requests the footprint in the lower-resolution mip level (higher level
number), while false
requests the footprint in the higher-resolution
mip level.
If texel filtering would access only a single mip level, the footprint in
that level would be returned when Coarse
is set to false
; an empty
footprint would be returned when Coarse
is set to true
.
The footprint for OpImageSampleFootprintNV
is returned in a structure
with six members:
-
The first member is a boolean value that is true if the texel filtering operation would access only a single mip level.
-
The second member is a two- or three-component integer vector holding the footprint anchor location. For two-dimensional images, the returned components are in units of eight texel groups. For three-dimensional images, the returned components are in units of four texel groups.
-
The third member is a two- or three-component integer vector holding a footprint offset relative to the anchor. All returned components are in units of texel groups.
-
The fourth member is a two-component integer vector mask, which holds a bitfield identifying the set of texel groups in an 8x8 or 4x4x4 neighborhood relative to the anchor and offset.
-
The fifth member is an integer identifying the mip level containing the footprint identified by the anchor, offset, and mask.
-
The sixth member is an integer identifying the granularity of the returned footprint.
For footprints in two-dimensional images (Dim2D
), the mask returned by
OpImageSampleFootprintNV
indicates whether each texel group in a 8x8
local neighborhood of texel groups would have one or more texels accessed
during texel filtering.
In the mask, the texel group with local group coordinates
\((lgx,lgy)\) is considered covered if and only if
where
-
\(0<=lgx<8\) and \(0<=lgy<8\); and
-
\(mask\) is the returned two-component mask.
The local group with coordinates \((lgx,lgy)\) in the mask is considered covered if and only if the texel filtering operation would access one or more texels \(\tau_{ij}\) in the returned miplevel where
and
-
\(i0<=i<=i1\) and \(j0<=j<=j1\);
-
\(gran\) is a two-component vector holding the width and height of the texel group identified by the granularity;
-
\(anchor\) is the returned two-component anchor vector; and
-
\(offset\) is the returned two-component offset vector.
For footprints in three-dimensional images (Dim3D
), the mask returned
by OpImageSampleFootprintNV
indicates whether each texel group in a
4x4x4 local neighborhood of texel groups would have one or more texels
accessed during texel filtering.
In the mask, the texel group with local group coordinates
\((lgx,lgy,lgz)\), is considered covered if and only if:
where
-
\(0<=lgx<4\), \(0<=lgy<4\), and \(0<=lgz<4\); and
-
\(mask\) is the returned two-component mask.
The local group with coordinates \((lgx,lgy,lgz)\) in the mask is considered covered if and only if the texel filtering operation would access one or more texels \(\tau_{ijk}\) in the returned miplevel where
and
-
\(i0<=i<=i1\), \(j0<=j<=j1\), \(k0<=k<=k1\);
-
\(gran\) is a three-component vector holding the width, height, and depth of the texel group identified by the granularity;
-
\(anchor\) is the returned three-component anchor vector; and
-
\(offset\) is the returned three-component offset vector.
If the sampler used by OpImageSampleFootprintNV
enables anisotropic
texel filtering via anisotropyEnable
, it is possible that the set of
texel groups accessed in a mip level may be too large to be expressed using
an 8x8 or 4x4x4 mask using the granularity requested in the instruction.
In this case, the implementation uses a texel group larger than the
requested granularity.
When a larger texel group size is used, OpImageSampleFootprintNV
returns an integer granularity value that can be interpreted in the same
manner as the granularity value provided to the instruction to determine the
texel group size used.
If anisotropic texel filtering is disabled in the sampler, or if an
anisotropic footprint can be represented as an 8x8 or 4x4x4 mask with the
requested granularity, OpImageSampleFootprintNV
will use the requested
granularity as-is and return a granularity value of zero.
OpImageSampleFootprintNV
supports only two- and three-dimensional image
accesses (Dim2D
and Dim3D
) and the footprint returned is undefined
if a sampler uses an addressing mode other than
VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
.
15.11. Image Operation Steps
Each step described in this chapter is performed by a subset of the image instructions:
-
Texel Input Validation Operations, Format Conversion, Texel Replacement, Conversion to RGBA, and Component Swizzle: Performed by all instructions except
OpImageWrite
. -
Depth Comparison: Performed by
OpImage
*Dref
instructions. -
All Texel output operations: Performed by
OpImageWrite
. -
Projection: Performed by all
OpImage
*Proj
instructions. -
Derivative Image Operations, Cube Map Operations, Scale Factor Operation, Level-of-Detail Operation and Image Level(s) Selection, and Texel Anisotropic Filtering: Performed by all
OpImageSample
* andOpImageSparseSample
* instructions. -
(s,t,r,q,a) to (u,v,w,a) Transformation, Wrapping, and (u,v,w,a) to (i,j,k,l,n) Transformation And Array Layer Selection: Performed by all
OpImageSample
,OpImageSparseSample
, andOpImage
*Gather
instructions. -
Texel Gathering: Performed by
OpImage
*Gather
instructions. -
Texel Footprint Evaluation: Performed by
OpImageSampleFootprint
instructions. -
Texel Filtering: Performed by all
OpImageSample
* andOpImageSparseSample
* instructions. -
Sparse Residency: Performed by all
OpImageSparse
* instructions.
16. Fragment Density Map Operations
16.1. Fragment Density Map Operations Overview
When a fragment is generated in a render pass that has a fragment density map attachment, its area is determined by the properties of the local framebuffer region that the fragment occupies. The framebuffer is divided into a uniform grid of these local regions, and their fragment area property is derived from the density map with the following operations:
16.2. Fetch Density Value
Each local framebuffer region at center coordinate (x,y) fetches a texel from the fragment density map at integer coordinates:
-
\(i = \lfloor{\frac{x}{fragmentDensityTexelSize_{width}}}\rfloor\)
-
\(j = \lfloor{\frac{y}{fragmentDensityTexelSize_{height}}}\rfloor\)
Where the size of each region in the framebuffer is:
-
\(fragmentDensityTexelSize'_{width} = {2^{\lceil{\log_2(\frac{framebuffer_{width}}{fragmentDensityMap_{width}})}\rceil}}\)
-
\(fragmentDensityTexelSize'_{height} = {2^{\lceil{\log_2(\frac{framebuffer_{height}}{fragmentDensityMap_{height}})}\rceil}}\)
This region is subject to the limits in
VkPhysicalDeviceFragmentDensityMapPropertiesEXT
and therefore the
final region size is clamped:
-
\(fragmentDensityTexelSize_{width} = \mathbin{clamp}(fragmentDensityTexelSize'_{width},minFragmentDensityTexelSize_{width},maxFragmentDensityTexelSize_{height})\)
-
\(fragmentDensityTexelSize_{height} = \mathbin{clamp}(fragmentDensityTexelSize'_{height},minFragmentDensityTexelSize_{height},maxFragmentDensityTexelSize_{height})\)
When multiview is enabled for the render pass and the fragment density map
attachment view was created with layerCount
greater than 1
, the
density map layer that the texel is fetched from is:
-
\(layer = baseArrayLayer + ViewIndex\)
Otherwise:
-
\(layer = baseArrayLayer\)
The texel fetched from the density map at (i,j,layer) is next converted to density with the following operations.
16.2.1. Component Swizzle
The components
member of VkImageViewCreateInfo is applied to the
fetched texel as defined in Image component
swizzle.
16.2.2. Component Mapping
The swizzled texel’s components are mapped to a density value:
-
\(densityValue_{xy} = (C'_{r},C'_{g})\)
16.3. Fragment Area Conversion
Fragment area for the framebuffer region is undefined if the density fetched
is not a normalized floating-point value greater than 0.0
.
Otherwise, the fetched fragment area for that region is derived as:
-
\(fragmentArea_{wh} = \frac{1.0}{densityValue_{xy}}\)
16.3.1. Fragment Area Filter
Optionally, the implementation may fetch additional density map texels in an implementation defined window around (i,j). The texels follow the standard conversion steps up to and including fragment area conversion.
A single fetched fragment area for the framebuffer region is chosen by the implementation and must have an area between the min and max areas of the fetched set.
16.3.2. Fragment Area Clamp
The implementation may clamp the fetched fragment area to one that it supports. The clamped fragment area must have a size less than or equal to the original fetched value. Implementations may vary the supported set of fragment areas per framebuffer region. Fragment area (1,1) must always be in the supported set.
Note
For example, if the fetched fragment area is (1,4) but the implementation only supports areas of {(1,1),(2,2)}, it could choose to clamp the area to (2,2) since it has the same size as (1,4). While this would produce fragments that have lower quality strictly in the x-axis, the overall density is maintained. |
The clamped fragment area is assigned to the corresponding framebuffer region.
17. Queries
Queries provide a mechanism to return information about the processing of a sequence of Vulkan commands. Query operations are asynchronous, and as such, their results are not returned immediately. Instead, their results, and their availability status, are stored in a Query Pool. The state of these queries can be read back on the host, or copied to a buffer object on the device.
The supported query types are Occlusion Queries, Pipeline Statistics Queries, and Timestamp Queries.
17.1. Query Pools
Queries are managed using query pool objects. Each query pool is a collection of a specific number of queries of a particular type.
Query pools are represented by VkQueryPool
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkQueryPool)
To create a query pool, call:
VkResult vkCreateQueryPool(
VkDevice device,
const VkQueryPoolCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkQueryPool* pQueryPool);
-
device
is the logical device that creates the query pool. -
pCreateInfo
is a pointer to an instance of theVkQueryPoolCreateInfo
structure containing the number and type of queries to be managed by the pool. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pQueryPool
is a pointer to a VkQueryPool handle in which the resulting query pool object is returned.
The VkQueryPoolCreateInfo
structure is defined as:
typedef struct VkQueryPoolCreateInfo {
VkStructureType sType;
const void* pNext;
VkQueryPoolCreateFlags flags;
VkQueryType queryType;
uint32_t queryCount;
VkQueryPipelineStatisticFlags pipelineStatistics;
} VkQueryPoolCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
queryType
is a VkQueryType value specifying the type of queries managed by the pool. -
queryCount
is the number of queries managed by the pool. -
pipelineStatistics
is a bitmask of VkQueryPipelineStatisticFlagBits specifying which counters will be returned in queries on the new pool, as described below in Pipeline Statistics Queries.
pipelineStatistics
is ignored if queryType
is not
VK_QUERY_TYPE_PIPELINE_STATISTICS
.
typedef VkFlags VkQueryPoolCreateFlags;
VkQueryPoolCreateFlags
is a bitmask type for setting a mask, but is
currently reserved for future use.
To destroy a query pool, call:
void vkDestroyQueryPool(
VkDevice device,
VkQueryPool queryPool,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the query pool. -
queryPool
is the query pool to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
Possible values of VkQueryPoolCreateInfo::queryType
, specifying
the type of queries managed by the pool, are:
typedef enum VkQueryType {
VK_QUERY_TYPE_OCCLUSION = 0,
VK_QUERY_TYPE_PIPELINE_STATISTICS = 1,
VK_QUERY_TYPE_TIMESTAMP = 2,
VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT = 1000028004,
VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_NV = 1000165000,
} VkQueryType;
-
VK_QUERY_TYPE_OCCLUSION
specifies an occlusion query. -
VK_QUERY_TYPE_PIPELINE_STATISTICS
specifies a pipeline statistics query. -
VK_QUERY_TYPE_TIMESTAMP
specifies a timestamp query. -
VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT
specifies a transform feedback query.
17.2. Query Operation
The operation of queries is controlled by the commands vkCmdBeginQuery, vkCmdEndQuery, vkCmdResetQueryPool, vkCmdCopyQueryPoolResults, and vkCmdWriteTimestamp.
In order for a VkCommandBuffer
to record query management commands,
the queue family for which its VkCommandPool
was created must support
the appropriate type of operations (graphics, compute) suitable for the
query type of a given query pool.
Each query in a query pool has a status that is either unavailable or available, and also has state to store the numerical results of a query operation of the type requested when the query pool was created. Resetting a query via vkCmdResetQueryPool sets the status to unavailable and makes the numerical results undefined. Performing a query operation with vkCmdBeginQuery and vkCmdEndQuery changes the status to available when the query finishes, and updates the numerical results. Both the availability status and numerical results are retrieved by calling either vkGetQueryPoolResults or vkCmdCopyQueryPoolResults.
Query commands, for the same query and submitted to the same queue, execute
in their entirety in submission order,
relative to each other.
In effect there is an implicit execution dependency from each such query
command to all query command previously submitted to the same queue.
There is one significant exception to this; if the flags
parameter of
vkCmdCopyQueryPoolResults does not include
VK_QUERY_RESULT_WAIT_BIT
, execution of vkCmdCopyQueryPoolResults
may happen-before the results of vkCmdEndQuery are available.
After query pool creation, each query is in an undefined state and must be reset prior to use. Queries must also be reset between uses. Using a query that has not been reset will result in undefined behavior.
If a logical device includes multiple physical devices, then each command that writes a query must execute on a single physical device, and any call to vkCmdBeginQuery must execute the corresponding vkCmdEndQuery command on the same physical device.
To reset a range of queries in a query pool, call:
void vkCmdResetQueryPool(
VkCommandBuffer commandBuffer,
VkQueryPool queryPool,
uint32_t firstQuery,
uint32_t queryCount);
-
commandBuffer
is the command buffer into which this command will be recorded. -
queryPool
is the handle of the query pool managing the queries being reset. -
firstQuery
is the initial query index to reset. -
queryCount
is the number of queries to reset.
When executed on a queue, this command sets the status of query indices
[firstQuery
, firstQuery
+ queryCount
- 1] to
unavailable.
Once queries are reset and ready for use, query commands can be issued to a command buffer. Occlusion queries and pipeline statistics queries count events - drawn samples and pipeline stage invocations, respectively - resulting from commands that are recorded between a vkCmdBeginQuery command and a vkCmdEndQuery command within a specified command buffer, effectively scoping a set of drawing and/or compute commands. Timestamp queries write timestamps to a query pool.
A query must begin and end in the same command buffer, although if it is a
primary command buffer, and the
inherited queries feature is enabled,
it can execute secondary command buffers during the query operation.
For a secondary command buffer to be executed while a query is active, it
must set the occlusionQueryEnable
, queryFlags
, and/or
pipelineStatistics
members of VkCommandBufferInheritanceInfo to
conservative values, as described in the Command
Buffer Recording section.
A query must either begin and end inside the same subpass of a render pass
instance, or must both begin and end outside of a render pass instance
(i.e. contain entire render pass instances).
If queries are used while executing a render pass instance that has
multiview enabled, the query uses N consecutive query indices in the
query pool (starting at query
) where N is the number of bits set
in the view mask in the subpass the query is used in.
How the numerical results of the query are distributed among the queries is
implementation-dependent.
For example, some implementations may write each view’s results to a
distinct query, while other implementations may write the total result to
the first query and write zero to the other queries.
However, the sum of the results in all the queries must accurately reflect
the total result of the query summed over all views.
Applications can sum the results from all the queries to compute the total
result.
Queries used with multiview rendering must not span subpasses, i.e. they must begin and end in the same subpass.
To begin a query, call:
void vkCmdBeginQuery(
VkCommandBuffer commandBuffer,
VkQueryPool queryPool,
uint32_t query,
VkQueryControlFlags flags);
-
commandBuffer
is the command buffer into which this command will be recorded. -
queryPool
is the query pool that will manage the results of the query. -
query
is the query index within the query pool that will contain the results. -
flags
is a bitmask of VkQueryControlFlagBits specifying constraints on the types of queries that can be performed.
If the queryType
of the pool is VK_QUERY_TYPE_OCCLUSION
and
flags
contains VK_QUERY_CONTROL_PRECISE_BIT
, an implementation
must return a result that matches the actual number of samples passed.
This is described in more detail in Occlusion Queries.
After beginning a query, that query is considered active within the command buffer it was called in until that same query is ended. Queries active in a primary command buffer when secondary command buffers are executed are considered active for those secondary command buffers.
To begin an indexed query, call:
void vkCmdBeginQueryIndexedEXT(
VkCommandBuffer commandBuffer,
VkQueryPool queryPool,
uint32_t query,
VkQueryControlFlags flags,
uint32_t index);
-
commandBuffer
is the command buffer into which this command will be recorded. -
queryPool
is the query pool that will manage the results of the query. -
query
is the query index within the query pool that will contain the results. -
flags
is a bitmask of VkQueryControlFlagBits specifying constraints on the types of queries that can be performed. -
index
is the query type specific index. When the query type isVK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT
the index represents the vertex stream.
The vkCmdBeginQueryIndexedEXT
command operates the same as the
vkCmdBeginQuery command, except that it also accepts a query type
specific index
parameter.
Bits which can be set in vkCmdBeginQuery::flags
, specifying
constraints on the types of queries that can be performed, are:
typedef enum VkQueryControlFlagBits {
VK_QUERY_CONTROL_PRECISE_BIT = 0x00000001,
} VkQueryControlFlagBits;
-
VK_QUERY_CONTROL_PRECISE_BIT
specifies the precision of occlusion queries.
typedef VkFlags VkQueryControlFlags;
VkQueryControlFlags
is a bitmask type for setting a mask of zero or
more VkQueryControlFlagBits.
To end a query after the set of desired draw or dispatch commands is executed, call:
void vkCmdEndQuery(
VkCommandBuffer commandBuffer,
VkQueryPool queryPool,
uint32_t query);
-
commandBuffer
is the command buffer into which this command will be recorded. -
queryPool
is the query pool that is managing the results of the query. -
query
is the query index within the query pool where the result is stored.
As queries operate asynchronously, ending a query does not immediately set the query’s status to available. A query is considered finished when the final results of the query are ready to be retrieved by vkGetQueryPoolResults and vkCmdCopyQueryPoolResults, and this is when the query’s status is set to available.
Once a query is ended the query must finish in finite time, unless the state of the query is changed using other commands, e.g. by issuing a reset of the query.
To end an indexed query after the set of desired draw or dispatch commands is recorded, call:
void vkCmdEndQueryIndexedEXT(
VkCommandBuffer commandBuffer,
VkQueryPool queryPool,
uint32_t query,
uint32_t index);
-
commandBuffer
is the command buffer into which this command will be recorded. -
queryPool
is the query pool that is managing the results of the query. -
query
is the query index within the query pool where the result is stored. -
index
is the query type specific index.
The vkCmdEndQueryIndexedEXT
command operates the same as the
vkCmdEndQuery command, except that it also accepts a query type
specific index
parameter.
An application can retrieve results either by requesting they be written
into application-provided memory, or by requesting they be copied into a
VkBuffer
.
In either case, the layout in memory is defined as follows:
-
The first query’s result is written starting at the first byte requested by the command, and each subsequent query’s result begins
stride
bytes later. -
Each query’s result is a tightly packed array of unsigned integers, either 32- or 64-bits as requested by the command, storing the numerical results and, if requested, the availability status.
-
If
VK_QUERY_RESULT_WITH_AVAILABILITY_BIT
is used, the final element of each query’s result is an integer indicating whether the query’s result is available, with any non-zero value indicating that it is available. -
Occlusion queries write one integer value - the number of samples passed. Pipeline statistics queries write one integer value for each bit that is enabled in the
pipelineStatistics
when the pool is created, and the statistics values are written in bit order starting from the least significant bit. Timestamps write one integer value. Transform feedback queries write two integers; the first integer is the number of primitives successfully written to the corresponding transform feedback buffer and the second is the number of primitives output to the vertex stream, regardless of whether they were successfully captured or not. In other words, if the transform feedback buffer was sized too small for the number of primitives output by the vertex stream, the first integer represents the number of primitives actually written and the second is the number that would have been written if all the transform feedback buffers associated with that vertex stream were large enough. -
If more than one query is retrieved and
stride
is not at least as large as the size of the array of integers corresponding to a single query, the values written to memory are undefined.
To retrieve status and results for a set of queries, call:
VkResult vkGetQueryPoolResults(
VkDevice device,
VkQueryPool queryPool,
uint32_t firstQuery,
uint32_t queryCount,
size_t dataSize,
void* pData,
VkDeviceSize stride,
VkQueryResultFlags flags);
-
device
is the logical device that owns the query pool. -
queryPool
is the query pool managing the queries containing the desired results. -
firstQuery
is the initial query index. -
queryCount
is the number of queries.firstQuery
andqueryCount
together define a range of queries. For pipeline statistics queries, each query index in the pool contains one integer value for each bit that is enabled in VkQueryPoolCreateInfo::pipelineStatistics
when the pool is created. -
dataSize
is the size in bytes of the buffer pointed to bypData
. -
pData
is a pointer to a user-allocated buffer where the results will be written -
stride
is the stride in bytes between results for individual queries withinpData
. -
flags
is a bitmask of VkQueryResultFlagBits specifying how and when results are returned.
If no bits are set in flags
, and all requested queries are in the
available state, results are written as an array of 32-bit unsigned integer
values.
The behavior when not all queries are available, is described
below.
If VK_QUERY_RESULT_64_BIT
is not set and the result overflows a 32-bit
value, the value may either wrap or saturate.
Similarly, if VK_QUERY_RESULT_64_BIT
is set and the result overflows a
64-bit value, the value may either wrap or saturate.
If VK_QUERY_RESULT_WAIT_BIT
is set, Vulkan will wait for each query to
be in the available state before retrieving the numerical results for that
query.
In this case, vkGetQueryPoolResults
is guaranteed to succeed and
return VK_SUCCESS
if the queries become available in a finite time
(i.e. if they have been issued and not reset).
If queries will never finish (e.g. due to being reset but not issued), then
vkGetQueryPoolResults
may not return in finite time.
If VK_QUERY_RESULT_WAIT_BIT
and VK_QUERY_RESULT_PARTIAL_BIT
are
both not set then no result values are written to pData
for queries
that are in the unavailable state at the time of the call, and
vkGetQueryPoolResults
returns VK_NOT_READY
.
However, availability state is still written to pData
for those
queries if VK_QUERY_RESULT_WITH_AVAILABILITY_BIT
is set.
Note
Applications must take care to ensure that use of the
For example, if a query has been used previously and a command buffer
records the commands The above also applies when |
Note
Applications can double-buffer query pool usage, with a pool per frame, and reset queries at the end of the frame in which they are read. |
If VK_QUERY_RESULT_PARTIAL_BIT
is set, VK_QUERY_RESULT_WAIT_BIT
is not set, and the query’s status is unavailable, an intermediate result
value between zero and the final result value is written to pData
for
that query.
VK_QUERY_RESULT_PARTIAL_BIT
must not be used if the pool’s
queryType
is VK_QUERY_TYPE_TIMESTAMP
.
If VK_QUERY_RESULT_WITH_AVAILABILITY_BIT
is set, the final integer
value written for each query is non-zero if the query’s status was available
or zero if the status was unavailable.
When VK_QUERY_RESULT_WITH_AVAILABILITY_BIT
is used, implementations
must guarantee that if they return a non-zero availability value then the
numerical results must be valid, assuming the results are not reset by a
subsequent command.
Note
Satisfying this guarantee may require careful ordering by the application, e.g. to read the availability status before reading the results. |
Bits which can be set in vkGetQueryPoolResults::flags
and
vkCmdCopyQueryPoolResults::flags
, specifying how and when
results are returned, are:
typedef enum VkQueryResultFlagBits {
VK_QUERY_RESULT_64_BIT = 0x00000001,
VK_QUERY_RESULT_WAIT_BIT = 0x00000002,
VK_QUERY_RESULT_WITH_AVAILABILITY_BIT = 0x00000004,
VK_QUERY_RESULT_PARTIAL_BIT = 0x00000008,
} VkQueryResultFlagBits;
-
VK_QUERY_RESULT_64_BIT
specifies the results will be written as an array of 64-bit unsigned integer values. If this bit is not set, the results will be written as an array of 32-bit unsigned integer values. -
VK_QUERY_RESULT_WAIT_BIT
specifies that Vulkan will wait for each query’s status to become available before retrieving its results. -
VK_QUERY_RESULT_WITH_AVAILABILITY_BIT
specifies that the availability status accompanies the results. -
VK_QUERY_RESULT_PARTIAL_BIT
specifies that returning partial results is acceptable.
typedef VkFlags VkQueryResultFlags;
VkQueryResultFlags
is a bitmask type for setting a mask of zero or
more VkQueryResultFlagBits.
To copy query statuses and numerical results directly to buffer memory, call:
void vkCmdCopyQueryPoolResults(
VkCommandBuffer commandBuffer,
VkQueryPool queryPool,
uint32_t firstQuery,
uint32_t queryCount,
VkBuffer dstBuffer,
VkDeviceSize dstOffset,
VkDeviceSize stride,
VkQueryResultFlags flags);
-
commandBuffer
is the command buffer into which this command will be recorded. -
queryPool
is the query pool managing the queries containing the desired results. -
firstQuery
is the initial query index. -
queryCount
is the number of queries.firstQuery
andqueryCount
together define a range of queries. -
dstBuffer
is a VkBuffer object that will receive the results of the copy command. -
dstOffset
is an offset intodstBuffer
. -
stride
is the stride in bytes between results for individual queries withindstBuffer
. The required size of the backing memory fordstBuffer
is determined as described above for vkGetQueryPoolResults. -
flags
is a bitmask of VkQueryResultFlagBits specifying how and when results are returned.
vkCmdCopyQueryPoolResults
is guaranteed to see the effect of previous
uses of vkCmdResetQueryPool
in the same queue, without any additional
synchronization.
Thus, the results will always reflect the most recent use of the query.
flags
has the same possible values described above for the flags
parameter of vkGetQueryPoolResults, but the different style of
execution causes some subtle behavioral differences.
Because vkCmdCopyQueryPoolResults
executes in order with respect to
other query commands, there is less ambiguity about which use of a query is
being requested.
If no bits are set in flags
, results for all requested queries in the
available state are written as 32-bit unsigned integer values, and nothing
is written for queries in the unavailable state.
If VK_QUERY_RESULT_64_BIT
is set, the results are written as an array
of 64-bit unsigned integer values as described for
vkGetQueryPoolResults.
If VK_QUERY_RESULT_WAIT_BIT
is set, the implementation will wait for
each query’s status to be in the available state before retrieving the
numerical results for that query.
This is guaranteed to reflect the most recent use of the query on the same
queue, assuming that the query is not being simultaneously used by other
queues.
If the query does not become available in a finite amount of time (e.g. due
to not issuing a query since the last reset), a VK_ERROR_DEVICE_LOST
error may occur.
Similarly, if VK_QUERY_RESULT_WITH_AVAILABILITY_BIT
is set and
VK_QUERY_RESULT_WAIT_BIT
is not set, the availability is guaranteed to
reflect the most recent use of the query on the same queue, assuming that
the query is not being simultaneously used by other queues.
As with vkGetQueryPoolResults
, implementations must guarantee that if
they return a non-zero availability value, then the numerical results are
valid.
If VK_QUERY_RESULT_PARTIAL_BIT
is set, VK_QUERY_RESULT_WAIT_BIT
is not set, and the query’s status is unavailable, an intermediate result
value between zero and the final result value is written for that query.
VK_QUERY_RESULT_PARTIAL_BIT
must not be used if the pool’s
queryType
is VK_QUERY_TYPE_TIMESTAMP
.
vkCmdCopyQueryPoolResults
is considered to be a transfer operation,
and its writes to buffer memory must be synchronized using
VK_PIPELINE_STAGE_TRANSFER_BIT
and VK_ACCESS_TRANSFER_WRITE_BIT
before using the results.
Rendering operations such as clears, MSAA resolves, attachment load/store operations, and blits may count towards the results of queries. This behavior is implementation-dependent and may vary depending on the path used within an implementation. For example, some implementations have several types of clears, some of which may include vertices and some not.
17.3. Occlusion Queries
Occlusion queries track the number of samples that pass the per-fragment
tests for a set of drawing commands.
As such, occlusion queries are only available on queue families supporting
graphics operations.
The application can then use these results to inform future rendering
decisions.
An occlusion query is begun and ended by calling vkCmdBeginQuery
and
vkCmdEndQuery
, respectively.
When an occlusion query begins, the count of passing samples always starts
at zero.
For each drawing command, the count is incremented as described in
Sample Counting.
If flags
does not contain VK_QUERY_CONTROL_PRECISE_BIT
an
implementation may generate any non-zero result value for the query if the
count of passing samples is non-zero.
Note
Not setting |
When an occlusion query finishes, the result for that query is marked as
available.
The application can then either copy the result to a buffer (via
vkCmdCopyQueryPoolResults
) or request it be put into host memory (via
vkGetQueryPoolResults
).
Note
If occluding geometry is not drawn first, samples can pass the depth test, but still not be visible in a final image. |
17.4. Pipeline Statistics Queries
Pipeline statistics queries allow the application to sample a specified set
of VkPipeline
counters.
These counters are accumulated by Vulkan for a set of either draw or
dispatch commands while a pipeline statistics query is active.
As such, pipeline statistics queries are available on queue families
supporting either graphics or compute operations.
Further, the availability of pipeline statistics queries is indicated by the
pipelineStatisticsQuery
member of the VkPhysicalDeviceFeatures
object (see vkGetPhysicalDeviceFeatures
and vkCreateDevice
for
detecting and requesting this query type on a VkDevice
).
A pipeline statistics query is begun and ended by calling
vkCmdBeginQuery
and vkCmdEndQuery
, respectively.
When a pipeline statistics query begins, all statistics counters are set to
zero.
While the query is active, the pipeline type determines which set of
statistics are available, but these must be configured on the query pool
when it is created.
If a statistic counter is issued on a command buffer that does not support
the corresponding operation, the value of that counter is undefined after
the query has finished.
At least one statistic counter relevant to the operations supported on the
recording command buffer must be enabled.
Bits which can be set to individually enable pipeline statistics counters
for query pools with VkQueryPoolCreateInfo::pipelineStatistics
,
and for secondary command buffers with
VkCommandBufferInheritanceInfo::pipelineStatistics
, are:
typedef enum VkQueryPipelineStatisticFlagBits {
VK_QUERY_PIPELINE_STATISTIC_INPUT_ASSEMBLY_VERTICES_BIT = 0x00000001,
VK_QUERY_PIPELINE_STATISTIC_INPUT_ASSEMBLY_PRIMITIVES_BIT = 0x00000002,
VK_QUERY_PIPELINE_STATISTIC_VERTEX_SHADER_INVOCATIONS_BIT = 0x00000004,
VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_INVOCATIONS_BIT = 0x00000008,
VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_PRIMITIVES_BIT = 0x00000010,
VK_QUERY_PIPELINE_STATISTIC_CLIPPING_INVOCATIONS_BIT = 0x00000020,
VK_QUERY_PIPELINE_STATISTIC_CLIPPING_PRIMITIVES_BIT = 0x00000040,
VK_QUERY_PIPELINE_STATISTIC_FRAGMENT_SHADER_INVOCATIONS_BIT = 0x00000080,
VK_QUERY_PIPELINE_STATISTIC_TESSELLATION_CONTROL_SHADER_PATCHES_BIT = 0x00000100,
VK_QUERY_PIPELINE_STATISTIC_TESSELLATION_EVALUATION_SHADER_INVOCATIONS_BIT = 0x00000200,
VK_QUERY_PIPELINE_STATISTIC_COMPUTE_SHADER_INVOCATIONS_BIT = 0x00000400,
} VkQueryPipelineStatisticFlagBits;
-
VK_QUERY_PIPELINE_STATISTIC_INPUT_ASSEMBLY_VERTICES_BIT
specifies that queries managed by the pool will count the number of vertices processed by the input assembly stage. Vertices corresponding to incomplete primitives may contribute to the count. -
VK_QUERY_PIPELINE_STATISTIC_INPUT_ASSEMBLY_PRIMITIVES_BIT
specifies that queries managed by the pool will count the number of primitives processed by the input assembly stage. If primitive restart is enabled, restarting the primitive topology has no effect on the count. Incomplete primitives may be counted. -
VK_QUERY_PIPELINE_STATISTIC_VERTEX_SHADER_INVOCATIONS_BIT
specifies that queries managed by the pool will count the number of vertex shader invocations. This counter’s value is incremented each time a vertex shader is invoked. -
VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_INVOCATIONS_BIT
specifies that queries managed by the pool will count the number of geometry shader invocations. This counter’s value is incremented each time a geometry shader is invoked. In the case of instanced geometry shaders, the geometry shader invocations count is incremented for each separate instanced invocation. -
VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_PRIMITIVES_BIT
specifies that queries managed by the pool will count the number of primitives generated by geometry shader invocations. The counter’s value is incremented each time the geometry shader emits a primitive. Restarting primitive topology using the SPIR-V instructionsOpEndPrimitive
orOpEndStreamPrimitive
has no effect on the geometry shader output primitives count. -
VK_QUERY_PIPELINE_STATISTIC_CLIPPING_INVOCATIONS_BIT
specifies that queries managed by the pool will count the number of primitives processed by the Primitive Clipping stage of the pipeline. The counter’s value is incremented each time a primitive reaches the primitive clipping stage. -
VK_QUERY_PIPELINE_STATISTIC_CLIPPING_PRIMITIVES_BIT
specifies that queries managed by the pool will count the number of primitives output by the Primitive Clipping stage of the pipeline. The counter’s value is incremented each time a primitive passes the primitive clipping stage. The actual number of primitives output by the primitive clipping stage for a particular input primitive is implementation-dependent but must satisfy the following conditions:-
If at least one vertex of the input primitive lies inside the clipping volume, the counter is incremented by one or more.
-
Otherwise, the counter is incremented by zero or more.
-
-
VK_QUERY_PIPELINE_STATISTIC_FRAGMENT_SHADER_INVOCATIONS_BIT
specifies that queries managed by the pool will count the number of fragment shader invocations. The counter’s value is incremented each time the fragment shader is invoked. -
VK_QUERY_PIPELINE_STATISTIC_TESSELLATION_CONTROL_SHADER_PATCHES_BIT
specifies that queries managed by the pool will count the number of patches processed by the tessellation control shader. The counter’s value is incremented once for each patch for which a tessellation control shader is invoked. -
VK_QUERY_PIPELINE_STATISTIC_TESSELLATION_EVALUATION_SHADER_INVOCATIONS_BIT
specifies that queries managed by the pool will count the number of invocations of the tessellation evaluation shader. The counter’s value is incremented each time the tessellation evaluation shader is invoked. -
VK_QUERY_PIPELINE_STATISTIC_COMPUTE_SHADER_INVOCATIONS_BIT
specifies that queries managed by the pool will count the number of compute shader invocations. The counter’s value is incremented every time the compute shader is invoked. Implementations may skip the execution of certain compute shader invocations or execute additional compute shader invocations for implementation-dependent reasons as long as the results of rendering otherwise remain unchanged.
These values are intended to measure relative statistics on one implementation. Various device architectures will count these values differently. Any or all counters may be affected by the issues described in Query Operation.
Note
For example, tile-based rendering devices may need to replay the scene multiple times, affecting some of the counts. |
If a pipeline has rasterizerDiscardEnable
enabled, implementations
may discard primitives after the final vertex processing stage.
As a result, if rasterizerDiscardEnable
is enabled, the clipping input
and output primitives counters may not be incremented.
When a pipeline statistics query finishes, the result for that query is
marked as available.
The application can copy the result to a buffer (via
vkCmdCopyQueryPoolResults
), or request it be put into host memory (via
vkGetQueryPoolResults
).
typedef VkFlags VkQueryPipelineStatisticFlags;
VkQueryPipelineStatisticFlags
is a bitmask type for setting a mask of
zero or more VkQueryPipelineStatisticFlagBits.
17.5. Timestamp Queries
Timestamps provide applications with a mechanism for timing the execution
of commands.
A timestamp is an integer value generated by the VkPhysicalDevice
.
Unlike other queries, timestamps do not operate over a range, and so do not
use vkCmdBeginQuery or vkCmdEndQuery.
The mechanism is built around a set of commands that allow the application
to tell the VkPhysicalDevice
to write timestamp values to a
query pool and then either read timestamp values on the
host (using vkGetQueryPoolResults) or copy timestamp values to a
VkBuffer
(using vkCmdCopyQueryPoolResults).
The application can then compute differences between timestamps to
determine execution time.
The number of valid bits in a timestamp value is determined by the
VkQueueFamilyProperties
::timestampValidBits
property of the
queue on which the timestamp is written.
Timestamps are supported on any queue which reports a non-zero value for
timestampValidBits
via vkGetPhysicalDeviceQueueFamilyProperties.
If the timestampComputeAndGraphics
limit is VK_TRUE
, timestamps are
supported by every queue family that supports either graphics or compute
operations (see VkQueueFamilyProperties).
The number of nanoseconds it takes for a timestamp value to be incremented
by 1 can be obtained from
VkPhysicalDeviceLimits
::timestampPeriod
after a call to
vkGetPhysicalDeviceProperties
.
To request a timestamp, call:
void vkCmdWriteTimestamp(
VkCommandBuffer commandBuffer,
VkPipelineStageFlagBits pipelineStage,
VkQueryPool queryPool,
uint32_t query);
-
commandBuffer
is the command buffer into which the command will be recorded. -
pipelineStage
is one of the VkPipelineStageFlagBits, specifying a stage of the pipeline. -
queryPool
is the query pool that will manage the timestamp. -
query
is the query within the query pool that will contain the timestamp.
vkCmdWriteTimestamp
latches the value of the timer when all previous
commands have completed executing as far as the specified pipeline stage,
and writes the timestamp value to memory.
When the timestamp value is written, the availability status of the query is
set to available.
Note
If an implementation is unable to detect completion and latch the timer at any specific stage of the pipeline, it may instead do so at any logically later stage. |
vkCmdCopyQueryPoolResults can then be called to copy the timestamp value from the query pool into buffer memory, with ordering and synchronization behavior equivalent to how other queries operate. Timestamp values can also be retrieved from the query pool using vkGetQueryPoolResults. As with other queries, the query must be reset using vkCmdResetQueryPool before requesting the timestamp value be written to it.
While vkCmdWriteTimestamp
can be called inside or outside of a render
pass instance, vkCmdCopyQueryPoolResults must only be called outside
of a render pass instance.
Timestamps may only be meaningfully compared if they are written by commands submitted to the same queue.
Note
An example of such a comparison is determining the execution time of a sequence of commands. |
If vkCmdWriteTimestamp
is called while executing a render pass
instance that has multiview enabled, the timestamp uses N consecutive
query indices in the query pool (starting at query
) where N is
the number of bits set in the view mask of the subpass the command is
executed in.
The resulting query values are determined by an implementation-dependent
choice of one of the following behaviors:
-
The first query is a timestamp value and (if more than one bit is set in the view mask) zero is written to the remaining queries. If two timestamps are written in the same subpass, the sum of the execution time of all views between those commands is the difference between the first query written by each command.
-
All N queries are timestamp values. If two timestamps are written in the same subpass, the sum of the execution time of all views between those commands is the sum of the difference between corresponding queries written by each command. The difference between corresponding queries may be the execution time of a single view.
In either case, the application can sum the differences between all N queries to determine the total execution time.
17.6. Transform Feedback Queries
Transform feedback queries track the number of primitives attempted to be
written and actually written, by the vertex stream being captured, to a
transform feedback buffer.
This query is updated during draw commands while transform feedback is
active.
The number of primitives actually written will be less than the number
attempted to be written if the bound transform feedback buffer size was too
small for the number of primitives actually drawn.
Primitives are not written beyond the bound range of the transform feedback
buffer.
A transform feedback query is begun and ended by calling
vkCmdBeginQuery
and vkCmdEndQuery
, respectively to query for
vertex stream zero.
vkCmdBeginQueryIndexedEXT
and vkCmdEndQueryIndexedEXT
can be
used to begin and end transform feedback queries for any supported vertex
stream.
When a transform feedback query begins, the count of primitives written and
primitives needed starts from zero.
For each drawing command, the count is incremented as vertex attribute
outputs are captured to the transform feedback buffers while transform
feedback is active.
When a transform feedback query finishes, the result for that query is
marked as available.
The application can then either copy the result to a buffer (via
vkCmdCopyQueryPoolResults
) or request it be put into host memory (via
vkGetQueryPoolResults
).
18. Clear Commands
18.1. Clearing Images Outside A Render Pass Instance
Color and depth/stencil images can be cleared outside a render pass instance using vkCmdClearColorImage or vkCmdClearDepthStencilImage, respectively. These commands are only allowed outside of a render pass instance.
To clear one or more subranges of a color image, call:
void vkCmdClearColorImage(
VkCommandBuffer commandBuffer,
VkImage image,
VkImageLayout imageLayout,
const VkClearColorValue* pColor,
uint32_t rangeCount,
const VkImageSubresourceRange* pRanges);
-
commandBuffer
is the command buffer into which the command will be recorded. -
image
is the image to be cleared. -
imageLayout
specifies the current layout of the image subresource ranges to be cleared, and must beVK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
,VK_IMAGE_LAYOUT_GENERAL
orVK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL
. -
pColor
is a pointer to a VkClearColorValue structure that contains the values the image subresource ranges will be cleared to (see Clear Values below). -
rangeCount
is the number of image subresource range structures inpRanges
. -
pRanges
points to an array of VkImageSubresourceRange structures that describe a range of mipmap levels, array layers, and aspects to be cleared, as described in Image Views.
Each specified range in pRanges
is cleared to the value specified by
pColor
.
To clear one or more subranges of a depth/stencil image, call:
void vkCmdClearDepthStencilImage(
VkCommandBuffer commandBuffer,
VkImage image,
VkImageLayout imageLayout,
const VkClearDepthStencilValue* pDepthStencil,
uint32_t rangeCount,
const VkImageSubresourceRange* pRanges);
-
commandBuffer
is the command buffer into which the command will be recorded. -
image
is the image to be cleared. -
imageLayout
specifies the current layout of the image subresource ranges to be cleared, and must beVK_IMAGE_LAYOUT_GENERAL
orVK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL
. -
pDepthStencil
is a pointer to a VkClearDepthStencilValue structure that contains the values the depth and stencil image subresource ranges will be cleared to (see Clear Values below). -
rangeCount
is the number of image subresource range structures inpRanges
. -
pRanges
points to an array of VkImageSubresourceRange structures that describe a range of mipmap levels, array layers, and aspects to be cleared, as described in Image Views.
Clears outside render pass instances are treated as transfer operations for the purposes of memory barriers.
18.2. Clearing Images Inside A Render Pass Instance
To clear one or more regions of color and depth/stencil attachments inside a render pass instance, call:
void vkCmdClearAttachments(
VkCommandBuffer commandBuffer,
uint32_t attachmentCount,
const VkClearAttachment* pAttachments,
uint32_t rectCount,
const VkClearRect* pRects);
-
commandBuffer
is the command buffer into which the command will be recorded. -
attachmentCount
is the number of entries in thepAttachments
array. -
pAttachments
is a pointer to an array of VkClearAttachment structures defining the attachments to clear and the clear values to use. If any attachment to be cleared in the current subpass isVK_ATTACHMENT_UNUSED
, then the clear has no effect on that attachment. -
rectCount
is the number of entries in thepRects
array. -
pRects
points to an array of VkClearRect structures defining regions within each selected attachment to clear.
vkCmdClearAttachments
can clear multiple regions of each attachment
used in the current subpass of a render pass instance.
This command must be called only inside a render pass instance, and
implicitly selects the images to clear based on the current framebuffer
attachments and the command parameters.
If the render pass has a fragment density map attachment, clears follow the operations of fragment density maps as if each clear region was a primitive which generates fragments. The clear color is applied to all pixels inside each fragment’s area regardless if the pixels lie outside of the clear region. Clears may have a different set of supported fragment areas than draws.
Unlike other clear commands, vkCmdClearAttachments executes as a
drawing command, rather than a transfer command, with writes performed by it
executing in rasterization order.
Clears to color attachments are executed as color attachment writes, by the
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
stage.
Clears to depth/stencil attachments are executed as depth
writes and writes by the
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
and
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT
stages.
The VkClearRect
structure is defined as:
typedef struct VkClearRect {
VkRect2D rect;
uint32_t baseArrayLayer;
uint32_t layerCount;
} VkClearRect;
-
rect
is the two-dimensional region to be cleared. -
baseArrayLayer
is the first layer to be cleared. -
layerCount
is the number of layers to clear.
The layers [baseArrayLayer
, baseArrayLayer
+
layerCount
) counting from the base layer of the attachment image view
are cleared.
The VkClearAttachment
structure is defined as:
typedef struct VkClearAttachment {
VkImageAspectFlags aspectMask;
uint32_t colorAttachment;
VkClearValue clearValue;
} VkClearAttachment;
-
aspectMask
is a mask selecting the color, depth and/or stencil aspects of the attachment to be cleared. -
colorAttachment
is only meaningful ifVK_IMAGE_ASPECT_COLOR_BIT
is set inaspectMask
, in which case it is an index to thepColorAttachments
array in the VkSubpassDescription structure of the current subpass which selects the color attachment to clear. -
clearValue
is the color or depth/stencil value to clear the attachment to, as described in Clear Values below.
No memory barriers are needed between vkCmdClearAttachments
and
preceding or subsequent draw or attachment clear commands in the same
subpass.
The vkCmdClearAttachments
command is not affected by the bound
pipeline state.
Attachments can also be cleared at the beginning of a render pass instance
by setting loadOp
(or stencilLoadOp
) of
VkAttachmentDescription to VK_ATTACHMENT_LOAD_OP_CLEAR
, as
described for vkCreateRenderPass.
18.3. Clear Values
The VkClearColorValue
structure is defined as:
typedef union VkClearColorValue {
float float32[4];
int32_t int32[4];
uint32_t uint32[4];
} VkClearColorValue;
-
float32
are the color clear values when the format of the image or attachment is one of the formats in the Interpretation of Numeric Format table other than signed integer (SINT
) or unsigned integer (UINT
). Floating point values are automatically converted to the format of the image, with the clear value being treated as linear if the image is sRGB. -
int32
are the color clear values when the format of the image or attachment is signed integer (SINT
). Signed integer values are converted to the format of the image by casting to the smaller type (with negative 32-bit values mapping to negative values in the smaller type). If the integer clear value is not representable in the target type (e.g. would overflow in conversion to that type), the clear value is undefined. -
uint32
are the color clear values when the format of the image or attachment is unsigned integer (UINT
). Unsigned integer values are converted to the format of the image by casting to the integer type with fewer bits.
The four array elements of the clear color map to R, G, B, and A components of image formats, in order.
If the image has more than one sample, the same value is written to all samples for any pixels being cleared.
The VkClearDepthStencilValue
structure is defined as:
typedef struct VkClearDepthStencilValue {
float depth;
uint32_t stencil;
} VkClearDepthStencilValue;
-
depth
is the clear value for the depth aspect of the depth/stencil attachment. It is a floating-point value which is automatically converted to the attachment’s format. -
stencil
is the clear value for the stencil aspect of the depth/stencil attachment. It is a 32-bit integer value which is converted to the attachment’s format by taking the appropriate number of LSBs.
The VkClearValue
union is defined as:
typedef union VkClearValue {
VkClearColorValue color;
VkClearDepthStencilValue depthStencil;
} VkClearValue;
-
color
specifies the color image clear values to use when clearing a color image or attachment. -
depthStencil
specifies the depth and stencil clear values to use when clearing a depth/stencil image or attachment.
This union is used where part of the API requires either color or depth/stencil clear values, depending on the attachment, and defines the initial clear values in the VkRenderPassBeginInfo structure.
18.4. Filling Buffers
To clear buffer data, call:
void vkCmdFillBuffer(
VkCommandBuffer commandBuffer,
VkBuffer dstBuffer,
VkDeviceSize dstOffset,
VkDeviceSize size,
uint32_t data);
-
commandBuffer
is the command buffer into which the command will be recorded. -
dstBuffer
is the buffer to be filled. -
dstOffset
is the byte offset into the buffer at which to start filling, and must be a multiple of 4. -
size
is the number of bytes to fill, and must be either a multiple of 4, orVK_WHOLE_SIZE
to fill the range fromoffset
to the end of the buffer. IfVK_WHOLE_SIZE
is used and the remaining size of the buffer is not a multiple of 4, then the nearest smaller multiple is used. -
data
is the 4-byte word written repeatedly to the buffer to fillsize
bytes of data. The data word is written to memory according to the host endianness.
vkCmdFillBuffer
is treated as “transfer” operation for the purposes
of synchronization barriers.
The VK_BUFFER_USAGE_TRANSFER_DST_BIT
must be specified in usage
of VkBufferCreateInfo
in order for the buffer to be compatible with
vkCmdFillBuffer
.
18.5. Updating Buffers
To update buffer data inline in a command buffer, call:
void vkCmdUpdateBuffer(
VkCommandBuffer commandBuffer,
VkBuffer dstBuffer,
VkDeviceSize dstOffset,
VkDeviceSize dataSize,
const void* pData);
-
commandBuffer
is the command buffer into which the command will be recorded. -
dstBuffer
is a handle to the buffer to be updated. -
dstOffset
is the byte offset into the buffer to start updating, and must be a multiple of 4. -
dataSize
is the number of bytes to update, and must be a multiple of 4. -
pData
is a pointer to the source data for the buffer update, and must be at leastdataSize
bytes in size.
dataSize
must be less than or equal to 65536 bytes.
For larger updates, applications can use buffer to buffer
copies.
Note
Buffer updates performed with The additional cost of this functionality compared to buffer to buffer copies means it is only recommended for very small amounts of data, and is why it is limited to only 65536 bytes. Applications can work around this by issuing multiple
|
The source data is copied from the user pointer to the command buffer when the command is called.
vkCmdUpdateBuffer
is only allowed outside of a render pass.
This command is treated as “transfer” operation, for the purposes of
synchronization barriers.
The VK_BUFFER_USAGE_TRANSFER_DST_BIT
must be specified in usage
of VkBufferCreateInfo in order for the buffer to be compatible with
vkCmdUpdateBuffer
.
Note
The |
19. Copy Commands
An application can copy buffer and image data using several methods
depending on the type of data transfer.
Data can be copied between buffer objects with vkCmdCopyBuffer
and a
portion of an image can be copied to another image with
vkCmdCopyImage
.
Image data can also be copied to and from buffer memory using
vkCmdCopyImageToBuffer
and vkCmdCopyBufferToImage
.
Image data can be blitted (with or without scaling and filtering) with
vkCmdBlitImage
.
Multisampled images can be resolved to a non-multisampled image with
vkCmdResolveImage
.
19.1. Common Operation
The following valid usage rules apply to all copy commands:
-
Copy commands must be recorded outside of a render pass instance.
-
The set of all bytes bound to all the source regions must not overlap the set of all bytes bound to the destination regions.
-
The set of all bytes bound to each destination region must not overlap the set of all bytes bound to another destination region.
-
Copy regions must be non-empty.
-
Regions must not extend outside the bounds of the buffer or image level, except that regions of compressed images can extend as far as the dimension of the image level rounded up to a complete compressed texel block.
-
Source image subresources must be in either the
VK_IMAGE_LAYOUT_GENERAL
orVK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL
layout. Destination image subresources must be in theVK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
,VK_IMAGE_LAYOUT_GENERAL
orVK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL
layout. As a consequence, if an image subresource is used as both source and destination of a copy, it must be in theVK_IMAGE_LAYOUT_GENERAL
layout. -
Source images must have
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT
in their format features. -
Destination images must have
VK_FORMAT_FEATURE_TRANSFER_DST_BIT
in their format features. -
Source images must have been created with the
VK_IMAGE_USAGE_TRANSFER_SRC_BIT
usage bit enabled and destination images must have been created with theVK_IMAGE_USAGE_TRANSFER_DST_BIT
usage bit enabled. -
Source buffers must have been created with the
VK_BUFFER_USAGE_TRANSFER_SRC_BIT
usage bit enabled and destination buffers must have been created with theVK_BUFFER_USAGE_TRANSFER_DST_BIT
usage bit enabled.
All copy commands are treated as “transfer” operations for the purposes of synchronization barriers.
19.2. Copying Data Between Buffers
To copy data between buffer objects, call:
void vkCmdCopyBuffer(
VkCommandBuffer commandBuffer,
VkBuffer srcBuffer,
VkBuffer dstBuffer,
uint32_t regionCount,
const VkBufferCopy* pRegions);
-
commandBuffer
is the command buffer into which the command will be recorded. -
srcBuffer
is the source buffer. -
dstBuffer
is the destination buffer. -
regionCount
is the number of regions to copy. -
pRegions
is a pointer to an array of VkBufferCopy structures specifying the regions to copy.
Each region in pRegions
is copied from the source buffer to the same
region of the destination buffer.
srcBuffer
and dstBuffer
can be the same buffer or alias the
same memory, but the resulting values are undefined if the copy regions
overlap in memory.
The VkBufferCopy
structure is defined as:
typedef struct VkBufferCopy {
VkDeviceSize srcOffset;
VkDeviceSize dstOffset;
VkDeviceSize size;
} VkBufferCopy;
-
srcOffset
is the starting offset in bytes from the start ofsrcBuffer
. -
dstOffset
is the starting offset in bytes from the start ofdstBuffer
. -
size
is the number of bytes to copy.
19.3. Copying Data Between Images
vkCmdCopyImage
performs image copies in a similar manner to a host
memcpy.
It does not perform general-purpose conversions such as scaling, resizing,
blending, color-space conversion, or format conversions.
Rather, it simply copies raw image data.
vkCmdCopyImage
can copy between images with different formats,
provided the formats are compatible as defined below.
To copy data between image objects, call:
void vkCmdCopyImage(
VkCommandBuffer commandBuffer,
VkImage srcImage,
VkImageLayout srcImageLayout,
VkImage dstImage,
VkImageLayout dstImageLayout,
uint32_t regionCount,
const VkImageCopy* pRegions);
-
commandBuffer
is the command buffer into which the command will be recorded. -
srcImage
is the source image. -
srcImageLayout
is the current layout of the source image subresource. -
dstImage
is the destination image. -
dstImageLayout
is the current layout of the destination image subresource. -
regionCount
is the number of regions to copy. -
pRegions
is a pointer to an array of VkImageCopy structures specifying the regions to copy.
Each region in pRegions
is copied from the source image to the same
region of the destination image.
srcImage
and dstImage
can be the same image or alias the same
memory.
The formats of srcImage
and dstImage
must be compatible.
Formats are compatible if they share the same class, as shown in the
Compatible Formats table.
Depth/stencil formats must match exactly.
If the format of srcImage
or dstImage
is a
multi-planar image
format, regions of each plane to be copied must be specified separately
using the srcSubresource
and dstSubresource
members of the
VkImageCopy structure.
In this case, the aspectMask
of the srcSubresource
or
dstSubresource
that refers to the multi-planar image must be
VK_IMAGE_ASPECT_PLANE_0_BIT
, VK_IMAGE_ASPECT_PLANE_1_BIT
, or
VK_IMAGE_ASPECT_PLANE_2_BIT
.
For the purposes of vkCmdCopyImage
, each plane of a multi-planar image
is treated as having the format listed in
Compatible formats of planes of multi-planar formats for the plane identified by the
aspectMask
of the corresponding subresource.
This applies both to VkFormat and to coordinates used in the copy,
which correspond to texels in the plane rather than how these texels map
to coordinates in the image as a whole.
Note
For example, the |
vkCmdCopyImage
allows copying between size-compatible compressed and
uncompressed internal formats.
Formats are size-compatible if the texel block size of the uncompressed
format is equal to the texel block size of the compressed format.
Such a copy does not perform on-the-fly compression or decompression.
When copying from an uncompressed format to a compressed format, each texel
of uncompressed data of the source image is copied as a raw value to the
corresponding compressed texel block of the destination image.
When copying from a compressed format to an uncompressed format, each
compressed texel block of the source image is copied as a raw value to the
corresponding texel of uncompressed data in the destination image.
Thus, for example, it is legal to copy between a 128-bit uncompressed format
and a compressed format which has a 128-bit sized compressed texel block
representing 4×4 texels (using 8 bits per texel), or between a 64-bit
uncompressed format and a compressed format which has a 64-bit sized
compressed texel block representing 4×4 texels (using 4 bits per
texel).
When copying between compressed and uncompressed formats the extent
members represent the texel dimensions of the source image and not the
destination.
When copying from a compressed image to an uncompressed image the image
texel dimensions written to the uncompressed image will be source extent
divided by the compressed texel block dimensions.
When copying from an uncompressed image to a compressed image the image
texel dimensions written to the compressed image will be the source extent
multiplied by the compressed texel block dimensions.
In both cases the number of bytes read and the number of bytes written will
be identical.
Copying to or from block-compressed images is typically done in multiples of
the compressed texel block size.
For this reason the extent
must be a multiple of the compressed texel
block dimension.
There is one exception to this rule which is required to handle compressed
images created with dimensions that are not a multiple of the compressed
texel block dimensions: if the srcImage
is compressed, then:
-
If
extent.width
is not a multiple of the compressed texel block width, then (extent.width
+srcOffset.x
) must equal the image subresource width. -
If
extent.height
is not a multiple of the compressed texel block height, then (extent.height
+srcOffset.y
) must equal the image subresource height. -
If
extent.depth
is not a multiple of the compressed texel block depth, then (extent.depth
+srcOffset.z
) must equal the image subresource depth.
Similarly, if the dstImage
is compressed, then:
-
If
extent.width
is not a multiple of the compressed texel block width, then (extent.width
+dstOffset.x
) must equal the image subresource width. -
If
extent.height
is not a multiple of the compressed texel block height, then (extent.height
+dstOffset.y
) must equal the image subresource height. -
If
extent.depth
is not a multiple of the compressed texel block depth, then (extent.depth
+dstOffset.z
) must equal the image subresource depth.
This allows the last compressed texel block of the image in each non-multiple dimension to be included as a source or destination of the copy.
“_422
” image formats that are not
multi-planar are
treated as having a 2×1 compressed texel block for the purposes of
these rules.
vkCmdCopyImage
can be used to copy image data between multisample
images, but both images must have the same number of samples.
The VkImageCopy
structure is defined as:
typedef struct VkImageCopy {
VkImageSubresourceLayers srcSubresource;
VkOffset3D srcOffset;
VkImageSubresourceLayers dstSubresource;
VkOffset3D dstOffset;
VkExtent3D extent;
} VkImageCopy;
-
srcSubresource
anddstSubresource
are VkImageSubresourceLayers structures specifying the image subresources of the images used for the source and destination image data, respectively. -
srcOffset
anddstOffset
select the initialx
,y
, andz
offsets in texels of the sub-regions of the source and destination image data. -
extent
is the size in texels of the image to copy inwidth
,height
anddepth
.
For VK_IMAGE_TYPE_3D
images, copies are performed slice by slice
starting with the z
member of the srcOffset
or dstOffset
,
and copying depth
slices.
For images with multiple layers, copies are performed layer by layer
starting with the baseArrayLayer
member of the srcSubresource
or
dstSubresource
and copying layerCount
layers.
Image data can be copied between images with different image types.
If one image is VK_IMAGE_TYPE_3D
and the other image is
VK_IMAGE_TYPE_2D
with multiple layers, then each slice is copied to or
from a different layer.
Copies involving a multi-planar image format specify the region to be copied in terms of the
plane to be copied, not the coordinates of the multi-planar image.
This means that copies accessing the R/B planes of “_422
” format
images must fit the copied region within half the width
of the parent
image, and that copies accessing the R/B planes of “_420
” format
images must fit the copied region within half the width
and
height
of the parent image.
The VkImageSubresourceLayers
structure is defined as:
typedef struct VkImageSubresourceLayers {
VkImageAspectFlags aspectMask;
uint32_t mipLevel;
uint32_t baseArrayLayer;
uint32_t layerCount;
} VkImageSubresourceLayers;
-
aspectMask
is a combination of VkImageAspectFlagBits, selecting the color, depth and/or stencil aspects to be copied. -
mipLevel
is the mipmap level to copy from. -
baseArrayLayer
andlayerCount
are the starting layer and number of layers to copy.
19.4. Copying Data Between Buffers and Images
To copy data from a buffer object to an image object, call:
void vkCmdCopyBufferToImage(
VkCommandBuffer commandBuffer,
VkBuffer srcBuffer,
VkImage dstImage,
VkImageLayout dstImageLayout,
uint32_t regionCount,
const VkBufferImageCopy* pRegions);
-
commandBuffer
is the command buffer into which the command will be recorded. -
srcBuffer
is the source buffer. -
dstImage
is the destination image. -
dstImageLayout
is the layout of the destination image subresources for the copy. -
regionCount
is the number of regions to copy. -
pRegions
is a pointer to an array of VkBufferImageCopy structures specifying the regions to copy.
Each region in pRegions
is copied from the specified region of the
source buffer to the specified region of the destination image.
If the format of dstImage
is a
multi-planar image
format), regions of each plane to be a target of a copy must be specified
separately using the pRegions
member of the VkBufferImageCopy
structure.
In this case, the aspectMask
of imageSubresource
must be
VK_IMAGE_ASPECT_PLANE_0_BIT
, VK_IMAGE_ASPECT_PLANE_1_BIT
, or
VK_IMAGE_ASPECT_PLANE_2_BIT
.
For the purposes of vkCmdCopyBufferToImage
, each plane of a
multi-planar image is treated as having the format listed in
Compatible formats of planes of multi-planar formats for the plane identified by the
aspectMask
of the corresponding subresource.
This applies both to VkFormat and to coordinates used in the copy,
which correspond to texels in the plane rather than how these texels map
to coordinates in the image as a whole.
To copy data from an image object to a buffer object, call:
void vkCmdCopyImageToBuffer(
VkCommandBuffer commandBuffer,
VkImage srcImage,
VkImageLayout srcImageLayout,
VkBuffer dstBuffer,
uint32_t regionCount,
const VkBufferImageCopy* pRegions);
-
commandBuffer
is the command buffer into which the command will be recorded. -
srcImage
is the source image. -
srcImageLayout
is the layout of the source image subresources for the copy. -
dstBuffer
is the destination buffer. -
regionCount
is the number of regions to copy. -
pRegions
is a pointer to an array of VkBufferImageCopy structures specifying the regions to copy.
Each region in pRegions
is copied from the specified region of the
source image to the specified region of the destination buffer.
If the VkFormat of srcImage
is a
multi-planar image
format, regions of each plane to be a source of a copy must be specified
separately using the pRegions
member of the VkBufferImageCopy
structure.
In this case, the aspectMask
of imageSubresource
must be
VK_IMAGE_ASPECT_PLANE_0_BIT
, VK_IMAGE_ASPECT_PLANE_1_BIT
, or
VK_IMAGE_ASPECT_PLANE_2_BIT
.
For the purposes of vkCmdCopyBufferToImage
, each plane of a
multi-planar image is treated as having the format listed in
Compatible formats of planes of multi-planar formats for the plane identified by the
aspectMask
of the corresponding subresource.
This applies both to VkFormat and to coordinates used in the copy,
which correspond to texels in the plane rather than how these texels map
to coordinates in the image as a whole.
For both vkCmdCopyBufferToImage and vkCmdCopyImageToBuffer, each
element of pRegions
is a structure defined as:
typedef struct VkBufferImageCopy {
VkDeviceSize bufferOffset;
uint32_t bufferRowLength;
uint32_t bufferImageHeight;
VkImageSubresourceLayers imageSubresource;
VkOffset3D imageOffset;
VkExtent3D imageExtent;
} VkBufferImageCopy;
-
bufferOffset
is the offset in bytes from the start of the buffer object where the image data is copied from or to. -
bufferRowLength
andbufferImageHeight
specify in texels a subregion of a larger two- or three-dimensional image in buffer memory, and control the addressing calculations. If either of these values is zero, that aspect of the buffer memory is considered to be tightly packed according to theimageExtent
. -
imageSubresource
is a VkImageSubresourceLayers used to specify the specific image subresources of the image used for the source or destination image data. -
imageOffset
selects the initialx
,y
,z
offsets in texels of the sub-region of the source or destination image data. -
imageExtent
is the size in texels of the image to copy inwidth
,height
anddepth
.
When copying to or from a depth or stencil aspect, the data in buffer memory uses a layout that is a (mostly) tightly packed representation of the depth or stencil data. Specifically:
-
data copied to or from the stencil aspect of any depth/stencil format is tightly packed with one
VK_FORMAT_S8_UINT
value per texel. -
data copied to or from the depth aspect of a
VK_FORMAT_D16_UNORM
orVK_FORMAT_D16_UNORM_S8_UINT
format is tightly packed with oneVK_FORMAT_D16_UNORM
value per texel. -
data copied to or from the depth aspect of a
VK_FORMAT_D32_SFLOAT
orVK_FORMAT_D32_SFLOAT_S8_UINT
format is tightly packed with oneVK_FORMAT_D32_SFLOAT
value per texel. -
data copied to or from the depth aspect of a
VK_FORMAT_X8_D24_UNORM_PACK32
orVK_FORMAT_D24_UNORM_S8_UINT
format is packed with one 32-bit word per texel with the D24 value in the LSBs of the word, and undefined values in the eight MSBs.
Note
To copy both the depth and stencil aspects of a depth/stencil format, two
entries in |
Because depth or stencil aspect buffer to image copies may require format conversions on some implementations, they are not supported on queues that do not support graphics.
Copies are done layer by layer starting with image layer
baseArrayLayer
member of imageSubresource
.
layerCount
layers are copied from the source image or to the
destination image.
19.4.1. Buffer and Image Addressing
Pseudocode for image/buffer addressing of uncompressed formats is:
rowLength = region->bufferRowLength;
if (rowLength == 0)
rowLength = region->imageExtent.width;
imageHeight = region->bufferImageHeight;
if (imageHeight == 0)
imageHeight = region->imageExtent.height;
texelBlockSize = <texel block size of the format of the src/dstImage>;
address of (x,y,z) = region->bufferOffset + (((z * imageHeight) + y) * rowLength + x) * texelBlockSize;
where x,y,z range from (0,0,0) to region->imageExtent.{width,height,depth}.
Note that imageOffset
does not affect addressing calculations for
buffer memory.
Instead, bufferOffset
can be used to select the starting address in
buffer memory.
For block-compressed formats, all parameters are still specified in texels rather than compressed texel blocks, but the addressing math operates on whole compressed texel blocks. Pseudocode for compressed copy addressing is:
rowLength = region->bufferRowLength;
if (rowLength == 0)
rowLength = region->imageExtent.width;
imageHeight = region->bufferImageHeight;
if (imageHeight == 0)
imageHeight = region->imageExtent.height;
compressedTexelBlockSizeInBytes = <compressed texel block size taken from the src/dstImage>;
rowLength /= compressedTexelBlockWidth;
imageHeight /= compressedTexelBlockHeight;
address of (x,y,z) = region->bufferOffset + (((z * imageHeight) + y) * rowLength + x) * compressedTexelBlockSizeInBytes;
where x,y,z range from (0,0,0) to region->imageExtent.{width/compressedTexelBlockWidth,height/compressedTexelBlockHeight,depth/compressedTexelBlockDepth}.
Copying to or from block-compressed images is typically done in multiples of
the compressed texel block size.
For this reason the imageExtent
must be a multiple of the compressed
texel block dimension.
There is one exception to this rule which is required to handle compressed
images created with dimensions that are not a multiple of the compressed
texel block dimensions:
-
If
imageExtent.width
is not a multiple of the compressed texel block width, then (imageExtent.width
+imageOffset.x
) must equal the image subresource width. -
If
imageExtent.height
is not a multiple of the compressed texel block height, then (imageExtent.height
+imageOffset.y
) must equal the image subresource height. -
If
imageExtent.depth
is not a multiple of the compressed texel block depth, then (imageExtent.depth
+imageOffset.z
) must equal the image subresource depth.
This allows the last compressed texel block of the image in each non-multiple dimension to be included as a source or destination of the copy.
19.5. Image Copies with Scaling
To copy regions of a source image into a destination image, potentially performing format conversion, arbitrary scaling, and filtering, call:
void vkCmdBlitImage(
VkCommandBuffer commandBuffer,
VkImage srcImage,
VkImageLayout srcImageLayout,
VkImage dstImage,
VkImageLayout dstImageLayout,
uint32_t regionCount,
const VkImageBlit* pRegions,
VkFilter filter);
-
commandBuffer
is the command buffer into which the command will be recorded. -
srcImage
is the source image. -
srcImageLayout
is the layout of the source image subresources for the blit. -
dstImage
is the destination image. -
dstImageLayout
is the layout of the destination image subresources for the blit. -
regionCount
is the number of regions to blit. -
pRegions
is a pointer to an array of VkImageBlit structures specifying the regions to blit. -
filter
is a VkFilter specifying the filter to apply if the blits require scaling.
vkCmdBlitImage
must not be used for multisampled source or
destination images.
Use vkCmdResolveImage for this purpose.
As the sizes of the source and destination extents can differ in any dimension, texels in the source extent are scaled and filtered to the destination extent. Scaling occurs via the following operations:
-
For each destination texel, the integer coordinate of that texel is converted to an unnormalized texture coordinate, using the effective inverse of the equations described in unnormalized to integer conversion:
-
ubase = i + ½
-
vbase = j + ½
-
wbase = k + ½
-
-
These base coordinates are then offset by the first destination offset:
-
uoffset = ubase - xdst0
-
voffset = vbase - ydst0
-
woffset = wbase - zdst0
-
aoffset = a -
baseArrayCount
dst
-
-
The scale is determined from the source and destination regions, and applied to the offset coordinates:
-
scale_u = (xsrc1 - xsrc0) / (xdst1 - xdst0)
-
scale_v = (ysrc1 - ysrc0) / (ydst1 - ydst0)
-
scale_w = (zsrc1 - zsrc0) / (zdst1 - zdst0)
-
uscaled = uoffset * scaleu
-
vscaled = voffset * scalev
-
wscaled = woffset * scalew
-
-
Finally the source offset is added to the scaled coordinates, to determine the final unnormalized coordinates used to sample from
srcImage
:-
u = uscaled + xsrc0
-
v = vscaled + ysrc0
-
w = wscaled + zsrc0
-
q =
mipLevel
-
a = aoffset +
baseArrayCount
src
-
These coordinates are used to sample from the source image, as described in
Image Operations chapter, with the filter mode equal to that
of filter
, a mipmap mode of VK_SAMPLER_MIPMAP_MODE_NEAREST
and
an address mode of VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE
.
Implementations must clamp at the edge of the source image, and may
additionally clamp to the edge of the source region.
Note
Due to allowable rounding errors in the generation of the source texture coordinates, it is not always possible to guarantee exactly which source texels will be sampled for a given blit. As rounding errors are implementation dependent, the exact results of a blitting operation are also implementation dependent. |
Blits are done layer by layer starting with the baseArrayLayer
member
of srcSubresource
for the source and dstSubresource
for the
destination.
layerCount
layers are blitted to the destination image.
3D textures are blitted slice by slice.
Slices in the source region bounded by srcOffsets
[0].z
and
srcOffsets
[1].z
are copied to slices in the destination region
bounded by dstOffsets
[0].z
and dstOffsets
[1].z
.
For each destination slice, a source z coordinate is linearly interpolated
between srcOffsets
[0].z
and srcOffsets
[1].z
.
If the filter
parameter is VK_FILTER_LINEAR
then the value
sampled from the source image is taken by doing linear filtering using the
interpolated z coordinate.
If filter
parameter is VK_FILTER_NEAREST
then the value sampled
from the source image is taken from the single nearest slice, with an
implementation-dependent arithmetic rounding mode.
The following filtering and conversion rules apply:
-
Integer formats can only be converted to other integer formats with the same signedness.
-
No format conversion is supported between depth/stencil images. The formats must match.
-
Format conversions on unorm, snorm, unscaled and packed float formats of the copied aspect of the image are performed by first converting the pixels to float values.
-
For sRGB source formats, nonlinear RGB values are converted to linear representation prior to filtering.
-
After filtering, the float values are first clamped and then cast to the destination image format. In case of sRGB destination format, linear RGB values are converted to nonlinear representation before writing the pixel to the image.
Signed and unsigned integers are converted by first clamping to the representable range of the destination format, then casting the value.
The VkImageBlit
structure is defined as:
typedef struct VkImageBlit {
VkImageSubresourceLayers srcSubresource;
VkOffset3D srcOffsets[2];
VkImageSubresourceLayers dstSubresource;
VkOffset3D dstOffsets[2];
} VkImageBlit;
-
srcSubresource
is the subresource to blit from. -
srcOffsets
is an array of two VkOffset3D structures specifying the bounds of the source region withinsrcSubresource
. -
dstSubresource
is the subresource to blit into. -
dstOffsets
is an array of two VkOffset3D structures specifying the bounds of the destination region withindstSubresource
.
For each element of the pRegions
array, a blit operation is performed
the specified source and destination regions.
19.6. Resolving Multisample Images
To resolve a multisample image to a non-multisample image, call:
void vkCmdResolveImage(
VkCommandBuffer commandBuffer,
VkImage srcImage,
VkImageLayout srcImageLayout,
VkImage dstImage,
VkImageLayout dstImageLayout,
uint32_t regionCount,
const VkImageResolve* pRegions);
-
commandBuffer
is the command buffer into which the command will be recorded. -
srcImage
is the source image. -
srcImageLayout
is the layout of the source image subresources for the resolve. -
dstImage
is the destination image. -
dstImageLayout
is the layout of the destination image subresources for the resolve. -
regionCount
is the number of regions to resolve. -
pRegions
is a pointer to an array of VkImageResolve structures specifying the regions to resolve.
During the resolve the samples corresponding to each pixel location in the source are converted to a single sample before being written to the destination. If the source formats are floating-point or normalized types, the sample values for each pixel are resolved in an implementation-dependent manner. If the source formats are integer types, a single sample’s value is selected for each pixel.
srcOffset
and dstOffset
select the initial x
, y
, and
z
offsets in texels of the sub-regions of the source and destination
image data.
extent
is the size in texels of the source image to resolve in
width
, height
and depth
.
Resolves are done layer by layer starting with baseArrayLayer
member
of srcSubresource
for the source and dstSubresource
for the
destination.
layerCount
layers are resolved to the destination image.
The VkImageResolve
structure is defined as:
typedef struct VkImageResolve {
VkImageSubresourceLayers srcSubresource;
VkOffset3D srcOffset;
VkImageSubresourceLayers dstSubresource;
VkOffset3D dstOffset;
VkExtent3D extent;
} VkImageResolve;
-
srcSubresource
anddstSubresource
are VkImageSubresourceLayers structures specifying the image subresources of the images used for the source and destination image data, respectively. Resolve of depth/stencil images is not supported. -
srcOffset
anddstOffset
select the initialx
,y
, andz
offsets in texels of the sub-regions of the source and destination image data. -
extent
is the size in texels of the source image to resolve inwidth
,height
anddepth
.
19.7. Buffer Markers
To write a 32-bit marker value into a buffer as a pipelined operation, call:
void vkCmdWriteBufferMarkerAMD(
VkCommandBuffer commandBuffer,
VkPipelineStageFlagBits pipelineStage,
VkBuffer dstBuffer,
VkDeviceSize dstOffset,
uint32_t marker);
-
commandBuffer
is the command buffer into which the command will be recorded. -
pipelineStage
is one of the VkPipelineStageFlagBits values, specifying the pipeline stage whose completion triggers the marker write. -
dstBuffer
is the buffer where the marker will be written to. -
dstOffset
is the byte offset into the buffer where the marker will be written to. -
marker
is the 32-bit value of the marker.
The command will write the 32-bit marker value into the buffer only after
all preceding commands have finished executing up to at least the specified
pipeline stage.
This includes the completion of other preceding
vkCmdWriteBufferMarkerAMD
commands so long as their specified pipeline
stages occur either at the same time or earlier than this command’s
specified pipelineStage
.
While consecutive buffer marker writes with the same pipelineStage
parameter are implicitly complete in submission order, memory and execution
dependencies between buffer marker writes and other operations must still be
explicitly ordered using synchronization commands.
The access scope for buffer marker writes falls under the
VK_ACCESS_TRANSFER_WRITE_BIT
, and the pipeline stages for identifying
the synchronization scope must include both pipelineStage
and
VK_PIPELINE_STAGE_TRANSFER_BIT
.
Note
Similar to |
Note
Implementations may only support a limited number of pipelined marker write operations in flight at a given time, thus excessive number of marker write operations may degrade command execution performance. |
20. Drawing Commands
Drawing commands (commands with Draw
in the name) provoke work in a
graphics pipeline.
Drawing commands are recorded into a command buffer and when executed by a
queue, will produce work which executes according to the bound graphics
pipeline.
A graphics pipeline must be bound to a command buffer before any drawing
commands are recorded in that command buffer.
Drawing can be achieved in two modes:
-
Programmable Mesh Shading, the mesh shader assembles primitives, or
-
Programmable Primitive Shading, the input primitives are assembled
as follows.
Each draw is made up of zero or more vertices and zero or more instances,
which are processed by the device and result in the assembly of primitives.
Primitives are assembled according to the pInputAssemblyState
member
of the VkGraphicsPipelineCreateInfo
structure, which is of type
VkPipelineInputAssemblyStateCreateInfo
:
typedef struct VkPipelineInputAssemblyStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineInputAssemblyStateCreateFlags flags;
VkPrimitiveTopology topology;
VkBool32 primitiveRestartEnable;
} VkPipelineInputAssemblyStateCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
topology
is a VkPrimitiveTopology defining the primitive topology, as described below. -
primitiveRestartEnable
controls whether a special vertex index value is treated as restarting the assembly of primitives. This enable only applies to indexed draws (vkCmdDrawIndexed and vkCmdDrawIndexedIndirect), and the special index value is either 0xFFFFFFFF when theindexType
parameter ofvkCmdBindIndexBuffer
is equal toVK_INDEX_TYPE_UINT32
, or 0xFFFF whenindexType
is equal toVK_INDEX_TYPE_UINT16
. Primitive restart is not allowed for “list” topologies.
Restarting the assembly of primitives discards the most recent index values
if those elements formed an incomplete primitive, and restarts the primitive
assembly using the subsequent indices, but only assembling the immediately
following element through the end of the originally specified elements.
The primitive restart index value comparison is performed before adding the
vertexOffset
value to the index value.
typedef VkFlags VkPipelineInputAssemblyStateCreateFlags;
VkPipelineInputAssemblyStateCreateFlags
is a bitmask type for setting
a mask, but is currently reserved for future use.
20.1. Primitive Topologies
Primitive topology determines how consecutive vertices are organized into primitives, and determines the type of primitive that is used at the beginning of the graphics pipeline. The effective topology for later stages of the pipeline is altered by tessellation or geometry shading (if either is in use) and depends on the execution modes of those shaders. In the case of mesh shading the only effective topology is defined by the execution mode of the mesh shader.
Supported topologies are defined by VkPrimitiveTopology and include:
typedef enum VkPrimitiveTopology {
VK_PRIMITIVE_TOPOLOGY_POINT_LIST = 0,
VK_PRIMITIVE_TOPOLOGY_LINE_LIST = 1,
VK_PRIMITIVE_TOPOLOGY_LINE_STRIP = 2,
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST = 3,
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP = 4,
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_FAN = 5,
VK_PRIMITIVE_TOPOLOGY_LINE_LIST_WITH_ADJACENCY = 6,
VK_PRIMITIVE_TOPOLOGY_LINE_STRIP_WITH_ADJACENCY = 7,
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY = 8,
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP_WITH_ADJACENCY = 9,
VK_PRIMITIVE_TOPOLOGY_PATCH_LIST = 10,
} VkPrimitiveTopology;
Each primitive topology, and its construction from a list of vertices, is summarized below with a supporting diagram. In each diagram, the numbered points show the sequencing of vertices in order within the vertex arrays; however the positions chosen are arbitrary and for illustration only. Vertices connected with solid lines belong to the main primitives. In the primitive types with adjacency, the vertices connected by dashed lines are the adjacent vertices that are accessible in a geometry shader.
Note
The terminology “vertex i” means “the vertex with index i in the ordered list of vertices defining this primitive”. |
Note
Depending on the polygon mode, a polygon
primitive generated from a drawing command with |
20.1.1. Point Lists
A series of individual points are specified with topology
VK_PRIMITIVE_TOPOLOGY_POINT_LIST
.
Each vertex defines a separate point.
20.1.2. Line Lists
Lists of line segments, with each segment defined by a pair of vertices, are
specified with topology
VK_PRIMITIVE_TOPOLOGY_LINE_LIST
.
The first two vertices define the first segment, with subsequent pairs of
vertices each defining one more segment.
If the number of vertices is odd, then the last vertex is ignored.
20.1.3. Line Strips
A series of one or more connected line segments are specified with
topology
VK_PRIMITIVE_TOPOLOGY_LINE_STRIP
.
In this case, the first vertex specifies the first segment’s start point
while the second vertex specifies the first segment’s endpoint and the
second segment’s start point.
In general, vertex i (for i > 0) specifies the beginning of the
ith segment and the end of the previous segment.
The last vertex specifies the end of the last segment.
If only one vertex is specified, then no primitive is generated.
20.1.4. Triangle Lists
Lists of separate triangles are specified with topology
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST
.
In this case, vertices 3 i, 3 i + 1, and 3 i + 2
(in that order) determine a triangle for each i = 0, 1, …, n-1,
where there are 3 n + k vertices drawn.
k is either 0, 1, or 2; if k is not zero, the final k
vertices are ignored.
20.1.5. Triangle Strips
A triangle strip is a series of triangles connected along shared edges, and
is specified with topology
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP
.
In this case, the first three vertices define the first triangle, and their
order is significant.
Each subsequent vertex defines a new triangle using that point along with
the last two vertices from the previous triangle.
If fewer than three vertices are specified, no primitive is produced.
The order of vertices in successive triangles changes as shown in the figure
below, so that all triangle faces have the same orientation.
20.1.6. Triangle Fans
A triangle fan is specified with topology
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_FAN
.
It is similar to a triangle strip, but changes the vertex replaced from the
previous triangle so that all triangles in the fan share a common vertex.
20.1.7. Line Lists With Adjacency
Lines with adjacency are specified with topology
VK_PRIMITIVE_TOPOLOGY_LINE_LIST_WITH_ADJACENCY
, and are independent
line segments where each endpoint has a corresponding adjacent vertex that
is accessible in a geometry shader.
If a geometry shader is not active, the adjacent vertices are ignored.
A line segment is drawn from vertex 4 i + 1 to vertex 4 i + 2 for each i = 0, 1, …, n-1, where there are 4 n + k vertices. k is either 0, 1, 2, or 3; if k is not zero, the final k vertices are ignored. For line segment i, vertices 4 i and 4 i + 3 vertices are considered adjacent to vertices 4 i + 1 and 4 i + 2, respectively.
20.1.8. Line Strips With Adjacency
Line strips with adjacency are specified with topology
VK_PRIMITIVE_TOPOLOGY_LINE_STRIP_WITH_ADJACENCY
and are similar to
line strips, except that each line segment has a pair of adjacent vertices
that are accessible in a geometry shader.
If a geometry shader is not active, the adjacent vertices are ignored.
A line segment is drawn from vertex i + 1 vertex to vertex i + 2 for each i = 0, 1, …, n-1, where there are n + 3 vertices. If there are fewer than four vertices, all vertices are ignored. For line segment i, vertices i and i + 3 are considered adjacent to vertices i + 1 and i + 2, respectively.
20.1.9. Triangle Lists With Adjacency
Triangles with adjacency are specified with topology
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY
, and are similar to
separate triangles except that each triangle edge has an adjacent vertex
that is accessible in a geometry shader.
If a geometry shader is not active, the adjacent vertices are ignored.
Vertices 6 i, 6 i + 2, and 6 i + 4 (in that order) determine a triangle for each i = 0, 1, …, n-1, where there are 6 n+k vertices. k is either 0, 1, 2, 3, 4, or 5; if k is non-zero, the final k vertices are ignored. For triangle i, vertices 6 i + 1, 6 i + 3, and 6 i + 5 vertices are considered adjacent to edges from vertex 6 i to 6 i + 2, from 6 i + 2 to 6 i + 4, and from 6 i + 4 to 6 i vertices, respectively.
20.1.10. Triangle Strips With Adjacency
Triangle strips with adjacency are specified with topology
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP_WITH_ADJACENCY
, and are similar
to triangle strips except that each triangle edge has an adjacent vertex
that is accessible in a geometry shader.
If a geometry shader is not active, the adjacent vertices are ignored.
In triangle strips with adjacency, n triangles are drawn where there are 2 (n + 2) + k vertices. k is either 0 or 1; if k is 1, the final vertex is ignored. If there are fewer than 6 vertices, the entire primitive is ignored.
The table below illustrates the vertices and order used to draw each triangle, and which vertices are considered adjacent to each edge of those triangles. Each triangle is drawn using the vertices whose numbers are in the 1st, 2nd, and 3rd columns under Primitive Vertices, in that order. The vertices in the 1/2, 2/3, and 3/1 columns under Adjacent Vertices are considered adjacent to the edges from the first to the second, from the second to the third, and from the third to the first vertex of the triangle, respectively. The six rows correspond to six cases: the first and only triangle (i = 0, n = 1), the first triangle of several (i = 0, n > 0), odd middle triangles (i = 1, 3, 5 …), even middle triangles (i = 2, 4, 6, …), and special cases for the last triangle, when i is either even or odd. For the purposes of this table, both the first vertex and first triangle are numbered 0.
Primitive Vertices | Adjacent Vertices | |||||
---|---|---|---|---|---|---|
Primitive |
1st |
2nd |
3rd |
1/2 |
2/3 |
3/1 |
only (i = 0, n = 1) |
0 |
2 |
4 |
1 |
5 |
3 |
first (i = 0) |
0 |
2 |
4 |
1 |
6 |
3 |
middle (i odd) |
2 i + 2 |
2 i |
2 i + 4 |
2 i-2 |
2 i + 3 |
2 i + 6 |
middle (i even) |
2 i |
2 i + 2 |
2 i + 4 |
2 i-2 |
2 i + 6 |
2 i + 3 |
last (i=n-1, i odd) |
2 i + 2 |
2 i |
2 i + 4 |
2 i-2 |
2 i + 3 |
2 i + 5 |
last (i=n-1, i even) |
2 i |
2 i + 2 |
2 i + 4 |
2 i-2 |
2 i + 5 |
2 i + 3 |
20.1.11. Separate Patches
Separate patches are specified with topology
VK_PRIMITIVE_TOPOLOGY_PATCH_LIST
.
A patch is an ordered collection of vertices used for
primitive tessellation.
The vertices comprising a patch have no implied geometric ordering, and are
used by tessellation shaders and the fixed-function tessellator to generate
new point, line, or triangle primitives.
Each patch in the series has a fixed number of vertices, specified by the
patchControlPoints
member of the
VkPipelineTessellationStateCreateInfo structure passed to
vkCreateGraphicsPipelines.
Once assembled and vertex shaded, these patches are provided as input to the
tessellation control shader stage.
If the number of vertices in a patch is given by v, vertices v × i through v × i + v - 1 (in that order) determine a patch for each i = 0, 1, …, n-1, where there are v × n + k vertices. k is in the range [0, v - 1]; if k is not zero, the final k vertices are ignored.
20.2. Primitive Order
Primitives generated by drawing commands progress through the stages of the graphics pipeline in primitive order. Primitive order is initially determined in the following way:
-
Submission order determines the initial ordering
-
For indirect draw commands, the order in which accessed instances of the VkDrawIndirectCommand are stored in
buffer
, from lower indirect buffer addresses to higher addresses. -
If a draw command includes multiple instances, the order in which instances are executed, from lower numbered instances to higher.
-
The order in which primitives are specified by a draw command:
-
For non-indexed draws, from vertices with a lower numbered
vertexIndex
to a higher numberedvertexIndex
. -
For indexed draws, vertices sourced from a lower index buffer addresses to higher addresses.
-
For draws using mesh shaders, the order is provided by mesh shading.
-
Within this order implementations further sort primitives:
-
If tessellation shading is active, by an implementation-dependent order of new primitives generated by tessellation.
-
If geometry shading is active, by the order new primitives are generated by geometry shading.
-
If the polygon mode is not
VK_POLYGON_MODE_FILL
, orVK_POLYGON_MODE_FILL_RECTANGLE_NV
, by an implementation-dependent ordering of the new primitives generated within the original primitive.
Primitive order is later used to define rasterization order, which determines the order in which fragments output results to a framebuffer.
20.3. Programmable Primitive Shading
Once primitives are assembled, they proceed to the vertex shading stage of the pipeline. If the draw includes multiple instances, then the set of primitives is sent to the vertex shading stage multiple times, once for each instance.
It is implementation-dependent whether vertex shading occurs on vertices that are discarded as part of incomplete primitives, but if it does occur then it operates as if they were vertices in complete primitives and such invocations can have side effects.
Vertex shading receives two per-vertex inputs from the primitive assembly
stage - the vertexIndex
and the instanceIndex
.
How these values are generated is defined below, with each command.
Drawing commands fall roughly into two categories:
-
Non-indexed drawing commands present a sequential
vertexIndex
to the vertex shader. The sequential index is generated automatically by the device (see Fixed-Function Vertex Processing for details on both specifying the vertex attributes indexed byvertexIndex
, as well as binding vertex buffers containing those attributes to a command buffer). These commands are: -
Indexed drawing commands read index values from an index buffer and use this to compute the
vertexIndex
value for the vertex shader. These commands are:
To bind an index buffer to a command buffer, call:
void vkCmdBindIndexBuffer(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset,
VkIndexType indexType);
-
commandBuffer
is the command buffer into which the command is recorded. -
buffer
is the buffer being bound. -
offset
is the starting offset in bytes withinbuffer
used in index buffer address calculations. -
indexType
is a VkIndexType value specifying whether indices are treated as 16 bits or 32 bits.
Possible values of vkCmdBindIndexBuffer::indexType
, specifying
the size of indices, are:
typedef enum VkIndexType {
VK_INDEX_TYPE_UINT16 = 0,
VK_INDEX_TYPE_UINT32 = 1,
VK_INDEX_TYPE_NONE_NV = 1000165000,
} VkIndexType;
-
VK_INDEX_TYPE_UINT16
specifies that indices are 16-bit unsigned integer values. -
VK_INDEX_TYPE_UINT32
specifies that indices are 32-bit unsigned integer values. -
VK_INDEX_TYPE_NONE_NV
specifies that no indices are provided.
The parameters for each drawing command are specified directly in the command or read from buffer memory, depending on the command. Drawing commands that source their parameters from buffer memory are known as indirect drawing commands.
All drawing commands interact with the Robust Buffer Access feature.
To record a non-indexed draw, call:
void vkCmdDraw(
VkCommandBuffer commandBuffer,
uint32_t vertexCount,
uint32_t instanceCount,
uint32_t firstVertex,
uint32_t firstInstance);
-
commandBuffer
is the command buffer into which the command is recorded. -
vertexCount
is the number of vertices to draw. -
instanceCount
is the number of instances to draw. -
firstVertex
is the index of the first vertex to draw. -
firstInstance
is the instance ID of the first instance to draw.
When the command is executed, primitives are assembled using the current
primitive topology and vertexCount
consecutive vertex indices with the
first vertexIndex
value equal to firstVertex
.
The primitives are drawn instanceCount
times with instanceIndex
starting with firstInstance
and increasing sequentially for each
instance.
The assembled primitives execute the bound graphics pipeline.
To record an indexed draw, call:
void vkCmdDrawIndexed(
VkCommandBuffer commandBuffer,
uint32_t indexCount,
uint32_t instanceCount,
uint32_t firstIndex,
int32_t vertexOffset,
uint32_t firstInstance);
-
commandBuffer
is the command buffer into which the command is recorded. -
indexCount
is the number of vertices to draw. -
instanceCount
is the number of instances to draw. -
firstIndex
is the base index within the index buffer. -
vertexOffset
is the value added to the vertex index before indexing into the vertex buffer. -
firstInstance
is the instance ID of the first instance to draw.
When the command is executed, primitives are assembled using the current
primitive topology and indexCount
vertices whose indices are retrieved
from the index buffer.
The index buffer is treated as an array of tightly packed unsigned integers
of size defined by the vkCmdBindIndexBuffer::indexType
parameter
with which the buffer was bound.
The first vertex index is at an offset of firstIndex
* indexSize
+ offset
within the bound index buffer, where offset
is the
offset specified by vkCmdBindIndexBuffer
and indexSize
is the
byte size of the type specified by indexType
.
Subsequent index values are retrieved from consecutive locations in the
index buffer.
Indices are first compared to the primitive restart value, then zero
extended to 32 bits (if the indexType
is VK_INDEX_TYPE_UINT16
)
and have vertexOffset
added to them, before being supplied as the
vertexIndex
value.
The primitives are drawn instanceCount
times with instanceIndex
starting with firstInstance
and increasing sequentially for each
instance.
The assembled primitives execute the bound graphics pipeline.
To record a non-indexed indirect draw, call:
void vkCmdDrawIndirect(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset,
uint32_t drawCount,
uint32_t stride);
-
commandBuffer
is the command buffer into which the command is recorded. -
buffer
is the buffer containing draw parameters. -
offset
is the byte offset intobuffer
where parameters begin. -
drawCount
is the number of draws to execute, and can be zero. -
stride
is the byte stride between successive sets of draw parameters.
vkCmdDrawIndirect
behaves similarly to vkCmdDraw except that the
parameters are read by the device from a buffer during execution.
drawCount
draws are executed by the command, with parameters taken
from buffer
starting at offset
and increasing by stride
bytes for each successive draw.
The parameters of each draw are encoded in an array of
VkDrawIndirectCommand structures.
If drawCount
is less than or equal to one, stride
is ignored.
The VkDrawIndirectCommand
structure is defined as:
typedef struct VkDrawIndirectCommand {
uint32_t vertexCount;
uint32_t instanceCount;
uint32_t firstVertex;
uint32_t firstInstance;
} VkDrawIndirectCommand;
-
vertexCount
is the number of vertices to draw. -
instanceCount
is the number of instances to draw. -
firstVertex
is the index of the first vertex to draw. -
firstInstance
is the instance ID of the first instance to draw.
The members of VkDrawIndirectCommand
have the same meaning as the
similarly named parameters of vkCmdDraw.
To record a non-indexed draw call with a draw call count sourced from a buffer, call:
void vkCmdDrawIndirectCountKHR(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset,
VkBuffer countBuffer,
VkDeviceSize countBufferOffset,
uint32_t maxDrawCount,
uint32_t stride);
-
commandBuffer
is the command buffer into which the command is recorded. -
buffer
is the buffer containing draw parameters. -
offset
is the byte offset intobuffer
where parameters begin. -
countBuffer
is the buffer containing the draw count. -
countBufferOffset
is the byte offset intocountBuffer
where the draw count begins. -
maxDrawCount
specifies the maximum number of draws that will be executed. The actual number of executed draw calls is the minimum of the count specified incountBuffer
andmaxDrawCount
. -
stride
is the byte stride between successive sets of draw parameters.
vkCmdDrawIndirectCountKHR
behaves similarly to vkCmdDrawIndirect
except that the draw count is read by the device from a buffer during
execution.
The command will read an unsigned 32-bit integer from countBuffer
located at countBufferOffset
and use this as the draw count.
To record a non-indexed draw call with a draw call count sourced from a buffer, call:
void vkCmdDrawIndirectCountAMD(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset,
VkBuffer countBuffer,
VkDeviceSize countBufferOffset,
uint32_t maxDrawCount,
uint32_t stride);
-
commandBuffer
is the command buffer into which the command is recorded. -
buffer
is the buffer containing draw parameters. -
offset
is the byte offset intobuffer
where parameters begin. -
countBuffer
is the buffer containing the draw count. -
countBufferOffset
is the byte offset intocountBuffer
where the draw count begins. -
maxDrawCount
specifies the maximum number of draws that will be executed. The actual number of executed draw calls is the minimum of the count specified incountBuffer
andmaxDrawCount
. -
stride
is the byte stride between successive sets of draw parameters.
vkCmdDrawIndirectCountAMD
behaves similarly to vkCmdDrawIndirect
except that the draw count is read by the device from a buffer during
execution.
The command will read an unsigned 32-bit integer from countBuffer
located at countBufferOffset
and use this as the draw count.
To record an indexed indirect draw, call:
void vkCmdDrawIndexedIndirect(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset,
uint32_t drawCount,
uint32_t stride);
-
commandBuffer
is the command buffer into which the command is recorded. -
buffer
is the buffer containing draw parameters. -
offset
is the byte offset intobuffer
where parameters begin. -
drawCount
is the number of draws to execute, and can be zero. -
stride
is the byte stride between successive sets of draw parameters.
vkCmdDrawIndexedIndirect
behaves similarly to vkCmdDrawIndexed
except that the parameters are read by the device from a buffer during
execution.
drawCount
draws are executed by the command, with parameters taken
from buffer
starting at offset
and increasing by stride
bytes for each successive draw.
The parameters of each draw are encoded in an array of
VkDrawIndexedIndirectCommand structures.
If drawCount
is less than or equal to one, stride
is ignored.
The VkDrawIndexedIndirectCommand
structure is defined as:
typedef struct VkDrawIndexedIndirectCommand {
uint32_t indexCount;
uint32_t instanceCount;
uint32_t firstIndex;
int32_t vertexOffset;
uint32_t firstInstance;
} VkDrawIndexedIndirectCommand;
-
indexCount
is the number of vertices to draw. -
instanceCount
is the number of instances to draw. -
firstIndex
is the base index within the index buffer. -
vertexOffset
is the value added to the vertex index before indexing into the vertex buffer. -
firstInstance
is the instance ID of the first instance to draw.
The members of VkDrawIndexedIndirectCommand
have the same meaning as
the similarly named parameters of vkCmdDrawIndexed.
To record an indexed draw call with a draw call count sourced from a buffer, call:
void vkCmdDrawIndexedIndirectCountKHR(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset,
VkBuffer countBuffer,
VkDeviceSize countBufferOffset,
uint32_t maxDrawCount,
uint32_t stride);
-
commandBuffer
is the command buffer into which the command is recorded. -
buffer
is the buffer containing draw parameters. -
offset
is the byte offset intobuffer
where parameters begin. -
countBuffer
is the buffer containing the draw count. -
countBufferOffset
is the byte offset intocountBuffer
where the draw count begins. -
maxDrawCount
specifies the maximum number of draws that will be executed. The actual number of executed draw calls is the minimum of the count specified incountBuffer
andmaxDrawCount
. -
stride
is the byte stride between successive sets of draw parameters.
vkCmdDrawIndexedIndirectCountKHR
behaves similarly to
vkCmdDrawIndexedIndirect except that the draw count is read by the
device from a buffer during execution.
The command will read an unsigned 32-bit integer from countBuffer
located at countBufferOffset
and use this as the draw count.
To record an indexed draw call with a draw call count sourced from a buffer, call:
void vkCmdDrawIndexedIndirectCountAMD(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset,
VkBuffer countBuffer,
VkDeviceSize countBufferOffset,
uint32_t maxDrawCount,
uint32_t stride);
-
commandBuffer
is the command buffer into which the command is recorded. -
buffer
is the buffer containing draw parameters. -
offset
is the byte offset intobuffer
where parameters begin. -
countBuffer
is the buffer containing the draw count. -
countBufferOffset
is the byte offset intocountBuffer
where the draw count begins. -
maxDrawCount
specifies the maximum number of draws that will be executed. The actual number of executed draw calls is the minimum of the count specified incountBuffer
andmaxDrawCount
. -
stride
is the byte stride between successive sets of draw parameters.
vkCmdDrawIndexedIndirectCountAMD
behaves similarly to
vkCmdDrawIndexedIndirect except that the draw count is read by the
device from a buffer during execution.
The command will read an unsigned 32-bit integer from countBuffer
located at countBufferOffset
and use this as the draw count.
20.3.1. Drawing Transform Feedback
It is possible to draw vertex data that was previously captured during
active transform feedback by binding
one or more of the transform feedback buffers as vertex buffers.
A pipeline barrier is required between using the buffers as transform
feedback buffers and vertex buffers to ensure all writes to the transform
feedback buffers are visible when the data is read as vertex attributes.
The source access is VK_ACCESS_TRANSFORM_FEEDBACK_WRITE_BIT_EXT
and
the destination access is VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT
for the
pipeline stages VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT
and
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT
respectively.
The value written to the counter buffer by
vkCmdEndTransformFeedbackEXT can be used to determine the vertex
count for the draw.
A pipeline barrier is required between using the counter buffer for
vkCmdEndTransformFeedbackEXT
and vkCmdDrawIndirectByteCountEXT
where the source access is
VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT
and the destination
access is VK_ACCESS_INDIRECT_COMMAND_READ_BIT
for the pipeline stages
VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT
and
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
respectively.
To record a non-indexed draw call, where the vertex count is based on a byte count read from a buffer and the passed in vertex stride parameter, call:
void vkCmdDrawIndirectByteCountEXT(
VkCommandBuffer commandBuffer,
uint32_t instanceCount,
uint32_t firstInstance,
VkBuffer counterBuffer,
VkDeviceSize counterBufferOffset,
uint32_t counterOffset,
uint32_t vertexStride);
-
commandBuffer
is the command buffer into which the command is recorded. -
instanceCount
is the number of instances to draw. -
firstInstance
is the instance ID of the first instance to draw. -
counterBuffer
is the buffer handle from where the byte count is read. -
counterBufferOffset
is the offset into the buffer used to read the byte count, which is used to calculate the vertex count for this draw call. -
counterOffset
is subtracted from the byte count read from thecounterBuffer
at thecounterBufferOffset
-
vertexStride
is the stride in bytes between each element of the vertex data that is used to calculate the vertex count from the counter value. This value is typically the same value that was used in the graphics pipeline state when the transform feedback was captured as theXfbStride
.
When the command is executed, primitives are assembled in the same way as
done with vkCmdDraw except the vertexCount
is calculated based
on the byte count read from counterBuffer
at offset
counterBufferOffset
.
The assembled primitives execute the bound graphics pipeline.
The effective vertexCount
is calculated as follows:
const uint32_t * counterBufferPtr = (const uint8_t *)counterBuffer.address + counterBufferOffset;
vertexCount = floor(max(0, (*counterBufferPtr - counterOffset)) / vertexStride);
The effective firstVertex
is zero.
20.4. Conditional Rendering
Certain rendering commands can be executed conditionally based on a value in buffer memory. These rendering commands are limited to drawing commands, dispatching commands, and clearing attachments with vkCmdClearAttachments within a conditional rendering block which is defined by commands vkCmdBeginConditionalRenderingEXT and vkCmdEndConditionalRenderingEXT. Other rendering commands remain unaffected by conditional rendering.
After beginning conditional rendering, it is considered active within the command buffer it was called until it is ended with vkCmdEndConditionalRenderingEXT.
Conditional rendering must begin and end in the same command buffer.
When conditional rendering is active, a primary command buffer can execute
secondary command buffers if the
inherited conditional
rendering feature is enabled.
For a secondary command buffer to be executed while conditional rendering is
active in the primary command buffer, it must set the
conditionalRenderingEnable
flag of
VkCommandBufferInheritanceConditionalRenderingInfoEXT, as described in
the Command Buffer Recording section.
Conditional rendering must also either begin and end inside the same subpass of a render pass instance, or must both begin and end outside of a render pass instance (i.e. contain entire render pass instances).
To begin conditional rendering, call:
void vkCmdBeginConditionalRenderingEXT(
VkCommandBuffer commandBuffer,
const VkConditionalRenderingBeginInfoEXT* pConditionalRenderingBegin);
-
commandBuffer
is the command buffer into which this command will be recorded. -
pConditionalRenderingBegin
is a pointer to an instance of the VkConditionalRenderingBeginInfoEXT structure specifying the parameters of conditional rendering.
The VkConditionalRenderingBeginInfoEXT
structure is defined as:
typedef struct VkConditionalRenderingBeginInfoEXT {
VkStructureType sType;
const void* pNext;
VkBuffer buffer;
VkDeviceSize offset;
VkConditionalRenderingFlagsEXT flags;
} VkConditionalRenderingBeginInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
buffer
is a buffer containing the predicate for conditional rendering. -
offset
is the byte offset intobuffer
where the predicate is located. -
flags
is a bitmask of VkConditionalRenderingFlagsEXT specifying the behavior of conditional rendering.
If the 32-bit value at offset
in buffer
memory is zero, then the
rendering commands are discarded, otherwise they are executed as normal.
If the value of the predicate in buffer memory changes while conditional
rendering is active, the rendering commands may be discarded in an
implementation-dependent way.
Some implementations may latch the value of the predicate upon beginning
conditional rendering while others may read it before every rendering
command.
Bits which can be set in
vkCmdBeginConditionalRenderingEXT::flags
specifying the behavior
of conditional rendering are:
typedef enum VkConditionalRenderingFlagBitsEXT {
VK_CONDITIONAL_RENDERING_INVERTED_BIT_EXT = 0x00000001,
} VkConditionalRenderingFlagBitsEXT;
-
VK_CONDITIONAL_RENDERING_INVERTED_BIT_EXT
specifies the condition used to determine whether to discard rendering commands or not. That is, if the 32-bit predicate read frombuffer
memory atoffset
is zero, the rendering commands are not discarded, and if non zero, then they are discarded.
typedef VkFlags VkConditionalRenderingFlagsEXT;
VkConditionalRenderingFlagsEXT
is a bitmask type for setting a mask of
zero or more VkConditionalRenderingFlagBitsEXT.
To end conditional rendering, call:
void vkCmdEndConditionalRenderingEXT(
VkCommandBuffer commandBuffer);
-
commandBuffer
is the command buffer into which this command will be recorded.
Once ended, conditional rendering becomes inactive.
20.5. Programmable Mesh Shading
In this drawing approach, primitives are assembled by the mesh shader stage. Mesh shading operates similarly to dispatching compute as the shaders make use of workgroups.
To record a draw that uses the mesh pipeline, call:
void vkCmdDrawMeshTasksNV(
VkCommandBuffer commandBuffer,
uint32_t taskCount,
uint32_t firstTask);
-
commandBuffer
is the command buffer into which the command will be recorded. -
taskCount
is the number of local workgroups to dispatch in the X dimension. Y and Z dimension are implicitly set to one. -
firstTask
is the X component of the first workgroup ID.
When the command is executed, a global workgroup consisting of
taskCount
local workgroups is assembled.
To record an indirect mesh tasks draw, call:
void vkCmdDrawMeshTasksIndirectNV(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset,
uint32_t drawCount,
uint32_t stride);
-
commandBuffer
is the command buffer into which the command is recorded. -
buffer
is the buffer containing draw parameters. -
offset
is the byte offset intobuffer
where parameters begin. -
drawCount
is the number of draws to execute, and can be zero. -
stride
is the byte stride between successive sets of draw parameters.
vkCmdDrawMeshTasksIndirectNV
behaves similarly to
vkCmdDrawMeshTasksNV except that the parameters are read by the device
from a buffer during execution.
drawCount
draws are executed by the command, with parameters taken
from buffer
starting at offset
and increasing by stride
bytes for each successive draw.
The parameters of each draw are encoded in an array of
VkDrawMeshTasksIndirectCommandNV structures.
If drawCount
is less than or equal to one, stride
is ignored.
The VkDrawMeshTasksIndirectCommandNV
structure is defined as:
typedef struct VkDrawMeshTasksIndirectCommandNV {
uint32_t taskCount;
uint32_t firstTask;
} VkDrawMeshTasksIndirectCommandNV;
-
taskCount
is the number of local workgroups to dispatch in the X dimension. Y and Z dimension are implicitly set to one. -
firstTask
is the X component of the first workgroup ID.
The members of VkDrawMeshTasksIndirectCommandNV
have the same meaning
as the similarly named parameters of vkCmdDrawMeshTasksNV.
To record an indirect mesh tasks draw with the draw count sourced from a buffer, call:
void vkCmdDrawMeshTasksIndirectCountNV(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset,
VkBuffer countBuffer,
VkDeviceSize countBufferOffset,
uint32_t maxDrawCount,
uint32_t stride);
-
commandBuffer
is the command buffer into which the command is recorded. -
buffer
is the buffer containing draw parameters. -
offset
is the byte offset intobuffer
where parameters begin. -
countBuffer
is the buffer containing the draw count. -
countBufferOffset
is the byte offset intocountBuffer
where the draw count begins. -
maxDrawCount
specifies the maximum number of draws that will be executed. The actual number of executed draw calls is the minimum of the count specified incountBuffer
andmaxDrawCount
. -
stride
is the byte stride between successive sets of draw parameters.
vkCmdDrawMeshTasksIndirectCountNV
behaves similarly to
vkCmdDrawMeshTasksIndirectNV except that the draw count is read by the
device from a buffer during execution.
The command will read an unsigned 32-bit integer from countBuffer
located at countBufferOffset
and use this as the draw count.
21. Fixed-Function Vertex Processing
Vertex fetching is controlled via configurable state, as a logically distinct graphics pipeline stage.
21.1. Vertex Attributes
Vertex shaders can define input variables, which receive vertex attribute
data transferred from one or more VkBuffer
(s) by drawing commands.
Vertex shader input variables are bound to buffers via an indirect binding
where the vertex shader associates a vertex input attribute number with
each variable, vertex input attributes are associated to vertex input
bindings on a per-pipeline basis, and vertex input bindings are associated
with specific buffers on a per-draw basis via the
vkCmdBindVertexBuffers
command.
Vertex input attribute and vertex input binding descriptions also contain
format information controlling how data is extracted from buffer memory and
converted to the format expected by the vertex shader.
There are VkPhysicalDeviceLimits
::maxVertexInputAttributes
number of vertex input attributes and
VkPhysicalDeviceLimits
::maxVertexInputBindings
number of vertex
input bindings (each referred to by zero-based indices), where there are at
least as many vertex input attributes as there are vertex input bindings.
Applications can store multiple vertex input attributes interleaved in a
single buffer, and use a single vertex input binding to access those
attributes.
In GLSL, vertex shaders associate input variables with a vertex input
attribute number using the location
layout qualifier.
The component
layout qualifier associates components of a vertex shader
input variable with components of a vertex input attribute.
// Assign location M to variableName
layout (location=M, component=2) in vec2 variableName;
// Assign locations [N,N+L) to the array elements of variableNameArray
layout (location=N) in vec4 variableNameArray[L];
In SPIR-V, vertex shaders associate input variables with a vertex input
attribute number using the Location
decoration.
The Component
decoration associates components of a vertex shader input
variable with components of a vertex input attribute.
The Location
and Component
decorations are specified via the
OpDecorate
instruction.
...
%1 = OpExtInstImport "GLSL.std.450"
...
OpName %9 "variableName"
OpName %15 "variableNameArray"
OpDecorate %18 BuiltIn VertexIndex
OpDecorate %19 BuiltIn InstanceIndex
OpDecorate %9 Location M
OpDecorate %9 Component 2
OpDecorate %15 Location N
...
%2 = OpTypeVoid
%3 = OpTypeFunction %2
%6 = OpTypeFloat 32
%7 = OpTypeVector %6 2
%8 = OpTypePointer Input %7
%9 = OpVariable %8 Input
%10 = OpTypeVector %6 4
%11 = OpTypeInt 32 0
%12 = OpConstant %11 L
%13 = OpTypeArray %10 %12
%14 = OpTypePointer Input %13
%15 = OpVariable %14 Input
...
21.1.1. Attribute Location and Component Assignment
Vertex shaders allow Location
and Component
decorations on input
variable declarations.
The Location
decoration specifies which vertex input attribute is used
to read and interpret the data that a variable will consume.
The Component
decoration allows the location to be more finely
specified for scalars and vectors, down to the individual components within
a location that are consumed.
The components within a location are 0, 1, 2, and 3.
A variable starting at component N will consume components N, N+1, N+2, …
up through its size.
For single precision types, it is invalid if the sequence of components gets
larger than 3.
When a vertex shader input variable declared using a scalar or vector 32-bit
data type is assigned a location, its value(s) are taken from the components
of the input attribute specified with the corresponding
VkVertexInputAttributeDescription
::location
.
The components used depend on the type of variable and the Component
decoration specified in the variable declaration, as identified in
Input attribute components accessed by 32-bit input variables.
Any 32-bit scalar or vector input will consume a single location.
For 32-bit data types, missing components are filled in with default values
as described below.
32-bit data type | Component decoration |
Components consumed |
---|---|---|
scalar |
0 or unspecified |
(x, o, o, o) |
scalar |
1 |
(o, y, o, o) |
scalar |
2 |
(o, o, z, o) |
scalar |
3 |
(o, o, o, w) |
two-component vector |
0 or unspecified |
(x, y, o, o) |
two-component vector |
1 |
(o, y, z, o) |
two-component vector |
2 |
(o, o, z, w) |
three-component vector |
0 or unspecified |
(x, y, z, o) |
three-component vector |
1 |
(o, y, z, w) |
four-component vector |
0 or unspecified |
(x, y, z, w) |
Components indicated by “o” are available for use by other input variables which are sourced from the same attribute, and if used, are either filled with the corresponding component from the input format (if present), or the default value.
When a vertex shader input variable declared using a 32-bit floating point
matrix type is assigned a location i, its values are taken from
consecutive input attributes starting with the corresponding
VkVertexInputAttributeDescription
::location
.
Such matrices are treated as an array of column vectors with values taken
from the input attributes identified in Input attributes accessed by 32-bit input matrix variables.
The VkVertexInputAttributeDescription
::format
must be specified
with a VkFormat that corresponds to the appropriate type of column
vector.
The Component
decoration must not be used with matrix types.
Data type | Column vector type | Locations consumed | Components consumed |
---|---|---|---|
mat2 |
two-component vector |
i, i+1 |
(x, y, o, o), (x, y, o, o) |
mat2x3 |
three-component vector |
i, i+1 |
(x, y, z, o), (x, y, z, o) |
mat2x4 |
four-component vector |
i, i+1 |
(x, y, z, w), (x, y, z, w) |
mat3x2 |
two-component vector |
i, i+1, i+2 |
(x, y, o, o), (x, y, o, o), (x, y, o, o) |
mat3 |
three-component vector |
i, i+1, i+2 |
(x, y, z, o), (x, y, z, o), (x, y, z, o) |
mat3x4 |
four-component vector |
i, i+1, i+2 |
(x, y, z, w), (x, y, z, w), (x, y, z, w) |
mat4x2 |
two-component vector |
i, i+1, i+2, i+3 |
(x, y, o, o), (x, y, o, o), (x, y, o, o), (x, y, o, o) |
mat4x3 |
three-component vector |
i, i+1, i+2, i+3 |
(x, y, z, o), (x, y, z, o), (x, y, z, o), (x, y, z, o) |
mat4 |
four-component vector |
i, i+1, i+2, i+3 |
(x, y, z, w), (x, y, z, w), (x, y, z, w), (x, y, z, w) |
Components indicated by “o” are available for use by other input variables which are sourced from the same attribute, and if used, are either filled with the corresponding component from the input (if present), or the default value.
When a vertex shader input variable declared using a scalar or vector 64-bit
data type is assigned a location i, its values are taken from consecutive
input attributes starting with the corresponding
VkVertexInputAttributeDescription
::location
.
The locations and components used depend on the type of variable and the
Component
decoration specified in the variable declaration, as
identified in Input attribute locations and components accessed by 64-bit input variables.
For 64-bit data types, no default attribute values are provided.
Input variables must not use more components than provided by the
attribute.
Input attributes which have one- or two-component 64-bit formats will
consume a single location.
Input attributes which have three- or four-component 64-bit formats will
consume two consecutive locations.
A 64-bit scalar data type will consume two components, and a 64-bit
two-component vector data type will consume all four components available
within a location.
A three- or four-component 64-bit data type must not specify a component.
A three-component 64-bit data type will consume all four components of the
first location and components 0 and 1 of the second location.
This leaves components 2 and 3 available for other component-qualified
declarations.
A four-component 64-bit data type will consume all four components of the
first location and all four components of the second location.
It is invalid for a scalar or two-component 64-bit data type to specify a
component of 1 or 3.
Input format | Locations consumed | 64-bit data type | Location decoration |
Component decoration |
32-bit components consumed |
---|---|---|---|---|---|
R64 |
i |
scalar |
i |
0 or unspecified |
(x, y, -, -) |
R64G64 |
i |
scalar |
i |
0 or unspecified |
(x, y, o, o) |
scalar |
i |
2 |
(o, o, z, w) |
||
two-component vector |
i |
0 or unspecified |
(x, y, z, w) |
||
R64G64B64 |
i, i+1 |
scalar |
i |
0 or unspecified |
(x, y, o, o), (o, o, -, -) |
scalar |
i |
2 |
(o, o, z, w), (o, o, -, -) |
||
scalar |
i+1 |
0 or unspecified |
(o, o, o, o), (x, y, -, -) |
||
two-component vector |
i |
0 or unspecified |
(x, y, z, w), (o, o, -, -) |
||
three-component vector |
i |
unspecified |
(x, y, z, w), (x, y, -, -) |
||
R64G64B64A64 |
i, i+1 |
scalar |
i |
0 or unspecified |
(x, y, o, o), (o, o, o, o) |
scalar |
i |
2 |
(o, o, z, w), (o, o, o, o) |
||
scalar |
i+1 |
0 or unspecified |
(o, o, o, o), (x, y, o, o) |
||
scalar |
i+1 |
2 |
(o, o, o, o), (o, o, z, w) |
||
two-component vector |
i |
0 or unspecified |
(x, y, z, w), (o, o, o, o) |
||
two-component vector |
i+1 |
0 or unspecified |
(o, o, o, o), (x, y, z, w) |
||
three-component vector |
i |
unspecified |
(x, y, z, w), (x, y, o, o) |
||
four-component vector |
i |
unspecified |
(x, y, z, w), (x, y, z, w) |
Components indicated by “o” are available for use by other input variables which are sourced from the same attribute. Components indicated by “-” are not available for input variables as there are no default values provided for 64-bit data types, and there is no data provided by the input format.
When a vertex shader input variable declared using a 64-bit floating-point matrix type is assigned a location i, its values are taken from consecutive input attribute locations. Such matrices are treated as an array of column vectors with values taken from the input attributes as shown in Input attribute locations and components accessed by 64-bit input variables. Each column vector starts at the location immediately following the last location of the previous column vector. The number of attributes and components assigned to each matrix is determined by the matrix dimensions and ranges from two to eight locations.
When a vertex shader input variable declared using an array type is assigned
a location, its values are taken from consecutive input attributes starting
with the corresponding
VkVertexInputAttributeDescription
::location
.
The number of attributes and components assigned to each element are
determined according to the data type of the array elements and
Component
decoration (if any) specified in the declaration of the
array, as described above.
Each element of the array, in order, is assigned to consecutive locations,
but all at the same specified component within each location.
Only input variables declared with the data types and component decorations as specified above are supported. Location aliasing is causing two variables to have the same location number. Component aliasing is assigning the same (or overlapping) component number for two location aliases. Location aliasing is allowed only if it does not cause component aliasing. Further, when location aliasing, the aliases sharing the location must all have the same SPIR-V floating-point component type or all have the same width integer-type components.
21.2. Vertex Input Description
Applications specify vertex input attribute and vertex input binding
descriptions as part of graphics pipeline creation.
The VkGraphicsPipelineCreateInfo::pVertexInputState
points to a
structure of type VkPipelineVertexInputStateCreateInfo
.
The VkPipelineVertexInputStateCreateInfo
structure is defined as:
typedef struct VkPipelineVertexInputStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineVertexInputStateCreateFlags flags;
uint32_t vertexBindingDescriptionCount;
const VkVertexInputBindingDescription* pVertexBindingDescriptions;
uint32_t vertexAttributeDescriptionCount;
const VkVertexInputAttributeDescription* pVertexAttributeDescriptions;
} VkPipelineVertexInputStateCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
vertexBindingDescriptionCount
is the number of vertex binding descriptions provided inpVertexBindingDescriptions
. -
pVertexBindingDescriptions
is a pointer to an array ofVkVertexInputBindingDescription
structures. -
vertexAttributeDescriptionCount
is the number of vertex attribute descriptions provided inpVertexAttributeDescriptions
. -
pVertexAttributeDescriptions
is a pointer to an array ofVkVertexInputAttributeDescription
structures.
typedef VkFlags VkPipelineVertexInputStateCreateFlags;
VkPipelineVertexInputStateCreateFlags
is a bitmask type for setting a
mask, but is currently reserved for future use.
Each vertex input binding is specified by an instance of the
VkVertexInputBindingDescription
structure.
The VkVertexInputBindingDescription
structure is defined as:
typedef struct VkVertexInputBindingDescription {
uint32_t binding;
uint32_t stride;
VkVertexInputRate inputRate;
} VkVertexInputBindingDescription;
-
binding
is the binding number that this structure describes. -
stride
is the distance in bytes between two consecutive elements within the buffer. -
inputRate
is a VkVertexInputRate value specifying whether vertex attribute addressing is a function of the vertex index or of the instance index.
Possible values of VkVertexInputBindingDescription::inputRate
,
specifying the rate at which vertex attributes are pulled from buffers, are:
typedef enum VkVertexInputRate {
VK_VERTEX_INPUT_RATE_VERTEX = 0,
VK_VERTEX_INPUT_RATE_INSTANCE = 1,
} VkVertexInputRate;
-
VK_VERTEX_INPUT_RATE_VERTEX
specifies that vertex attribute addressing is a function of the vertex index. -
VK_VERTEX_INPUT_RATE_INSTANCE
specifies that vertex attribute addressing is a function of the instance index.
Each vertex input attribute is specified by an instance of the
VkVertexInputAttributeDescription
structure.
The VkVertexInputAttributeDescription
structure is defined as:
typedef struct VkVertexInputAttributeDescription {
uint32_t location;
uint32_t binding;
VkFormat format;
uint32_t offset;
} VkVertexInputAttributeDescription;
-
location
is the shader binding location number for this attribute. -
binding
is the binding number which this attribute takes its data from. -
format
is the size and type of the vertex attribute data. -
offset
is a byte offset of this attribute relative to the start of an element in the vertex input binding.
To bind vertex buffers to a command buffer for use in subsequent draw commands, call:
void vkCmdBindVertexBuffers(
VkCommandBuffer commandBuffer,
uint32_t firstBinding,
uint32_t bindingCount,
const VkBuffer* pBuffers,
const VkDeviceSize* pOffsets);
-
commandBuffer
is the command buffer into which the command is recorded. -
firstBinding
is the index of the first vertex input binding whose state is updated by the command. -
bindingCount
is the number of vertex input bindings whose state is updated by the command. -
pBuffers
is a pointer to an array of buffer handles. -
pOffsets
is a pointer to an array of buffer offsets.
The values taken from elements i of pBuffers
and pOffsets
replace the current state for the vertex input binding
firstBinding
+ i, for i in [0,
bindingCount
).
The vertex input binding is updated to start at the offset indicated by
pOffsets
[i] from the start of the buffer pBuffers
[i].
All vertex input attributes that use each of these bindings will use these
updated addresses in their address calculations for subsequent draw
commands.
21.3. Vertex Attribute Divisor in Instanced Rendering
If
vertexAttributeInstanceRateDivisor
feature is enabled and the pNext
chain of
VkPipelineVertexInputStateCreateInfo includes a
VkPipelineVertexInputDivisorStateCreateInfoEXT
structure, then that
structure controls how vertex attributes are assigned to an instance when
instanced rendering is enabled.
The VkPipelineVertexInputDivisorStateCreateInfoEXT
structure is
defined as:
typedef struct VkPipelineVertexInputDivisorStateCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t vertexBindingDivisorCount;
const VkVertexInputBindingDivisorDescriptionEXT* pVertexBindingDivisors;
} VkPipelineVertexInputDivisorStateCreateInfoEXT;
-
sType
is the type of this structure -
pNext
isNULL
or a pointer to an extension-specific structure -
vertexBindingDivisorCount
is the number of elements in thepVertexBindingDivisors
array. -
pVertexBindingDivisors
is a pointer to an array ofVkVertexInputBindingDivisorDescriptionEXT
structures, which specifies the divisor value for each binding.
The individual divisor values per binding are specified using the
VkVertexInputBindingDivisorDescriptionEXT
structure which is defined
as:
typedef struct VkVertexInputBindingDivisorDescriptionEXT {
uint32_t binding;
uint32_t divisor;
} VkVertexInputBindingDivisorDescriptionEXT;
-
binding
is the binding number for which the divisor is specified. -
divisor
is the number of successive instances that will use the same value of the vertex attribute when instanced rendering is enabled. For example, if the divisor is N, the same vertex attribute will applied to N successive instances before moving on to the next vertex attribute. The maximum value of divisor is implementation dependent and can be queried usingVkPhysicalDeviceVertexAttributeDivisorPropertiesEXT
::maxVertexAttribDivisor
. A value of0
can be used for the divisor if thevertexAttributeInstanceRateZeroDivisor
feature is enabled. In this case, the same vertex attribute will be applied to all instances.
If this structure is not used to define a divisor value for an attribute then the divisor has a logical default value of 1.
The address of each attribute for each vertexIndex
and
instanceIndex
is calculated as follows:
-
Let
attribDesc
be the member ofVkPipelineVertexInputStateCreateInfo
::pVertexAttributeDescriptions
withVkVertexInputAttributeDescription
::location
equal to the vertex input attribute number. -
Let
bindingDesc
be the member ofVkPipelineVertexInputStateCreateInfo
::pVertexBindingDescriptions
withVkVertexInputAttributeDescription
::binding
equal toattribDesc
.binding. -
Let
vertexIndex
be the index of the vertex within the draw (a value betweenfirstVertex
andfirstVertex
+vertexCount
forvkCmdDraw
, or a value taken from the index buffer forvkCmdDrawIndexed
), and letinstanceIndex
be the instance number of the draw (a value betweenfirstInstance
andfirstInstance
+instanceCount
). -
Let
divisor
be the member ofVkPipelineVertexInputDivisorStateCreateInfoEXT
::pVertexBindingDivisors
withVkVertexInputBindingDivisorDescriptionEXT
::binding
equal toattribDesc
.binding.
bufferBindingAddress = buffer[binding].baseAddress + offset[binding];
if (bindingDesc.inputRate == VK_VERTEX_INPUT_RATE_VERTEX)
vertexOffset = vertexIndex * bindingDesc.stride;
else
if (divisor == 0)
vertexOffset = firstInstance * bindingDesc.stride;
else
vertexOffset = (firstInstance + ((instanceIndex - firstInstance) / divisor)) * bindingDesc.stride;
attribAddress = bufferBindingAddress + vertexOffset + attribDesc.offset;
For each attribute, raw data is extracted starting at attribAddress
and is
converted from the VkVertexInputAttributeDescription
’s format
to
either to floating-point, unsigned integer, or signed integer based on the
base type of the format; the base type of the format must match the base
type of the input variable in the shader.
If format
is a packed format, attribAddress
must be a multiple of
the size in bytes of the whole attribute data type as described in
Packed Formats.
Otherwise, attribAddress
must be a multiple of the size in bytes of the
component type indicated by format
(see Formats).
If the format does not include G, B, or A components, then those are filled
with (0,0,1) as needed (using either 1.0f or integer 1 based on the
format) for attributes that are not 64-bit data types.
The number of components in the vertex shader input variable need not
exactly match the number of components in the format.
If the vertex shader has fewer components, the extra components are
discarded.
21.4. Example
To create a graphics pipeline that uses the following vertex description:
struct Vertex
{
float x, y, z, w;
uint8_t u, v;
};
The application could use the following set of structures:
const VkVertexInputBindingDescription binding =
{
0, // binding
sizeof(Vertex), // stride
VK_VERTEX_INPUT_RATE_VERTEX // inputRate
};
const VkVertexInputAttributeDescription attributes[] =
{
{
0, // location
binding.binding, // binding
VK_FORMAT_R32G32B32A32_SFLOAT, // format
0 // offset
},
{
1, // location
binding.binding, // binding
VK_FORMAT_R8G8_UNORM, // format
4 * sizeof(float) // offset
}
};
const VkPipelineVertexInputStateCreateInfo viInfo =
{
VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_CREATE_INFO, // sType
NULL, // pNext
0, // flags
1, // vertexBindingDescriptionCount
&binding, // pVertexBindingDescriptions
2, // vertexAttributeDescriptionCount
&attributes[0] // pVertexAttributeDescriptions
};
22. Tessellation
Tessellation involves three pipeline stages. First, a tessellation control shader transforms control points of a patch and can produce per-patch data. Second, a fixed-function tessellator generates multiple primitives corresponding to a tessellation of the patch in (u,v) or (u,v,w) parameter space. Third, a tessellation evaluation shader transforms the vertices of the tessellated patch, for example to compute their positions and attributes as part of the tessellated surface. The tessellator is enabled when the pipeline contains both a tessellation control shader and a tessellation evaluation shader.
22.1. Tessellator
If a pipeline includes both tessellation shaders (control and evaluation),
the tessellator consumes each input patch (after vertex shading) and
produces a new set of independent primitives (points, lines, or triangles).
These primitives are logically produced by subdividing a geometric primitive
(rectangle or triangle) according to the per-patch outer and inner
tessellation levels written by the tessellation control shader.
These levels are specified using the built-in
variables TessLevelOuter
and TessLevelInner
, respectively.
This subdivision is performed in an implementation-dependent manner.
If no tessellation shaders are present in the pipeline, the tessellator is
disabled and incoming primitives are passed through without modification.
The type of subdivision performed by the tessellator is specified by an
OpExecutionMode
instruction in the tessellation evaluation or
tessellation control shader using one of execution modes Triangles
,
Quads
, and IsoLines
.
Other tessellation-related execution modes can also be specified in either
the tessellation control or tessellation evaluation shaders, and if they are
specified in both then the modes must be the same.
Tessellation execution modes include:
-
Triangles
,Quads
, andIsoLines
. These control the type of subdivision and topology of the output primitives. One mode must be set in at least one of the tessellation shader stages. -
VertexOrderCw
andVertexOrderCcw
. These control the orientation of triangles generated by the tessellator. One mode must be set in at least one of the tessellation shader stages. -
PointMode
. Controls generation of points rather than triangles or lines. This functionality defaults to disabled, and is enabled if either shader stage includes the execution mode. -
SpacingEqual
,SpacingFractionalEven
, andSpacingFractionalOdd
. Controls the spacing of segments on the edges of tessellated primitives. One mode must be set in at least one of the tessellation shader stages. -
OutputVertices
. Controls the size of the output patch of the tessellation control shader. One value must be set in at least one of the tessellation shader stages.
For triangles, the tessellator subdivides a triangle primitive into smaller
triangles.
For quads, the tessellator subdivides a rectangle primitive into smaller
triangles.
For isolines, the tessellator subdivides a rectangle primitive into a
collection of line segments arranged in strips stretching across the
rectangle in the u dimension (i.e. the coordinates in TessCoord
are of the form (0,x) through (1,x) for all tessellation evaluation shader
invocations that share a line).
Each vertex produced by the tessellator has an associated (u,v,w) or (u,v)
position in a normalized parameter space, with parameter values in the range
[0,1], as illustrated
in figures Domain parameterization for tessellation primitive modes (upper-left origin) and
Domain parameterization for tessellation primitive modes (lower-left origin).
The domain space can have either an upper-left or lower-left origin,
selected by the domainOrigin
member of
VkPipelineTessellationDomainOriginStateCreateInfo.
For triangles, the vertex’s position is a barycentric coordinate (u,v,w), where u + v + w = 1.0, and indicates the relative influence of the three vertices of the triangle on the position of the vertex. For quads and isolines, the position is a (u,v) coordinate indicating the relative horizontal and vertical position of the vertex relative to the subdivided rectangle. The subdivision process is explained in more detail in subsequent sections.
22.2. Tessellator Patch Discard
A patch is discarded by the tessellator if any relevant outer tessellation level is less than or equal to zero.
Patches will also be discarded if any relevant outer tessellation level corresponds to a floating-point NaN (not a number) in implementations supporting NaN.
No new primitives are generated and the tessellation evaluation shader is
not executed for patches that are discarded.
For Quads
, all four outer levels are relevant.
For Triangles
and IsoLines
, only the first three or two outer
levels, respectively, are relevant.
Negative inner levels will not cause a patch to be discarded; they will be
clamped as described below.
22.3. Tessellator Spacing
Each of the tessellation levels is used to determine the number and spacing
of segments used to subdivide a corresponding edge.
The method used to derive the number and spacing of segments is specified by
an OpExecutionMode
in the tessellation control or tessellation
evaluation shader using one of the identifiers SpacingEqual
,
SpacingFractionalEven
, or SpacingFractionalOdd
.
If SpacingEqual
is used, the floating-point tessellation level is first
clamped to [1, maxLevel
], where maxLevel
is the
implementation-dependent maximum tessellation level
(VkPhysicalDeviceLimits
::maxTessellationGenerationLevel
).
The result is rounded up to the nearest integer n, and the
corresponding edge is divided into n segments of equal length in (u,v)
space.
If SpacingFractionalEven
is used, the tessellation level is first
clamped to [2, maxLevel
] and then rounded up to the nearest even
integer n.
If SpacingFractionalOdd
is used, the tessellation level is clamped to
[1, maxLevel
- 1] and then rounded up to the nearest odd integer
n.
If n is one, the edge will not be subdivided.
Otherwise, the corresponding edge will be divided into n - 2 segments
of equal length, and two additional segments of equal length that are
typically shorter than the other segments.
The length of the two additional segments relative to the others will
decrease monotonically with n - f, where f is the clamped
floating-point tessellation level.
When n - f is zero, the additional segments will have equal length to
the other segments.
As n - f approaches 2.0, the relative length of the additional
segments approaches zero.
The two additional segments must be placed symmetrically on opposite sides
of the subdivided edge.
The relative location of these two segments is implementation-dependent, but
must be identical for any pair of subdivided edges with identical values of
f.
When the tessellator produces triangles (in the Triangles
or Quads
modes), the orientation of all triangles is specified with an
OpExecutionMode
of VertexOrderCw
or VertexOrderCcw
in the
tessellation control or tessellation evaluation shaders.
If the order is VertexOrderCw
, the vertices of all generated triangles
will have clockwise ordering in (u,v) or (u,v,w) space.
If the order is VertexOrderCcw
, the vertices will have
counter-clockwise ordering.
If the tessellation domain has an upper-left origin, the vertices of a triangle have counter-clockwise ordering if
-
a = u0 v1 - u1 v0 + u1 v2 - u2 v1 + u2 v0 - u0 v2
is negative, and clockwise ordering if a is positive. ui and vi are the u and v coordinates in normalized parameter space of the ith vertex of the triangle. If the tessellation domain has a lower-left origin, the vertices of a triangle have counter-clockwise ordering if a is positive, and clockwise ordering if a is negative.
Note
The value a is proportional (with a positive factor) to the signed area of the triangle. In |
For all primitive modes, the tessellator is capable of generating points
instead of lines or triangles.
If the tessellation control or tessellation evaluation shader specifies the
OpExecutionMode
PointMode
, the primitive generator will generate
one point for each distinct vertex produced by tessellation.
Otherwise, the tessellator will produce a collection of line segments or
triangles according to the primitive mode.
When tessellating triangles or quads in point mode with fractional odd
spacing, the tessellator may produce interior vertices that are
positioned on the edge of the patch if an inner tessellation level is less
than or equal to one.
Such vertices are considered distinct from vertices produced by subdividing
the outer edge of the patch, even if there are pairs of vertices with
identical coordinates.
22.4. Tessellation Primitive Ordering
Few guarantees are provided for the relative ordering of primitives produced by tessellation, as they pertain to primitive order.
-
The output primitives generated from each input primitive are passed to subsequent pipeline stages in an implementation-dependent order.
-
All output primitives generated from a given input primitive are passed to subsequent pipeline stages before any output primitives generated from subsequent input primitives.
22.5. Triangle Tessellation
If the tessellation primitive mode is Triangles
, an equilateral
triangle is subdivided into a collection of triangles covering the area of
the original triangle.
First, the original triangle is subdivided into a collection of concentric
equilateral triangles.
The edges of each of these triangles are subdivided, and the area between
each triangle pair is filled by triangles produced by joining the vertices
on the subdivided edges.
The number of concentric triangles and the number of subdivisions along each
triangle except the outermost is derived from the first inner tessellation
level.
The edges of the outermost triangle are subdivided independently, using the
first, second, and third outer tessellation levels to control the number of
subdivisions of the u = 0 (left), v = 0 (bottom), and w =
0 (right) edges, respectively.
The second inner tessellation level and the fourth outer tessellation level
have no effect in this mode.
If the first inner tessellation level and all three outer tessellation levels are exactly one after clamping and rounding, only a single triangle with (u,v,w) coordinates of (0,0,1), (1,0,0), and (0,1,0) is generated. If the inner tessellation level is one and any of the outer tessellation levels is greater than one, the inner tessellation level is treated as though it were originally specified as 1 + ε and will result in a two- or three-segment subdivision depending on the tessellation spacing. When used with fractional odd spacing, the three-segment subdivision may produce inner vertices positioned on the edge of the triangle.
If any tessellation level is greater than one, tessellation begins by producing a set of concentric inner triangles and subdividing their edges. First, the three outer edges are temporarily subdivided using the clamped and rounded first inner tessellation level and the specified tessellation spacing, generating n segments. For the outermost inner triangle, the inner triangle is degenerate — a single point at the center of the triangle — if n is two. Otherwise, for each corner of the outer triangle, an inner triangle corner is produced at the intersection of two lines extended perpendicular to the corner’s two adjacent edges running through the vertex of the subdivided outer edge nearest that corner. If n is three, the edges of the inner triangle are not subdivided and is the final triangle in the set of concentric triangles. Otherwise, each edge of the inner triangle is divided into n - 2 segments, with the n - 1 vertices of this subdivision produced by intersecting the inner edge with lines perpendicular to the edge running through the n - 1 innermost vertices of the subdivision of the outer edge. Once the outermost inner triangle is subdivided, the previous subdivision process repeats itself, using the generated triangle as an outer triangle. This subdivision process is illustrated in Inner Triangle Tessellation.
Once all the concentric triangles are produced and their edges are subdivided, the area between each pair of adjacent inner triangles is filled completely with a set of non-overlapping triangles. In this subdivision, two of the three vertices of each triangle are taken from adjacent vertices on a subdivided edge of one triangle; the third is one of the vertices on the corresponding edge of the other triangle. If the innermost triangle is degenerate (i.e., a point), the triangle containing it is subdivided into six triangles by connecting each of the six vertices on that triangle with the center point. If the innermost triangle is not degenerate, that triangle is added to the set of generated triangles as-is.
After the area corresponding to any inner triangles is filled, the tessellator generates triangles to cover the area between the outermost triangle and the outermost inner triangle. To do this, the temporary subdivision of the outer triangle edge above is discarded. Instead, the u = 0, v = 0, and w = 0 edges are subdivided according to the first, second, and third outer tessellation levels, respectively, and the tessellation spacing. The original subdivision of the first inner triangle is retained. The area between the outer and first inner triangles is completely filled by non-overlapping triangles as described above. If the first (and only) inner triangle is degenerate, a set of triangles is produced by connecting each vertex on the outer triangle edges with the center point.
After all triangles are generated, each vertex in the subdivided triangle is assigned a barycentric (u,v,w) coordinate based on its location relative to the three vertices of the outer triangle.
The algorithm used to subdivide the triangular domain in (u,v,w) space into individual triangles is implementation-dependent. However, the set of triangles produced will completely cover the domain, and no portion of the domain will be covered by multiple triangles.
The order in which the vertices for a given output triangle is generated is implementation-dependent. However, when depicted in a manner similar to Inner Triangle Tessellation, the order of the vertices in each generated triangle will be either all clockwise or all counter-clockwise, according to the vertex order layout declaration.
22.6. Quad Tessellation
If the tessellation primitive mode is Quads
, a rectangle is subdivided
into a collection of triangles covering the area of the original rectangle.
First, the original rectangle is subdivided into a regular mesh of
rectangles, where the number of rectangles along the u = 0 and u
= 1 (vertical) and v = 0 and v = 1 (horizontal) edges are
derived from the first and second inner tessellation levels, respectively.
All rectangles, except those adjacent to one of the outer rectangle edges,
are decomposed into triangle pairs.
The outermost rectangle edges are subdivided independently, using the first,
second, third, and fourth outer tessellation levels to control the number of
subdivisions of the u = 0 (left), v = 0 (bottom), u = 1
(right), and v = 1 (top) edges, respectively.
The area between the inner rectangles of the mesh and the outer rectangle
edges are filled by triangles produced by joining the vertices on the
subdivided outer edges to the vertices on the edge of the inner rectangle
mesh.
If both clamped inner tessellation levels and all four clamped outer tessellation levels are exactly one, only a single triangle pair covering the outer rectangle is generated. Otherwise, if either clamped inner tessellation level is one, that tessellation level is treated as though it were originally specified as 1 + ε and will result in a two- or three-segment subdivision depending on the tessellation spacing. When used with fractional odd spacing, the three-segment subdivision may produce inner vertices positioned on the edge of the rectangle.
If any tessellation level is greater than one, tessellation begins by subdividing the u = 0 and u = 1 edges of the outer rectangle into m segments using the clamped and rounded first inner tessellation level and the tessellation spacing. The v = 0 and v = 1 edges are subdivided into n segments using the second inner tessellation level. Each vertex on the u = 0 and v = 0 edges are joined with the corresponding vertex on the u = 1 and v = 1 edges to produce a set of vertical and horizontal lines that divide the rectangle into a grid of smaller rectangles. The primitive generator emits a pair of non-overlapping triangles covering each such rectangle not adjacent to an edge of the outer rectangle. The boundary of the region covered by these triangles forms an inner rectangle, the edges of which are subdivided by the grid vertices that lie on the edge. If either m or n is two, the inner rectangle is degenerate, and one or both of the rectangle’s edges consist of a single point. This subdivision is illustrated in Figure Inner Quad Tessellation.
After the area corresponding to the inner rectangle is filled, the tessellator must produce triangles to cover the area between the inner and outer rectangles. To do this, the subdivision of the outer rectangle edge above is discarded. Instead, the u = 0, v = 0, u = 1, and v = 1 edges are subdivided according to the first, second, third, and fourth outer tessellation levels, respectively, and the tessellation spacing. The original subdivision of the inner rectangle is retained. The area between the outer and inner rectangles is completely filled by non-overlapping triangles. Two of the three vertices of each triangle are adjacent vertices on a subdivided edge of one rectangle; the third is one of the vertices on the corresponding edge of the other triangle. If either edge of the innermost rectangle is degenerate, the area near the corresponding outer edges is filled by connecting each vertex on the outer edge with the single vertex making up the inner edge.
The algorithm used to subdivide the rectangular domain in (u,v) space into individual triangles is implementation-dependent. However, the set of triangles produced will completely cover the domain, and no portion of the domain will be covered by multiple triangles.
The order in which the vertices for a given output triangle is generated is implementation-dependent. However, when depicted in a manner similar to Inner Quad Tessellation, the order of the vertices in each generated triangle will be either all clockwise or all counter-clockwise, according to the vertex order layout declaration.
22.7. Isoline Tessellation
If the tessellation primitive mode is IsoLines
, a set of independent
horizontal line segments is drawn.
The segments are arranged into connected strips called isolines, where the
vertices of each isoline have a constant v coordinate and u coordinates
covering the full range [0,1].
The number of isolines generated is derived from the first outer
tessellation level; the number of segments in each isoline is derived from
the second outer tessellation level.
Both inner tessellation levels and the third and fourth outer tessellation
levels have no effect in this mode.
As with quad tessellation above, isoline tessellation begins with a rectangle. The u = 0 and u = 1 edges of the rectangle are subdivided according to the first outer tessellation level. For the purposes of this subdivision, the tessellation spacing mode is ignored and treated as equal_spacing. An isoline is drawn connecting each vertex on the u = 0 rectangle edge to the corresponding vertex on the u = 1 rectangle edge, except that no line is drawn between (0,1) and (1,1). If the number of isolines on the subdivided u = 0 and u = 1 edges is n, this process will result in n equally spaced lines with constant v coordinates of 0, \(\frac{1}{n}, \frac{2}{n}, \ldots, \frac{n-1}{n}\).
Each of the n isolines is then subdivided according to the second outer tessellation level and the tessellation spacing, resulting in m line segments. Each segment of each line is emitted by the tessellator.
The order in which the vertices for a given output line is generated is implementation-dependent.
22.8. Tessellation Pipeline State
The pTessellationState
member of VkGraphicsPipelineCreateInfo
points to a structure of type VkPipelineTessellationStateCreateInfo
.
The VkPipelineTessellationStateCreateInfo
structure is defined as:
typedef struct VkPipelineTessellationStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineTessellationStateCreateFlags flags;
uint32_t patchControlPoints;
} VkPipelineTessellationStateCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
patchControlPoints
number of control points per patch.
typedef VkFlags VkPipelineTessellationStateCreateFlags;
VkPipelineTessellationStateCreateFlags
is a bitmask type for setting a
mask, but is currently reserved for future use.
The VkPipelineTessellationDomainOriginStateCreateInfo
structure is
defined as:
typedef struct VkPipelineTessellationDomainOriginStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkTessellationDomainOrigin domainOrigin;
} VkPipelineTessellationDomainOriginStateCreateInfo;
or the equivalent
typedef VkPipelineTessellationDomainOriginStateCreateInfo VkPipelineTessellationDomainOriginStateCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
domainOrigin
controls the origin of the tessellation domain space, and is of type VkTessellationDomainOrigin.
If the VkPipelineTessellationDomainOriginStateCreateInfo
structure is
included in the pNext
chain of
VkPipelineTessellationStateCreateInfo, it controls the origin of the
tessellation domain.
If this structure is not present, it is as if domainOrigin
were
VK_TESSELLATION_DOMAIN_ORIGIN_UPPER_LEFT
.
The possible tessellation domain origins are specified by the VkTessellationDomainOrigin enumeration:
typedef enum VkTessellationDomainOrigin {
VK_TESSELLATION_DOMAIN_ORIGIN_UPPER_LEFT = 0,
VK_TESSELLATION_DOMAIN_ORIGIN_LOWER_LEFT = 1,
VK_TESSELLATION_DOMAIN_ORIGIN_UPPER_LEFT_KHR = VK_TESSELLATION_DOMAIN_ORIGIN_UPPER_LEFT,
VK_TESSELLATION_DOMAIN_ORIGIN_LOWER_LEFT_KHR = VK_TESSELLATION_DOMAIN_ORIGIN_LOWER_LEFT,
} VkTessellationDomainOrigin;
or the equivalent
typedef VkTessellationDomainOrigin VkTessellationDomainOriginKHR;
-
VK_TESSELLATION_DOMAIN_ORIGIN_UPPER_LEFT
specifies that the origin of the domain space is in the upper left corner, as shown in figure Domain parameterization for tessellation primitive modes (upper-left origin). -
VK_TESSELLATION_DOMAIN_ORIGIN_LOWER_LEFT
specifies that the origin of the domain space is in the lower left corner, as shown in figure Domain parameterization for tessellation primitive modes (lower-left origin).
This enum affects how the VertexOrderCw
and VertexOrderCcw
tessellation execution modes are interpreted, since the winding is defined
relative to the orientation of the domain.
23. Geometry Shading
The geometry shader operates on a group of vertices and their associated data assembled from a single input primitive, and emits zero or more output primitives and the group of vertices and their associated data required for each output primitive. Geometry shading is enabled when a geometry shader is included in the pipeline.
23.1. Geometry Shader Input Primitives
Each geometry shader invocation has access to all vertices in the primitive
(and their associated data), which are presented to the shader as an array
of inputs.
The input primitive type expected by the geometry shader is specified with
an OpExecutionMode
instruction in the geometry shader, and must be
compatible with the primitive topology used by primitive assembly (if
tessellation is not in use) or must match the type of primitive generated
by the tessellation primitive generator (if tessellation is in use).
Compatibility is defined below, with each input primitive type.
The input primitive types accepted by a geometry shader are:
- Points
-
Geometry shaders that operate on points use an
OpExecutionMode
instruction specifying theInputPoints
input mode. Such a shader is valid only when the pipeline primitive topology isVK_PRIMITIVE_TOPOLOGY_POINT_LIST
(if tessellation is not in use) or if tessellation is in use and the tessellation evaluation shader usesPointMode
. There is only a single input vertex available for each geometry shader invocation. However, inputs to the geometry shader are still presented as an array, but this array has a length of one. - Lines
-
Geometry shaders that operate on line segments are generated by including an
OpExecutionMode
instruction with theInputLines
mode. Such a shader is valid only for theVK_PRIMITIVE_TOPOLOGY_LINE_LIST
, andVK_PRIMITIVE_TOPOLOGY_LINE_STRIP
primitive topologies (if tessellation is not in use) or if tessellation is in use and the tessellation mode isIsolines
. There are two input vertices available for each geometry shader invocation. The first vertex refers to the vertex at the beginning of the line segment and the second vertex refers to the vertex at the end of the line segment. - Lines with Adjacency
-
Geometry shaders that operate on line segments with adjacent vertices are generated by including an
OpExecutionMode
instruction with theInputLinesAdjacency
mode. Such a shader is valid only for theVK_PRIMITIVE_TOPOLOGY_LINE_LIST_WITH_ADJACENCY
andVK_PRIMITIVE_TOPOLOGY_LINE_STRIP_WITH_ADJACENCY
primitive topologies and must not be used when tessellation is in use.In this mode, there are four vertices available for each geometry shader invocation. The second vertex refers to attributes of the vertex at the beginning of the line segment and the third vertex refers to the vertex at the end of the line segment. The first and fourth vertices refer to the vertices adjacent to the beginning and end of the line segment, respectively.
- Triangles
-
Geometry shaders that operate on triangles are created by including an
OpExecutionMode
instruction with theTriangles
mode. Such a shader is valid when the pipeline topology isVK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST
,VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP
, orVK_PRIMITIVE_TOPOLOGY_TRIANGLE_FAN
(if tessellation is not in use) or when tessellation is in use and the tessellation mode isTriangles
orQuads
.In this mode, there are three vertices available for each geometry shader invocation. The first, second, and third vertices refer to attributes of the first, second, and third vertex of the triangle, respectively.
- Triangles with Adjacency
-
Geometry shaders that operate on triangles with adjacent vertices are created by including an
OpExecutionMode
instruction with theInputTrianglesAdjacency
mode. Such a shader is valid when the pipeline topology isVK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY
orVK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP_WITH_ADJACENCY
, and must not be used when tessellation is in use.In this mode, there are six vertices available for each geometry shader invocation. The first, third and fifth vertices refer to attributes of the first, second and third vertex of the triangle, respectively. The second, fourth and sixth vertices refer to attributes of the vertices adjacent to the edges from the first to the second vertex, from the second to the third vertex, and from the third to the first vertex, respectively.
23.2. Geometry Shader Output Primitives
A geometry shader generates primitives in one of three output modes: points,
line strips, or triangle strips.
The primitive mode is specified in the shader using an OpExecutionMode
instruction with the OutputPoints
, OutputLineStrip
or
OutputTriangleStrip
modes, respectively.
Each geometry shader must include exactly one output primitive mode.
The vertices output by the geometry shader are assembled into points, lines, or triangles based on the output primitive type and the resulting primitives are then further processed as described in Rasterization. If the number of vertices emitted by the geometry shader is not sufficient to produce a single primitive, vertices corresponding to incomplete primitives are not processed by subsequent pipeline stages. The number of vertices output by the geometry shader is limited to a maximum count specified in the shader.
The maximum output vertex count is specified in the shader using an
OpExecutionMode
instruction with the mode set to OutputVertices
and the maximum number of vertices that will be produced by the geometry
shader specified as a literal.
Each geometry shader must specify a maximum output vertex count.
23.3. Multiple Invocations of Geometry Shaders
Geometry shaders can be invoked more than one time for each input
primitive.
This is known as geometry shader instancing and is requested by including
an OpExecutionMode
instruction with mode
specified as
Invocations
and the number of invocations specified as an integer
literal.
In this mode, the geometry shader will execute at least n times for
each input primitive, where n is the number of invocations specified
in the OpExecutionMode
instruction.
The instance number is available to each invocation as a built-in input
using InvocationId
.
23.4. Geometry Shader Primitive Ordering
Limited guarantees are provided for the relative ordering of primitives produced by a geometry shader, as they pertain to primitive order.
-
For instanced geometry shaders, the output primitives generated from each input primitive are passed to subsequent pipeline stages using the invocation number to order the primitives, from least to greatest.
-
All output primitives generated from a given input primitive are passed to subsequent pipeline stages before any output primitives generated from subsequent input primitives.
23.5. Geometry Shader Passthrough
A geometry shader that uses the PassthroughNV
decoration on a variable
in its input interface is considered a passthrough geometry shader.
Output primitives in a passthrough geometry shader must have the same
topology as the input primitive and are not produced by emitting vertices.
The vertices of the output primitive have two different types of attributes,
per-vertex and per-primitive.
Geometry shader input variables with PassthroughNV
decoration are
considered to produce per-vertex outputs, where values for each output
vertex are copied from the corresponding input vertex.
Any built-in or user-defined geometry shader outputs are considered
per-primitive in a passthrough geometry shader, where a single output value
is copied to all output vertices.
The remainder of this section details the usage of the PassthroughNV
decoration and modifications to the interface matching rules when using
passthrough geometry shaders.
23.5.1. PassthroughNV
Decoration
Decorating a geometry shader input variable with the PassthroughNV
decoration indicates that values of this input are copied through to the
corresponding vertex of the output primitive.
Input variables and block members which do not have the PassthroughNV
decoration are consumed by the geometry shader without being passed through
to subsequent stages.
The PassthroughNV
decoration must only be used within a geometry
shader.
Any variable decorated with PassthroughNV
must be declared using the
Input
storage class.
The PassthroughNV
decoration must not be used with any of:
-
an input primitive type other than
InputPoints
,InputLines
, orTriangles
, as specified by the mode forOpExecutionMode
. -
an invocation count other than one, as specified by the
Invocations
mode forOpExecutionMode
. -
an
OpEntryPoint
which statically uses theOpEmitVertex
orOpEndPrimitive
instructions. -
a variable decorated with the
InvocationId
built-in decoration. -
a variable decorated with the
PrimitiveId
built-in decoration that is declared using theInput
storage class.
23.5.2. Passthrough Interface Matching
When a passthrough geometry shader is in use, the Interface Matching rules involving the geometry shader input and output interfaces operate as described in this section.
For the purposes of matching passthrough geometry shader inputs with outputs
of the previous pipeline stages, the PassthroughNV
decoration is
ignored.
For the purposes of matching the outputs of the geometry shader with
subsequent pipeline stages, each input variable with the PassthroughNV
decoration is considered to add an equivalent output variable with the same
type, decoration (other than PassthroughNV
), number, and declaration
order on the output interface.
The output variable declaration corresponding to an input variable decorated
with PassthroughNV
will be identical to the input declaration, except
that the outermost array dimension of such variables is removed.
The output block declaration corresponding to an input block decorated with
PassthroughNV
or having members decorated with PassthroughNV
will
be identical to the input declaration, except that the outermost array
dimension of such declaration is removed.
If an input block is decorated with PassthroughNV
, the equivalent
output block contains all the members of the input block.
Otherwise, the equivalent output block contains only those input block
members decorated with PassthroughNV
.
All members of the corresponding output block are assigned Location
and
Component
decorations identical to those assigned to the corresponding
input block members.
Output variables and blocks generated from inputs decorated with
PassthroughNV
will only exist for the purposes of interface matching;
these declarations are not available to geometry shader code or listed in
the module interface.
For the purposes of component counting, passthrough geometry shaders count
all statically used input variable components declared with the
PassthroughNV
decoration as output components as well, since their
values will be copied to the output primitive produced by the geometry
shader.
24. Mesh Shading
Task and mesh shaders operate in workgroups to produce a collection of primitives that will be processed by subsequent stages of the graphics pipeline.
Work on the mesh pipeline is initiated by the application drawing a set of mesh tasks organized in global workgroups. If the optional task shader is active, each workgroup triggers the execution of task shader invocations that will create a new set of mesh workgroups upon completion. Each of these created workgroups, or each of the original workgroups if no task shader is present, triggers the execution of mesh shader invocations.
Each mesh shader workgroup emits zero or more output primitives along with the group of vertices and their associated data required for each output primitive.
24.1. Task Shader Input
For every workgroup issued via the drawing commands a group of task shader invocations is executed. There are no inputs other than the builtin workgroup identifiers.
24.2. Task Shader Output
The task shader can emit zero or more mesh workgroups to be generated using
the built-in variable TaskCountNV
.
This value must be less than or equal to
VkPhysicalDeviceMeshShaderPropertiesNV
::maxTaskOutputCount
.
It can also output user-defined data that is passed as input to all mesh
shader invocations that the task creates.
These outputs are decorated as PerTaskNV
.
24.3. Mesh Generation
If a task shader exists, the mesh assembler creates a variable amount of mesh workgroups depending on each task’s output. If there is no task shader, the drawing commands emit the mesh shader invocations directly.
24.4. Mesh Shader Input
The only inputs available to the mesh shader are variables identifying the
specific workgroup and invocation and, if applicable, any outputs written as
PerTaskNV
by the task shader that spawned the mesh shader’s workgroup.
The mesh shader can operate without a task shader as well.
24.5. Mesh Shader Output Primitives
A mesh shader generates primitives in one of three output modes: points,
lines, or triangles.
The primitive mode is specified in the shader using an OpExecutionMode
instruction with the OutputPoints
, OutputLinesNV
, or
OutputTrianglesNV
modes, respectively.
Each mesh shader must include exactly one output primitive mode.
The maximum output vertex count is specified as a literal in the shader
using an OpExecutionMode
instruction with the mode set to
OutputVertices
and must be less than or equal to
VkPhysicalDeviceMeshShaderPropertiesNV
::maxMeshOutputVertices
.
The maximum output primitive count is specified as a literal in the shader
using an OpExecutionMode
instruction with the mode set to
OutputPrimitivesNV
and must be less than or equal to
VkPhysicalDeviceMeshShaderPropertiesNV
::maxMeshOutputPrimitives
.
The number of primitives output by the mesh shader is provided via writing
to the built-in variable
PrimitiveCountNV
and must be less than or equal to the maximum output
primitive count specified in the shader.
A variable decorated with PrimitiveIndicesNV
is an output array of
local index values into the vertex output arrays from which primitives are
assembled according to the output primitive type.
These resulting primitives are then further processed as described in
Rasterization.
24.6. Mesh Shader Per-View Outputs
The mesh shader outputs decorated with the PositionPerViewNV
,
ClipDistancePerViewNV
, CullDistancePerViewNV
, LayerPerViewNV
,
and ViewportMaskPerViewNV
built-in decorations are the per-view
versions of the single-view variables with equivalent names (that is
Position
, ClipDistance
, CullDistance
, Layer
, and
ViewportMaskNV
, respectively).
If a shader statically assigns a value to any element of a per-view array it
must not statically assign a value to the equivalent single-view variable.
Each of these outputs is considered arrayed, with separate values for each view. The view number is used to index the first dimension of these arrays.
The second dimension of the ClipDistancePerViewNV
, and
CullDistancePerViewNV
arrays have the same requirements as the
ClipDistance
, and CullDistance
arrays.
If a mesh shader output is per-view, the corresponding fragment shader input is taken from the element of the per-view output array that corresponds to the view that is currently being processed by the fragment shader.
24.7. Mesh Shader Primitive Ordering
Following guarantees are provided for the relative ordering of primitives produced by a mesh shader, as they pertain to primitive order.
-
When a task shader is used, mesh workgroups spawned from lower tasks will be ordered prior those workgroups from subsequent tasks.
-
All output primitives generated from a given mesh workgroup are passed to subsequent pipeline stages before any output primitives generated from subsequent input workgroups.
-
All output primitives within a mesh workgroup, will be generated in the ordering provided by the builtin primitive indexbuffer (from low address to high address).
25. Fixed-Function Vertex Post-Processing
After programmable vertex processing, the following fixed-function operations are applied to vertices of the resulting primitives:
-
Transform feedback (see Transform Feedback)
-
Viewport swizzle (see Viewport Swizzle)
-
Flat shading (see Flat Shading).
-
Primitive clipping, including client-defined half-spaces (see Primitive Clipping).
-
Shader output attribute clipping (see Clipping Shader Outputs).
-
Clip space W scaling (see Controlling Viewport W Scaling).
-
Perspective division on clip coordinates (see Coordinate Transformations).
-
Viewport mapping, including depth range scaling (see Controlling the Viewport).
-
Front face determination for polygon primitives (see Basic Polygon Rasterization).
editing-note
TODO:Odd that this one link to a different chapter is in this list. |
Next, rasterization is performed on primitives as described in chapter Rasterization.
25.1. Transform Feedback
Before any other fixed-function vertex post-processing, vertex outputs from
the last shader in the vertex processing stage can be written out to one or
more transform feedback buffers bound to the command buffer.
To capture vertex outputs the last vertex processing stage shader must be
declared with the Xfb
execution mode.
Outputs decorated with XfbBuffer
will be written out to the
corresponding transform feedback buffers bound to the command buffer when
transform feedback is active.
Transform feedback buffers are bound to the command buffer by using
vkCmdBindTransformFeedbackBuffersEXT.
Transform feedback is made active by calling
vkCmdBeginTransformFeedbackEXT and made inactive by calling
vkCmdEndTransformFeedbackEXT.
After vertex data is written it is possible to use
vkCmdDrawIndirectByteCountEXT to start a new draw where the
vertexCount
is derived from the number of bytes written by a previous
transform feedback.
When an individual point, line, or triangle primitive reaches the transform
feedback stage while transform feedback is active, the values of the
specified output variables are assembled into primitives and appended to the
bound transform feedback buffers.
After activating transform feedback, the values of the first assembled
primitive are written at the starting offsets of the bound transform
feedback buffers, and subsequent primitives are appended to the buffer.
If the optional pCounterBuffers
and pCounterBufferOffsets
parameters are specified, the starting points within the transform feedback
buffers are adjusted so data is appended to the previously written values
indicated by the value stored by the implementation in the counter buffer.
When capturing line and triangle primitives, all values from the first
vertex output are written first, followed by values of the subsequent vertex
outputs.
When capturing vertices, the stride associated with each transform feedback
buffer, as indicated by the XfbStride
decoration, indicates the number
of bytes of storage reserved for each vertex in the transform feedback
buffer.
For every vertex captured, each output attribute with a Offset
decoration will be written to the storage reserved for the vertex at the
associated transform feedback buffer.
When writing output variables that are arrays or structures, individual
array elements or structure members are written tightly packed in order.
For vector types, individual components are written in order.
For matrix types, outputs are written as an array of column vectors.
If any component of an output with an assigned transform feedback offset was not written to by its shader, the value recorded for that component is undefined. The results of writing an output variable to a transform feedback buffer are undefined if any component of that variable would be written at an offset not aligned to the size of the component, or the component is less than 4 bytes in size. When capturing a vertex, any portion of the reserved storage not associated with an output variable with an assigned transform feedback offset will be unmodified.
When transform feedback is inactive, no vertices are recorded.
If there is a valid counter buffer handle and counter buffer offset in the
pCounterBuffers
and pCounterBufferOffsets
arrays, writes to the
corresponding transform feedback buffer will start at the byte offset
represented by the value stored in the counter buffer location.
Individual lines or triangles of a strip or fan primitive will be extracted and recorded separately. Incomplete primitives are not recorded.
When using a geometry shader that emits vertices to multiple vertex streams, a primitive will be assembled and output for each stream when there are enough vertices emitted for the output primitive type. All outputs assigned to a given transform feedback buffer are required to come from a single vertex stream.
The sizes of the transform feedback buffers are defined by the
vkCmdBindTransformFeedbackBuffersEXT pSizes
parameter for each
of the bound buffers, or the size of the bound buffer, whichever is the
lesser.
If there is less space remaining in any of the transform feedback buffers
than the size of the all the vertex data for that primitive based on the
XfbStride
for that XfbBuffer
then no vertex data of that primitive
is recorded in any transform feedback buffer, and the value for the number
of primitives written in the corresponding
VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT
query for all transform
feedback buffers is no longer incremented.
Any outputs made to a XfbBuffer
that is not bound to a transform
feedback buffer is ignored.
To bind transform feedback buffers to a command buffer for use in subsequent draw commands, call:
void vkCmdBindTransformFeedbackBuffersEXT(
VkCommandBuffer commandBuffer,
uint32_t firstBinding,
uint32_t bindingCount,
const VkBuffer* pBuffers,
const VkDeviceSize* pOffsets,
const VkDeviceSize* pSizes);
-
commandBuffer
is the command buffer into which the command is recorded. -
firstBinding
is the index of the first transform feedback binding whose state is updated by the command. -
bindingCount
is the number of transform feedback bindings whose state is updated by the command. -
pBuffers
is a pointer to an array of buffer handles. -
pOffsets
is a pointer to an array of buffer offsets. -
pSizes
is an optional array of buffer sizes, which specifies the maximum number of bytes to capture to the corresponding transform feedback buffer. IfpSizes
isNULL
, or the value of thepSizes
array element isVK_WHOLE_SIZE
, then the maximum bytes captured will be the size of the corresponding buffer minus the buffer offset.
The values taken from elements i of pBuffers
, pOffsets
and
pSizes
replace the current state for the transform feedback binding
firstBinding
+ i, for i in [0,
bindingCount
).
The transform feedback binding is updated to start at the offset indicated
by pOffsets
[i] from the start of the buffer pBuffers
[i].
Transform feedback for specific transform feedback buffers is made active by calling:
void vkCmdBeginTransformFeedbackEXT(
VkCommandBuffer commandBuffer,
uint32_t firstCounterBuffer,
uint32_t counterBufferCount,
const VkBuffer* pCounterBuffers,
const VkDeviceSize* pCounterBufferOffsets);
-
commandBuffer
is the command buffer into which the command is recorded. -
firstCounterBuffer
is the index of the first transform feedback buffer corresponding topCounterBuffers
[0] andpCounterBufferOffsets
[0]. -
counterBufferCount
is the size of thepCounterBuffers
andpCounterBufferOffsets
arrays. -
pCounterBuffers
is an optional array of buffer handles to the counter buffers which contain a 4 byte integer value representing the byte offset from the start of the corresponding transform feedback buffer from where to start capturing vertex data. If the byte offset stored to the counter buffer location was done using vkCmdEndTransformFeedbackEXT it can be used to resume transform feedback from the previous location. IfpCounterBuffers
isNULL
, then transform feedback will start capturing vertex data to byte offset zero in all bound transform feedback buffers. For each element ofpCounterBuffers
that is VK_NULL_HANDLE, transform feedback will start capturing vertex data to byte zero in the corresponding bound transform feedback buffer. -
pCounterBufferOffsets
is an optional array of offsets within each of thepCounterBuffers
where the counter values were previously written. The location in each counter buffer at these offsets must be large enough to contain 4 bytes of data. This data is the number of bytes captured by the previous transform feedback to this buffer. IfpCounterBufferOffsets
isNULL
, then it is assumed the offsets are zero.
The active transform feedback buffers will capture primitives emitted from
the corresponding XfbBuffer
in the bound graphics pipeline.
Any XfbBuffer
emitted that does not output to an active transform
feedback buffer will not be captured.
Transform feedback for specific transform feedback buffers is made inactive by calling:
void vkCmdEndTransformFeedbackEXT(
VkCommandBuffer commandBuffer,
uint32_t firstCounterBuffer,
uint32_t counterBufferCount,
const VkBuffer* pCounterBuffers,
const VkDeviceSize* pCounterBufferOffsets);
-
commandBuffer
is the command buffer into which the command is recorded. -
firstCounterBuffer
is the index of the first transform feedback buffer corresponding topCounterBuffers
[0] andpCounterBufferOffsets
[0]. -
counterBufferCount
is the size of thepCounterBuffers
andpCounterBufferOffsets
arrays. -
pCounterBuffers
is an optional array of buffer handles to the counter buffers used to record the current byte positions of each transform feedback buffer where the next vertex output data would be captured. This can be used by a subsequent vkCmdBeginTransformFeedbackEXT call to resume transform feedback capture from this position. It can also be used by vkCmdDrawIndirectByteCountEXT to determine the vertex count of the draw call. -
pCounterBufferOffsets
is an optional array of offsets within each of thepCounterBuffers
where the counter values can be written. The location in each counter buffer at these offsets must be large enough to contain 4 bytes of data. The data stored at this location is the byte offset from the start of the transform feedback buffer binding where the next vertex data would be written. IfpCounterBufferOffsets
isNULL
, then it is assumed the offsets are zero.
25.2. Viewport Swizzle
Each primitive sent to a given viewport has a swizzle and optional negation
applied to its clip coordinates.
The swizzle that is applied depends on the viewport index, and is controlled
by the VkPipelineViewportSwizzleStateCreateInfoNV
pipeline state:
typedef struct VkPipelineViewportSwizzleStateCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkPipelineViewportSwizzleStateCreateFlagsNV flags;
uint32_t viewportCount;
const VkViewportSwizzleNV* pViewportSwizzles;
} VkPipelineViewportSwizzleStateCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
viewportCount
is the number of viewport swizzles used by the pipeline. -
pViewportSwizzles
is a pointer to an array of VkViewportSwizzleNV structures, defining the viewport swizzles.
typedef VkFlags VkPipelineViewportSwizzleStateCreateFlagsNV;
VkPipelineViewportSwizzleStateCreateFlagsNV
is a bitmask type for
setting a mask, but is currently reserved for future use.
The VkPipelineViewportSwizzleStateCreateInfoNV
state is set by adding
an instance of this structure to the pNext
chain of an instance of the
VkPipelineViewportStateCreateInfo
structure and setting the graphics
pipeline state with vkCreateGraphicsPipelines.
Each viewport specified from 0 to viewportCount
- 1 has its x,y,z,w
swizzle state set to the corresponding x
, y
, z
and w
in the VkViewportSwizzleNV structure.
Each component is of type VkViewportCoordinateSwizzleNV, which
determines the type of swizzle for that component.
The value of x
computes the new x component of the position as:
if (x == VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_X_NV) x' = x;
if (x == VK_VIEWPORT_COORDINATE_SWIZZLE_NEGATIVE_X_NV) x' = -x;
if (x == VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_Y_NV) x' = y;
if (x == VK_VIEWPORT_COORDINATE_SWIZZLE_NEGATIVE_Y_NV) x' = -y;
if (x == VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_Z_NV) x' = z;
if (x == VK_VIEWPORT_COORDINATE_SWIZZLE_NEGATIVE_Z_NV) x' = -z;
if (x == VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_W_NV) x' = w;
if (x == VK_VIEWPORT_COORDINATE_SWIZZLE_NEGATIVE_W_NV) x' = -w;
Similar selections are performed for the y
, z
, and w
coordinates.
This swizzling is applied before clipping and perspective divide.
If the swizzle for an active viewport index is not specified, the swizzle
for x
is VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_X_NV
, y
is VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_Y_NV
, z
is
VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_Z_NV
and w
is
VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_W_NV
.
Viewport swizzle parameters are specified by setting the pNext
pointer
of VkGraphicsPipelineCreateInfo
to point to an instance of
VkPipelineViewportSwizzleStateCreateInfoNV
.
VkPipelineViewportSwizzleStateCreateInfoNV uses
VkViewportSwizzleNV
to set the viewport swizzle parameters.
The VkViewportSwizzleNV
structure is defined as:
typedef struct VkViewportSwizzleNV {
VkViewportCoordinateSwizzleNV x;
VkViewportCoordinateSwizzleNV y;
VkViewportCoordinateSwizzleNV z;
VkViewportCoordinateSwizzleNV w;
} VkViewportSwizzleNV;
-
x
is a VkViewportCoordinateSwizzleNV value specifying the swizzle operation to apply to the x component of the primitive -
y
is a VkViewportCoordinateSwizzleNV value specifying the swizzle operation to apply to the y component of the primitive -
z
is a VkViewportCoordinateSwizzleNV value specifying the swizzle operation to apply to the z component of the primitive -
w
is a VkViewportCoordinateSwizzleNV value specifying the swizzle operation to apply to the w component of the primitive
Possible values of the VkViewportSwizzleNV::x
, y
, z
,
and w
members, specifying swizzling of the corresponding components of
primitives, are:
typedef enum VkViewportCoordinateSwizzleNV {
VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_X_NV = 0,
VK_VIEWPORT_COORDINATE_SWIZZLE_NEGATIVE_X_NV = 1,
VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_Y_NV = 2,
VK_VIEWPORT_COORDINATE_SWIZZLE_NEGATIVE_Y_NV = 3,
VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_Z_NV = 4,
VK_VIEWPORT_COORDINATE_SWIZZLE_NEGATIVE_Z_NV = 5,
VK_VIEWPORT_COORDINATE_SWIZZLE_POSITIVE_W_NV = 6,
VK_VIEWPORT_COORDINATE_SWIZZLE_NEGATIVE_W_NV = 7,
} VkViewportCoordinateSwizzleNV;
These values are described in detail in Viewport Swizzle.
25.3. Flat Shading
Flat shading a vertex output attribute means to assign all vertices of the primitive the same value for that output.
The output values assigned are those of the provoking vertex of the primitive. The provoking vertex depends on the primitive topology, and is generally the “first” vertex of the primitive. For primitives not processed by tessellation or geometry shaders, the provoking vertex is selected from the input vertices according to the following table.
Primitive type of primitive i |
Provoking vertex number |
|
i |
|
2 i |
|
i |
|
3 i |
|
i |
|
i + 1 |
|
4 i + 1 |
|
i + 1 |
|
6 i |
|
2 i |
Flat shading is applied to those vertex attributes that
match fragment input attributes which
are decorated as Flat
.
If a geometry shader is active, the output primitive topology is either points, line strips, or triangle strips, and the selection of the provoking vertex behaves according to the corresponding row of the table. If a tessellation evaluation shader is active and a geometry shader is not active, the provoking vertex is undefined but must be one of the vertices of the primitive.
25.4. Primitive Clipping
Primitives are culled against the cull volume and then clipped to the clip volume. In clip coordinates, the view volume is defined by:
This view volume can be further restricted by as many as
VkPhysicalDeviceLimits
::maxClipDistances
client-defined
half-spaces.
The cull volume is the intersection of up to
VkPhysicalDeviceLimits
::maxCullDistances
client-defined
half-spaces (if no client-defined cull half-spaces are enabled, culling
against the cull volume is skipped).
A shader must write a single cull distance for each enabled cull half-space
to elements of the CullDistance
array.
If the cull distance for any enabled cull half-space is negative for all of
the vertices of the primitive under consideration, the primitive is
discarded.
Otherwise the primitive is clipped against the clip volume as defined below.
The clip volume is the intersection of up to
VkPhysicalDeviceLimits
::maxClipDistances
client-defined
half-spaces with the view volume (if no client-defined clip half-spaces are
enabled, the clip volume is the view volume).
A shader must write a single clip distance for each enabled clip half-space
to elements of the ClipDistance
array.
Clip half-space i is then given by the set of points satisfying the
inequality
-
ci(P) ≥ 0
where ci(P) is the clip distance i at point P. For point primitives, ci(P) is simply the clip distance for the vertex in question. For line and triangle primitives, per-vertex clip distances are interpolated using a weighted mean, with weights derived according to the algorithms described in sections Basic Line Segment Rasterization and Basic Polygon Rasterization, using the perspective interpolation equations.
The number of client-defined clip and cull half-spaces that are enabled is
determined by the explicit size of the built-in arrays ClipDistance
and
CullDistance
, respectively, declared as an output in the interface of
the entry point of the final shader stage before clipping.
Depth clamping is enabled or disabled via the depthClampEnable
enable
of the VkPipelineRasterizationStateCreateInfo
structure.
If depth clamping is enabled, the plane equation
-
0 ≤ zc ≤ wc
(see the clip volume definition above) is ignored by view volume clipping (effectively, there is no near or far plane clipping).
If the primitive under consideration is a point or line segment, then clipping passes it unchanged if its vertices lie entirely within the clip volume.
Possible values of
VkPhysicalDevicePointClippingProperties::pointClippingBehavior
,
specifying clipping behavior of a point primitive whose vertex lies outside
the clip volume, are:
typedef enum VkPointClippingBehavior {
VK_POINT_CLIPPING_BEHAVIOR_ALL_CLIP_PLANES = 0,
VK_POINT_CLIPPING_BEHAVIOR_USER_CLIP_PLANES_ONLY = 1,
VK_POINT_CLIPPING_BEHAVIOR_ALL_CLIP_PLANES_KHR = VK_POINT_CLIPPING_BEHAVIOR_ALL_CLIP_PLANES,
VK_POINT_CLIPPING_BEHAVIOR_USER_CLIP_PLANES_ONLY_KHR = VK_POINT_CLIPPING_BEHAVIOR_USER_CLIP_PLANES_ONLY,
} VkPointClippingBehavior;
or the equivalent
typedef VkPointClippingBehavior VkPointClippingBehaviorKHR;
-
VK_POINT_CLIPPING_BEHAVIOR_ALL_CLIP_PLANES
specifies that the primitive is discarded if the vertex lies outside any clip plane, including the planes bounding the view volume. -
VK_POINT_CLIPPING_BEHAVIOR_USER_CLIP_PLANES_ONLY
specifies that the primitive is discarded only if the vertex lies outside any user clip plane.
If either of a line segment’s vertices lie outside of the clip volume, the line segment may be clipped, with new vertex coordinates computed for each vertex that lies outside the clip volume. A clipped line segment endpoint lies on both the original line segment and the boundary of the clip volume.
This clipping produces a value, 0 ≤ t ≤ 1, for each clipped vertex. If the coordinates of a clipped vertex are P and the original vertices’ coordinates are P1 and P2, then t is given by
-
P = t P1 + (1-t) P2.
editing-note
This is weird - it gives P, not t. |
t is used to clip vertex output attributes as described in Clipping Shader Outputs.
If the primitive is a polygon, it passes unchanged if every one of its edges lie entirely inside the clip volume, and it is discarded if every one of its edges lie entirely outside the clip volume. If the edges of the polygon intersect the boundary of the clip volume, the intersecting edges are reconnected by new edges that lie along the boundary of the clip volume - in some cases requiring the introduction of new vertices into a polygon.
If a polygon intersects an edge of the clip volume’s boundary, the clipped polygon must include a point on this boundary edge.
Primitives rendered with user-defined half-spaces must satisfy a complementarity criterion. Suppose a series of primitives is drawn where each vertex i has a single specified clip distance di (or a number of similarly specified clip distances, if multiple half-spaces are enabled). Next, suppose that the same series of primitives are drawn again with each such clip distance replaced by -di (and the graphics pipeline is otherwise the same). In this case, primitives must not be missing any pixels, and pixels must not be drawn twice in regions where those primitives are cut by the clip planes.
25.5. Clipping Shader Outputs
Next, vertex output attributes are clipped. The output values associated with a vertex that lies within the clip volume are unaffected by clipping. If a primitive is clipped, however, the output values assigned to vertices produced by clipping are clipped.
Let the output values assigned to the two vertices P1 and P2 of an unclipped edge be c1 and c2. The value of t (see Primitive Clipping) for a clipped point P is used to obtain the output value associated with P as
-
c = t c1 + (1-t) c2.
(Multiplying an output value by a scalar means multiplying each of x, y, z, and w by the scalar.)
Since this computation is performed in clip space before division by wc, clipped output values are perspective-correct.
Polygon clipping creates a clipped vertex along an edge of the clip volume’s boundary. This situation is handled by noting that polygon clipping proceeds by clipping against one half-space at a time. Output value clipping is done in the same way, so that clipped points always occur at the intersection of polygon edges (possibly already clipped) with the clip volume’s boundary.
For vertex output attributes whose matching fragment input attributes are
decorated with NoPerspective
, the value of t used to obtain the
output value associated with P will be adjusted to produce results
that vary linearly in framebuffer space.
Output attributes of integer or unsigned integer type must always be flat shaded. Flat shaded attributes are constant over the primitive being rasterized (see Basic Line Segment Rasterization and Basic Polygon Rasterization), and no interpolation is performed. The output value c is taken from either c1 or c2, since flat shading has already occurred and the two values are identical.
25.6. Controlling Viewport W Scaling
If viewport W scaling is enabled, the W component of the clip coordinate is modified by the provided coefficients from the corresponding viewport as follows.
-
wc' = xcoeff xc + ycoeff yc + wc
The VkPipelineViewportWScalingStateCreateInfoNV
structure is defined
as:
typedef struct VkPipelineViewportWScalingStateCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkBool32 viewportWScalingEnable;
uint32_t viewportCount;
const VkViewportWScalingNV* pViewportWScalings;
} VkPipelineViewportWScalingStateCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
viewportWScalingEnable
controls whether viewport W scaling is enabled. -
viewportCount
is the number of viewports used by W scaling, and must match the number of viewports in the pipeline if viewport W scaling is enabled. -
pViewportWScalings
is a pointer to an array ofVkViewportWScalingNV
structures, which define the W scaling parameters for the corresponding viewport. If the viewport W scaling state is dynamic, this member is ignored.
The VkPipelineViewportWScalingStateCreateInfoNV
state is set by adding
an instance of this structure to the pNext
chain of an instance of the
VkPipelineViewportStateCreateInfo
structure and setting the graphics
pipeline state with vkCreateGraphicsPipelines.
If the bound pipeline state object was not created with the
VK_DYNAMIC_STATE_VIEWPORT_W_SCALING_NV
dynamic state enabled, viewport
W scaling parameters are specified using the pViewportWScalings
member of VkPipelineViewportWScalingStateCreateInfoNV in the pipeline
state object.
If the pipeline state object was created with the
VK_DYNAMIC_STATE_VIEWPORT_W_SCALING_NV
dynamic state enabled, the
viewport transformation parameters are dynamically set and changed with the
command:
void vkCmdSetViewportWScalingNV(
VkCommandBuffer commandBuffer,
uint32_t firstViewport,
uint32_t viewportCount,
const VkViewportWScalingNV* pViewportWScalings);
-
commandBuffer
is the command buffer into which the command will be recorded. -
firstViewport
is the index of the first viewport whose parameters are updated by the command. -
viewportCount
is the number of viewports whose parameters are updated by the command. -
pViewportWScalings
is a pointer to an array of VkViewportWScalingNV structures specifying viewport parameters.
The viewport parameters taken from element i of
pViewportWScalings
replace the current state for the viewport index
firstViewport
+ i, for i in [0,
viewportCount
).
Both VkPipelineViewportWScalingStateCreateInfoNV and
vkCmdSetViewportWScalingNV use VkViewportWScalingNV
to set the
viewport transformation parameters.
The VkViewportWScalingNV
structure is defined as:
typedef struct VkViewportWScalingNV {
float xcoeff;
float ycoeff;
} VkViewportWScalingNV;
-
xcoeff
andycoeff
are the viewport’s W scaling factor for x and y respectively.
25.7. Coordinate Transformations
Clip coordinates for a vertex result from shader execution, which yields a
vertex coordinate Position
.
Perspective division on clip coordinates yields normalized device coordinates, followed by a viewport transformation (see Controlling the Viewport) to convert these coordinates into framebuffer coordinates.
If a vertex in clip coordinates has a position given by
then the vertex’s normalized device coordinates are
25.8. Controlling the Viewport
The viewport transformation is determined by the selected viewport’s width and height in pixels, px and py, respectively, and its center (ox, oy) (also in pixels), as well as its depth range min and max determining a depth range scale value pz and a depth range bias value oz (defined below). The vertex’s framebuffer coordinates (xf, yf, zf) are given by
-
xf = (px / 2) xd + ox
-
yf = (py / 2) yd + oy
-
zf = pz × zd + oz
Multiple viewports are available, numbered zero up to
VkPhysicalDeviceLimits
::maxViewports
minus one.
The number of viewports used by a pipeline is controlled by the
viewportCount
member of the VkPipelineViewportStateCreateInfo
structure used in pipeline creation.
The VkPipelineViewportStateCreateInfo
structure is defined as:
typedef struct VkPipelineViewportStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineViewportStateCreateFlags flags;
uint32_t viewportCount;
const VkViewport* pViewports;
uint32_t scissorCount;
const VkRect2D* pScissors;
} VkPipelineViewportStateCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
viewportCount
is the number of viewports used by the pipeline. -
pViewports
is a pointer to an array of VkViewport structures, defining the viewport transforms. If the viewport state is dynamic, this member is ignored. -
scissorCount
is the number of scissors and must match the number of viewports. -
pScissors
is a pointer to an array of VkRect2D structures which define the rectangular bounds of the scissor for the corresponding viewport. If the scissor state is dynamic, this member is ignored.
typedef VkFlags VkPipelineViewportStateCreateFlags;
VkPipelineViewportStateCreateFlags
is a bitmask type for setting a
mask, but is currently reserved for future use.
A vertex processing stage may direct each primitive to zero or more
viewports.
The destination viewports for a primitive are selected by the last active
vertex processing stage that has an output variable decorated with
ViewportIndex
(selecting a single viewport) or ViewportMaskNV
(selecting multiple viewports).
The viewport transform uses the viewport corresponding to either the value
assigned to ViewportIndex
or one of the bits set in
ViewportMaskNV
, and taken from an implementation-dependent vertex of
each primitive.
If ViewportIndex
or any of the bits in ViewportMaskNV
are outside
the range zero to viewportCount
minus one for a primitive, or if the
last active vertex processing stage did not assign a value to either
ViewportIndex
or ViewportMaskNV
for all vertices of a primitive
due to flow control, the values resulting from the viewport transformation
of the vertices of such primitives are undefined.
If the last vertex processing stage does not have an output decorated with
ViewportIndex
or ViewportMaskNV
, the viewport numbered zero is
used by the viewport transformation.
A single vertex can be used in more than one individual primitive, in
primitives such as VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP
.
In this case, the viewport transformation is applied separately for each
primitive.
If the bound pipeline state object was not created with the
VK_DYNAMIC_STATE_VIEWPORT
dynamic state enabled, viewport
transformation parameters are specified using the pViewports
member of
VkPipelineViewportStateCreateInfo
in the pipeline state object.
If the pipeline state object was created with the
VK_DYNAMIC_STATE_VIEWPORT
dynamic state enabled, the viewport
transformation parameters are dynamically set and changed with the command:
void vkCmdSetViewport(
VkCommandBuffer commandBuffer,
uint32_t firstViewport,
uint32_t viewportCount,
const VkViewport* pViewports);
-
commandBuffer
is the command buffer into which the command will be recorded. -
firstViewport
is the index of the first viewport whose parameters are updated by the command. -
viewportCount
is the number of viewports whose parameters are updated by the command. -
pViewports
is a pointer to an array of VkViewport structures specifying viewport parameters.
The viewport parameters taken from element i of pViewports
replace the current state for the viewport index firstViewport
+ i, for i in [0, viewportCount
).
Both VkPipelineViewportStateCreateInfo and vkCmdSetViewport use
VkViewport
to set the viewport transformation parameters.
The VkViewport
structure is defined as:
typedef struct VkViewport {
float x;
float y;
float width;
float height;
float minDepth;
float maxDepth;
} VkViewport;
-
x
andy
are the viewport’s upper left corner (x,y). -
width
andheight
are the viewport’s width and height, respectively. -
minDepth
andmaxDepth
are the depth range for the viewport. It is valid forminDepth
to be greater than or equal tomaxDepth
.
The framebuffer depth coordinate z
f may be represented using
either a fixed-point or floating-point representation.
However, a floating-point representation must be used if the depth/stencil
attachment has a floating-point depth component.
If an m-bit fixed-point representation is used, we assume that it
represents each value \(\frac{k}{2^m - 1}\), where k ∈ {
0, 1, …, 2m-1 }, as k (e.g. 1.0 is represented in binary as a
string of all ones).
The viewport parameters shown in the above equations are found from these values as
-
ox =
x
+width
/ 2 -
oy =
y
+height
/ 2 -
oz =
minDepth
-
px =
width
-
py =
height
-
pz =
maxDepth
-minDepth
.
The application can specify a negative term for height
, which has the
effect of negating the y coordinate in clip space before performing the
transform.
When using a negative height
, the application should also adjust the
y
value to point to the lower left corner of the viewport instead of
the upper left corner.
Using the negative height
allows the application to avoid having to
negate the y component of the Position
output from the last vertex
processing stage in shaders that also target other graphics APIs.
The width and height of the implementation-dependent maximum viewport dimensions must be greater than or equal to the width and height of the largest image which can be created and attached to a framebuffer.
The floating-point viewport bounds are represented with an implementation-dependent precision.
26. Rasterization
Rasterization is the process by which a primitive is converted to a two-dimensional image. Each point of this image contains associated data such as depth, color, or other attributes.
Rasterizing a primitive begins by determining which squares of an integer grid in framebuffer coordinates are occupied by the primitive, and assigning one or more depth values to each such square. This process is described below for points, lines, and polygons.
A grid square, including its (x,y) framebuffer coordinates, z (depth), and associated data added by fragment shaders, is called a fragment. A fragment is located by its upper left corner, which lies on integer grid coordinates.
Rasterization operations also refer to a fragment’s sample locations, which are offset by fractional values from its upper left corner. The rasterization rules for points, lines, and triangles involve testing whether each sample location is inside the primitive. Fragments need not actually be square, and rasterization rules are not affected by the aspect ratio of fragments. Display of non-square grids, however, will cause rasterized points and line segments to appear fatter in one direction than the other.
We assume that fragments are square, since it simplifies antialiasing and texturing. After rasterization, fragments are processed by the early per-fragment tests, if enabled.
Several factors affect rasterization, including the members of
VkPipelineRasterizationStateCreateInfo
and
VkPipelineMultisampleStateCreateInfo
.
The VkPipelineRasterizationStateCreateInfo
structure is defined as:
typedef struct VkPipelineRasterizationStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineRasterizationStateCreateFlags flags;
VkBool32 depthClampEnable;
VkBool32 rasterizerDiscardEnable;
VkPolygonMode polygonMode;
VkCullModeFlags cullMode;
VkFrontFace frontFace;
VkBool32 depthBiasEnable;
float depthBiasConstantFactor;
float depthBiasClamp;
float depthBiasSlopeFactor;
float lineWidth;
} VkPipelineRasterizationStateCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
depthClampEnable
controls whether to clamp the fragment’s depth values instead of clipping primitives to the z planes of the frustum, as described in Primitive Clipping. -
rasterizerDiscardEnable
controls whether primitives are discarded immediately before the rasterization stage. -
polygonMode
is the triangle rendering mode. See VkPolygonMode. -
cullMode
is the triangle facing direction used for primitive culling. See VkCullModeFlagBits. -
frontFace
is a VkFrontFace value specifying the front-facing triangle orientation to be used for culling. -
depthBiasEnable
controls whether to bias fragment depth values. -
depthBiasConstantFactor
is a scalar factor controlling the constant depth value added to each fragment. -
depthBiasClamp
is the maximum (or minimum) depth bias of a fragment. -
depthBiasSlopeFactor
is a scalar factor applied to a fragment’s slope in depth bias calculations. -
lineWidth
is the width of rasterized line segments.
The application can also add a
VkPipelineRasterizationStateRasterizationOrderAMD
structure to the
pNext
chain of a VkPipelineRasterizationStateCreateInfo
structure.
This structure enables selecting the rasterization order to use when
rendering with the corresponding graphics pipeline as described in
Rasterization Order.
typedef VkFlags VkPipelineRasterizationStateCreateFlags;
VkPipelineRasterizationStateCreateFlags
is a bitmask type for setting
a mask, but is currently reserved for future use.
The VkPipelineMultisampleStateCreateInfo
structure is defined as:
typedef struct VkPipelineMultisampleStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineMultisampleStateCreateFlags flags;
VkSampleCountFlagBits rasterizationSamples;
VkBool32 sampleShadingEnable;
float minSampleShading;
const VkSampleMask* pSampleMask;
VkBool32 alphaToCoverageEnable;
VkBool32 alphaToOneEnable;
} VkPipelineMultisampleStateCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
rasterizationSamples
is a VkSampleCountFlagBits specifying the number of samples used in rasterization. -
sampleShadingEnable
can be used to enable Sample Shading. -
minSampleShading
specifies a minimum fraction of sample shading ifsampleShadingEnable
is set toVK_TRUE
. -
pSampleMask
is a bitmask of static coverage information that is ANDed with the coverage information generated during rasterization, as described in Sample Mask. -
alphaToCoverageEnable
controls whether a temporary coverage value is generated based on the alpha component of the fragment’s first color output as specified in the Multisample Coverage section. -
alphaToOneEnable
controls whether the alpha component of the fragment’s first color output is replaced with one as described in Multisample Coverage.
typedef VkFlags VkPipelineMultisampleStateCreateFlags;
VkPipelineMultisampleStateCreateFlags
is a bitmask type for setting a
mask, but is currently reserved for future use.
Rasterization only generates fragments which cover one or more pixels inside the framebuffer. Pixels outside the framebuffer are never considered covered in the fragment. Fragments which would be produced by application of any of the primitive rasterization rules described below but which lie outside the framebuffer are not produced, nor are they processed by any later stage of the pipeline, including any of the early per-fragment tests described in Early Per-Fragment Tests.
Surviving fragments are processed by fragment shaders. Fragment shaders determine associated data for fragments, and can also modify or replace their assigned depth values.
When the VK_AMD_mixed_attachment_samples
and
VK_NV_framebuffer_mixed_samples
extensions are not enabled, if the subpass
for which this pipeline is being created uses color and/or depth/stencil
attachments, then rasterizationSamples
must be the same as the sample
count for those subpass attachments.
When the VK_AMD_mixed_attachment_samples
extension is enabled, if the
subpass for which this pipeline is being created uses color and/or
depth/stencil attachments, then rasterizationSamples
must be the same
as the maximum of the sample counts of those subpass attachments.
When the VK_NV_framebuffer_mixed_samples
extension is enabled,
rasterizationSamples
must match the sample count of the depth/stencil
attachment if present, otherwise must be greater than or equal to the
sample count of the color attachments, if present.
If the subpass for which this pipeline is being created does not use color
or depth/stencil attachments, rasterizationSamples
must follow the
rules for a zero-attachment subpass.
26.1. Discarding Primitives Before Rasterization
Primitives are discarded before rasterization if the
rasterizerDiscardEnable
member of
VkPipelineRasterizationStateCreateInfo is enabled.
When enabled, primitives are discarded after they are processed by the last
active shader stage in the pipeline before rasterization.
26.2. Controlling the Vertex Stream Used for Rasterization
By default vertex data output from the last vertex processing stage are
directed to vertex stream zero.
Geometry shaders can emit primitives to multiple independent vertex
streams.
Each vertex emitted by the geometry shader is directed at one of the vertex
streams.
As vertices are received on each vertex stream, they are arranged into
primitives of the type specified by the geometry shader output primitive
type.
The shading language instructions OpEndPrimitive
and
OpEndStreamPrimitive
can be used to end the primitive being assembled
on a given vertex stream and start a new empty primitive of the same type.
An implementation supports up to
VkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackStreams
streams, which is at least 1.
The individual streams are numbered 0 through
maxTransformFeedbackStreams
minus 1.
There is no requirement on the order of the streams to which vertices are
emitted, and the number of vertices emitted to each vertex stream can be
completely independent, subject only to the
VkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackStreamDataSize
and
VkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackBufferDataSize
limits.
The primitives output from all vertex streams are passed to the transform
feedback stage to be captured to transform feedback buffers in the manner
specified by the last vertex processing stage shader’s XfbBuffer
,
XfbStride
, and Offsets
decorations on the output interface
variables in the graphics pipeline.
To use a vertex stream other than zero, or to use multiple streams, the
GeometryStreams
capability must be specified.
By default, the primitives output from vertex stream zero are rasterized.
If the implementation supports the
VkPhysicalDeviceTransformFeedbackPropertiesEXT::transformFeedbackRasterizationStreamSelect
property it is possible to rasterize a vertex stream other than zero.
By default, geometry shaders that emit vertices to multiple vertex streams
are limited to using only the OutputPoints
output primitive type.
If the implementation supports the
VkPhysicalDeviceTransformFeedbackPropertiesEXT::transformFeedbackStreamsLinesTriangles
property it is possible to emit OutputLineStrip
or
OutputTriangleStrip
in addition to OutputPoints
.
The vertex stream used for rasterization is specified by adding a
VkPipelineRasterizationStateStreamCreateInfoEXT
structure to the
pNext
chain of a VkPipelineRasterizationStateCreateInfo
structure.
The VkPipelineRasterizationStateStreamCreateInfoEXT
structure is
defined as:
typedef struct VkPipelineRasterizationStateStreamCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkPipelineRasterizationStateStreamCreateFlagsEXT flags;
uint32_t rasterizationStream;
} VkPipelineRasterizationStateStreamCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
rasterizationStream
is the vertex stream selected for rasterization.
If this structure is not present, rasterizationStream
is assumed to be
zero.
typedef VkFlags VkPipelineRasterizationStateStreamCreateFlagsEXT;
VkPipelineRasterizationStateStreamCreateFlagsEXT
is a bitmask type for
setting a mask, but is currently reserved for future use.
26.3. Rasterization Order
Within a subpass of a render pass instance, for a given (x,y,layer,sample) sample location, the following operations are guaranteed to execute in rasterization order, for each separate primitive that includes that sample location:
Each of these operations is atomically executed for each primitive and sample location.
Execution of these operations for each primitive in a subpass occurs in an order determined by the application.
The rasterization order to use for a graphics pipeline is specified by
adding a VkPipelineRasterizationStateRasterizationOrderAMD
structure
to the pNext
chain of a VkPipelineRasterizationStateCreateInfo
structure.
The VkPipelineRasterizationStateRasterizationOrderAMD
structure is
defined as:
typedef struct VkPipelineRasterizationStateRasterizationOrderAMD {
VkStructureType sType;
const void* pNext;
VkRasterizationOrderAMD rasterizationOrder;
} VkPipelineRasterizationStateRasterizationOrderAMD;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
rasterizationOrder
is a VkRasterizationOrderAMD value specifying the primitive rasterization order to use.
If the VK_AMD_rasterization_order
device extension is not enabled or
the application does not request a particular rasterization order through
specifying a VkPipelineRasterizationStateRasterizationOrderAMD
structure then the rasterization order used by the graphics pipeline
defaults to VK_RASTERIZATION_ORDER_STRICT_AMD
.
Possible values of
VkPipelineRasterizationStateRasterizationOrderAMD::rasterizationOrder
,
specifying the primitive rasterization order, are:
typedef enum VkRasterizationOrderAMD {
VK_RASTERIZATION_ORDER_STRICT_AMD = 0,
VK_RASTERIZATION_ORDER_RELAXED_AMD = 1,
} VkRasterizationOrderAMD;
-
VK_RASTERIZATION_ORDER_STRICT_AMD
specifies that operations for each primitive in a subpass must occur in primitive order. -
VK_RASTERIZATION_ORDER_RELAXED_AMD
specifies that operations for each primitive in a subpass may not occur in primitive order.
26.4. Multisampling
Multisampling is a mechanism to antialias all Vulkan primitives: points, lines, and polygons. The technique is to sample all primitives multiple times at each pixel. Each sample in each framebuffer attachment has storage for a color, depth, and/or stencil value, such that per-fragment operations apply to each sample independently. The color sample values can be later resolved to a single color (see Resolving Multisample Images and the Render Pass chapter for more details on how to resolve multisample images to non-multisample images).
Vulkan defines rasterization rules for single-sample modes in a way that is equivalent to a multisample mode with a single sample in the center of each fragment.
Each fragment includes a coverage value with rasterizationSamples
bits
(see Sample Mask).
Each fragment includes rasterizationSamples
depth values and sets of
associated data.
An implementation may choose to assign the same associated data to more
than one sample.
The location for evaluating such associated data may be anywhere within the
fragment area including the fragment’s center location (xf,yf) or
any of the sample locations.
When rasterizationSamples
is VK_SAMPLE_COUNT_1_BIT
, the
fragment’s center location must be used.
The different associated data values need not all be evaluated at the same
location.
Each fragment thus consists of integer x and y grid coordinates,
rasterizationSamples
depth values and sets of associated data, and a
coverage value with rasterizationSamples
bits.
It is understood that each pixel has rasterizationSamples
locations
associated with it.
These locations are exact positions, rather than regions or areas, and each
is referred to as a sample point.
The sample points associated with a pixel must be located inside or on the
boundary of the unit square that is considered to bound the pixel.
Furthermore, the relative locations of sample points may be identical for
each pixel in the framebuffer, or they may differ.
If the render pass has a fragment density map attachment, each fragment only
has rasterizationSamples
locations associated with it regardless of
how many pixels are covered in the fragment area.
Fragment sample locations are defined as if the fragment had an area of
(1,1) and its sample points must be located within these bounds.
Their actual location in the framebuffer is calculated by scaling the sample
location by the fragment area.
Attachments with storage for multiple samples per pixel are located at the
pixel sample locations.
Otherwise, the fragment’s sample locations are generally used for evaluation
of associated data and fragment operations.
If the current pipeline includes a fragment shader with one or more
variables in its interface decorated with Sample
and Input
, the
data associated with those variables will be assigned independently for each
sample.
The values for each sample must be evaluated at the location of the sample.
The data associated with any other variables not decorated with Sample
and Input
need not be evaluated independently for each sample.
If the standardSampleLocations
member of VkPhysicalDeviceLimits
is VK_TRUE
, then the sample counts VK_SAMPLE_COUNT_1_BIT
,
VK_SAMPLE_COUNT_2_BIT
, VK_SAMPLE_COUNT_4_BIT
,
VK_SAMPLE_COUNT_8_BIT
, and VK_SAMPLE_COUNT_16_BIT
have sample
locations as listed in the following table, with the ith entry in
the table corresponding to bit i in the sample masks.
VK_SAMPLE_COUNT_32_BIT
and VK_SAMPLE_COUNT_64_BIT
do not have
standard sample locations.
Locations are defined relative to an origin in the upper left corner of the
fragment.
|
|
|
|
|
(0.5,0.5) |
(0.75,0.75) |
(0.375, 0.125) |
(0.5625, 0.3125) |
(0.5625, 0.5625) |
Color images created with multiple samples per pixel use a compression
technique where there are two arrays of data associated with each pixel.
The first array contains one element per sample where each element stores an
index to the second array defining the fragment mask of the pixel.
The second array contains one element per color fragment and each element
stores a unique color value in the format of the image.
With this compression technique it’s not always necessary to actually use
unique storage locations for each color sample: when multiple samples share
the same color value the fragment mask may have two samples referring to
the same color fragment.
The number of color fragments is determined by the samples
member of
the VkImageCreateInfo structure used to create the image.
The VK_AMD_shader_fragment_mask
device extension provides shader
instructions enabling the application to get direct access to the fragment
mask and the individual color fragment values.
26.5. Custom Sample Locations
Applications can also control the sample locations used for rasterization.
If the pNext
chain of the VkPipelineMultisampleStateCreateInfo
structure specified at pipeline creation time includes an instance of the
VkPipelineSampleLocationsStateCreateInfoEXT
structure, then that
structure controls the sample locations used when rasterizing primitives
with the pipeline.
The VkPipelineSampleLocationsStateCreateInfoEXT
structure is defined
as:
typedef struct VkPipelineSampleLocationsStateCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkBool32 sampleLocationsEnable;
VkSampleLocationsInfoEXT sampleLocationsInfo;
} VkPipelineSampleLocationsStateCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
sampleLocationsEnable
controls whether custom sample locations are used. IfsampleLocationsEnable
isVK_FALSE
, the default sample locations are used and the values specified insampleLocationsInfo
are ignored. -
sampleLocationsInfo
is the sample locations to use during rasterization ifsampleLocationsEnable
isVK_TRUE
and the graphics pipeline is not created withVK_DYNAMIC_STATE_SAMPLE_LOCATIONS_EXT
.
The VkSampleLocationsInfoEXT
structure is defined as:
typedef struct VkSampleLocationsInfoEXT {
VkStructureType sType;
const void* pNext;
VkSampleCountFlagBits sampleLocationsPerPixel;
VkExtent2D sampleLocationGridSize;
uint32_t sampleLocationsCount;
const VkSampleLocationEXT* pSampleLocations;
} VkSampleLocationsInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
sampleLocationsPerPixel
is a VkSampleCountFlagBits specifying the number of sample locations per pixel. -
sampleLocationGridSize
is the size of the sample location grid to select custom sample locations for. -
sampleLocationsCount
is the number of sample locations inpSampleLocations
. -
pSampleLocations
is an array ofsampleLocationsCount
VkSampleLocationEXT structures.
This structure can be used either to specify the sample locations to be
used for rendering or to specify the set of sample locations an image
subresource has been last rendered with for the purposes of layout
transitions of depth/stencil images created with
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
.
The sample locations in pSampleLocations
specify
sampleLocationsPerPixel
number of sample locations for each pixel in
the grid of the size specified in sampleLocationGridSize
.
The sample location for sample i at the pixel grid location
(x,y) is taken from pSampleLocations
[(x + y *
sampleLocationGridSize.width
) * sampleLocationsPerPixel
+ i].
If the render pass has a fragment density map, the implementation will
choose the sample locations for the fragment and the contents of
pSampleLocations
may be ignored.
The VkSampleLocationEXT
structure is defined as:
typedef struct VkSampleLocationEXT {
float x;
float y;
} VkSampleLocationEXT;
-
x
is the horizontal coordinate of the sample’s location. -
y
is the vertical coordinate of the sample’s location.
The domain space of the sample location coordinates has an upper-left origin within the pixel in framebuffer space.
The values specified in a VkSampleLocationEXT
structure are always
clamped to the implementation-dependent sample location coordinate range
[sampleLocationCoordinateRange
[0],sampleLocationCoordinateRange
[1]]
that can be queried by chaining the
VkPhysicalDeviceSampleLocationsPropertiesEXT structure to the
pNext
chain of VkPhysicalDeviceProperties2.
The custom sample locations used for rasterization when
VkPipelineSampleLocationsStateCreateInfoEXT
::sampleLocationsEnable
is VK_TRUE
are specified by the
VkPipelineSampleLocationsStateCreateInfoEXT
::sampleLocationsInfo
property of the bound graphics pipeline, if the pipeline was not created
with VK_DYNAMIC_STATE_SAMPLE_LOCATIONS_EXT
enabled.
Otherwise, the sample locations used for rasterization are set by calling
vkCmdSetSampleLocationsEXT
:
void vkCmdSetSampleLocationsEXT(
VkCommandBuffer commandBuffer,
const VkSampleLocationsInfoEXT* pSampleLocationsInfo);
-
commandBuffer
is the command buffer into which the command will be recorded. -
pSampleLocationsInfo
is the sample locations state to set.
26.6. Shading Rate Image
The shading rate image feature allows pipelines to use a shading rate image to control the fragment area and the minimum number of fragment shader invocations launched for each fragment. When the shading rate image is enabled, the rasterizer determines a base shading rate for each region of the framebuffer covered by a primitive by fetching a value from the shading rate image and translating it to a shading rate using a per-viewport shading rate palette. This base shading rate is then adjusted to derive a final shading rate, which specifies the fragment area and fragment shader invocation count to use for fragments generated in the region.
If the pNext
chain of VkPipelineViewportStateCreateInfo includes
a VkPipelineViewportShadingRateImageStateCreateInfoNV
structure, then
that structure includes parameters that control the shading rate.
The VkPipelineViewportShadingRateImageStateCreateInfoNV
structure is
defined as:
typedef struct VkPipelineViewportShadingRateImageStateCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkBool32 shadingRateImageEnable;
uint32_t viewportCount;
const VkShadingRatePaletteNV* pShadingRatePalettes;
} VkPipelineViewportShadingRateImageStateCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
shadingRateImageEnable
specifies whether shading rate image and palettes are used during rasterization. -
viewportCount
specifies the number of per-viewport palettes used to translate values stored in shading rate images. -
pShadingRatePalettes
is a pointer to an array of VkShadingRatePaletteNV structures defining the palette for each viewport. If the shading rate palette state is dynamic, this member is ignored.
If this structure is not present, shadingRateImageEnable
is considered
to be VK_FALSE
, and the shading rate image and palettes are not used.
When shading rate image usage is enabled in the bound pipeline, the pipeline uses a shading rate image specified by the command:
void vkCmdBindShadingRateImageNV(
VkCommandBuffer commandBuffer,
VkImageView imageView,
VkImageLayout imageLayout);
-
commandBuffer
is the command buffer into which the command will be recorded. -
imageView
is an image view handle that specifies the shading rate image.imageView
may be set to VK_NULL_HANDLE, which is equivalent to specifying a view of an image filled with zero values. -
imageLayout
is the layout that the image subresources accessible fromimageView
will be in when the shading rate image is accessed.
When the shading rate image is enabled in the current pipeline, rasterizing
a primitive covering the pixel with coordinates (x,y) will fetch a
shading rate index value from the shading rate image bound by
vkCmdBindShadingRateImageNV
.
If the shading rate image view has a type of VK_IMAGE_VIEW_TYPE_2D
,
the lookup will use texel coordinates (u,v) where \(u = \lfloor
\frac{x}{twidth} \rfloor\), \(v = \lfloor \frac{y}{theight}
\rfloor\), and \(twidth\) and \(theight\) are the width and
height of the implementation-dependent
shading rate texel size.
If the shading rate image view has a type of
VK_IMAGE_VIEW_TYPE_2D_ARRAY
, the lookup will use texel coordinates
(u,v) to extract a texel from the layer l, where l is the layer of
the framebuffer being rendered to.
If l is greater than or equal to the number of layers in the image view,
layer zero will be used.
If the bound shading rate image view is not VK_NULL_HANDLE and
contains a texel with coordinates (u,v) in layer l (if applicable),
the single unsigned integer component for that texel will be used as the
shading rate index.
If the (u,v) coordinate is outside the extents of the subresource used
by the shading rate image view, or if the image view is
VK_NULL_HANDLE, the shading rate index is zero.
If the shading rate image view has multiple mipmap levels, the base level
identified by VkImageSubresourceRange
::baseMipLevel
will be
used.
A shading rate index is mapped to a base shading rate using a lookup table called the shading rate image palette. There is a separate palette for each viewport. The number of entries in each palette is given by the implementation-dependent shading rate image palette size.
If a pipeline state object is created with
VK_DYNAMIC_STATE_VIEWPORT_SHADING_RATE_PALETTE_NV
enabled, the
per-viewport shading rate image palettes are set by the command:
void vkCmdSetViewportShadingRatePaletteNV(
VkCommandBuffer commandBuffer,
uint32_t firstViewport,
uint32_t viewportCount,
const VkShadingRatePaletteNV* pShadingRatePalettes);
-
commandBuffer
is the command buffer into which the command will be recorded. -
firstViewport
is the index of the first viewport whose shading rate palette is updated by the command. -
viewportCount
is the number of viewports whose shading rate palettes are updated by the command. -
pShadingRatePalettes
is a pointer to an array of VkShadingRatePaletteNV structures defining the palette for each viewport.
The VkShadingRatePaletteNV
structure specifies to contents of a single
shading rate image palette and is defined as:
typedef struct VkShadingRatePaletteNV {
uint32_t shadingRatePaletteEntryCount;
const VkShadingRatePaletteEntryNV* pShadingRatePaletteEntries;
} VkShadingRatePaletteNV;
-
shadingRatePaletteEntryCount
specifies the number of entries in the shading rate image palette. -
pShadingRatePaletteEntries
is a pointer to an array of VkShadingRatePaletteEntryNV enums defining the shading rate for each palette entry.
To determine the base shading rate image, a shading rate index i is mapped
to array element i in the array pShadingRatePaletteEntries
for the
palette corresponding to the viewport used for the fragment.
If i is greater than or equal to the palette size
shadingRatePaletteEntryCount
, the base shading rate is undefined.
The supported shading rate image palette entries are defined by VkShadingRatePaletteEntryNV:
typedef enum VkShadingRatePaletteEntryNV {
VK_SHADING_RATE_PALETTE_ENTRY_NO_INVOCATIONS_NV = 0,
VK_SHADING_RATE_PALETTE_ENTRY_16_INVOCATIONS_PER_PIXEL_NV = 1,
VK_SHADING_RATE_PALETTE_ENTRY_8_INVOCATIONS_PER_PIXEL_NV = 2,
VK_SHADING_RATE_PALETTE_ENTRY_4_INVOCATIONS_PER_PIXEL_NV = 3,
VK_SHADING_RATE_PALETTE_ENTRY_2_INVOCATIONS_PER_PIXEL_NV = 4,
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_PIXEL_NV = 5,
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_2X1_PIXELS_NV = 6,
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_1X2_PIXELS_NV = 7,
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_2X2_PIXELS_NV = 8,
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_4X2_PIXELS_NV = 9,
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_2X4_PIXELS_NV = 10,
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_4X4_PIXELS_NV = 11,
} VkShadingRatePaletteEntryNV;
The following table indicates the width and height (in pixels) of each
fragment generated using the indicated shading rate, as well as the maximum
number of fragment shader invocations launched for each fragment.
When processing regions of a primitive that have a shading rate of
VK_SHADING_RATE_PALETTE_ENTRY_NO_INVOCATIONS_NV
, no fragments will be
generated in that region.
Shading Rate | Width | Height | Invocations |
---|---|---|---|
|
0 |
0 |
0 |
|
1 |
1 |
16 |
|
1 |
1 |
8 |
|
1 |
1 |
4 |
|
1 |
1 |
2 |
|
1 |
1 |
1 |
|
2 |
1 |
1 |
|
1 |
2 |
1 |
|
2 |
2 |
1 |
|
4 |
2 |
1 |
|
2 |
4 |
1 |
|
4 |
4 |
1 |
When the shading rate image is disabled, a shading rate of
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_PIXEL_NV
will be used
as the base shading rate.
Once a base shading rate has been established, it is adjusted to produce a final shading rate. First, if the base shading rate uses multiple pixels for each fragment, the implementation may reduce the fragment area to ensure that the total number of coverage samples for all pixels in a fragment does not exceed an implementation-dependent maximum.
If sample shading is active in the current pipeline and would result in processing n (n > 1) unique samples per fragment when the shading rate image is disabled, the shading rate is adjusted in an implementation-dependent manner to increase the number of fragment shader invocations spawned by the primitive. If the shading rate indicates fs pixels per fragment and fs is greater than n, the fragment area is adjusted so each fragment has approximately \(fs \over n\) pixels. Otherwise, if the shading rate indicates ipf invocations per fragment, the fragment area will be adjusted to a single pixel with approximately \(ipf \times n \over fs\) invocations per fragment.
If sample shading occurs due to the use of a fragment shader input variable
decorated with SampleId
or SamplePosition
, the shading rate is
ignored.
Each fragment will have a single pixel and will spawn up to
totalSamples
fragment shader invocations, as when using
sample shading without a shading rate image.
Finally, if the shading rate specifies multiple fragment shader invocations
per fragment, the total number of invocations in the shading rate is clamped
to be no larger than the value of totalSamples
used for
sample shading.
When the final shading rate for a primitive covering pixel (x,y) has a fragment area of \(fw \times fh\), the fragment for that pixel will cover all pixels with coordinates (x',y') that satisfy the equations:
This combined fragment is considered to have multiple coverage samples; the
total number of samples in this fragment is given by \(samples = fw
\times fh \times rs\) where rs indicates the value of
VkPipelineMultisampleStateCreateInfo
::rasterizationSamples
specified at pipeline creation time.
The set of coverage samples in the fragment is the union of the per-pixel
coverage samples in each of the fragment’s pixels The location and order of
coverage samples within each pixel in the combined fragment are assigned as
described in
Multisampling and Custom Sample Locations.
Each coverage sample in the set of pixels belonging to the combined fragment
is assigned a unique sample number in the range [0,samples-1].
If the
shadingRateCoarseSampleOrder
feature is supported, the order of coverage samples can be specified for
each combination of fragment area and coverage sample count.
If this feature is not supported, the sample order is
implementation-dependent.
If the pNext
chain of VkPipelineViewportStateCreateInfo includes
a VkPipelineViewportCoarseSampleOrderStateCreateInfoNV
structure, then
that structure includes parameters that control the order of coverage
samples in fragments larger than one pixel.
The VkPipelineViewportCoarseSampleOrderStateCreateInfoNV
structure is
defined as:
typedef struct VkPipelineViewportCoarseSampleOrderStateCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkCoarseSampleOrderTypeNV sampleOrderType;
uint32_t customSampleOrderCount;
const VkCoarseSampleOrderCustomNV* pCustomSampleOrders;
} VkPipelineViewportCoarseSampleOrderStateCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
sampleOrderType
specifies the mechanism used to order coverage samples in fragments larger than one pixel. -
customSampleOrderCount
specifies the number of custom sample orderings to use when ordering coverage samples. -
pCustomSampleOrders
is a pointer to an array of VkCoarseSampleOrderCustomNV structures, each of which specifies the coverage sample order for a single combination of fragment area and coverage sample count.
If this structure is not present, sampleOrderType
is considered to be
VK_COARSE_SAMPLE_ORDER_TYPE_DEFAULT_NV
.
If sampleOrderType
is VK_COARSE_SAMPLE_ORDER_TYPE_CUSTOM_NV
, the
coverage sample order used for any combination of fragment area and coverage
sample count not enumerated in pCustomSampleOrders
will be identical
to that used for VK_COARSE_SAMPLE_ORDER_TYPE_DEFAULT_NV
.
If the pipeline was created with
VK_DYNAMIC_STATE_VIEWPORT_COARSE_SAMPLE_ORDER_NV
, the contents of this
structure (if present) are ignored, and the coverage sample order is instead
specified by vkCmdSetCoarseSampleOrderNV.
The type VkCoarseSampleOrderTypeNV specifies the technique used to order coverage samples in fragments larger than one pixel, and is defined as:
typedef enum VkCoarseSampleOrderTypeNV {
VK_COARSE_SAMPLE_ORDER_TYPE_DEFAULT_NV = 0,
VK_COARSE_SAMPLE_ORDER_TYPE_CUSTOM_NV = 1,
VK_COARSE_SAMPLE_ORDER_TYPE_PIXEL_MAJOR_NV = 2,
VK_COARSE_SAMPLE_ORDER_TYPE_SAMPLE_MAJOR_NV = 3,
} VkCoarseSampleOrderTypeNV;
-
VK_COARSE_SAMPLE_ORDER_TYPE_DEFAULT_NV
specifies that coverage samples will be ordered in an implementation-dependent manner. -
VK_COARSE_SAMPLE_ORDER_TYPE_CUSTOM_NV
specifies that coverage samples will be ordered according to the array of custom orderings provided in either thepCustomSampleOrders
member ofVkPipelineViewportCoarseSampleOrderStateCreateInfoNV
or thepCustomSampleOrders
member of vkCmdSetCoarseSampleOrderNV. -
VK_COARSE_SAMPLE_ORDER_TYPE_PIXEL_MAJOR_NV
specifies that coverage samples will be ordered sequentially, sorted first by pixel coordinate (in row-major order) and then by coverage sample number. -
VK_COARSE_SAMPLE_ORDER_TYPE_SAMPLE_MAJOR_NV
specifies that coverage samples will be ordered sequentially, sorted first by coverage sample number and then by pixel coordinate (in row-major order).
When using a coarse sample order of
VK_COARSE_SAMPLE_ORDER_TYPE_PIXEL_MAJOR_NV
for a fragment with an
upper-left corner of \((fx,fy)\) with a width of \(fw
\times fh\) and \(fsc\) coverage samples per pixel, sample
\(cs\) of the fragment will be assigned to sample \(fs\) of
pixel \((px,py)\) will be assigned as follows:
When using a coarse sample order of
VK_COARSE_SAMPLE_ORDER_TYPE_SAMPLE_MAJOR_NV
, sample \(cs\)
will be assigned as follows:
The VkCoarseSampleOrderCustomNV
structure is used with a coverage
sample ordering type of VK_COARSE_SAMPLE_ORDER_TYPE_CUSTOM_NV
to
specify the order of coverage samples for one combination of fragment width,
fragment height, and coverage sample count.
The structure is defined as:
typedef struct VkCoarseSampleOrderCustomNV {
VkShadingRatePaletteEntryNV shadingRate;
uint32_t sampleCount;
uint32_t sampleLocationCount;
const VkCoarseSampleLocationNV* pSampleLocations;
} VkCoarseSampleOrderCustomNV;
-
shadingRate
is a shading rate palette entry that identifies the fragment width and height for the combination of fragment area and per-pixel coverage sample count to control. -
sampleCount
identifies the per-pixel coverage sample count for the combination of fragment area and coverage sample count to control. -
sampleLocationCount
specifies the number of sample locations in the custom ordering. -
pSampleLocations
is a pointer to an array of VkCoarseSampleOrderCustomNV structures that specifies the location of each sample in the custom ordering.
When using a custom sample ordering, element i in pSampleLocations
specifies a specific pixel and per-pixel coverage sample number that
corresponds to the coverage sample numbered i in the multi-pixel fragment.
The VkCoarseSampleLocationNV
structure identifies a specific pixel and
sample number for one of the coverage samples in a fragment that is larger
than one pixel.
This structure is defined as:
typedef struct VkCoarseSampleLocationNV {
uint32_t pixelX;
uint32_t pixelY;
uint32_t sample;
} VkCoarseSampleLocationNV;
-
pixelX
is added to the x coordinate of the upper-leftmost pixel of each fragment to identify the pixel containing the coverage sample. -
pixelY
is added to the y coordinate of the upper-leftmost pixel of each fragment to identify the pixel containing the coverage sample. -
sample
is the number of the coverage sample in the pixel identified bypixelX
andpixelY
.
If a pipeline state object is created with
VK_DYNAMIC_STATE_VIEWPORT_COARSE_SAMPLE_ORDER_NV
enabled, the order of
coverage samples in fragments larger than one pixel is set by the command:
void vkCmdSetCoarseSampleOrderNV(
VkCommandBuffer commandBuffer,
VkCoarseSampleOrderTypeNV sampleOrderType,
uint32_t customSampleOrderCount,
const VkCoarseSampleOrderCustomNV* pCustomSampleOrders);
-
commandBuffer
is the command buffer into which the command will be recorded. -
sampleOrderType
specifies the mechanism used to order coverage samples in fragments larger than one pixel. -
customSampleOrderCount
specifies the number of custom sample orderings to use when ordering coverage samples. -
pCustomSampleOrders
is a pointer to an array of VkCoarseSampleOrderCustomNV structures, each of which specifies the coverage sample order for a single combination of fragment area and coverage sample count.
If sampleOrderType
is VK_COARSE_SAMPLE_ORDER_TYPE_CUSTOM_NV
, the
coverage sample order used for any combination of fragment area and coverage
sample count not enumerated in pCustomSampleOrders
will be identical
to that used for VK_COARSE_SAMPLE_ORDER_TYPE_DEFAULT_NV
.
If the final shading rate for a primitive covering pixel (x,y) results in n invocations per pixel (n > 1), n separate fragment shader invocations will be generated for the fragment. Each coverage sample in the fragment will be assigned to one of the n fragment shader invocations in an implementation-dependent manner. The outputs from the fragment output interface of each shader invocation will be broadcast to all of the framebuffer samples associated with the invocation. If none of the coverage samples associated with a fragment shader invocation is covered by a primitive, the implementation may discard the fragment shader invocation for those samples.
If the final shading rate for a primitive covering pixel (x,y) results in a fragment containing multiple pixels, a single set of fragment shader invocations will be generated for all pixels in the combined fragment. Outputs from the fragment output interface will be broadcast to all covered framebuffer samples belonging to the fragment. If the fragment shader executes code discarding the fragment, none of the samples of the fragment will be updated.
26.7. Sample Shading
Sample shading can be used to specify a minimum number of unique samples to
process for each fragment.
If sample shading is enabled an implementation must provide a minimum of
max(⌈ minSampleShadingFactor
× totalSamples
⌉, 1) unique associated data for each fragment, where
minSampleShadingFactor
is the minimum fraction of sample shading.
If the VK_AMD_mixed_attachment_samples
extension is enabled and the
subpass uses color attachments, totalSamples
is the number of samples
of the color attachments.
Otherwise,
totalSamples
is the value of
VkPipelineMultisampleStateCreateInfo::rasterizationSamples
specified at pipeline creation time.
These are associated with the samples in an implementation-dependent manner.
When minSampleShadingFactor
is 1.0
, a separate set of associated
data are evaluated for each sample, and each set of values is evaluated at
the sample location.
Sample shading is enabled for a graphics pipeline:
-
If the interface of the fragment shader entry point of the graphics pipeline includes an input variable decorated with
SampleId
orSamplePosition
. In this caseminSampleShadingFactor
takes the value1.0
. -
Else if the
sampleShadingEnable
member of the VkPipelineMultisampleStateCreateInfo structure specified when creating the graphics pipeline is set toVK_TRUE
. In this caseminSampleShadingFactor
takes the value of VkPipelineMultisampleStateCreateInfo::minSampleShading
.
Otherwise, sample shading is considered disabled.
26.8. Barycentric Interpolation
When the fragmentShaderBarycentric
feature is enabled, the
PerVertexNV
interpolation
decoration can be used with fragment shader inputs to indicate that the
decorated inputs do not have associated data in the fragment.
Such inputs can only be accessed in a fragment shader using an array index
whose value (0, 1, or 2) identifies one of the vertices of the primitive
that produced the fragment.
When tessellation, geometry shading, and
mesh shading
are not active, fragment shader inputs decorated with PerVertexNV
will
take values from one of the vertices of the primitive that produced the
fragment, identified by the extra index provided in SPIR-V code accessing
the input.
If the n vertices passed to a draw call are numbered 0 through n-1, and
the point, line, and triangle primitives produced by the draw call are
numbered with consecutive integers beginning with zero, the following table
indicates the original vertex numbers used for index values of 0, 1, and 2.
If an input decorated with PerVertexNV
is accessed with any other
vertex index value, the value obtained is undefined.
Primitive Topology | Vertex 0 | Vertex 1 | Vertex 2 |
---|---|---|---|
|
i |
- |
- |
|
2i |
2i+1 |
- |
|
i |
i+1 |
- |
|
3i |
3i+1 |
3i+2 |
|
i |
i+1 |
i+2 |
|
i |
i+2 |
i+1 |
|
i+1 |
i+2 |
0 |
|
4i+1 |
4i+2 |
- |
|
i+1 |
i+2 |
- |
|
6i |
6i+2 |
6i+4 |
|
2i |
2i+2 |
2i+4 |
|
2i |
2i+4 |
2i+2 |
When geometry
or mesh
shading is active, primitives processed by fragment shaders are assembled
from the vertices emitted by the geometry
or mesh
shader.
In this case, the vertices used for fragment shader inputs decorated with
PerVertexNV
are derived by treating the primitives produced by the
shader as though they were specified by a draw call and consulting
the table above.
When using tessellation without geometry shading, the tessellator produces
primitives in an implementation-dependent manner.
While there is no defined vertex ordering for inputs decorated with
PerVertexNV
, the vertex ordering used in this case will be consistent
with the ordering used to derive the values of inputs decorated with
code::BaryCoordNV or code::BaryCoordNoPerspNV.
Fragment shader inputs decorated with BaryCoordNV
or
BaryCoordNoPerspNV
hold three-component vectors with barycentric
weights that indicate the location of the fragment relative to the
screen-space locations of vertices of its primitive.
For point primitives, such variables are always assigned the value (1,0,0).
For line primitives, the built-ins are obtained by
interpolating an attribute whose values for the vertices numbered 0 and 1
are (1,0,0) and (0,1,0), respectively.
For polygon primitives, the built-ins are
obtained by interpolating an attribute whose values for the vertices
numbered 0, 1, and 2 are (1,0,0), (0,1,0), and (0,0,1), respectively.
For BaryCoordNV
, the values are obtained using perspective
interpolation.
For BaryCoordNoPerspNV
, the values are obtained using linear
interpolation.
26.9. Points
A point is drawn by generating a set of fragments in the shape of a square
centered around the vertex of the point.
Each vertex has an associated point size that controls the width/height of
that square.
The point size is taken from the (potentially clipped) shader built-in
PointSize
written by:
-
the geometry shader, if active;
-
the tessellation evaluation shader, if active and no geometry shader is active;
-
the vertex shader, otherwise
and clamped to the implementation-dependent point size range
[pointSizeRange
[0],pointSizeRange
[1]].
The value written to PointSize
must be greater than zero.
Not all point sizes need be supported, but the size 1.0 must be supported.
The range of supported sizes and the size of evenly-spaced gradations within
that range are implementation-dependent.
The range and gradations are obtained from the pointSizeRange
and
pointSizeGranularity
members of VkPhysicalDeviceLimits.
If, for instance, the size range is from 0.1 to 2.0 and the gradation size
is 0.1, then the size 0.1, 0.2, …, 1.9, 2.0 are supported.
Additional point sizes may also be supported.
There is no requirement that these sizes be equally spaced.
If an unsupported size is requested, the nearest supported size is used
instead.
Further, if the render pass has a fragment density map attachment, point size may be rounded by the implementation to a multiple of the fragment’s width or height.
26.9.1. Basic Point Rasterization
Point rasterization produces a fragment for each fragment area group of
framebuffer pixels with one or more sample points that intersect a region
centered at the point’s (xf,yf).
This region is a square with side equal to the current point size.
Coverage bits that correspond to sample points that intersect the region are
1, other coverage bits are 0.
All fragments produced in rasterizing a point are assigned the same
associated data, which are those of the vertex corresponding to the point.
However, the fragment shader built-in PointCoord
contains point sprite
texture coordinates.
The s and t point sprite texture coordinates vary from zero to
one across the point horizontally left-to-right and top-to-bottom,
respectively.
The following formulas are used to evaluate s and t:
where size is the point’s size; (xp,yp) is the location at which the point sprite coordinates are evaluated - this may be the framebuffer coordinates of the fragment center, or the location of a sample; and (xf,yf) is the exact, unrounded framebuffer coordinate of the vertex for the point.
26.10. Line Segments
A line is drawn by generating a set of fragments overlapping a rectangle centered on the line segment. Each line segment has an associated width that controls the width of that rectangle.
The line width is specified by the
VkPipelineRasterizationStateCreateInfo::lineWidth
property of
the currently active pipeline, if the pipeline was not created with
VK_DYNAMIC_STATE_LINE_WIDTH
enabled.
Otherwise, the line width is set by calling vkCmdSetLineWidth
:
void vkCmdSetLineWidth(
VkCommandBuffer commandBuffer,
float lineWidth);
-
commandBuffer
is the command buffer into which the command will be recorded. -
lineWidth
is the width of rasterized line segments.
Not all line widths need be supported for line segment rasterization, but
width 1.0 antialiased segments must be provided.
The range and gradations are obtained from the lineWidthRange
and
lineWidthGranularity
members of VkPhysicalDeviceLimits.
If, for instance, the size range is from 0.1 to 2.0 and the gradation size
is 0.1, then the size 0.1, 0.2, …, 1.9, 2.0 are supported.
Additional line widths may also be supported.
There is no requirement that these widths be equally spaced.
If an unsupported width is requested, the nearest supported width is used
instead.
Further, if the render pass has a fragment density map attachment, line width may be rounded by the implementation to a multiple of the fragment’s width or height.
26.10.1. Basic Line Segment Rasterization
Rasterized line segments produce fragments which intersect a rectangle centered on the line segment. Two of the edges are parallel to the specified line segment; each is at a distance of one-half the current width from that segment in directions perpendicular to the direction of the line. The other two edges pass through the line endpoints and are perpendicular to the direction of the specified line segment. Coverage bits that correspond to sample points that intersect the rectangle are 1, other coverage bits are 0.
Next we specify how the data associated with each rasterized fragment are
obtained.
Let pr = (xd, yd) be the framebuffer coordinates at which
associated data are evaluated.
This may be the center of a fragment or the location of a sample within the
fragment.
When rasterizationSamples
is VK_SAMPLE_COUNT_1_BIT
, the fragment
center must be used.
Let pa = (xa, ya) and pb = (xb,yb) be
initial and final endpoints of the line segment, respectively.
Set
(Note that t = 0 at pa and t = 1 at pb. Also note that this calculation projects the vector from pa to pr onto the line, and thus computes the normalized distance of the fragment along the line.)
The value of an associated datum f for the fragment, whether it be a shader output or the clip w coordinate, must be determined using perspective interpolation:
where fa and fb are the data associated with the starting and ending endpoints of the segment, respectively; wa and wb are the clip w coordinates of the starting and ending endpoints of the segments, respectively.
Depth values for lines must be determined using linear interpolation:
-
z = (1 - t) za + t zb
where za and zb are the depth values of the starting and ending endpoints of the segment, respectively.
The NoPerspective
and Flat
interpolation decorations can be used
with fragment shader inputs to declare how they are interpolated.
When neither decoration is applied, perspective interpolation is performed as described above.
When the NoPerspective
decoration is used, linear interpolation is performed in the same fashion as for depth values,
as described above.
When the Flat
decoration is used, no interpolation is performed, and
outputs are taken from the corresponding input value of the
provoking vertex corresponding to that
primitive.
When the fragmentShaderBarycentric
feature is enabled, the
PerVertexNV
interpolation
decoration can also be used with fragment shader inputs which indicate
that the decorated inputs are not interpolated and can only be accessed
using an extra array dimension, where the extra index identifies one of the
vertices of the primitive that produced the fragment.
The above description documents the preferred method of line rasterization,
and must be used when the implementation advertises the strictLines
limit in VkPhysicalDeviceLimits as VK_TRUE
.
When strictLines
is VK_FALSE
, the edges of the lines are
generated as a parallelogram surrounding the original line.
The major axis is chosen by noting the axis in which there is the greatest
distance between the line start and end points.
If the difference is equal in both directions then the X axis is chosen as
the major axis.
Edges 2 and 3 are aligned to the minor axis and are centered on the
endpoints of the line as in Non strict lines, and each is
lineWidth
long.
Edges 0 and 1 are parallel to the line and connect the endpoints of edges 2
and 3.
Coverage bits that correspond to sample points that intersect the
parallelogram are 1, other coverage bits are 0.
Samples that fall exactly on the edge of the parallelogram follow the polygon rasterization rules.
Interpolation occurs as if the parallelogram was decomposed into two triangles where each pair of vertices at each end of the line has identical attributes.
26.11. Polygons
A polygon results from the decomposition of a triangle strip, triangle fan or a series of independent triangles. Like points and line segments, polygon rasterization is controlled by several variables in the VkPipelineRasterizationStateCreateInfo structure.
26.11.1. Basic Polygon Rasterization
The first step of polygon rasterization is to determine whether the triangle is back-facing or front-facing. This determination is made based on the sign of the (clipped or unclipped) polygon’s area computed in framebuffer coordinates. One way to compute this area is:
where \(x_f^i\) and \(y_f^i\) are the x and y framebuffer coordinates of the ith vertex of the n-vertex polygon (vertices are numbered starting at zero for the purposes of this computation) and i ⊕ 1 is (i + 1) mod n.
The interpretation of the sign of a is determined by the
VkPipelineRasterizationStateCreateInfo::frontFace
property of
the currently active pipeline.
Possible values are:
typedef enum VkFrontFace {
VK_FRONT_FACE_COUNTER_CLOCKWISE = 0,
VK_FRONT_FACE_CLOCKWISE = 1,
} VkFrontFace;
-
VK_FRONT_FACE_COUNTER_CLOCKWISE
specifies that a triangle with positive area is considered front-facing. -
VK_FRONT_FACE_CLOCKWISE
specifies that a triangle with negative area is considered front-facing.
Any triangle which is not front-facing is back-facing, including zero-area triangles.
Once the orientation of triangles is determined, they are culled according
to the VkPipelineRasterizationStateCreateInfo::cullMode
property
of the currently active pipeline.
Possible values are:
typedef enum VkCullModeFlagBits {
VK_CULL_MODE_NONE = 0,
VK_CULL_MODE_FRONT_BIT = 0x00000001,
VK_CULL_MODE_BACK_BIT = 0x00000002,
VK_CULL_MODE_FRONT_AND_BACK = 0x00000003,
} VkCullModeFlagBits;
-
VK_CULL_MODE_NONE
specifies that no triangles are discarded -
VK_CULL_MODE_FRONT_BIT
specifies that front-facing triangles are discarded -
VK_CULL_MODE_BACK_BIT
specifies that back-facing triangles are discarded -
VK_CULL_MODE_FRONT_AND_BACK
specifies that all triangles are discarded.
Following culling, fragments are produced for any triangles which have not been discarded.
typedef VkFlags VkCullModeFlags;
VkCullModeFlags
is a bitmask type for setting a mask of zero or more
VkCullModeFlagBits.
The rule for determining which fragments are produced by polygon rasterization is called point sampling. The two-dimensional projection obtained by taking the x and y framebuffer coordinates of the polygon’s vertices is formed. Fragments are produced for any fragment area groups of pixels for which any sample points lie inside of this polygon. Coverage bits that correspond to sample points that satisfy the point sampling criteria are 1, other coverage bits are 0. Special treatment is given to a sample whose sample location lies on a polygon edge. In such a case, if two polygons lie on either side of a common edge (with identical endpoints) on which a sample point lies, then exactly one of the polygons must result in a covered sample for that fragment during rasterization. As for the data associated with each fragment produced by rasterizing a polygon, we begin by specifying how these values are produced for fragments in a triangle. Define barycentric coordinates for a triangle. Barycentric coordinates are a set of three numbers, a, b, and c, each in the range [0,1], with a + b + c = 1. These coordinates uniquely specify any point p within the triangle or on the triangle’s boundary as
-
p = a pa + b pb + c pc
where pa, pb, and pc are the vertices of the triangle. a, b, and c are determined by:
where A(lmn) denotes the area in framebuffer coordinates of the triangle with vertices l, m, and n.
Denote an associated datum at pa, pb, or pc as fa, fb, or fc, respectively.
The value of an associated datum f for a fragment produced by rasterizing a triangle, whether it be a shader output or the clip w coordinate, must be determined using perspective interpolation:
where wa, wb, and wc are the clip w
coordinates of pa, pb, and pc, respectively.
a, b, and c are the barycentric coordinates of the
location at which the data are produced - this must be the location of the
fragment center or the location of a sample.
When rasterizationSamples
is VK_SAMPLE_COUNT_1_BIT
, the fragment
center must be used.
Depth values for triangles must be determined using linear interpolation:
-
z = a za + b zb + c zc
where za, zb, and zc are the depth values of pa, pb, and pc, respectively.
The NoPerspective
and Flat
interpolation decorations can be used
with fragment shader inputs to declare how they are interpolated.
When neither decoration is applied, perspective interpolation is performed as described above.
When the NoPerspective
decoration is used,
linear interpolation is performed in the
same fashion as for depth values, as described above.
When the Flat
decoration is used, no interpolation is performed, and
outputs are taken from the corresponding input value of the
provoking vertex corresponding to that
primitive.
When the VK_AMD_shader_explicit_vertex_parameter
device extension is
enabled the CustomInterpAMD
interpolation decoration can also be used with fragment shader inputs
which indicate that the decorated inputs can only be accessed by the
extended instruction InterpolateAtVertexAMD
and allows accessing the
value of the inputs for individual vertices of the primitive.
When the fragmentShaderBarycentric
feature is enabled, the
PerVertexNV
interpolation
decoration can also be used with fragment shader inputs which indicate
that the decorated inputs are not interpolated and can only be accessed
using an extra array dimension, where the extra index identifies one of the
vertices of the primitive that produced the fragment.
For a polygon with more than three edges, such as are produced by clipping a triangle, a convex combination of the values of the datum at the polygon’s vertices must be used to obtain the value assigned to each fragment produced by the rasterization algorithm. That is, it must be the case that at every fragment
where n is the number of vertices in the polygon and fi is the value of f at vertex i. For each i, 0 ≤ ai ≤ 1 and \(\sum_{i=1}^{n}a_i = 1\). The values of ai may differ from fragment to fragment, but at vertex i, ai = 1 and aj = 0 for j ≠ i.
Note
One algorithm that achieves the required behavior is to triangulate a polygon (without adding any vertices) and then treat each triangle individually as already discussed. A scan-line rasterizer that linearly interpolates data along each edge and then linearly interpolates data across each horizontal span from edge to edge also satisfies the restrictions (in this case, the numerator and denominator of equation [triangle_perspective_interpolation] are iterated independently and a division performed for each fragment). |
26.11.2. Polygon Mode
Possible values of the
VkPipelineRasterizationStateCreateInfo::polygonMode
property of
the currently active pipeline, specifying the method of rasterization for
polygons, are:
typedef enum VkPolygonMode {
VK_POLYGON_MODE_FILL = 0,
VK_POLYGON_MODE_LINE = 1,
VK_POLYGON_MODE_POINT = 2,
VK_POLYGON_MODE_FILL_RECTANGLE_NV = 1000153000,
} VkPolygonMode;
-
VK_POLYGON_MODE_POINT
specifies that polygon vertices are drawn as points. -
VK_POLYGON_MODE_LINE
specifies that polygon edges are drawn as line segments. -
VK_POLYGON_MODE_FILL
specifies that polygons are rendered using the polygon rasterization rules in this section. -
VK_POLYGON_MODE_FILL_RECTANGLE_NV
specifies that polygons are rendered using polygon rasterization rules, modified to consider a sample within the primitive if the sample location is inside the axis-aligned bounding box of the triangle after projection. Note that the barycentric weights used in attribute interpolation can extend outside the range [0,1] when these primitives are shaded. Special treatment is given to a sample position on the boundary edge of the bounding box. In such a case, if two rectangles lie on either side of a common edge (with identical endpoints) on which a sample position lies, then exactly one of the triangles must produce a fragment that covers that sample during rasterization.Polygons rendered in
VK_POLYGON_MODE_FILL_RECTANGLE_NV
mode may be clipped by the frustum or by user clip planes. If clipping is applied, the triangle is culled rather than clipped.Area calculation and facingness are determined for
VK_POLYGON_MODE_FILL_RECTANGLE_NV
mode using the triangle’s vertices.
These modes affect only the final rasterization of polygons: in particular, a polygon’s vertices are shaded and the polygon is clipped and possibly culled before these modes are applied.
26.11.3. Depth Bias
The depth values of all fragments generated by the rasterization of a
polygon can be offset by a single value that is computed for that polygon.
This behavior is controlled by the depthBiasEnable
,
depthBiasConstantFactor
, depthBiasClamp
, and
depthBiasSlopeFactor
members of
VkPipelineRasterizationStateCreateInfo, or by the corresponding
parameters to the vkCmdSetDepthBias
command if depth bias state is
dynamic.
void vkCmdSetDepthBias(
VkCommandBuffer commandBuffer,
float depthBiasConstantFactor,
float depthBiasClamp,
float depthBiasSlopeFactor);
-
commandBuffer
is the command buffer into which the command will be recorded. -
depthBiasConstantFactor
is a scalar factor controlling the constant depth value added to each fragment. -
depthBiasClamp
is the maximum (or minimum) depth bias of a fragment. -
depthBiasSlopeFactor
is a scalar factor applied to a fragment’s slope in depth bias calculations.
If depthBiasEnable
is VK_FALSE
, no depth bias is applied and the
fragment’s depth values are unchanged.
depthBiasSlopeFactor
scales the maximum depth slope of the polygon,
and depthBiasConstantFactor
scales an implementation-dependent
constant that relates to the usable resolution of the depth buffer.
The resulting values are summed to produce the depth bias value which is
then clamped to a minimum or maximum value specified by
depthBiasClamp
.
depthBiasSlopeFactor
, depthBiasConstantFactor
, and
depthBiasClamp
can each be positive, negative, or zero.
The maximum depth slope m of a triangle is
where (xf, yf, zf) is a point on the triangle. m may be approximated as
The minimum resolvable difference r is an implementation-dependent
parameter that depends on the depth buffer representation.
It is the smallest difference in framebuffer coordinate z values that
is guaranteed to remain distinct throughout polygon rasterization and in the
depth buffer.
All pairs of fragments generated by the rasterization of two polygons with
otherwise identical vertices, but z
f values that differ by
r, will have distinct depth values.
For fixed-point depth buffer representations, r is constant throughout the range of the entire depth buffer. For floating-point depth buffers, there is no single minimum resolvable difference. In this case, the minimum resolvable difference for a given polygon is dependent on the maximum exponent, e, in the range of z values spanned by the primitive. If n is the number of bits in the floating-point mantissa, the minimum resolvable difference, r, for the given primitive is defined as
-
r = 2e-n
If a triangle is rasterized using the
VK_POLYGON_MODE_FILL_RECTANGLE_NV
polygon mode, then this minimum
resolvable difference may not be resolvable for samples outside of the
triangle, where the depth is extrapolated.
If no depth buffer is present, r is undefined.
The bias value o for a polygon is
m is computed as described above. If the depth buffer uses a fixed-point representation, m is a function of depth values in the range [0,1], and o is applied to depth values in the same range.
For fixed-point depth buffers, fragment depth values are always limited to
the range [0,1] by clamping after depth bias addition is performed.
Unless the VK_EXT_depth_range_unrestricted
extension is enabled,
fragment depth values are clamped even when the depth buffer uses a
floating-point representation.
26.11.4. Conservative Rasterization
Polygon rasterization can be made conservative by setting
conservativeRasterizationMode
to
VK_CONSERVATIVE_RASTERIZATION_MODE_OVERESTIMATE_EXT
or
VK_CONSERVATIVE_RASTERIZATION_MODE_UNDERESTIMATE_EXT
in
VkPipelineRasterizationConservativeStateCreateInfoEXT
.
The VkPipelineRasterizationConservativeStateCreateInfoEXT
state is set
by adding an instance of this structure to the pNext
chain of an
instance of the VkPipelineRasterizationStateCreateInfo
structure when
creating the graphics pipeline.
Enabling these modes also affects line and point rasterization if the
implementation sets
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::conservativePointAndLineRasterization
to VK_TRUE
.
VkPipelineRasterizationConservativeStateCreateInfoEXT
is defined as:
typedef struct VkPipelineRasterizationConservativeStateCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkPipelineRasterizationConservativeStateCreateFlagsEXT flags;
VkConservativeRasterizationModeEXT conservativeRasterizationMode;
float extraPrimitiveOverestimationSize;
} VkPipelineRasterizationConservativeStateCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
conservativeRasterizationMode
is the conservative rasterization mode to use. -
extraPrimitiveOverestimationSize
is the extra size in pixels to increase the generating primitive during conservative rasterization at each of its edges inX
andY
equally in screen space beyond the base overestimation specified inVkPhysicalDeviceConservativeRasterizationPropertiesEXT
::primitiveOverestimationSize
.
typedef VkFlags VkPipelineRasterizationConservativeStateCreateFlagsEXT;
VkPipelineRasterizationConservativeStateCreateFlagsEXT
is a bitmask
type for setting a mask, but is currently reserved for future use.
Possible values of
VkPipelineRasterizationConservativeStateCreateInfoEXT::conservativeRasterizationMode
,
specifying the conservative rasterization mode are:
typedef enum VkConservativeRasterizationModeEXT {
VK_CONSERVATIVE_RASTERIZATION_MODE_DISABLED_EXT = 0,
VK_CONSERVATIVE_RASTERIZATION_MODE_OVERESTIMATE_EXT = 1,
VK_CONSERVATIVE_RASTERIZATION_MODE_UNDERESTIMATE_EXT = 2,
} VkConservativeRasterizationModeEXT;
-
VK_CONSERVATIVE_RASTERIZATION_MODE_DISABLED_EXT
specifies that conservative rasterization is disabled and rasterization proceeds as normal. -
VK_CONSERVATIVE_RASTERIZATION_MODE_OVERESTIMATE_EXT
specifies that conservative rasterization is enabled in overestimation mode. -
VK_CONSERVATIVE_RASTERIZATION_MODE_UNDERESTIMATE_EXT
specifies that conservative rasterization is enabled in underestimation mode.
When overestimate conservative rasterization is enabled, rather than evaluating coverage at individual sample locations, a determination is made of whether any portion of the pixel (including its edges and corners) is covered by the primitive. If any portion of the pixel is covered, then all bits of the coverage sample mask for the fragment corresponding to that pixel are enabled. If the render pass has a fragment density map attachment and any bit of the coverage sample mask for the fragment is enabled, then all bits of the coverage sample mask for the fragment are enabled.
If the implementation supports
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::conservativeRasterizationPostDepthCoverage
and the
PostDepthCoverage
execution mode is specified the SampleMask
built-in input variable will
reflect the coverage after the early per-fragment depth and stencil tests
are applied.
For the purposes of evaluating which pixels are covered by the primitive,
implementations can increase the size of the primitive by up to
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::primitiveOverestimationSize
pixels at each of the primitive edges.
This may increase the number of fragments generated by this primitive and
represents an overestimation of the pixel coverage.
This overestimation size can be increased further by setting the
extraPrimitiveOverestimationSize
value above 0.0
in steps of
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::extraPrimitiveOverestimationSizeGranularity
up to and including
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::extraPrimitiveOverestimationSize
.
This will: further increase the number of fragments generated by this
primitive.
The actual precision of the overestimation size used for conservative
rasterization may vary between implementations and produce results that
only approximate the primitiveOverestimationSize
and
extraPrimitiveOverestimationSizeGranularity
properties.
Implementations may especially vary these approximations when the render
pass has a fragment density map and the fragment area covers multiple
pixels.
For triangles if VK_CONSERVATIVE_RASTERIZATION_MODE_OVERESTIMATE_EXT
is enabled, fragments will be generated if the primitive area covers any
portion of any pixel inside the fragment area, including their edges or
corners.
The tie-breaking rule described in Basic Polygon
Rasterization does not apply during conservative rasterization and
coverage is set for all fragments generated from shared edges of polygons.
Degenerate triangles that evaluate to zero area after rasterization, even
for pixels that contain a vertex or edge of the zero-area polygon, will be
culled if
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::degenerateTrianglesRasterized
is VK_FALSE
or will generate fragments if
degenerateTrianglesRasterized
is VK_TRUE
.
The fragment input values for these degenerate triangles take their
attribute and depth values from the provoking vertex.
Degenerate triangles are considered backfacing and the application can
enable backface culling if desired.
Triangles that are zero area before rasterization may be culled regardless.
For lines if VK_CONSERVATIVE_RASTERIZATION_MODE_OVERESTIMATE_EXT
is
enabled, and the implementation sets
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::conservativePointAndLineRasterization
to VK_TRUE
, fragments will be generated if the line covers any portion
of any pixel inside the fragment area, including their edges or corners.
Degenerate lines that evaluate to zero length after rasterization will be
culled if
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::degenerateLinesRasterized
is VK_FALSE
or will generate fragments if
degenerateLinesRasterized
is VK_TRUE
.
The fragments input values for these degenerate lines take their attribute
and depth values from the provoking vertex.
Lines that are zero length before rasterization may be culled regardless.
For points if VK_CONSERVATIVE_RASTERIZATION_MODE_OVERESTIMATE_EXT
is
enabled, and the implementation sets
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::conservativePointAndLineRasterization
to VK_TRUE
, fragments will be generated if the point square covers any
portion of any pixel inside the fragment area, including their edges or
corners.
When underestimate conservative rasterization is enabled, rather than evaluating coverage at individual sample locations, a determination is made of whether all of the pixel (including its edges and corners) is covered by the primitive. If the entire pixel is covered, then a fragment is generated with all bits of its coverage sample mask corresponding to the pixel enabled, otherwise the pixel is not considered covered even if some portion of the pixel is covered. The fragment is discarded if no pixels inside the fragment area are considered covered. If the render pass has a fragment density map attachment and any pixel inside the fragment area is not considered covered, then the fragment is discarded even if some pixels are considered covered.
If the implementation supports
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::conservativeRasterizationPostDepthCoverage
and the
PostDepthCoverage
execution mode is specified the SampleMask
built-in input variable will
reflect the coverage after the early per-fragment depth and stencil tests
are applied.
For triangles, if VK_CONSERVATIVE_RASTERIZATION_MODE_UNDERESTIMATE_EXT
is enabled, fragments will only be generated if any pixel inside the
fragment area is fully covered by the generating primitive, including its
edges and corners.
For lines, if VK_CONSERVATIVE_RASTERIZATION_MODE_UNDERESTIMATE_EXT
is
enabled, fragments will be generated if any pixel inside the fragment area,
including its edges and corners, are entirely covered by the line.
For points, if VK_CONSERVATIVE_RASTERIZATION_MODE_UNDERESTIMATE_EXT
is
enabled, fragments will only be generated if the point square covers the
entirety of any pixel square inside the fragment area, including its edges
or corners.
If the render pass has a fragment density map and
VK_CONSERVATIVE_RASTERIZATION_MODE_UNDERESTIMATE_EXT
is enabled,
fragments will only be generated if the entirety of all pixels inside the
fragment area are covered by the generating primitive, line, or point.
For both overestimate and underestimate conservative rasterization modes a
fragment has all of its pixel squares fully covered by the generating
primitive must set FullyCoveredEXT
to VK_TRUE
if the
implementation enables the
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
::fullyCoveredFragmentShaderInputVariable
feature.
When the use of a shading rate image
results in fragments covering multiple pixels, coverage for conservative
rasterization is still evaluated on a per-pixel basis and may result in
fragments with partial coverage.
For fragment shader inputs decorated with FullyCoveredEXT
, a fragment
is considered fully covered if and only if all pixels in the fragment are
fully covered by the generating primitive.
27. Fragment Operations
Fragment operations execute on a per-fragment or per-sample basis, affecting whether or how a fragment or sample is written to the framebuffer. Some operations execute before fragment shading, and others after. Fragment operations always adhere to rasterization order.
27.1. Early Per-Fragment Tests
Once fragments are produced by rasterization, a number of per-fragment operations are performed prior to fragment shader execution. If a fragment is discarded during any of these operations, it will not be processed by any subsequent stage, including fragment shader execution.
The scissor test, exclusive scissor test, and sample mask generation are always performed during early fragment tests.
Fragment operations are performed in the following order:
-
the discard rectangles test (see Discard Rectangles Test)
-
the scissor test (see Scissor Test)
-
the exclusive scissor test (see Exclusive Scissor Test)
-
multisample fragment operations (see Sample Mask)
If early per-fragment operations are enabled by the fragment shader, these operations are also performed:
If post-depth coverage operation is
enabled by the fragment
shader, the SampleMask
coverage is determined after the early stencil and depth tests.
27.2. Discard Rectangles Test
The discard rectangles test determines if fragment’s framebuffer coordinates
(xf,yf) are inclusive or exclusive to a set of discard-space
rectangles.
The discard rectangles are set with the
VkPipelineDiscardRectangleStateCreateInfoEXT
pipeline state, which is
defined as:
typedef struct VkPipelineDiscardRectangleStateCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkPipelineDiscardRectangleStateCreateFlagsEXT flags;
VkDiscardRectangleModeEXT discardRectangleMode;
uint32_t discardRectangleCount;
const VkRect2D* pDiscardRectangles;
} VkPipelineDiscardRectangleStateCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
discardRectangleMode
is the mode used to determine whether fragments that lie within the discard rectangle are discarded or not. -
discardRectangleCount
is the number of discard rectangles used by the pipeline. -
pDiscardRectangles
is a pointer to an array of VkRect2D structures, defining the discard rectangles. If the discard rectangle state is dynamic, this member is ignored.
typedef VkFlags VkPipelineDiscardRectangleStateCreateFlagsEXT;
VkPipelineDiscardRectangleStateCreateFlagsEXT
is a bitmask type for
setting a mask, but is currently reserved for future use.
The VkPipelineDiscardRectangleStateCreateInfoEXT
state is set by
adding an instance of this structure to the pNext
chain of an instance
of the VkGraphicsPipelineCreateInfo
structure and setting the graphics
pipeline state with vkCreateGraphicsPipelines.
If the bound pipeline state object was not created with the
VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT
dynamic state enabled, discard
rectangles are specified using the pDiscardRectangles
member of
VkPipelineDiscardRectangleStateCreateInfoEXT
linked to the pipeline
state object.
If the pipeline state object was created with the
VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT
dynamic state enabled, the
discard rectangles are dynamically set and changed with the command:
void vkCmdSetDiscardRectangleEXT(
VkCommandBuffer commandBuffer,
uint32_t firstDiscardRectangle,
uint32_t discardRectangleCount,
const VkRect2D* pDiscardRectangles);
-
commandBuffer
is the command buffer into which the command will be recorded. -
firstDiscardRectangle
is the index of the first discard rectangle whose state is updated by the command. -
discardRectangleCount
is the number of discard rectangles whose state are updated by the command. -
pDiscardRectangles
is a pointer to an array of VkRect2D structures specifying discard rectangles.
The discard rectangle taken from element i of pDiscardRectangles
replace the current state for the discard rectangle index
firstDiscardRectangle
+ i, for i in [0,
discardRectangleCount
).
The VkOffset2D
::x
and VkOffset2D
::y
values of the
discard rectangle VkRect2D
specify the upper-left origin of the
discard rectangle box.
The lower-right corner of the discard rectangle box is specified as the
VkExtent2D
::width
and VkExtent2D
::height
from the
upper-left origin.
If offset.x
≤ xf < offset.x
+
extent.width
and offset.y
≤ yf < offset.y
+ extent.height
for the selected discard rectangle, then the
fragment is within the discard rectangle box.
When the discard rectangle mode is
VK_DISCARD_RECTANGLE_MODE_INCLUSIVE_EXT
a fragment within at least one
of the active discard rectangle boxes passes the discard rectangle test;
otherwise the fragment fails the discard rectangle test and is discarded.
When the discard rectangle mode is
VK_DISCARD_RECTANGLE_MODE_EXCLUSIVE_EXT
a fragment within at least one
of the active discard rectangle boxes fails the discard rectangle test, and
the fragment is discarded; otherwise the fragment passes the discard
rectangles test.
The discard rectangles test only applies to drawing commands,
not to other commands like clears or copies.
Possible values of
VkPipelineDiscardRectangleStateCreateInfoEXT::discardRectangleMode
,
specifying the behavior of the discard rectangle test, are:
typedef enum VkDiscardRectangleModeEXT {
VK_DISCARD_RECTANGLE_MODE_INCLUSIVE_EXT = 0,
VK_DISCARD_RECTANGLE_MODE_EXCLUSIVE_EXT = 1,
} VkDiscardRectangleModeEXT;
-
VK_DISCARD_RECTANGLE_MODE_INCLUSIVE_EXT
specifies that a fragment within any discard rectangle satisfies the test. -
VK_DISCARD_RECTANGLE_MODE_EXCLUSIVE_EXT
specifies that a fragment not within any of the discard rectangles satisfies the test.
When the use of a shading rate image results in a fragment covering multiple pixels, the discard rectangle test is performed independently for each pixel in the fragment. If a pixel covered by a fragment fails the discard rectangle test, all samples in the fragment associated with that pixel are treated as not covered. If the discard rectangle test results in a fragment with no samples covered, that fragment is discarded.
27.3. Scissor Test
The scissor test determines if a fragment’s framebuffer coordinates
(xf,yf) lie within the scissor rectangle corresponding to the
viewport index (see Controlling the Viewport)
used by the primitive that generated the fragment.
If the pipeline state object is created without
VK_DYNAMIC_STATE_SCISSOR
enabled then the scissor rectangles are set
by the VkPipelineViewportStateCreateInfo state of the pipeline state
object.
Otherwise, to dynamically set the scissor rectangles call:
void vkCmdSetScissor(
VkCommandBuffer commandBuffer,
uint32_t firstScissor,
uint32_t scissorCount,
const VkRect2D* pScissors);
-
commandBuffer
is the command buffer into which the command will be recorded. -
firstScissor
is the index of the first scissor whose state is updated by the command. -
scissorCount
is the number of scissors whose rectangles are updated by the command. -
pScissors
is a pointer to an array of VkRect2D structures defining scissor rectangles.
The scissor rectangles taken from element i of pScissors
replace
the current state for the scissor index firstScissor
+ i,
for i in [0, scissorCount
).
Each scissor rectangle is described by a VkRect2D structure, with the
offset.x
and offset.y
values determining the upper left corner
of the scissor rectangle, and the extent.width
and extent.height
values determining the size in pixels.
If offset.x
≤ xf < offset.x
+
extent.width
and offset.y
≤ yf < offset.y
+ extent.height
for the selected scissor rectangle, then the
scissor test passes.
Otherwise, the test fails and the fragment is discarded.
For points, lines, and polygons, the scissor rectangle for a primitive is
selected in the same manner as the viewport (see
Controlling the Viewport).
The scissor rectangles test only applies to drawing commands,
not to other commands like clears or copies.
It is legal for offset.x
+ extent.width
or
offset.y
+ extent.height
to exceed the dimensions of
the framebuffer - the scissor test still applies as defined above.
Rasterization does not produce fragments outside of the framebuffer, so such
fragments never have the scissor test performed on them.
The scissor test is always performed. Applications can effectively disable the scissor test by specifying a scissor rectangle that encompasses the entire framebuffer.
When the use of a shading rate image results in a fragment covering multiple pixels, the scissor test is performed independently for each pixel in the fragment. If a pixel covered by a fragment fails the scissor test, all samples in the fragment associated with that pixel are treated as not covered. If the scissor test results in a fragment with no samples covered, that fragment is discarded.
27.4. Exclusive Scissor Test
The exclusive scissor test determines if a pixel’s framebuffer coordinates (xf,yf) lie outside the exclusive scissor rectangle corresponding to the viewport index (see Controlling the Viewport) used by the primitive that generated the fragment. The exclusive scissor test behaves identically to the scissor test, except that it passes only if the pixel is outside the rectangle instead of passing if the pixel is inside the rectangle.
If the pNext
chain of VkPipelineViewportStateCreateInfo
includes
a VkPipelineViewportExclusiveScissorStateCreateInfoNV
structure, then
that structure includes parameters that affect the exclusive scissor test.
The VkPipelineViewportExclusiveScissorStateCreateInfoNV
structure is
defined as:
typedef struct VkPipelineViewportExclusiveScissorStateCreateInfoNV {
VkStructureType sType;
const void* pNext;
uint32_t exclusiveScissorCount;
const VkRect2D* pExclusiveScissors;
} VkPipelineViewportExclusiveScissorStateCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
exclusiveScissorCount
is the number of exclusive scissor rectangles used by the pipeline. -
pExclusiveScissors
is a pointer to an array of VkRect2D structures defining exclusive scissor rectangles. If the exclusive scissor state is dynamic, this member is ignored.
If this structure is not present, exclusiveScissorCount
is considered
to be 0
and the exclusive scissor test is disabled.
If the pipeline state object is created with
VK_DYNAMIC_STATE_EXCLUSIVE_SCISSOR_NV
enabled, then the exclusive
scissor rectangles are set by:
void vkCmdSetExclusiveScissorNV(
VkCommandBuffer commandBuffer,
uint32_t firstExclusiveScissor,
uint32_t exclusiveScissorCount,
const VkRect2D* pExclusiveScissors);
-
commandBuffer
is the command buffer into which the command will be recorded. -
firstExclusiveScissor
is the index of the first exclusive scissor rectangle whose state is updated by the command. -
exclusiveScissorCount
is the number of exclusive scissor rectangles updated by the command. -
pExclusiveScissors
is a pointer to an array of VkRect2D structures defining exclusive scissor rectangles.
The scissor rectangles taken from element i of
pExclusiveScissors
replace the current state for the scissor index
firstExclusiveScissor
+ i, for i in [0,
exclusiveScissorCount
).
Each scissor rectangle is described by a VkRect2D structure, with the
offset.x
and offset.y
values determining the upper left corner
of the scissor rectangle, and the extent.width
and extent.height
values determining the size in pixels.
If offset.x
≤ xf < offset.x
+
extent.width
and offset.y
≤ yf < offset.y
+ extent.height
for the selected exclusive scissor rectangle,
then the exclusive scissor test fails and the fragment is discarded.
Otherwise, the exclusive scissor test passes.
For points, lines, and polygons, the exclusive scissor rectangle for a
primitive is selected in the same manner as the viewport (see
Controlling the Viewport).
The exclusive scissor test only applies to drawing commands,
not to other commands like clears or copies.
It is legal for offset.x
+ extent.width
or
offset.y
+ extent.height
to exceed the dimensions of
the framebuffer - the exclusive scissor test still applies as defined above.
Rasterization does not produce fragments outside of the framebuffer, so such
fragments never have the exclusive scissor test performed on them.
The exclusive scissor test is performed if and only if the current pipeline
was created with a non-zero exclusiveScissorCount
.
Applications can effectively disable the exclusive scissor test for
specific viewports by specifying a scissor rectangle with a width or height
of zero.
When the use of a shading rate image results in a fragment covering multiple pixels, the exclusive scissor test is performed independently for each pixel in the fragment. If a pixel covered by a fragment fails the exclusive scissor test, all samples in the fragment associated with that pixel are treated as not covered. If the exclusive scissor test results in a fragment with no samples covered, that fragment is discarded.
27.5. Sample Mask
This step modifies fragment coverage values based on the values in the
pSampleMask
array member of
VkPipelineMultisampleStateCreateInfo, as described previously in
section Graphics Pipelines.
pSampleMask
contains an array of static coverage information that is
ANDed
with the coverage information generated during rasterization.
Bits that are zero disable coverage for the corresponding sample.
Bit B of mask word M corresponds to sample 32 × M
+ B.
The array is sized to a length of ⌈ rasterizationSamples
/
32 ⌉ words.
If pSampleMask
is NULL
, it is treated as if the mask has all bits
enabled, i.e. no coverage is removed from fragments.
The elements of the sample mask array are of type VkSampleMask
,
each representing 32 bits of coverage information:
typedef uint32_t VkSampleMask;
27.6. Early Fragment Test Mode
The depth bounds test, stencil test, depth test, representative fragment test, and occlusion query sample counting are performed before fragment shading if and only if early fragment tests are enabled by the fragment shader (see Early Fragment Tests). When early per-fragment operations are enabled, these operations are performed prior to fragment shader execution, and the stencil buffer, depth buffer, and occlusion query sample counts will be updated accordingly; these operations will not be performed again after fragment shader execution.
If a pipeline’s fragment shader has early fragment tests disabled, these operations are performed only after fragment program execution, in the order described below. If a pipeline does not contain a fragment shader, these operations are performed only once.
If early fragment tests are enabled, any depth value computed by the fragment shader has no effect. Additionally, the depth test (including depth writes), stencil test (including stencil writes) and sample counting operations are performed even for fragments or samples that would be discarded after fragment shader execution due to per-fragment operations such as alpha-to-coverage tests, or due to the fragment being discarded by the shader itself.
27.7. Late Per-Fragment Tests
After programmable fragment processing, per-fragment operations are performed before blending and color output to the framebuffer.
A fragment is produced by rasterization with framebuffer coordinates of (xf,yf) and depth z, as described in Rasterization. The fragment is then modified by programmable fragment processing, which adds associated data as described in Shaders. The fragment is then further modified, and possibly discarded by the late per-fragment operations described in this chapter. Finally, if the fragment was not discarded, it is used to update the framebuffer at the fragment’s framebuffer coordinates for any samples that remain covered.
The depth bounds test, stencil test, and depth test are performed for each sample, rather than just once for each fragment. Stencil and depth operations are performed for a sample only if that sample’s fragment coverage bit is a value of 1 when the fragment executes the corresponding stage of the graphics pipeline. If the corresponding coverage bit is 0, no operations are performed for that sample. Failure of the depth bounds, stencil, or depth test results in termination of the processing of that sample by means of disabling coverage for that sample, rather than discarding of the fragment. If, at any point, a fragment’s coverage becomes zero for all samples, then the fragment is discarded. All operations are performed on the depth and stencil values stored in the depth/stencil attachment of the framebuffer. The contents of the color attachments are not modified at this point.
The depth bounds test, stencil test, depth test, and occlusion query operations described in Depth Bounds Test, Stencil Test, Depth Test, Sample Counting are instead performed prior to fragment processing, as described in Early Fragment Test Mode, if requested by the fragment shader.
27.8. Mixed attachment samples
When the VK_AMD_mixed_attachment_samples
extension is enabled, special
rules apply to per-fragment operations when the number of samples of the
color attachments differs from the number of samples of the depth/stencil
attachment used in a subpass.
Let C be the number of color attachment samples and D be the number of depth/stencil attachment samples used by a given subpass.
If C < D then only the first C number of samples are guaranteed
to have a corresponding fragment shader invocation and thus a corresponding
color output value, unless the fragment shaders produce inputs to the late
per-fragment tests (e.g. by outputting to a variable decorated with the
FragDepth
built-in decoration).
Implementations are allowed to produce fragment shader invocations for
samples with indices greater than or equal to C but (other than
potential side effects) the color outputs of fragment shader invocations
corresponding to such samples are discarded.
27.9. Multisample Coverage
If a fragment shader is active and its entry point’s interface includes a
built-in output variable decorated with SampleMask
and also decorated
with OverrideCoverageNV
the fragment coverage is replaced with the
sample mask bits set in the shader.
Otherwise if the built-in output variable decorated with SampleMask
is
not also decorated with OverrideCoverageNV
then the fragment coverage
is ANDed
with the bits of the sample mask to generate a new fragment
coverage value.
If such a fragment shader did not assign a value to SampleMask
due to
flow of control, the value ANDed
with the fragment coverage is
undefined.
If no fragment shader is active, or if the active fragment shader does not
include SampleMask
in its interface, the fragment coverage is not
modified.
Next, the fragment alpha and coverage values are modified based on the
alphaToCoverageEnable
and alphaToOneEnable
members of the
VkPipelineMultisampleStateCreateInfo structure.
All alpha values in this section refer only to the alpha component of the
fragment shader output that has a Location
and Index
decoration of
zero (see the Fragment Output Interface
section).
If that shader output has an integer or unsigned integer type, then these
operations are skipped.
If alphaToCoverageEnable
is enabled, a temporary coverage value with
rasterizationSamples
bits is generated where each bit is determined by
the fragment’s alpha value.
The temporary coverage value is then ANDed with the fragment coverage value
to generate a new fragment coverage value.
No specific algorithm is specified for converting the alpha value to a temporary coverage mask. It is intended that the number of 1’s in this value be proportional to the alpha value (clamped to [0,1]), with all 1’s corresponding to a value of 1.0 and all 0’s corresponding to 0.0. The algorithm may be different at different framebuffer coordinates.
Note
Using different algorithms at different framebuffer coordinates may help to avoid artifacts caused by regular coverage sample locations. |
Next, if alphaToOneEnable
is enabled, each alpha value is replaced by
the maximum representable alpha value for fixed-point color buffers, or by
1.0 for floating-point buffers.
Otherwise, the alpha values are not changed.
27.10. Depth and Stencil Operations
Pipeline state controlling the depth bounds tests,
stencil test, and depth test is
specified through the members of the
VkPipelineDepthStencilStateCreateInfo
structure.
The VkPipelineDepthStencilStateCreateInfo
structure is defined as:
typedef struct VkPipelineDepthStencilStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineDepthStencilStateCreateFlags flags;
VkBool32 depthTestEnable;
VkBool32 depthWriteEnable;
VkCompareOp depthCompareOp;
VkBool32 depthBoundsTestEnable;
VkBool32 stencilTestEnable;
VkStencilOpState front;
VkStencilOpState back;
float minDepthBounds;
float maxDepthBounds;
} VkPipelineDepthStencilStateCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
depthTestEnable
controls whether depth testing is enabled. -
depthWriteEnable
controls whether depth writes are enabled whendepthTestEnable
isVK_TRUE
. Depth writes are always disabled whendepthTestEnable
isVK_FALSE
. -
depthCompareOp
is the comparison operator used in the depth test. -
depthBoundsTestEnable
controls whether depth bounds testing is enabled. -
stencilTestEnable
controls whether stencil testing is enabled. -
front
andback
control the parameters of the stencil test. -
minDepthBounds
andmaxDepthBounds
define the range of values used in the depth bounds test.
typedef VkFlags VkPipelineDepthStencilStateCreateFlags;
VkPipelineDepthStencilStateCreateFlags
is a bitmask type for setting a
mask, but is currently reserved for future use.
27.11. Depth Bounds Test
The depth bounds test conditionally disables coverage of a sample based on
the outcome of a comparison between the value za in the depth
attachment at location (xf,yf) (for the appropriate sample) and a
range of values.
The test is enabled or disabled by the depthBoundsTestEnable
member of
VkPipelineDepthStencilStateCreateInfo: If the pipeline state object is
created without the VK_DYNAMIC_STATE_DEPTH_BOUNDS
dynamic state
enabled then the range of values used in the depth bounds test are defined
by the minDepthBounds
and maxDepthBounds
members of the
VkPipelineDepthStencilStateCreateInfo structure.
Otherwise, to dynamically set the depth bounds range values call:
void vkCmdSetDepthBounds(
VkCommandBuffer commandBuffer,
float minDepthBounds,
float maxDepthBounds);
-
commandBuffer
is the command buffer into which the command will be recorded. -
minDepthBounds
is the lower bound of the range of depth values used in the depth bounds test. -
maxDepthBounds
is the upper bound of the range.
If minDepthBounds
≤ za ≤ maxDepthBounds
}, then
the depth bounds test passes.
Otherwise, the test fails and the sample’s coverage bit is cleared in the
fragment.
If there is no depth framebuffer attachment or if the depth bounds test is
disabled, it is as if the depth bounds test always passes.
27.12. Stencil Test
The stencil test conditionally disables coverage of a sample based on the
outcome of a comparison between the stencil value in the depth/stencil
attachment at location (xf,yf) (for the appropriate sample) and a
reference value.
The stencil test also updates the value in the stencil attachment, depending
on the test state, the stencil value and the stencil write masks.
The test is enabled or disabled by the stencilTestEnable
member of
VkPipelineDepthStencilStateCreateInfo.
When disabled, the stencil test and associated modifications are not made, and the sample’s coverage is not modified.
The stencil test is controlled with the front
and back
members
of VkPipelineDepthStencilStateCreateInfo
which are of type
VkStencilOpState
.
The VkStencilOpState
structure is defined as:
typedef struct VkStencilOpState {
VkStencilOp failOp;
VkStencilOp passOp;
VkStencilOp depthFailOp;
VkCompareOp compareOp;
uint32_t compareMask;
uint32_t writeMask;
uint32_t reference;
} VkStencilOpState;
-
failOp
is a VkStencilOp value specifying the action performed on samples that fail the stencil test. -
passOp
is a VkStencilOp value specifying the action performed on samples that pass both the depth and stencil tests. -
depthFailOp
is a VkStencilOp value specifying the action performed on samples that pass the stencil test and fail the depth test. -
compareOp
is a VkCompareOp value specifying the comparison operator used in the stencil test. -
compareMask
selects the bits of the unsigned integer stencil values participating in the stencil test. -
writeMask
selects the bits of the unsigned integer stencil values updated by the stencil test in the stencil framebuffer attachment. -
reference
is an integer reference value that is used in the unsigned stencil comparison.
There are two sets of stencil-related state, the front stencil state set and the back stencil state set. Stencil tests and writes use the front set of stencil state when processing front-facing fragments and use the back set of stencil state when processing back-facing fragments. Fragments rasterized from non-polygon primitives (points and lines) are always considered front-facing. Fragments rasterized from polygon primitives inherit their facingness from the polygon, even if the polygon is rasterized as points or lines due to the current VkPolygonMode. Whether a polygon is front- or back-facing is determined in the same manner used for face culling (see Basic Polygon Rasterization).
The operation of the stencil test is also affected by the compareMask
,
writeMask
, and reference
members of VkStencilOpState
set
in the pipeline state object if the pipeline state object is created without
the VK_DYNAMIC_STATE_STENCIL_COMPARE_MASK
,
VK_DYNAMIC_STATE_STENCIL_WRITE_MASK
, and
VK_DYNAMIC_STATE_STENCIL_REFERENCE
dynamic states enabled,
respectively.
If the pipeline state object is created with the
VK_DYNAMIC_STATE_STENCIL_COMPARE_MASK
dynamic state enabled, then to
dynamically set the stencil compare mask call:
void vkCmdSetStencilCompareMask(
VkCommandBuffer commandBuffer,
VkStencilFaceFlags faceMask,
uint32_t compareMask);
-
commandBuffer
is the command buffer into which the command will be recorded. -
faceMask
is a bitmask of VkStencilFaceFlagBits specifying the set of stencil state for which to update the compare mask. -
compareMask
is the new value to use as the stencil compare mask.
Bits which can be set in the
vkCmdSetStencilCompareMask::faceMask
parameter, and similar
parameters of other commands specifying which stencil state to update
stencil masks for, are:
typedef enum VkStencilFaceFlagBits {
VK_STENCIL_FACE_FRONT_BIT = 0x00000001,
VK_STENCIL_FACE_BACK_BIT = 0x00000002,
VK_STENCIL_FRONT_AND_BACK = 0x00000003,
} VkStencilFaceFlagBits;
-
VK_STENCIL_FACE_FRONT_BIT
specifies that only the front set of stencil state is updated. -
VK_STENCIL_FACE_BACK_BIT
specifies that only the back set of stencil state is updated. -
VK_STENCIL_FRONT_AND_BACK
is the combination ofVK_STENCIL_FACE_FRONT_BIT
andVK_STENCIL_FACE_BACK_BIT
, and specifies that both sets of stencil state are updated.
typedef VkFlags VkStencilFaceFlags;
VkStencilFaceFlags
is a bitmask type for setting a mask of zero or
more VkStencilFaceFlagBits.
If the pipeline state object is created with the
VK_DYNAMIC_STATE_STENCIL_WRITE_MASK
dynamic state enabled, then to
dynamically set the stencil write mask call:
void vkCmdSetStencilWriteMask(
VkCommandBuffer commandBuffer,
VkStencilFaceFlags faceMask,
uint32_t writeMask);
-
commandBuffer
is the command buffer into which the command will be recorded. -
faceMask
is a bitmask of VkStencilFaceFlagBits specifying the set of stencil state for which to update the write mask, as described above for vkCmdSetStencilCompareMask. -
writeMask
is the new value to use as the stencil write mask.
If the pipeline state object is created with the
VK_DYNAMIC_STATE_STENCIL_REFERENCE
dynamic state enabled, then to
dynamically set the stencil reference value call:
void vkCmdSetStencilReference(
VkCommandBuffer commandBuffer,
VkStencilFaceFlags faceMask,
uint32_t reference);
-
commandBuffer
is the command buffer into which the command will be recorded. -
faceMask
is a bitmask of VkStencilFaceFlagBits specifying the set of stencil state for which to update the reference value, as described above for vkCmdSetStencilCompareMask. -
reference
is the new value to use as the stencil reference value.
reference
is an integer reference value that is used in the unsigned
stencil comparison.
The reference value used by stencil comparison must be within the range
[0,2s-1] , where s is the number of bits in the stencil
framebuffer attachment, otherwise the reference value is considered
undefined.
The s least significant bits of compareMask
are bitwise
ANDed
with both the reference and the stored stencil value, and the
resulting masked values are those that participate in the comparison
controlled by compareOp
.
Let R be the masked reference value and S be the masked stored
stencil value.
Possible values of VkStencilOpState::compareOp
, specifying the
stencil comparison function, are:
typedef enum VkCompareOp {
VK_COMPARE_OP_NEVER = 0,
VK_COMPARE_OP_LESS = 1,
VK_COMPARE_OP_EQUAL = 2,
VK_COMPARE_OP_LESS_OR_EQUAL = 3,
VK_COMPARE_OP_GREATER = 4,
VK_COMPARE_OP_NOT_EQUAL = 5,
VK_COMPARE_OP_GREATER_OR_EQUAL = 6,
VK_COMPARE_OP_ALWAYS = 7,
} VkCompareOp;
-
VK_COMPARE_OP_NEVER
specifies that the test never passes. -
VK_COMPARE_OP_LESS
specifies that the test passes when R < S. -
VK_COMPARE_OP_EQUAL
specifies that the test passes when R = S. -
VK_COMPARE_OP_LESS_OR_EQUAL
specifies that the test passes when R ≤ S. -
VK_COMPARE_OP_GREATER
specifies that the test passes when R > S. -
VK_COMPARE_OP_NOT_EQUAL
specifies that the test passes when R ≠ S. -
VK_COMPARE_OP_GREATER_OR_EQUAL
specifies that the test passes when R ≥ S. -
VK_COMPARE_OP_ALWAYS
specifies that the test always passes.
Possible values of the failOp
, passOp
, and depthFailOp
members of VkStencilOpState, specifying what happens to the stored
stencil value if this or certain subsequent tests fail or pass, are:
typedef enum VkStencilOp {
VK_STENCIL_OP_KEEP = 0,
VK_STENCIL_OP_ZERO = 1,
VK_STENCIL_OP_REPLACE = 2,
VK_STENCIL_OP_INCREMENT_AND_CLAMP = 3,
VK_STENCIL_OP_DECREMENT_AND_CLAMP = 4,
VK_STENCIL_OP_INVERT = 5,
VK_STENCIL_OP_INCREMENT_AND_WRAP = 6,
VK_STENCIL_OP_DECREMENT_AND_WRAP = 7,
} VkStencilOp;
-
VK_STENCIL_OP_KEEP
keeps the current value. -
VK_STENCIL_OP_ZERO
sets the value to 0. -
VK_STENCIL_OP_REPLACE
sets the value toreference
. -
VK_STENCIL_OP_INCREMENT_AND_CLAMP
increments the current value and clamps to the maximum representable unsigned value. -
VK_STENCIL_OP_DECREMENT_AND_CLAMP
decrements the current value and clamps to 0. -
VK_STENCIL_OP_INVERT
bitwise-inverts the current value. -
VK_STENCIL_OP_INCREMENT_AND_WRAP
increments the current value and wraps to 0 when the maximum value would have been exceeded. -
VK_STENCIL_OP_DECREMENT_AND_WRAP
decrements the current value and wraps to the maximum possible value when the value would go below 0.
For purposes of increment and decrement, the stencil bits are considered as an unsigned integer.
If the stencil test fails, the sample’s coverage bit is cleared in the fragment. If there is no stencil framebuffer attachment, stencil modification cannot occur, and it is as if the stencil tests always pass.
If the stencil test passes, the writeMask
member of the
VkStencilOpState structures controls how the updated stencil value is
written to the stencil framebuffer attachment.
The least significant s bits of writeMask
, where s is the
number of bits in the stencil framebuffer attachment, specify an integer
mask.
Where a 1 appears in this mask, the corresponding bit in the stencil
value in the depth/stencil attachment is written; where a 0 appears,
the bit is not written.
The writeMask
value uses either the front-facing or back-facing state
based on the facingness of the fragment.
Fragments generated by front-facing primitives use the front mask and
fragments generated by back-facing primitives use the back mask.
27.13. Depth Test
The depth test conditionally disables coverage of a sample based on the
outcome of a comparison between the fragment’s depth value at the sample
location and the sample’s depth value in the depth/stencil attachment at
location (xf,yf).
The comparison is enabled or disabled with the depthTestEnable
member
of the VkPipelineDepthStencilStateCreateInfo structure.
When disabled, the depth comparison and subsequent possible updates to the
value of the depth component of the depth/stencil attachment are bypassed
and the fragment is passed to the next operation.
The stencil value, however, can be modified as indicated above as if the
depth test passed.
If enabled, the comparison takes place and the depth/stencil attachment
value can subsequently be modified.
The comparison is specified with the depthCompareOp
member of
VkPipelineDepthStencilStateCreateInfo.
Let z
f be the incoming fragment’s depth value for a sample,
and let za be the depth/stencil attachment value in memory for that
sample.
The depth test passes under the following conditions:
-
VK_COMPARE_OP_NEVER
: the test never passes. -
VK_COMPARE_OP_LESS
: the test passes when zf < za. -
VK_COMPARE_OP_EQUAL
: the test passes when zf = za. -
VK_COMPARE_OP_LESS_OR_EQUAL
: the test passes when zf ≤ za. -
VK_COMPARE_OP_GREATER
: the test passes when zf > za. -
VK_COMPARE_OP_NOT_EQUAL
: the test passes when zf ≠ za. -
VK_COMPARE_OP_GREATER_OR_EQUAL
: the test passes when zf ≥ za. -
VK_COMPARE_OP_ALWAYS
: the test always passes.
If VkPipelineRasterizationStateCreateInfo::depthClampEnable
is
enabled, before the incoming fragment’s z
f is compared to
z
a, z
f is clamped to [min(n,f),max(n,f)],
where n and f are the minDepth
and maxDepth
depth
range values of the viewport used by this fragment, respectively.
If the depth test fails, the sample’s coverage bit is cleared in the fragment. The stencil value at the sample’s location is updated according to the function currently in effect for depth test failure.
If the depth test passes, the sample’s (possibly clamped) z
f
value is conditionally written to the depth framebuffer attachment based on
the depthWriteEnable
member of
VkPipelineDepthStencilStateCreateInfo.
If depthWriteEnable
is VK_TRUE
the value is written, and if it
is VK_FALSE
the value is not written.
If the depth framebuffer attachment is a fixed-point format and the depth
value is outside of the 0.0
to 1.0
range the depth value is clamped
between 0.0
and 1.0
inclusive before writing.
The stencil value at the sample’s location is updated according to the
function currently in effect for depth test success.
If there is no depth framebuffer attachment, it is as if the depth test always passes.
27.14. Representative Fragment Test
The representative fragment test allows implementations to reduce the amount of rasterization and fragment processing work performed for each point, line, or triangle primitive. For any primitive that produces one or more fragments that pass all prior early fragment tests, the implementation may choose one or more “representative” fragments for processing and discard all other fragments. For draw calls rendering multiple points, lines, or triangles arranged in lists, strips, or fans, the representative fragment test is performed independently for each of those primitives. The set of fragments discarded by the representative fragment test is implementation-dependent. In some cases, the representative fragment test may not discard any fragments for a given primitive.
If the pNext
chain of VkGraphicsPipelineCreateInfo includes a
VkPipelineRepresentativeFragmentTestStateCreateInfoNV
structure, then
that structure includes parameters that control the representative fragment
test.
The VkPipelineRepresentativeFragmentTestStateCreateInfoNV
structure is
defined as:
typedef struct VkPipelineRepresentativeFragmentTestStateCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkBool32 representativeFragmentTestEnable;
} VkPipelineRepresentativeFragmentTestStateCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
representativeFragmentTestEnable
controls whether the representative fragment test is enabled.
If this structure is not present, representativeFragmentTestEnable
is
considered to be VK_FALSE
, and the representative fragment test is
disabled.
If early fragment tests are not enabled in the active fragment shader, the representative fragment shader test has no effect, even if enabled.
27.15. Sample Counting
Occlusion queries use query pool entries to track the number of samples that pass all the per-fragment tests. The mechanism of collecting an occlusion query value is described in Occlusion Queries.
The occlusion query sample counter increments by one for each sample with a coverage value of 1 in each fragment that survives all the per-fragment tests, including scissor, exclusive scissor, sample mask, alpha to coverage, stencil, and depth tests.
27.16. Fragment Coverage To Color
If the pNext
chain of VkPipelineMultisampleStateCreateInfo
includes a VkPipelineCoverageToColorStateCreateInfoNV
structure, then
that structure controls whether the fragment coverage is substituted for a
fragment color output and, if so, which output is replaced.
The VkPipelineCoverageToColorStateCreateInfoNV
structure is defined
as:
typedef struct VkPipelineCoverageToColorStateCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkPipelineCoverageToColorStateCreateFlagsNV flags;
VkBool32 coverageToColorEnable;
uint32_t coverageToColorLocation;
} VkPipelineCoverageToColorStateCreateInfoNV;
-
sType
is the type of this structure -
pNext
isNULL
or a pointer to an extension-specific structure -
flags
is reserved for future use. -
coverageToColorEnable
controls whether the fragment coverage value replaces a fragment color output. -
coverageToColorLocation
controls which fragment shader color output value is replaced.
If coverageToColorEnable
is VK_TRUE
, the fragment coverage
information is treated as a bitmask with one bit for each sample (as in the
Sample Mask section), and this bitmask replaces the
first component of the color value corresponding to the fragment shader
output location with Location
equal to coverageToColorLocation
and Index
equal to zero.
If the color attachment format has fewer bits than the sample coverage, the
low bits of the sample coverage bitmask are taken without any clamping.
If the color attachment format has more bits than the sample coverage, the
high bits of the sample coverage bitmask are filled with zeros.
If Sample Shading is in use, the coverage bitmask only has bits set for samples that correspond to the fragment shader invocation that shades those samples.
This pipeline stage occurs after sample counting and before blending, and is
always performed after fragment shading regardless of the setting of
EarlyFragmentTests
.
If coverageToColorEnable
is VK_FALSE
, these operations are
skipped.
If this structure is not present, it is as if coverageToColorEnable
is
VK_FALSE
.
typedef VkFlags VkPipelineCoverageToColorStateCreateFlagsNV;
VkPipelineCoverageToColorStateCreateFlagsNV
is a bitmask type for
setting a mask, but is currently reserved for future use.
27.17. Coverage Reduction
Coverage reduction generates a color sample mask from the coverage mask, with one bit for each sample in the color attachment(s) for the subpass. If a bit in the color sample mask is 0, then blending and writing to the framebuffer are not performed for that sample.
When the VK_NV_framebuffer_mixed_samples
extension is not enabled, each
color sample is associated with a unique rasterization sample, and the value
of the coverage mask is assigned to the color sample mask.
If the render pass has a fragment density map attachment,
rasterizationSamples
is greater than 1, and the fragment area covers
multiple pixels; there is an implementation-dependent association of
rasterization samples to color attachment samples within the fragment.
Each color sample’s mask bit is assigned the union of the coverage bits of
its associated raster samples.
When the VK_NV_framebuffer_mixed_samples
extension is enabled, if the
pipeline’s
VkPipelineMultisampleStateCreateInfo::rasterizationSamples
is
greater than one and the VkAttachmentDescription::samples
of the
color attachments is one, then the fragment’s coverage is reduced from
rasterizationSamples
bits to a single bit, where the color sample mask
is 1 if any bit in the fragment’s coverage is on, and 0 otherwise.
If the pipeline’s
VkPipelineMultisampleStateCreateInfo::rasterizationSamples
is
greater than the VkAttachmentDescription::samples
of the color
attachments in the subpass, then the fragment’s coverage is reduced from
rasterizationSamples
bits to a color sample mask with
VkAttachmentDescription::samples
bits.
There is an implementation-dependent association of raster samples to color
samples.
The reduced color sample mask is computed such that the bit for each color
sample is 1 if any of the associated bits in the fragment’s coverage is on,
and 0 otherwise.
27.17.1. Coverage Modulation
As part of coverage reduction, fragment color values can also be modulated (multiplied) by a value that is a function of fraction of covered rasterization samples associated with that color sample.
Pipeline state controlling coverage reduction is specified through the
members of the VkPipelineCoverageModulationStateCreateInfoNV
structure.
The VkPipelineCoverageModulationStateCreateInfoNV
structure is defined
as:
typedef struct VkPipelineCoverageModulationStateCreateInfoNV {
VkStructureType sType;
const void* pNext;
VkPipelineCoverageModulationStateCreateFlagsNV flags;
VkCoverageModulationModeNV coverageModulationMode;
VkBool32 coverageModulationTableEnable;
uint32_t coverageModulationTableCount;
const float* pCoverageModulationTable;
} VkPipelineCoverageModulationStateCreateInfoNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
coverageModulationMode
controls which color components are modulated and is of type VkCoverageModulationModeNV. -
coverageModulationTableEnable
controls whether the modulation factor is looked up from a table inpCoverageModulationTable
. -
coverageModulationTableCount
is the number of elements inpCoverageModulationTable
. -
pCoverageModulationTable
is a table of modulation factors containing a value for each number of covered samples.
If coverageModulationTableEnable
is VK_FALSE
, then for each
color sample the associated bits of the fragment’s coverage are counted and
divided by the number of associated bits to produce a modulation factor
R in the range (0,1] (a value of zero would have been killed due
to a color coverage of 0).
Specifically:
-
N = value of
rasterizationSamples
-
M = value of VkAttachmentDescription::
samples
for any color attachments -
R = popcount(associated coverage bits) / (N / M)
If coverageModulationTableEnable
is VK_TRUE
, the value R
is computed using a programmable lookup table.
The lookup table has N / M elements, and the element of the table is
selected by:
-
R =
pCoverageModulationTable
[popcount(associated coverage bits)-1]
Note that the table does not have an entry for popcount(associated coverage bits) = 0, because such samples would have been killed.
The values of pCoverageModulationTable
may be rounded to an
implementation-dependent precision, which is at least as fine as 1 /
N, and clamped to [0,1].
For each color attachment with a floating point or normalized color format,
each fragment output color value is replicated to M values which can
each be modulated (multiplied) by that color sample’s associated value of
R.
Which components are modulated is controlled by
coverageModulationMode
.
If this structure is not present, it is as if coverageModulationMode is
VK_COVERAGE_MODULATION_MODE_NONE_NV
.
typedef VkFlags VkPipelineCoverageModulationStateCreateFlagsNV;
VkPipelineCoverageModulationStateCreateFlagsNV
is a bitmask type for
setting a mask, but is currently reserved for future use.
Possible values of
VkPipelineCoverageModulationStateCreateInfoNV::coverageModulationMode
,
specifying which color components are modulated, are:
typedef enum VkCoverageModulationModeNV {
VK_COVERAGE_MODULATION_MODE_NONE_NV = 0,
VK_COVERAGE_MODULATION_MODE_RGB_NV = 1,
VK_COVERAGE_MODULATION_MODE_ALPHA_NV = 2,
VK_COVERAGE_MODULATION_MODE_RGBA_NV = 3,
} VkCoverageModulationModeNV;
-
VK_COVERAGE_MODULATION_MODE_NONE_NV
specifies that no components are multiplied by the modulation factor. -
VK_COVERAGE_MODULATION_MODE_RGB_NV
specifies that the red, green, and blue components are multiplied by the modulation factor. -
VK_COVERAGE_MODULATION_MODE_ALPHA_NV
specifies that the alpha component is multiplied by the modulation factor. -
VK_COVERAGE_MODULATION_MODE_RGBA_NV
specifies that all components are multiplied by the modulation factor.
28. The Framebuffer
28.1. Blending
Blending combines the incoming source fragment’s R, G, B, and A values with the destination R, G, B, and A values of each sample stored in the framebuffer at the fragment’s (xf,yf) location. Blending is performed for each color sample covered by the fragment, rather than just once for each fragment.
Source and destination values are combined according to the blend operation, quadruplets of source and destination weighting factors determined by the blend factors, and a blend constant, to obtain a new set of R, G, B, and A values, as described below.
Blending is computed and applied separately to each color attachment used by the subpass, with separate controls for each attachment.
Prior to performing the blend operation, signed and unsigned normalized fixed-point color components undergo an implied conversion to floating-point as specified by Conversion from Normalized Fixed-Point to Floating-Point. Blending computations are treated as if carried out in floating-point, and basic blend operations are performed with a precision and dynamic range no lower than that used to represent destination components. Advanced blending operations are performed with a precision and dynamic range no lower than the smaller of that used to represent destination components or that used to represent 16-bit floating-point values.
Blending applies only to fixed-point and floating-point color attachments. If the color attachment has an integer format, blending is not applied.
The pipeline blend state is included in the
VkPipelineColorBlendStateCreateInfo
structure during graphics pipeline
creation:
The VkPipelineColorBlendStateCreateInfo
structure is defined as:
typedef struct VkPipelineColorBlendStateCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineColorBlendStateCreateFlags flags;
VkBool32 logicOpEnable;
VkLogicOp logicOp;
uint32_t attachmentCount;
const VkPipelineColorBlendAttachmentState* pAttachments;
float blendConstants[4];
} VkPipelineColorBlendStateCreateInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
logicOpEnable
controls whether to apply Logical Operations. -
logicOp
selects which logical operation to apply. -
attachmentCount
is the number ofVkPipelineColorBlendAttachmentState
elements inpAttachments
. This value must equal thecolorAttachmentCount
for the subpass in which this pipeline is used. -
pAttachments
: is a pointer to array of per target attachment states. -
blendConstants
is an array of four values used as the R, G, B, and A components of the blend constant that are used in blending, depending on the blend factor.
Each element of the pAttachments
array is a
VkPipelineColorBlendAttachmentState structure specifying per-target
blending state for each individual color attachment.
If the independent blending feature
is not enabled on the device, all VkPipelineColorBlendAttachmentState
elements in the pAttachments
array must be identical.
typedef VkFlags VkPipelineColorBlendStateCreateFlags;
VkPipelineColorBlendStateCreateFlags
is a bitmask type for setting a
mask, but is currently reserved for future use.
The VkPipelineColorBlendAttachmentState
structure is defined as:
typedef struct VkPipelineColorBlendAttachmentState {
VkBool32 blendEnable;
VkBlendFactor srcColorBlendFactor;
VkBlendFactor dstColorBlendFactor;
VkBlendOp colorBlendOp;
VkBlendFactor srcAlphaBlendFactor;
VkBlendFactor dstAlphaBlendFactor;
VkBlendOp alphaBlendOp;
VkColorComponentFlags colorWriteMask;
} VkPipelineColorBlendAttachmentState;
-
blendEnable
controls whether blending is enabled for the corresponding color attachment. If blending is not enabled, the source fragment’s color for that attachment is passed through unmodified. -
srcColorBlendFactor
selects which blend factor is used to determine the source factors (Sr,Sg,Sb). -
dstColorBlendFactor
selects which blend factor is used to determine the destination factors (Dr,Dg,Db). -
colorBlendOp
selects which blend operation is used to calculate the RGB values to write to the color attachment. -
srcAlphaBlendFactor
selects which blend factor is used to determine the source factor Sa. -
dstAlphaBlendFactor
selects which blend factor is used to determine the destination factor Da. -
alphaBlendOp
selects which blend operation is use to calculate the alpha values to write to the color attachment. -
colorWriteMask
is a bitmask of VkColorComponentFlagBits specifying which of the R, G, B, and/or A components are enabled for writing, as described for the Color Write Mask.
28.1.1. Blend Factors
The source and destination color and alpha blending factors are selected from the enum:
typedef enum VkBlendFactor {
VK_BLEND_FACTOR_ZERO = 0,
VK_BLEND_FACTOR_ONE = 1,
VK_BLEND_FACTOR_SRC_COLOR = 2,
VK_BLEND_FACTOR_ONE_MINUS_SRC_COLOR = 3,
VK_BLEND_FACTOR_DST_COLOR = 4,
VK_BLEND_FACTOR_ONE_MINUS_DST_COLOR = 5,
VK_BLEND_FACTOR_SRC_ALPHA = 6,
VK_BLEND_FACTOR_ONE_MINUS_SRC_ALPHA = 7,
VK_BLEND_FACTOR_DST_ALPHA = 8,
VK_BLEND_FACTOR_ONE_MINUS_DST_ALPHA = 9,
VK_BLEND_FACTOR_CONSTANT_COLOR = 10,
VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_COLOR = 11,
VK_BLEND_FACTOR_CONSTANT_ALPHA = 12,
VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_ALPHA = 13,
VK_BLEND_FACTOR_SRC_ALPHA_SATURATE = 14,
VK_BLEND_FACTOR_SRC1_COLOR = 15,
VK_BLEND_FACTOR_ONE_MINUS_SRC1_COLOR = 16,
VK_BLEND_FACTOR_SRC1_ALPHA = 17,
VK_BLEND_FACTOR_ONE_MINUS_SRC1_ALPHA = 18,
} VkBlendFactor;
The semantics of each enum value is described in the table below:
VkBlendFactor | RGB Blend Factors (Sr,Sg,Sb) or (Dr,Dg,Db) | Alpha Blend Factor (Sa or Da) |
---|---|---|
|
(0,0,0) |
0 |
|
(1,1,1) |
1 |
|
(Rs0,Gs0,Bs0) |
As0 |
|
(1-Rs0,1-Gs0,1-Bs0) |
1-As0 |
|
(Rd,Gd,Bd) |
Ad |
|
(1-Rd,1-Gd,1-Bd) |
1-Ad |
|
(As0,As0,As0) |
As0 |
|
(1-As0,1-As0,1-As0) |
1-As0 |
|
(Ad,Ad,Ad) |
Ad |
|
(1-Ad,1-Ad,1-Ad) |
1-Ad |
|
(Rc,Gc,Bc) |
Ac |
|
(1-Rc,1-Gc,1-Bc) |
1-Ac |
|
(Ac,Ac,Ac) |
Ac |
|
(1-Ac,1-Ac,1-Ac) |
1-Ac |
|
(f,f,f); f = min(As0,1-Ad) |
1 |
|
(Rs1,Gs1,Bs1) |
As1 |
|
(1-Rs1,1-Gs1,1-Bs1) |
1-As1 |
|
(As1,As1,As1) |
As1 |
|
(1-As1,1-As1,1-As1) |
1-As1 |
In this table, the following conventions are used:
-
Rs0,Gs0,Bs0 and As0 represent the first source color R, G, B, and A components, respectively, for the fragment output location corresponding to the color attachment being blended.
-
Rs1,Gs1,Bs1 and As1 represent the second source color R, G, B, and A components, respectively, used in dual source blending modes, for the fragment output location corresponding to the color attachment being blended.
-
Rd,Gd,Bd and Ad represent the R, G, B, and A components of the destination color. That is, the color currently in the corresponding color attachment for this fragment/sample.
-
Rc,Gc,Bc and Ac represent the blend constant R, G, B, and A components, respectively.
If the pipeline state object is created without the
VK_DYNAMIC_STATE_BLEND_CONSTANTS
dynamic state enabled then the blend
constant (Rc,Gc,Bc,Ac) is specified via the
blendConstants
member of VkPipelineColorBlendStateCreateInfo.
Otherwise, to dynamically set and change the blend constant, call:
void vkCmdSetBlendConstants(
VkCommandBuffer commandBuffer,
const float blendConstants[4]);
-
commandBuffer
is the command buffer into which the command will be recorded. -
blendConstants
is an array of four values specifying the R, G, B, and A components of the blend constant color used in blending, depending on the blend factor.
28.1.2. Dual-Source Blending
Blend factors that use the secondary color input
(Rs1,Gs1,Bs1,As1) (VK_BLEND_FACTOR_SRC1_COLOR
,
VK_BLEND_FACTOR_ONE_MINUS_SRC1_COLOR
,
VK_BLEND_FACTOR_SRC1_ALPHA
, and
VK_BLEND_FACTOR_ONE_MINUS_SRC1_ALPHA
) may consume implementation
resources that could otherwise be used for rendering to multiple color
attachments.
Therefore, the number of color attachments that can be used in a
framebuffer may be lower when using dual-source blending.
Dual-source blending is only supported if the
dualSrcBlend
feature is enabled.
The maximum number of color attachments that can be used in a subpass when
using dual-source blending functions is implementation-dependent and is
reported as the maxFragmentDualSrcAttachments
member of
VkPhysicalDeviceLimits
.
When using a fragment shader with dual-source blending functions, the color
outputs are bound to the first and second inputs of the blender using the
Index
decoration, as described in Fragment
Output Interface.
If the second color input to the blender is not written in the shader, or if
no output is bound to the second input of a blender, the result of the
blending operation is not defined.
28.1.3. Blend Operations
Once the source and destination blend factors have been selected, they along with the source and destination components are passed to the blending operations. RGB and alpha components can use different operations. Possible values of VkBlendOp, specifying the operations, are:
typedef enum VkBlendOp {
VK_BLEND_OP_ADD = 0,
VK_BLEND_OP_SUBTRACT = 1,
VK_BLEND_OP_REVERSE_SUBTRACT = 2,
VK_BLEND_OP_MIN = 3,
VK_BLEND_OP_MAX = 4,
VK_BLEND_OP_ZERO_EXT = 1000148000,
VK_BLEND_OP_SRC_EXT = 1000148001,
VK_BLEND_OP_DST_EXT = 1000148002,
VK_BLEND_OP_SRC_OVER_EXT = 1000148003,
VK_BLEND_OP_DST_OVER_EXT = 1000148004,
VK_BLEND_OP_SRC_IN_EXT = 1000148005,
VK_BLEND_OP_DST_IN_EXT = 1000148006,
VK_BLEND_OP_SRC_OUT_EXT = 1000148007,
VK_BLEND_OP_DST_OUT_EXT = 1000148008,
VK_BLEND_OP_SRC_ATOP_EXT = 1000148009,
VK_BLEND_OP_DST_ATOP_EXT = 1000148010,
VK_BLEND_OP_XOR_EXT = 1000148011,
VK_BLEND_OP_MULTIPLY_EXT = 1000148012,
VK_BLEND_OP_SCREEN_EXT = 1000148013,
VK_BLEND_OP_OVERLAY_EXT = 1000148014,
VK_BLEND_OP_DARKEN_EXT = 1000148015,
VK_BLEND_OP_LIGHTEN_EXT = 1000148016,
VK_BLEND_OP_COLORDODGE_EXT = 1000148017,
VK_BLEND_OP_COLORBURN_EXT = 1000148018,
VK_BLEND_OP_HARDLIGHT_EXT = 1000148019,
VK_BLEND_OP_SOFTLIGHT_EXT = 1000148020,
VK_BLEND_OP_DIFFERENCE_EXT = 1000148021,
VK_BLEND_OP_EXCLUSION_EXT = 1000148022,
VK_BLEND_OP_INVERT_EXT = 1000148023,
VK_BLEND_OP_INVERT_RGB_EXT = 1000148024,
VK_BLEND_OP_LINEARDODGE_EXT = 1000148025,
VK_BLEND_OP_LINEARBURN_EXT = 1000148026,
VK_BLEND_OP_VIVIDLIGHT_EXT = 1000148027,
VK_BLEND_OP_LINEARLIGHT_EXT = 1000148028,
VK_BLEND_OP_PINLIGHT_EXT = 1000148029,
VK_BLEND_OP_HARDMIX_EXT = 1000148030,
VK_BLEND_OP_HSL_HUE_EXT = 1000148031,
VK_BLEND_OP_HSL_SATURATION_EXT = 1000148032,
VK_BLEND_OP_HSL_COLOR_EXT = 1000148033,
VK_BLEND_OP_HSL_LUMINOSITY_EXT = 1000148034,
VK_BLEND_OP_PLUS_EXT = 1000148035,
VK_BLEND_OP_PLUS_CLAMPED_EXT = 1000148036,
VK_BLEND_OP_PLUS_CLAMPED_ALPHA_EXT = 1000148037,
VK_BLEND_OP_PLUS_DARKER_EXT = 1000148038,
VK_BLEND_OP_MINUS_EXT = 1000148039,
VK_BLEND_OP_MINUS_CLAMPED_EXT = 1000148040,
VK_BLEND_OP_CONTRAST_EXT = 1000148041,
VK_BLEND_OP_INVERT_OVG_EXT = 1000148042,
VK_BLEND_OP_RED_EXT = 1000148043,
VK_BLEND_OP_GREEN_EXT = 1000148044,
VK_BLEND_OP_BLUE_EXT = 1000148045,
} VkBlendOp;
The semantics of each basic blend operations is described in the table below:
VkBlendOp | RGB Components | Alpha Component |
---|---|---|
|
R = Rs0 × Sr + Rd × Dr |
A = As0 × Sa + Ad × Da |
|
R = Rs0 × Sr - Rd × Dr |
A = As0 × Sa - Ad × Da |
|
R = Rd × Dr - Rs0 × Sr |
A = Ad × Da - As0 × Sa |
|
R = min(Rs0,Rd) |
A = min(As0,Ad) |
|
R = max(Rs0,Rd) |
A = max(As0,Ad) |
In this table, the following conventions are used:
-
Rs0, Gs0, Bs0 and As0 represent the first source color R, G, B, and A components, respectively.
-
Rd, Gd, Bd and Ad represent the R, G, B, and A components of the destination color. That is, the color currently in the corresponding color attachment for this fragment/sample.
-
Sr, Sg, Sb and Sa represent the source blend factor R, G, B, and A components, respectively.
-
Dr, Dg, Db and Da represent the destination blend factor R, G, B, and A components, respectively.
The blending operation produces a new set of values R, G, B and A, which are written to the framebuffer attachment. If blending is not enabled for this attachment, then R, G, B and A are assigned Rs0, Gs0, Bs0 and As0, respectively.
If the color attachment is fixed-point, the components of the source and destination values and blend factors are each clamped to [0,1] or [-1,1] respectively for an unsigned normalized or signed normalized color attachment prior to evaluating the blend operations. If the color attachment is floating-point, no clamping occurs.
If the numeric format of a framebuffer attachment uses sRGB encoding, the R, G, and B destination color values (after conversion from fixed-point to floating-point) are considered to be encoded for the sRGB color space and hence are linearized prior to their use in blending. Each R, G, and B component is converted from nonlinear to linear as described in the “sRGB EOTF” section of the Khronos Data Format Specification. If the format is not sRGB, no linearization is performed.
If the numeric format of a framebuffer attachment uses sRGB encoding, then the final R, G and B values are converted into the nonlinear sRGB representation before being written to the framebuffer attachment as described in the “sRGB EOTF -1” section of the Khronos Data Format Specification.
If the framebuffer color attachment numeric format is not sRGB encoded then the resulting cs values for R, G and B are unmodified. The value of A is never sRGB encoded. That is, the alpha component is always stored in memory as linear.
If the framebuffer color attachment is VK_ATTACHMENT_UNUSED
, no writes
are performed through that attachment.
Framebuffer color attachments greater than or equal to
VkSubpassDescription
::colorAttachmentCount
perform no writes.
28.1.4. Advanced Blend Operations
The advanced blend operations are those listed in tables f/X/Y/Z Advanced Blend Operations, Hue-Saturation-Luminosity Advanced Blend Operations, and Additional RGB Blend Operations.
If the pNext
chain of VkPipelineColorBlendStateCreateInfo
includes a VkPipelineColorBlendAdvancedStateCreateInfoEXT
structure,
then that structure includes parameters that affect advanced blend
operations.
The VkPipelineColorBlendAdvancedStateCreateInfoEXT
structure is
defined as:
typedef struct VkPipelineColorBlendAdvancedStateCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkBool32 srcPremultiplied;
VkBool32 dstPremultiplied;
VkBlendOverlapEXT blendOverlap;
} VkPipelineColorBlendAdvancedStateCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
srcPremultiplied
specifies whether the source color of the blend operation is treated as premultiplied. -
dstPremultiplied
specifies whether the destination color of the blend operation is treated as premultiplied. -
blendOverlap
is a VkBlendOverlapEXT value specifying how the source and destination sample’s coverage is correlated.
If this structure is not present, srcPremultiplied
and
dstPremultiplied
are both considered to be VK_TRUE
, and
blendOverlap
is considered to be
VK_BLEND_OVERLAP_UNCORRELATED_EXT
.
When using one of the operations in table f/X/Y/Z Advanced Blend Operations or Hue-Saturation-Luminosity Advanced Blend Operations, blending is performed according to the following equations:
where the function f and terms X, Y, and Z are specified in the table.
The R, G, and B components of the source color used for blending are derived
according to srcPremultiplied
.
If srcPremultiplied
is set to VK_TRUE
, the fragment color
components are considered to have been premultiplied by the A component
prior to blending.
The base source color (Rs',Gs',Bs') is obtained by dividing
through by the A component:
If srcPremultiplied
is VK_FALSE
, the fragment color components
are used as the base color:
The R, G, and B components of the destination color used for blending are
derived according to dstPremultiplied
.
If dstPremultiplied
is set to VK_TRUE
, the destination
components are considered to have been premultiplied by the A component
prior to blending.
The base destination color (Rd',Gd',Bd') is obtained by dividing
through by the A component:
If dstPremultiplied
is VK_FALSE
, the destination color
components are used as the base color:
When blending using advanced blend operations, we expect that the R, G, and B components of premultiplied source and destination color inputs be stored as the product of non-premultiplied R, G, and B component values and the A component of the color. If any R, G, or B component of a premultiplied input color is non-zero and the A component is zero, the color is considered ill-formed, and the corresponding component of the blend result is undefined.
The weighting functions p0, p1, and p2 are defined in table Advanced Blend Overlap Modes. In these functions, the A components of the source and destination colors are taken to indicate the portion of the pixel covered by the fragment (source) and the fragments previously accumulated in the pixel (destination). The functions p0, p1, and p2 approximate the relative portion of the pixel covered by the intersection of the source and destination, covered only by the source, and covered only by the destination, respectively.
Possible values of
VkPipelineColorBlendAdvancedStateCreateInfoEXT::blendOverlap
,
specifying the blend overlap functions, are:
typedef enum VkBlendOverlapEXT {
VK_BLEND_OVERLAP_UNCORRELATED_EXT = 0,
VK_BLEND_OVERLAP_DISJOINT_EXT = 1,
VK_BLEND_OVERLAP_CONJOINT_EXT = 2,
} VkBlendOverlapEXT;
-
VK_BLEND_OVERLAP_UNCORRELATED_EXT
specifies that there is no correlation between the source and destination coverage. -
VK_BLEND_OVERLAP_CONJOINT_EXT
specifies that the source and destination coverage are considered to have maximal overlap. -
VK_BLEND_OVERLAP_DISJOINT_EXT
specifies that the source and destination coverage are considered to have minimal overlap.
Overlap Mode | Weighting Equations |
---|---|
|
\[ \begin{aligned}
p_0(A_s,A_d) & = A_sA_d \\
p_1(A_s,A_d) & = A_s(1-A_d) \\
p_2(A_s,A_d) & = A_d(1-A_s) \\
\end{aligned}\]
|
|
\[ \begin{aligned}
p_0(A_s,A_d) & = min(A_s,A_d) \\
p_1(A_s,A_d) & = max(A_s-A_d,0) \\
p_2(A_s,A_d) & = max(A_d-A_s,0) \\
\end{aligned}\]
|
|
\[ \begin{aligned}
p_0(A_s,A_d) & = max(A_s+A_d-1,0) \\
p_1(A_s,A_d) & = min(A_s,1-A_d) \\
p_2(A_s,A_d) & = min(A_d,1-A_s) \\
\end{aligned}\]
|
Mode | Blend Coefficients |
---|---|
|
\[ \begin{aligned}
(X,Y,Z) & = (0,0,0) \\
f(C_s,C_d) & = 0
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,0) \\
f(C_s,C_d) & = C_s
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,0,1) \\
f(C_s,C_d) & = C_d
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = C_s
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = C_d
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,0,0) \\
f(C_s,C_d) & = C_s
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,0,0) \\
f(C_s,C_d) & = C_d
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (0,1,0) \\
f(C_s,C_d) & = 0
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (0,0,1) \\
f(C_s,C_d) & = 0
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,0,1) \\
f(C_s,C_d) & = C_s
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,0) \\
f(C_s,C_d) & = C_d
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (0,1,1) \\
f(C_s,C_d) & = 0
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = C_sC_d
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = C_s+C_d-C_sC_d
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
2 C_sC_d & C_d \leq 0.5 \\
1-2 (1-C_s)(1-C_d) & \text{otherwise}
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = min(C_s,C_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = max(C_s,C_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
0 & C_d \leq 0 \\
min(1,\frac{C_d}{1-C_s}) & C_d \gt 0 \text{ and } C_s \lt 1 \\
1 & C_d \gt 0 \text{ and } C_s \geq 1
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
1 & C_d \geq 1 \\
1 - min(1,\frac{1-C_d}{C_s}) & C_d \lt 1 \text{ and } C_s \gt 0 \\
0 & C_d \lt 1 \text{ and } C_s \leq 0
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
2 C_sC_d & C_s \leq 0.5 \\
1-2 (1-C_s)(1-C_d) & \text{otherwise}
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
C_d-(1-2 C_s)C_d(1-C_d) & C_s \leq 0.5 \\
C_d+(2 C_s-1)C_d((16 C_d-12)C_d+3) & C_s \gt 0.5 \text{ and } C_d \leq 0.25 \\
C_d+(2 C_s-1)(\sqrt{C_d}-C_d) & C_s \gt 0.5 \text{ and } C_d \gt 0.25
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = \lvert C_d-C_s \rvert
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = C_s+C_d-2C_sC_d
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,0,1) \\
f(C_s,C_d) & = 1-C_d
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,0,1) \\
f(C_s,C_d) & = C_s(1-C_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
C_s+C_d & C_s+C_d \leq 1 \\
1 & \text{otherwise}
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
C_s+C_d-1 & C_s+C_d \gt 1 \\
0 & \text{otherwise}
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
1-min(1,\frac{1-C_d}{2C_s}) & 0 \lt C_s \lt 0.5 \\
0 & C_s \leq 0 \\
min(1,\frac{C_d}{2(1-C_s)}) & 0.5 \leq C_s \lt 1 \\
1 & C_s \geq 1
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
1 & 2C_s+C_d \gt 2 \\
2C_s+C_d-1 & 1 \lt 2C_s+C_d \leq 2 \\
0 & 2C_s+C_d \leq 1
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
0 & 2C_s-1 \gt C_d \text{ and } C_s \lt 0.5 \\
2C_s-1 & 2C_s-1 \gt C_d \text{ and } C_s \geq 0.5 \\
2C_s & 2C_s-1 \leq C_d \text{ and } C_s \lt 0.5C_d \\
C_d & 2C_s-1 \leq C_d \text{ and } C_s \geq 0.5C_d
\end{cases}
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & =
\begin{cases}
0 & C_s+C_d \lt 1 \\
1 & \text{otherwise}
\end{cases}
\end{aligned}\]
|
When using one of the HSL blend operations in table Hue-Saturation-Luminosity Advanced Blend Operations as the blend operation, the RGB color components produced by the function f are effectively obtained by converting both the non-premultiplied source and destination colors to the HSL (hue, saturation, luminosity) color space, generating a new HSL color by selecting H, S, and L components from the source or destination according to the blend operation, and then converting the result back to RGB. In the equations below, a blended RGB color is produced according to the following pseudocode:
float minv3(vec3 c) {
return min(min(c.r, c.g), c.b);
}
float maxv3(vec3 c) {
return max(max(c.r, c.g), c.b);
}
float lumv3(vec3 c) {
return dot(c, vec3(0.30, 0.59, 0.11));
}
float satv3(vec3 c) {
return maxv3(c) - minv3(c);
}
// If any color components are outside [0,1], adjust the color to
// get the components in range.
vec3 ClipColor(vec3 color) {
float lum = lumv3(color);
float mincol = minv3(color);
float maxcol = maxv3(color);
if (mincol < 0.0) {
color = lum + ((color-lum)*lum) / (lum-mincol);
}
if (maxcol > 1.0) {
color = lum + ((color-lum)*lum) / (maxcol-lum);
}
return color;
}
// Take the base RGB color <cbase> and override its luminosity
// with that of the RGB color <clum>.
vec3 SetLum(vec3 cbase, vec3 clum) {
float lbase = lumv3(cbase);
float llum = lumv3(clum);
float ldiff = llum - lbase;
vec3 color = cbase + vec3(ldiff);
return ClipColor(color);
}
// Take the base RGB color <cbase> and override its saturation with
// that of the RGB color <csat>. The override the luminosity of the
// result with that of the RGB color <clum>.
vec3 SetLumSat(vec3 cbase, vec3 csat, vec3 clum)
{
float minbase = minv3(cbase);
float sbase = satv3(cbase);
float ssat = satv3(csat);
vec3 color;
if (sbase > 0) {
// Equivalent (modulo rounding errors) to setting the
// smallest (R,G,B) component to 0, the largest to <ssat>,
// and interpolating the "middle" component based on its
// original value relative to the smallest/largest.
color = (cbase - minbase) * ssat / sbase;
} else {
color = vec3(0.0);
}
return SetLum(color, clum);
}
Mode | Result |
---|---|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = SetLumSat(C_s,C_d,C_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = SetLumSat(C_d,C_s,C_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = SetLum(C_s,C_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(X,Y,Z) & = (1,1,1) \\
f(C_s,C_d) & = SetLum(C_d,C_s)
\end{aligned}\]
|
When using one of the operations in table
Additional RGB Blend
Operations as the blend operation, the source and destination colors used
by these blending operations are interpreted according to
srcPremultiplied
and dstPremultiplied
.
The blending operations below are evaluated where the RGB source and
destination color components are both considered to have been premultiplied
by the corresponding A component.
Mode | Result |
---|---|
|
\[ \begin{aligned}
(R,G,B,A) = ( & R_s'+R_d', \\
& G_s'+G_d', \\
& B_s'+B_d', \\
& A_s+A_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) = ( & min(1,R_s'+R_d'), \\
& min(1,G_s'+G_d'), \\
& min(1,B_s'+B_d'), \\
& min(1,A_s+A_d))
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) = ( & min(min(1,A_s+A_d),R_s'+R_d'), \\
& min(min(1,A_s+A_d),G_s'+G_d'), \\
& min(min(1,A_s+A_d),B_s'+B_d'), \\
& min(1,A_s+A_d))
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) = ( & max(0,min(1,A_s+A_d)-((A_s-R_s')+(A_d-R_d'))), \\
& max(0,min(1,A_s+A_d)-((A_s-G_s')+(A_d-G_d'))), \\
& max(0,min(1,A_s+A_d)-((A_s-B_s')+(A_d-B_d'))), \\
& min(1,A_s+A_d))
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) = ( & R_d'-R_s', \\
& G_d'-G_s', \\
& B_d'-B_s', \\
& A_d-A_s)
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) = ( & max(0,R_d'-R_s'), \\
& max(0,G_d'-G_s'), \\
& max(0,B_d'-B_s'), \\
& max(0,A_d-A_s))
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) = ( & \frac{A_d}{2} + 2(R_d'-\frac{A_d}{2})(R_s'-\frac{A_s}{2}), \\
& \frac{A_d}{2} + 2(G_d'-\frac{A_d}{2})(G_s'-\frac{A_s}{2}), \\
& \frac{A_d}{2} + 2(B_d'-\frac{A_d}{2})(B_s'-\frac{A_s}{2}), \\
& A_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) = ( & A_s(1-R_d') + (1-A_s)R_d', \\
& A_s(1-G_d') + (1-A_s)G_d', \\
& A_s(1-B_d') + (1-A_s)B_d', \\
& A_s+A_d-A_sA_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) & = (R_s', G_d', B_d', A_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) & = (R_d', G_s', B_d', A_d)
\end{aligned}\]
|
|
\[ \begin{aligned}
(R,G,B,A) & = (R_d', G_d', B_s', A_d)
\end{aligned}\]
|
28.2. Logical Operations
The application can enable a logical operation between the fragment’s color values and the existing value in the framebuffer attachment. This logical operation is applied prior to updating the framebuffer attachment. Logical operations are applied only for signed and unsigned integer and normalized integer framebuffers. Logical operations are not applied to floating-point or sRGB format color attachments.
Logical operations are controlled by the logicOpEnable
and
logicOp
members of VkPipelineColorBlendStateCreateInfo.
If logicOpEnable
is VK_TRUE
, then a logical operation selected
by logicOp
is applied between each color attachment and the fragment’s
corresponding output value, and blending of all attachments is treated as if
it were disabled.
Any attachments using color formats for which logical operations are not
supported simply pass through the color values unmodified.
The logical operation is applied independently for each of the red, green,
blue, and alpha components.
The logicOp
is selected from the following operations:
typedef enum VkLogicOp {
VK_LOGIC_OP_CLEAR = 0,
VK_LOGIC_OP_AND = 1,
VK_LOGIC_OP_AND_REVERSE = 2,
VK_LOGIC_OP_COPY = 3,
VK_LOGIC_OP_AND_INVERTED = 4,
VK_LOGIC_OP_NO_OP = 5,
VK_LOGIC_OP_XOR = 6,
VK_LOGIC_OP_OR = 7,
VK_LOGIC_OP_NOR = 8,
VK_LOGIC_OP_EQUIVALENT = 9,
VK_LOGIC_OP_INVERT = 10,
VK_LOGIC_OP_OR_REVERSE = 11,
VK_LOGIC_OP_COPY_INVERTED = 12,
VK_LOGIC_OP_OR_INVERTED = 13,
VK_LOGIC_OP_NAND = 14,
VK_LOGIC_OP_SET = 15,
} VkLogicOp;
The logical operations supported by Vulkan are summarized in the following table in which
-
¬ is bitwise invert,
-
∧ is bitwise and,
-
∨ is bitwise or,
-
⊕ is bitwise exclusive or,
-
s is the fragment’s Rs0, Gs0, Bs0 or As0 component value for the fragment output corresponding to the color attachment being updated, and
-
d is the color attachment’s R, G, B or A component value:
Mode | Operation |
---|---|
|
0 |
|
s ∧ d |
|
s ∧ ¬ d |
|
s |
|
¬ s ∧ d |
|
d |
|
s ⊕ d |
|
s ∨ d |
|
¬ (s ∨ d) |
|
¬ (s ⊕ d) |
|
¬ d |
|
s ∨ ¬ d |
|
¬ s |
|
¬ s ∨ d |
|
¬ (s ∧ d) |
|
all 1s |
The result of the logical operation is then written to the color attachment as controlled by the component write mask, described in Blend Operations.
28.3. Color Write Mask
Bits which can be set in
VkPipelineColorBlendAttachmentState::colorWriteMask
to determine
whether the final color values R, G, B and A are written to the
framebuffer attachment are:
typedef enum VkColorComponentFlagBits {
VK_COLOR_COMPONENT_R_BIT = 0x00000001,
VK_COLOR_COMPONENT_G_BIT = 0x00000002,
VK_COLOR_COMPONENT_B_BIT = 0x00000004,
VK_COLOR_COMPONENT_A_BIT = 0x00000008,
} VkColorComponentFlagBits;
-
VK_COLOR_COMPONENT_R_BIT
specifies that the R value is written to the color attachment for the appropriate sample. Otherwise, the value in memory is unmodified. -
VK_COLOR_COMPONENT_G_BIT
specifies that the G value is written to the color attachment for the appropriate sample. Otherwise, the value in memory is unmodified. -
VK_COLOR_COMPONENT_B_BIT
specifies that the B value is written to the color attachment for the appropriate sample. Otherwise, the value in memory is unmodified. -
VK_COLOR_COMPONENT_A_BIT
specifies that the A value is written to the color attachment for the appropriate sample. Otherwise, the value in memory is unmodified.
The color write mask operation is applied regardless of whether blending is enabled.
typedef VkFlags VkColorComponentFlags;
VkColorComponentFlags
is a bitmask type for setting a mask of zero or
more VkColorComponentFlagBits.
29. Dispatching Commands
Dispatching commands (commands with Dispatch
in the name) provoke
work in a compute pipeline.
Dispatching commands are recorded into a command buffer and when executed by
a queue, will produce work which executes according to the bound compute
pipeline.
A compute pipeline must be bound to a command buffer before any dispatch
commands are recorded in that command buffer.
To record a dispatch, call:
void vkCmdDispatch(
VkCommandBuffer commandBuffer,
uint32_t groupCountX,
uint32_t groupCountY,
uint32_t groupCountZ);
-
commandBuffer
is the command buffer into which the command will be recorded. -
groupCountX
is the number of local workgroups to dispatch in the X dimension. -
groupCountY
is the number of local workgroups to dispatch in the Y dimension. -
groupCountZ
is the number of local workgroups to dispatch in the Z dimension.
When the command is executed, a global workgroup consisting of groupCountX × groupCountY × groupCountZ local workgroups is assembled.
To record an indirect command dispatch, call:
void vkCmdDispatchIndirect(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
VkDeviceSize offset);
-
commandBuffer
is the command buffer into which the command will be recorded. -
buffer
is the buffer containing dispatch parameters. -
offset
is the byte offset intobuffer
where parameters begin.
vkCmdDispatchIndirect
behaves similarly to vkCmdDispatch except
that the parameters are read by the device from a buffer during execution.
The parameters of the dispatch are encoded in a
VkDispatchIndirectCommand structure taken from buffer
starting
at offset
.
The VkDispatchIndirectCommand
structure is defined as:
typedef struct VkDispatchIndirectCommand {
uint32_t x;
uint32_t y;
uint32_t z;
} VkDispatchIndirectCommand;
-
x
is the number of local workgroups to dispatch in the X dimension. -
y
is the number of local workgroups to dispatch in the Y dimension. -
z
is the number of local workgroups to dispatch in the Z dimension.
The members of VkDispatchIndirectCommand
have the same meaning as the
corresponding parameters of vkCmdDispatch.
To record a dispatch using non-zero base values for the components of
WorkgroupId
, call:
void vkCmdDispatchBase(
VkCommandBuffer commandBuffer,
uint32_t baseGroupX,
uint32_t baseGroupY,
uint32_t baseGroupZ,
uint32_t groupCountX,
uint32_t groupCountY,
uint32_t groupCountZ);
or the equivalent command
void vkCmdDispatchBaseKHR(
VkCommandBuffer commandBuffer,
uint32_t baseGroupX,
uint32_t baseGroupY,
uint32_t baseGroupZ,
uint32_t groupCountX,
uint32_t groupCountY,
uint32_t groupCountZ);
-
commandBuffer
is the command buffer into which the command will be recorded. -
baseGroupX
is the start value for the X component ofWorkgroupId
. -
baseGroupY
is the start value for the Y component ofWorkgroupId
. -
baseGroupZ
is the start value for the Z component ofWorkgroupId
. -
groupCountX
is the number of local workgroups to dispatch in the X dimension. -
groupCountY
is the number of local workgroups to dispatch in the Y dimension. -
groupCountZ
is the number of local workgroups to dispatch in the Z dimension.
When the command is executed, a global workgroup consisting of
groupCountX × groupCountY × groupCountZ local workgroups
is assembled, with WorkgroupId
values ranging from [baseGroup,
baseGroup + groupCount) in each component.
vkCmdDispatch is equivalent to
vkCmdDispatchBase(0,0,0,groupCountX,groupCountY,groupCountZ)
.
30. Device-Generated Commands
This chapter discusses the generation of command buffer content on the device. These principle steps are to be taken to generate commands on the device:
-
Make resource bindings accessible for the device via registering in a
VkObjectTableNVX
. -
Define via
VkIndirectCommandsLayoutNVX
the sequence of commands which should be generated. -
Fill one or more
VkBuffer
with the appropriate content that gets interpreted byVkIndirectCommandsLayoutNVX
. -
Reserve command space via vkCmdReserveSpaceForCommandsNVX in a secondary
VkCommandBuffer
where the generated commands should be recorded. -
Generate the actual commands via vkCmdProcessCommandsNVX passing all required data.
Execution of such generated commands can either be triggered directly with
the generation process, or by executing the secondary VkCommandBuffer
that was chosen as optional target.
The latter allows re-using generated commands as well.
Similar to VkDescriptorSet
, special care should be taken for the
lifetime of resources referenced in VkObjectTableNVX
, which may be
accessed at either generation or execution time.
vkCmdProcessCommandsNVX executes in a separate logical pipeline from either graphics or compute. When generating commands into a secondary command buffer, the command generation must be explicitly synchronized against the secondary command buffer’s execution. When not using a secondary command buffer, the command generation is automatically synchronized against the command execution.
30.1. Features and Limitations
To query the support of related features and limitations, call:
void vkGetPhysicalDeviceGeneratedCommandsPropertiesNVX(
VkPhysicalDevice physicalDevice,
VkDeviceGeneratedCommandsFeaturesNVX* pFeatures,
VkDeviceGeneratedCommandsLimitsNVX* pLimits);
-
physicalDevice
is the handle to the physical device whose properties will be queried. -
pFeatures
points to an instance of the VkDeviceGeneratedCommandsFeaturesNVX structure, that will be filled with returned information. -
pLimits
points to an instance of the VkDeviceGeneratedCommandsLimitsNVX structure, that will be filled with returned information.
The VkDeviceGeneratedCommandsFeaturesNVX
structure is defined as:
typedef struct VkDeviceGeneratedCommandsFeaturesNVX {
VkStructureType sType;
const void* pNext;
VkBool32 computeBindingPointSupport;
} VkDeviceGeneratedCommandsFeaturesNVX;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
computeBindingPointSupport
specifies whether theVkObjectTableNVX
supports entries withVK_OBJECT_ENTRY_USAGE_GRAPHICS_BIT_NVX
bit set andVkIndirectCommandsLayoutNVX
supportsVK_PIPELINE_BIND_POINT_COMPUTE
.
The VkDeviceGeneratedCommandsLimitsNVX
structure is defined as:
typedef struct VkDeviceGeneratedCommandsLimitsNVX {
VkStructureType sType;
const void* pNext;
uint32_t maxIndirectCommandsLayoutTokenCount;
uint32_t maxObjectEntryCounts;
uint32_t minSequenceCountBufferOffsetAlignment;
uint32_t minSequenceIndexBufferOffsetAlignment;
uint32_t minCommandsTokenBufferOffsetAlignment;
} VkDeviceGeneratedCommandsLimitsNVX;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxIndirectCommandsLayoutTokenCount
the maximum number of tokens inVkIndirectCommandsLayoutNVX
. -
maxObjectEntryCounts
the maximum number of entries per resource type inVkObjectTableNVX
. -
minSequenceCountBufferOffsetAlignment
the minimum alignment for memory addresses optionally used invkCmdProcessCommandsNVX
. -
minSequenceIndexBufferOffsetAlignment
the minimum alignment for memory addresses optionally used invkCmdProcessCommandsNVX
. -
minCommandsTokenBufferOffsetAlignment
the minimum alignment for memory addresses optionally used invkCmdProcessCommandsNVX
.
30.2. Binding Object Table
The device-side bindings are registered inside a table:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkObjectTableNVX)
This is required as the CPU-side object pointers, for example when binding a
VkPipeline
or VkDescriptorSet
, cannot be used by the device.
The combination of VkObjectTableNVX
and uint32_t
table indices
stored inside a VkBuffer
serve that purpose during device command
generation.
At creation time the table is defined with a fixed amount of registration
slots for the individual resource types.
A detailed resource binding can then later be registered via
vkRegisterObjectsNVX at any uint32_t
index below the allocated
maximum.
30.2.1. Table Creation
To create object tables, call:
VkResult vkCreateObjectTableNVX(
VkDevice device,
const VkObjectTableCreateInfoNVX* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkObjectTableNVX* pObjectTable);
-
device
is the logical device that creates the object table. -
pCreateInfo
is a pointer to an instance of theVkObjectTableCreateInfoNVX
structure containing parameters affecting creation of the table. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pObjectTable
points to a VkObjectTableNVX handle in which the resulting object table is returned.
The VkObjectTableCreateInfoNVX
structure is defined as:
typedef struct VkObjectTableCreateInfoNVX {
VkStructureType sType;
const void* pNext;
uint32_t objectCount;
const VkObjectEntryTypeNVX* pObjectEntryTypes;
const uint32_t* pObjectEntryCounts;
const VkObjectEntryUsageFlagsNVX* pObjectEntryUsageFlags;
uint32_t maxUniformBuffersPerDescriptor;
uint32_t maxStorageBuffersPerDescriptor;
uint32_t maxStorageImagesPerDescriptor;
uint32_t maxSampledImagesPerDescriptor;
uint32_t maxPipelineLayouts;
} VkObjectTableCreateInfoNVX;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
objectCount
is the number of entry configurations that the object table supports. -
pObjectEntryTypes
is an array of VkObjectEntryTypeNVX values providing the entry type of a given configuration. -
pObjectEntryCounts
is an array of counts of how many objects can be registered in the table. -
pObjectEntryUsageFlags
is an array of bitmasks of VkObjectEntryUsageFlagBitsNVX specifying the binding usage of the entry. -
maxUniformBuffersPerDescriptor
is the maximum number ofVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
orVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
used by any single registeredVkDescriptorSet
in this table. -
maxStorageBuffersPerDescriptor
is the maximum number ofVK_DESCRIPTOR_TYPE_STORAGE_BUFFER
orVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
used by any single registeredVkDescriptorSet
in this table. -
maxStorageImagesPerDescriptor
is the maximum number ofVK_DESCRIPTOR_TYPE_STORAGE_IMAGE
orVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
used by any single registeredVkDescriptorSet
in this table. -
maxSampledImagesPerDescriptor
is the maximum number ofVK_DESCRIPTOR_TYPE_SAMPLER
,VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
,VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
orVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
used by any single registeredVkDescriptorSet
in this table. -
maxPipelineLayouts
is the maximum number of uniqueVkPipelineLayout
used by any registeredVkDescriptorSet
orVkPipeline
in this table.
Possible values of elements of the
VkObjectTableCreateInfoNVX::pObjectEntryTypes
array, specifying
the entry type of a configuration, are:
typedef enum VkObjectEntryTypeNVX {
VK_OBJECT_ENTRY_TYPE_DESCRIPTOR_SET_NVX = 0,
VK_OBJECT_ENTRY_TYPE_PIPELINE_NVX = 1,
VK_OBJECT_ENTRY_TYPE_INDEX_BUFFER_NVX = 2,
VK_OBJECT_ENTRY_TYPE_VERTEX_BUFFER_NVX = 3,
VK_OBJECT_ENTRY_TYPE_PUSH_CONSTANT_NVX = 4,
} VkObjectEntryTypeNVX;
-
VK_OBJECT_ENTRY_TYPE_DESCRIPTOR_SET_NVX
specifies aVkDescriptorSet
resource entry that is registered viaVkObjectTableDescriptorSetEntryNVX
. -
VK_OBJECT_ENTRY_TYPE_PIPELINE_NVX
specifies aVkPipeline
resource entry that is registered viaVkObjectTablePipelineEntryNVX
. -
VK_OBJECT_ENTRY_TYPE_INDEX_BUFFER_NVX
specifies aVkBuffer
resource entry that is registered viaVkObjectTableIndexBufferEntryNVX
. -
VK_OBJECT_ENTRY_TYPE_VERTEX_BUFFER_NVX
specifies aVkBuffer
resource entry that is registered viaVkObjectTableVertexBufferEntryNVX
. -
VK_OBJECT_ENTRY_TYPE_PUSH_CONSTANT_NVX
specifies the resource entry is registered viaVkObjectTablePushConstantEntryNVX
.
Bits which can be set in elements of the
VkObjectTableCreateInfoNVX::pObjectEntryUsageFlags
array,
specifying binding usage of an entry, are:
typedef enum VkObjectEntryUsageFlagBitsNVX {
VK_OBJECT_ENTRY_USAGE_GRAPHICS_BIT_NVX = 0x00000001,
VK_OBJECT_ENTRY_USAGE_COMPUTE_BIT_NVX = 0x00000002,
} VkObjectEntryUsageFlagBitsNVX;
-
VK_OBJECT_ENTRY_USAGE_GRAPHICS_BIT_NVX
specifies that the resource is bound toVK_PIPELINE_BIND_POINT_GRAPHICS
-
VK_OBJECT_ENTRY_USAGE_COMPUTE_BIT_NVX
specifies that the resource is bound toVK_PIPELINE_BIND_POINT_COMPUTE
typedef VkFlags VkObjectEntryUsageFlagsNVX;
VkObjectEntryUsageFlagsNVX
is a bitmask type for setting a mask of
zero or more VkObjectEntryUsageFlagBitsNVX.
To destroy an object table, call:
void vkDestroyObjectTableNVX(
VkDevice device,
VkObjectTableNVX objectTable,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the table. -
objectTable
is the table to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
30.2.2. Registering Objects
Resource bindings of Vulkan objects are registered at an arbitrary
uint32_t
index within an object table.
As long as the object table references such objects, they must not be
deleted.
VkResult vkRegisterObjectsNVX(
VkDevice device,
VkObjectTableNVX objectTable,
uint32_t objectCount,
const VkObjectTableEntryNVX* const* ppObjectTableEntries,
const uint32_t* pObjectIndices);
-
device
is the logical device that creates the object table. -
objectTable
is the table for which the resources are registered. -
objectCount
is the number of resources to register. -
ppObjectTableEntries
provides an array for detailed binding informations, each array element is a pointer to a struct of typeVkObjectTablePipelineEntryNVX
,VkObjectTableDescriptorSetEntryNVX
,VkObjectTableVertexBufferEntryNVX
,VkObjectTableIndexBufferEntryNVX
orVkObjectTablePushConstantEntryNVX
(see below for details). -
pObjectIndices
are the indices at which each resource is registered.
Common to all resource entries are:
typedef struct VkObjectTableEntryNVX {
VkObjectEntryTypeNVX type;
VkObjectEntryUsageFlagsNVX flags;
} VkObjectTableEntryNVX;
-
type
defines the entry type -
flags
defines which VkPipelineBindPoint the resource can be used with. Some entry types allow only a single flag to be set.
typedef struct VkObjectTablePipelineEntryNVX {
VkObjectEntryTypeNVX type;
VkObjectEntryUsageFlagsNVX flags;
VkPipeline pipeline;
} VkObjectTablePipelineEntryNVX;
-
pipeline
specifies the VkPipeline that this resource entry references.
typedef struct VkObjectTableDescriptorSetEntryNVX {
VkObjectEntryTypeNVX type;
VkObjectEntryUsageFlagsNVX flags;
VkPipelineLayout pipelineLayout;
VkDescriptorSet descriptorSet;
} VkObjectTableDescriptorSetEntryNVX;
-
pipelineLayout
specifies the VkPipelineLayout that thedescriptorSet
is used with. -
descriptorSet
specifies the VkDescriptorSet that can be bound with this entry.
typedef struct VkObjectTableVertexBufferEntryNVX {
VkObjectEntryTypeNVX type;
VkObjectEntryUsageFlagsNVX flags;
VkBuffer buffer;
} VkObjectTableVertexBufferEntryNVX;
-
buffer
specifies the VkBuffer that can be bound as vertex bufer
typedef struct VkObjectTableIndexBufferEntryNVX {
VkObjectEntryTypeNVX type;
VkObjectEntryUsageFlagsNVX flags;
VkBuffer buffer;
VkIndexType indexType;
} VkObjectTableIndexBufferEntryNVX;
-
buffer
specifies the VkBuffer that can be bound as index buffer -
indexType
specifies the VkIndexType used with this index buffer
typedef struct VkObjectTablePushConstantEntryNVX {
VkObjectEntryTypeNVX type;
VkObjectEntryUsageFlagsNVX flags;
VkPipelineLayout pipelineLayout;
VkShaderStageFlags stageFlags;
} VkObjectTablePushConstantEntryNVX;
-
pipelineLayout
specifies the VkPipelineLayout that the pushconstants are used with -
stageFlags
specifies the VkShaderStageFlags that the pushconstants are used with
Use the following command to unregister resources from an object table:
VkResult vkUnregisterObjectsNVX(
VkDevice device,
VkObjectTableNVX objectTable,
uint32_t objectCount,
const VkObjectEntryTypeNVX* pObjectEntryTypes,
const uint32_t* pObjectIndices);
-
device
is the logical device that creates the object table. -
objectTable
is the table from which the resources are unregistered. -
objectCount
is the number of resources being removed from the object table. -
pObjectEntryType
provides an array of VkObjectEntryTypeNVX for the resources being removed. -
pObjectIndices
provides the array of object indices to be removed.
30.3. Indirect Commands Layout
The device-side command generation happens through an iterative processing of an atomic sequence comprised of command tokens, which are represented by:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkIndirectCommandsLayoutNVX)
30.3.1. Tokenized Command Processing
The processing is in principle illustrated below:
void cmdProcessSequence(cmd, objectTable, indirectCommandsLayout, pIndirectCommandsTokens, s)
{
for (c = 0; c < indirectCommandsLayout.tokenCount; c++)
{
indirectCommandsLayout.pTokens[c].command (cmd, objectTable, pIndirectCommandsTokens[c], s);
}
}
void cmdProcessAllSequences(cmd, objectTable, indirectCommandsLayout, pIndirectCommandsTokens, sequencesCount)
{
for (s = 0; s < sequencesCount; s++)
{
cmdProcessSequence(cmd, objectTable, indirectCommandsLayout, pIndirectCommandsTokens, s);
}
}
The processing of each sequence is considered stateless, therefore all state
changes must occur prior work provoking commands within the sequence.
A single sequence is either strictly targeting
VK_PIPELINE_BIND_POINT_GRAPHICS
or
VK_PIPELINE_BIND_POINT_COMPUTE
.
The primary input data for each token is provided through VkBuffer
content at command generation time using vkCmdProcessCommandsNVX,
however some functional arguments, for example binding sets, are specified
at layout creation time.
The input size is different for each token.
Possible values of those elements of the
VkIndirectCommandsLayoutCreateInfoNVX::pTokens
array which
specify command tokens (other elements of the array specify command
parameters) are:
typedef enum VkIndirectCommandsTokenTypeNVX {
VK_INDIRECT_COMMANDS_TOKEN_TYPE_PIPELINE_NVX = 0,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DESCRIPTOR_SET_NVX = 1,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NVX = 2,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NVX = 3,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NVX = 4,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NVX = 5,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NVX = 6,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DISPATCH_NVX = 7,
} VkIndirectCommandsTokenTypeNVX;
Token type | Equivalent command |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The VkIndirectCommandsLayoutTokenNVX
structure specifies details to
the function arguments that need to be known at layout creation time:
typedef struct VkIndirectCommandsLayoutTokenNVX {
VkIndirectCommandsTokenTypeNVX tokenType;
uint32_t bindingUnit;
uint32_t dynamicCount;
uint32_t divisor;
} VkIndirectCommandsLayoutTokenNVX;
-
type
specifies the token command type. -
bindingUnit
has a different meaning depending on the type, please refer pseudo code further down for details. -
dynamicCount
has a different meaning depending on the type, please refer pseudo code further down for details. -
divisor
defines the rate at which the input data buffers are accessed.
The VkIndirectCommandsTokenNVX
structure specifies the input data for
a token at processing time.
typedef struct VkIndirectCommandsTokenNVX {
VkIndirectCommandsTokenTypeNVX tokenType;
VkBuffer buffer;
VkDeviceSize offset;
} VkIndirectCommandsTokenNVX;
-
tokenType
specifies the token command type. -
buffer
specifies the VkBuffer storing the functional arguments for each squence. These argumetns can be written by the device. -
offset
specified an offset intobuffer
where the arguments start.
The following code provides detailed information on how an individual sequence is processed:
void cmdProcessSequence(cmd, objectTable, indirectCommandsLayout, pIndirectCommandsTokens, s)
{
for (uint32_t c = 0; c < indirectCommandsLayout.tokenCount; c++){
input = pIndirectCommandsTokens[c];
i = s / indirectCommandsLayout.pTokens[c].divisor;
switch(input.type){
VK_INDIRECT_COMMANDS_TOKEN_TYPE_PIPELINE_NVX:
size_t stride = sizeof(uint32_t);
uint32_t* data = input.buffer.pointer( input.offset + stride * i );
uint32_t object = data[0];
vkCmdBindPipeline(cmd, indirectCommandsLayout.pipelineBindPoint,
objectTable.pipelines[ object ].pipeline);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DESCRIPTOR_SET_NVX:
size_t stride = sizeof(uint32_t) + sizeof(uint32_t) * indirectCommandsLayout.pTokens[c].dynamicCount;
uint32_t* data = input.buffer.pointer( input.offset + stride * i);
uint32_t object = data[0];
vkCmdBindDescriptorSets(cmd, indirectCommandsLayout.pipelineBindPoint,
objectTable.descriptorsets[ object ].layout,
indirectCommandsLayout.pTokens[ c ].bindingUnit,
1, &objectTable.descriptorsets[ object ].descriptorSet,
indirectCommandsLayout.pTokens[ c ].dynamicCount, data + 1);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NVX:
size_t stride = sizeof(uint32_t) + indirectCommandsLayout.pTokens[c].dynamicCount;
uint32_t* data = input.buffer.pointer( input.offset + stride * i );
uint32_t object = data[0];
vkCmdPushConstants(cmd,
objectTable.pushconstants[ object ].layout,
objectTable.pushconstants[ object ].stageFlags,
indirectCommandsLayout.pTokens[ c ].bindingUnit, indirectCommandsLayout.pTokens[c].dynamicCount, data + 1);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NVX:
size_t s tride = sizeof(uint32_t) + sizeof(uint32_t) * indirectCommandsLayout.pTokens[c].dynamicCount;
uint32_t* data = input.buffer.pointer( input.offset + stride * i );
uint32_t object = data[0];
vkCmdBindIndexBuffer(cmd,
objectTable.vertexbuffers[ object ].buffer,
indirectCommandsLayout.pTokens[ c ].dynamicCount ? data[1] : 0,
objectTable.vertexbuffers[ object ].indexType);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NVX:
size_t stride = sizeof(uint32_t) + sizeof(uint32_t) * indirectCommandsLayout.pTokens[c].dynamicCount;
uint32_t* data = input.buffer.pointer( input.offset + stride * i );
uint32_t object = data[0];
vkCmdBindVertexBuffers(cmd,
indirectCommandsLayout.pTokens[ c ].bindingUnit, 1,
&objectTable.vertexbuffers[ object ].buffer,
indirectCommandsLayout.pTokens[ c ].dynamicCount ? data + 1 : {0}); // device size handled as uint32_t
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NVX:
vkCmdDrawIndexedIndirect(cmd,
input.buffer,
sizeof(VkDrawIndexedIndirectCommand) * i + input.offset, 1, 0);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NVX:
vkCmdDrawIndirect(cmd,
input.buffer,
sizeof(VkDrawIndirectCommand) * i + input.offset, 1, 0);
break;
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DISPATCH_NVX:
vkCmdDispatchIndirect(cmd,
input.buffer,
sizeof(VkDispatchIndirectCommand) * i + input.offset);
break;
}
}
}
30.3.2. Creation and Deletion
Indirect command layouts are created by:
VkResult vkCreateIndirectCommandsLayoutNVX(
VkDevice device,
const VkIndirectCommandsLayoutCreateInfoNVX* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkIndirectCommandsLayoutNVX* pIndirectCommandsLayout);
-
device
is the logical device that creates the indirect command layout. -
pCreateInfo
is a pointer to an instance of theVkIndirectCommandsLayoutCreateInfoNVX
structure containing parameters affecting creation of the indirect command layout. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pIndirectCommandsLayout
points to aVkIndirectCommandsLayoutNVX
handle in which the resulting indirect command layout is returned.
The VkIndirectCommandsLayoutCreateInfoNVX
structure is defined as:
typedef struct VkIndirectCommandsLayoutCreateInfoNVX {
VkStructureType sType;
const void* pNext;
VkPipelineBindPoint pipelineBindPoint;
VkIndirectCommandsLayoutUsageFlagsNVX flags;
uint32_t tokenCount;
const VkIndirectCommandsLayoutTokenNVX* pTokens;
} VkIndirectCommandsLayoutCreateInfoNVX;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pipelineBindPoint
is the VkPipelineBindPoint that this layout targets. -
flags
is a bitmask of VkIndirectCommandsLayoutUsageFlagBitsNVX specifying usage hints of this layout. -
tokenCount
is the length of the individual command sequnce. -
pTokens
is an array describing each command token in detail. See VkIndirectCommandsTokenTypeNVX and VkIndirectCommandsLayoutTokenNVX below for details.
The following code illustrates some of the key flags:
void cmdProcessAllSequences(cmd, objectTable, indirectCommandsLayout, pIndirectCommandsTokens, sequencesCount, indexbuffer, indexbufferoffset)
{
for (s = 0; s < sequencesCount; s++)
{
sequence = s;
if (indirectCommandsLayout.flags & VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NVX) {
sequence = incoherent_implementation_dependent_permutation[ sequence ];
}
if (indirectCommandsLayout.flags & VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NVX) {
sequence = indexbuffer.load_uint32( sequence * sizeof(uint32_t) + indexbufferoffset);
}
cmdProcessSequence( cmd, objectTable, indirectCommandsLayout, pIndirectCommandsTokens, sequence );
}
}
Bits which can be set in
VkIndirectCommandsLayoutCreateInfoNVX::flags
, specifying usage
hints of an indirect command layout, are:
typedef enum VkIndirectCommandsLayoutUsageFlagBitsNVX {
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NVX = 0x00000001,
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_SPARSE_SEQUENCES_BIT_NVX = 0x00000002,
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EMPTY_EXECUTIONS_BIT_NVX = 0x00000004,
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NVX = 0x00000008,
} VkIndirectCommandsLayoutUsageFlagBitsNVX;
-
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NVX
specifies that the processing of sequences can happen at an implementation-dependent order, which is not guaranteed to be coherent across multiple invocations. -
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_SPARSE_SEQUENCES_BIT_NVX
specifies that there is likely a high difference between allocated number of sequences and actually used. -
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EMPTY_EXECUTIONS_BIT_NVX
specifies that there are likely many draw or dispatch calls that are zero-sized (zero grid dimension, no primitives to render). -
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NVX
specifies that the input data for the sequences is not implicitly indexed from 0..sequencesUsed but a user providedVkBuffer
encoding the index is provided.
typedef VkFlags VkIndirectCommandsLayoutUsageFlagsNVX;
VkIndirectCommandsLayoutUsageFlagsNVX
is a bitmask type for setting a
mask of zero or more VkIndirectCommandsLayoutUsageFlagBitsNVX.
Indirect command layouts are destroyed by:
void vkDestroyIndirectCommandsLayoutNVX(
VkDevice device,
VkIndirectCommandsLayoutNVX indirectCommandsLayout,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the layout. -
indirectCommandsLayout
is the table to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
30.4. Indirect Commands Generation
Command space for generated commands recorded into a secondary command buffer must be reserved by calling:
void vkCmdReserveSpaceForCommandsNVX(
VkCommandBuffer commandBuffer,
const VkCmdReserveSpaceForCommandsInfoNVX* pReserveSpaceInfo);
-
commandBuffer
is the secondary command buffer in which the space for device-generated commands is reserved. -
pProcessCommandsInfo
is a pointer to an instance of the VkCmdReserveSpaceForCommandsInfoNVX structure containing parameters affecting the reservation of command buffer space.
typedef struct VkCmdReserveSpaceForCommandsInfoNVX {
VkStructureType sType;
const void* pNext;
VkObjectTableNVX objectTable;
VkIndirectCommandsLayoutNVX indirectCommandsLayout;
uint32_t maxSequencesCount;
} VkCmdReserveSpaceForCommandsInfoNVX;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
objectTable
is the VkObjectTableNVX to be used for the generation process. Only registered objects at the time vkCmdReserveSpaceForCommandsNVX is called, will be taken into account for the reservation. -
indirectCommandsLayout
is the VkIndirectCommandsLayoutNVX that must also be used at generation time. -
maxSequencesCount
is the maximum number of sequences for which command buffer space will be reserved.
The generated commands will behave as if they were recorded within the call
to vkCmdReserveSpaceForCommandsNVX
, that means they can inherit state
defined in the command buffer prior this call.
However, given the stateless nature of the generated sequences, they will
not affect commands after the reserved space.
Treat the state that can be affected by the provided
VkIndirectCommandsLayoutNVX
as undefined.
The actual generation on the device is handled with:
void vkCmdProcessCommandsNVX(
VkCommandBuffer commandBuffer,
const VkCmdProcessCommandsInfoNVX* pProcessCommandsInfo);
-
commandBuffer
is the primary command buffer in which the generation process takes space. -
pProcessCommandsInfo
is a pointer to an instance of the VkCmdProcessCommandsInfoNVX structure containing parameters affecting the processing of commands.
typedef struct VkCmdProcessCommandsInfoNVX {
VkStructureType sType;
const void* pNext;
VkObjectTableNVX objectTable;
VkIndirectCommandsLayoutNVX indirectCommandsLayout;
uint32_t indirectCommandsTokenCount;
const VkIndirectCommandsTokenNVX* pIndirectCommandsTokens;
uint32_t maxSequencesCount;
VkCommandBuffer targetCommandBuffer;
VkBuffer sequencesCountBuffer;
VkDeviceSize sequencesCountOffset;
VkBuffer sequencesIndexBuffer;
VkDeviceSize sequencesIndexOffset;
} VkCmdProcessCommandsInfoNVX;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
objectTable
is the VkObjectTableNVX to be used for the generation process. Only registered objects at the time vkCmdReserveSpaceForCommandsNVX is called, will be taken into account for the reservation. -
indirectCommandsLayout
is the VkIndirectCommandsLayoutNVX that provides the command sequence to generate. -
indirectCommandsTokenCount
defines the number of input tokens used. -
pIndirectCommandsTokens
provides an array of VkIndirectCommandsTokenNVX that reference the input data for each token command. -
maxSequencesCount
is the maximum number of sequences for which command buffer space will be reserved. IfsequencesCountBuffer
is VK_NULL_HANDLE, this is also the actual number of sequences generated. -
targetCommandBuffer
can be the secondary VkCommandBuffer in which the commands should be recorded. IftargetCommandBuffer
isNULL
an implicit reservation as well as execution takes place on the processingVkCommandBuffer
. -
sequencesCountBuffer
can be VkBuffer from which the actual amount of sequences is sourced from asuint32_t
value. -
sequencesCountOffset
is the byte offset intosequencesCountBuffer
where the count value is stored. -
sequencesIndexBuffer
must be set ifindirectCommandsLayout
’sVK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NVX
is set and provides the used sequence indices asuint32_t
array. Otherwise it must be VK_NULL_HANDLE. -
sequencesIndexOffset
is the byte offset intosequencesIndexBuffer
where the index values start.
Referencing the functions defined in Indirect Commands Layout,
vkCmdProcessCommandsNVX
behaves as:
// For targetCommandBuffers the existing reservedSpace is reset & overwritten.
VkCommandBuffer cmd = targetCommandBuffer ?
targetCommandBuffer.reservedSpace :
commandBuffer;
uint32_t sequencesCount = sequencesCountBuffer ?
min(maxSequencesCount, sequencesCountBuffer.load_uint32(sequencesCountOffset) :
maxSequencesCount;
cmdProcessAllSequences(cmd, objectTable,
indirectCommandsLayout, pIndirectCommandsTokens,
sequencesCount,
sequencesIndexBuffer, sequencesIndexOffset);
// The stateful commands within indirectCommandsLayout will not
// affect the state of subsequent commands in the target
// command buffer (cmd)
Note
It is important to note that the state that may be affected through generated commands must be considered undefined for the commands following them. It is not possible to setup generated state and provoking work that uses this state outside of the generated sequence. |
31. Sparse Resources
As documented in Resource Memory Association,
VkBuffer
and VkImage
resources in Vulkan must be bound
completely and contiguously to a single VkDeviceMemory
object.
This binding must be done before the resource is used, and the binding is
immutable for the lifetime of the resource.
Sparse resources relax these restrictions and provide these additional features:
-
Sparse resources can be bound non-contiguously to one or more
VkDeviceMemory
allocations. -
Sparse resources can be re-bound to different memory allocations over the lifetime of the resource.
-
Sparse resources can have descriptors generated and used orthogonally with memory binding commands.
31.1. Sparse Resource Features
Sparse resources have several features that must be enabled explicitly at
resource creation time.
The features are enabled by including bits in the flags
parameter of
VkImageCreateInfo or VkBufferCreateInfo.
Each feature also has one or more corresponding feature enables specified in
VkPhysicalDeviceFeatures.
-
Sparse binding is the base feature, and provides the following capabilities:
-
Resources can be bound at some defined (sparse block) granularity.
-
The entire resource must be bound to memory before use regardless of regions actually accessed.
-
No specific mapping of image region to memory offset is defined, i.e. the location that each texel corresponds to in memory is implementation-dependent.
-
Sparse buffers have a well-defined mapping of buffer range to memory range, where an offset into a range of the buffer that is bound to a single contiguous range of memory corresponds to an identical offset within that range of memory.
-
Requested via the
VK_IMAGE_CREATE_SPARSE_BINDING_BIT
andVK_BUFFER_CREATE_SPARSE_BINDING_BIT
bits. -
A sparse image created using
VK_IMAGE_CREATE_SPARSE_BINDING_BIT
(but notVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
) supports all formats that non-sparse usage supports, and supports bothVK_IMAGE_TILING_OPTIMAL
andVK_IMAGE_TILING_LINEAR
tiling.
-
-
Sparse Residency builds on (and requires) the
sparseBinding
feature. It includes the following capabilities:-
Resources do not have to be completely bound to memory before use on the device.
-
Images have a prescribed sparse image block layout, allowing specific rectangular regions of the image to be bound to specific offsets in memory allocations.
-
Consistency of access to unbound regions of the resource is defined by the absence or presence of
VkPhysicalDeviceSparseProperties
::residencyNonResidentStrict
. If this property is present, accesses to unbound regions of the resource are well defined and behave as if the data bound is populated with all zeros; writes are discarded. When this property is absent, accesses are considered safe, but reads will return undefined values. -
Requested via the
VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
andVK_BUFFER_CREATE_SPARSE_RESIDENCY_BIT
bits. -
Sparse residency support is advertised on a finer grain via the following features:
-
sparseResidencyBuffer
: Support for creatingVkBuffer
objects with theVK_BUFFER_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidencyImage2D
: Support for creating 2D single-sampledVkImage
objects withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidencyImage3D
: Support for creating 3DVkImage
objects withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidency2Samples
: Support for creating 2DVkImage
objects with 2 samples andVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidency4Samples
: Support for creating 2DVkImage
objects with 4 samples andVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidency8Samples
: Support for creating 2DVkImage
objects with 8 samples andVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidency16Samples
: Support for creating 2DVkImage
objects with 16 samples andVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
.
Implementations supporting
sparseResidencyImage2D
are only required to support sparse 2D, single-sampled images. Support for sparse 3D and MSAA images is optional and can be enabled viasparseResidencyImage3D
,sparseResidency2Samples
,sparseResidency4Samples
,sparseResidency8Samples
, andsparseResidency16Samples
. -
-
A sparse image created using
VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
supports all non-compressed color formats with power-of-two element size that non-sparse usage supports. Additional formats may also be supported and can be queried via vkGetPhysicalDeviceSparseImageFormatProperties.VK_IMAGE_TILING_LINEAR
tiling is not supported.
-
-
Sparse aliasing provides the following capability that can be enabled per resource:
Allows physical memory ranges to be shared between multiple locations in the same sparse resource or between multiple sparse resources, with each binding of a memory location observing a consistent interpretation of the memory contents.
See Sparse Memory Aliasing for more information.
31.2. Sparse Buffers and Fully-Resident Images
Both VkBuffer
and VkImage
objects created with the
VK_IMAGE_CREATE_SPARSE_BINDING_BIT
or
VK_BUFFER_CREATE_SPARSE_BINDING_BIT
bits can be thought of as a
linear region of address space.
In the VkImage
case if VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
is
not used, this linear region is entirely opaque, meaning that there is no
application-visible mapping between texel location and memory offset.
Unless VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
or
VK_BUFFER_CREATE_SPARSE_RESIDENCY_BIT
are also used, the entire
resource must be bound to one or more VkDeviceMemory
objects before
use.
31.2.1. Sparse Buffer and Fully-Resident Image Block Size
The sparse block size in bytes for sparse buffers and fully-resident images
is reported as VkMemoryRequirements
::alignment
.
alignment
represents both the memory alignment requirement and the
binding granularity (in bytes) for sparse resources.
31.3. Sparse Partially-Resident Buffers
VkBuffer
objects created with the
VK_BUFFER_CREATE_SPARSE_RESIDENCY_BIT
bit allow the buffer to be made
only partially resident.
Partially resident VkBuffer
objects are allocated and bound
identically to VkBuffer
objects using only the
VK_BUFFER_CREATE_SPARSE_BINDING_BIT
feature.
The only difference is the ability for some regions of the buffer to be
unbound during device use.
31.4. Sparse Partially-Resident Images
VkImage
objects created with the
VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
bit allow specific rectangular
regions of the image called sparse image blocks to be bound to specific
ranges of memory.
This allows the application to manage residency at either image subresource
or sparse image block granularity.
Each image subresource (outside of the mip tail)
starts on a sparse block boundary and has dimensions that are integer
multiples of the corresponding dimensions of the sparse image block.
Note
Applications can use these types of images to control LOD based on total memory consumption. If memory pressure becomes an issue the application can unbind and disable specific mipmap levels of images without having to recreate resources or modify texel data of unaffected levels. The application can also use this functionality to access subregions of the image in a “megatexture” fashion. The application can create a large image and only populate the region of the image that is currently being used in the scene. |
31.4.1. Accessing Unbound Regions
The following member of VkPhysicalDeviceSparseProperties
affects how
data in unbound regions of sparse resources are handled by the
implementation:
-
residencyNonResidentStrict
If this property is not present, reads of unbound regions of the image will return undefined values. Both reads and writes are still considered safe and will not affect other resources or populated regions of the image.
If this property is present, all reads of unbound regions of the image will behave as if the region was bound to memory populated with all zeros; writes will be discarded.
Formatted accesses to unbound memory may still alter some component values in the natural way for those accesses, e.g. substituting a value of one for alpha in formats that do not have an alpha component.
Example: Reading the alpha component of an unbacked VK_FORMAT_R8_UNORM
image will return a value of 1.0f.
See Physical Device Enumeration for instructions for retrieving physical device properties.
31.4.2. Mip Tail Regions
Sparse images created using VK_IMAGE_CREATE_SPARSE_BINDING_BIT
(without also using VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
) have no
specific mapping of image region or image subresource to memory offset
defined, so the entire image can be thought of as a linear opaque address
region.
However, images created with VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
do
have a prescribed sparse image block layout, and hence each image
subresource must start on a sparse block boundary.
Within each array layer, the set of mip levels that have a smaller size than
the sparse block size in bytes are grouped together into a mip tail
region.
If the VK_SPARSE_IMAGE_FORMAT_ALIGNED_MIP_SIZE_BIT
flag is present in
the flags
member of VkSparseImageFormatProperties
, for the
image’s format
, then any mip level which has dimensions that are not
integer multiples of the corresponding dimensions of the sparse image block,
and all subsequent mip levels, are also included in the mip tail region.
The following member of VkPhysicalDeviceSparseProperties
may affect
how the implementation places mip levels in the mip tail region:
-
residencyAlignedMipSize
Each mip tail region is bound to memory as an opaque region (i.e. must be bound using a VkSparseImageOpaqueMemoryBindInfo structure) and may be of a size greater than or equal to the sparse block size in bytes. This size is guaranteed to be an integer multiple of the sparse block size in bytes.
An implementation may choose to allow each array-layer’s mip tail region to
be bound to memory independently or require that all array-layer’s mip tail
regions be treated as one.
This is dictated by VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
in
VkSparseImageMemoryRequirements
::flags
.
The following diagrams depict how
VK_SPARSE_IMAGE_FORMAT_ALIGNED_MIP_SIZE_BIT
and
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
alter memory usage and
requirements.
In the absence of VK_SPARSE_IMAGE_FORMAT_ALIGNED_MIP_SIZE_BIT
and
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
, each array layer contains a
mip tail region containing texel data for all mip levels smaller than the
sparse image block in any dimension.
Mip levels that are as large or larger than a sparse image block in all dimensions can be bound individually. Right-edges and bottom-edges of each level are allowed to have partially used sparse blocks. Any bound partially-used-sparse-blocks must still have their full sparse block size in bytes allocated in memory.
When VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
is present all array
layers will share a single mip tail region.
Note
The mip tail regions are presented here in 2D arrays simply for figure size reasons. Each mip tail is logically a single array of sparse blocks with an implementation-dependent mapping of texels or compressed texel blocks to sparse blocks. |
When VK_SPARSE_IMAGE_FORMAT_ALIGNED_MIP_SIZE_BIT
is present the first
mip level that would contain partially used sparse blocks begins the mip
tail region.
This level and all subsequent levels are placed in the mip tail.
Only the first N mip levels whose dimensions are an exact multiple of
the sparse image block dimensions can be bound and unbound on a sparse
block basis.
Note
The mip tail region is presented here in a 2D array simply for figure size reasons. It is logically a single array of sparse blocks with an implementation-dependent mapping of texels or compressed texel blocks to sparse blocks. |
When both VK_SPARSE_IMAGE_FORMAT_ALIGNED_MIP_SIZE_BIT
and
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
are present the constraints
from each of these flags are in effect.
31.4.3. Standard Sparse Image Block Shapes
Standard sparse image block shapes define a standard set of dimensions for sparse image blocks that depend on the format of the image. Layout of texels or compressed texel blocks within a sparse image block is implementation dependent. All currently defined standard sparse image block shapes are 64 KB in size.
For block-compressed formats (e.g. VK_FORMAT_BC5_UNORM_BLOCK
), the
texel size is the size of the compressed texel block (e.g. 128-bit for
BC5
) thus the dimensions of the standard sparse image block shapes
apply in terms of compressed texel blocks.
Note
For block-compressed formats, the dimensions of a sparse image block in terms of texels can be calculated by multiplying the sparse image block dimensions by the compressed texel block dimensions. |
TEXEL SIZE (bits) | Block Shape (2D) | Block Shape (3D) |
---|---|---|
8-Bit |
256 × 256 × 1 |
64 × 32 × 32 |
16-Bit |
256 × 128 × 1 |
32 × 32 × 32 |
32-Bit |
128 × 128 × 1 |
32 × 32 × 16 |
64-Bit |
128 × 64 × 1 |
32 × 16 × 16 |
128-Bit |
64 × 64 × 1 |
16 × 16 × 16 |
TEXEL SIZE (bits) | Block Shape (2X) | Block Shape (4X) | Block Shape (8X) | Block Shape (16X) |
---|---|---|---|---|
8-Bit |
128 × 256 × 1 |
128 × 128 × 1 |
64 × 128 × 1 |
64 × 64 × 1 |
16-Bit |
128 × 128 × 1 |
128 × 64 × 1 |
64 × 64 × 1 |
64 × 32 × 1 |
32-Bit |
64 × 128 × 1 |
64 × 64 × 1 |
32 × 64 × 1 |
32 × 32 × 1 |
64-Bit |
64 × 64 × 1 |
64 × 32 × 1 |
32 × 32 × 1 |
32 × 16 × 1 |
128-Bit |
32 × 64 × 1 |
32 × 32 × 1 |
16 × 32 × 1 |
16 × 16 × 1 |
Implementations that support the standard sparse image block shape for all
formats listed in the Standard Sparse Image Block Shapes (Single Sample) and
Standard Sparse Image Block Shapes (MSAA) tables may advertise the following
VkPhysicalDeviceSparseProperties
:
-
residencyStandard2DBlockShape
-
residencyStandard2DMultisampleBlockShape
-
residencyStandard3DBlockShape
Reporting each of these features does not imply that all possible image types are supported as sparse. Instead, this indicates that no supported sparse image of the corresponding type will use custom sparse image block dimensions for any formats that have a corresponding standard sparse image block shape.
31.4.4. Custom Sparse Image Block Shapes
An implementation that does not support a standard image block shape for a
particular sparse partially-resident image may choose to support a custom
sparse image block shape for it instead.
The dimensions of such a custom sparse image block shape are reported in
VkSparseImageFormatProperties
::imageGranularity
.
As with standard sparse image block shapes, the size in bytes of the custom
sparse image block shape will be reported in
VkMemoryRequirements
::alignment
.
Custom sparse image block dimensions are reported through
vkGetPhysicalDeviceSparseImageFormatProperties
and
vkGetImageSparseMemoryRequirements
.
An implementation must not support both the standard sparse image block shape and a custom sparse image block shape for the same image. The standard sparse image block shape must be used if it is supported.
31.4.5. Multiple Aspects
Partially resident images are allowed to report separate sparse properties for different aspects of the image. One example is for depth/stencil images where the implementation separates the depth and stencil data into separate planes. Another reason for multiple aspects is to allow the application to manage memory allocation for implementation-private metadata associated with the image. See the figure below:
Note
The mip tail regions are presented here in 2D arrays simply for figure size reasons. Each mip tail is logically a single array of sparse blocks with an implementation-dependent mapping of texels or compressed texel blocks to sparse blocks. |
In the figure above the depth, stencil, and metadata aspects all have unique
sparse properties.
The per-texel stencil data is ¼ the size of the depth data,
hence the stencil sparse blocks include 4 × the number of
texels.
The sparse block size in bytes for all of the aspects is identical and
defined by VkMemoryRequirements
::alignment
.
Metadata
The metadata aspect of an image has the following constraints:
-
All metadata is reported in the mip tail region of the metadata aspect.
-
All metadata must be bound prior to device use of the sparse image.
31.5. Sparse Memory Aliasing
By default sparse resources have the same aliasing rules as non-sparse resources. See Memory Aliasing for more information.
VkDevice
objects that have the
sparseResidencyAliased feature
enabled are able to use the VK_BUFFER_CREATE_SPARSE_ALIASED_BIT
and
VK_IMAGE_CREATE_SPARSE_ALIASED_BIT
flags for resource creation.
These flags allow resources to access physical memory bound into multiple
locations within one or more sparse resources in a data consistent
fashion.
This means that reading physical memory from multiple aliased locations will
return the same value.
Care must be taken when performing a write operation to aliased physical memory. Memory dependencies must be used to separate writes to one alias from reads or writes to another alias. Writes to aliased memory that are not properly guarded against accesses to different aliases will have undefined results for all accesses to the aliased memory.
Applications that wish to make use of data consistent sparse memory aliasing must abide by the following guidelines:
-
All sparse resources that are bound to aliased physical memory must be created with the
VK_BUFFER_CREATE_SPARSE_ALIASED_BIT
/VK_IMAGE_CREATE_SPARSE_ALIASED_BIT
flag. -
All resources that access aliased physical memory must interpret the memory in the same way. This implies the following:
-
Buffers and images cannot alias the same physical memory in a data consistent fashion. The physical memory ranges must be used exclusively by buffers or used exclusively by images for data consistency to be guaranteed.
-
Memory in sparse image mip tail regions cannot access aliased memory in a data consistent fashion.
-
Sparse images that alias the same physical memory must have compatible formats and be using the same sparse image block shape in order to access aliased memory in a data consistent fashion.
-
Failure to follow any of the above guidelines will require the application to abide by the normal, non-sparse resource aliasing rules. In this case memory cannot be accessed in a data consistent fashion.
Note
Enabling sparse resource memory aliasing can be a way to lower physical memory use, but it may reduce performance on some implementations. An application developer can test on their target HW and balance the memory / performance trade-offs measured. |
31.6. Sparse Resource Implementation Guidelines
31.7. Sparse Resource API
The APIs related to sparse resources are grouped into the following categories:
31.7.1. Physical Device Features
Some sparse-resource related features are reported and enabled in
VkPhysicalDeviceFeatures
.
These features must be supported and enabled on the VkDevice
object
before applications can use them.
See Physical Device Features for information on how to
get and set enabled device features, and for more detailed explanations of
these features.
Sparse Physical Device Features
-
sparseBinding
: Support for creating VkBuffer andVkImage
objects with theVK_BUFFER_CREATE_SPARSE_BINDING_BIT
andVK_IMAGE_CREATE_SPARSE_BINDING_BIT
flags, respectively. -
sparseResidencyBuffer
: Support for creating VkBuffer objects with theVK_BUFFER_CREATE_SPARSE_RESIDENCY_BIT
flag. -
sparseResidencyImage2D
: Support for creating 2D single-sampledVkImage
objects withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidencyImage3D
: Support for creating 3D VkImage objects withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidency2Samples
: Support for creating 2D VkImage objects with 2 samples andVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidency4Samples
: Support for creating 2D VkImage objects with 4 samples andVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidency8Samples
: Support for creating 2D VkImage objects with 8 samples andVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidency16Samples
: Support for creating 2D VkImage objects with 16 samples andVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
. -
sparseResidencyAliased
: Support for creating VkBuffer andVkImage
objects with theVK_BUFFER_CREATE_SPARSE_ALIASED_BIT
andVK_IMAGE_CREATE_SPARSE_ALIASED_BIT
flags, respectively.
31.7.2. Physical Device Sparse Properties
Some features of the implementation are not possible to disable, and are
reported to allow applications to alter their sparse resource usage
accordingly.
These read-only capabilities are reported in the
VkPhysicalDeviceProperties::sparseProperties
member, which is a
structure of type VkPhysicalDeviceSparseProperties
.
The VkPhysicalDeviceSparseProperties
structure is defined as:
typedef struct VkPhysicalDeviceSparseProperties {
VkBool32 residencyStandard2DBlockShape;
VkBool32 residencyStandard2DMultisampleBlockShape;
VkBool32 residencyStandard3DBlockShape;
VkBool32 residencyAlignedMipSize;
VkBool32 residencyNonResidentStrict;
} VkPhysicalDeviceSparseProperties;
-
residencyStandard2DBlockShape
isVK_TRUE
if the physical device will access all single-sample 2D sparse resources using the standard sparse image block shapes (based on image format), as described in the Standard Sparse Image Block Shapes (Single Sample) table. If this property is not supported the value returned in theimageGranularity
member of theVkSparseImageFormatProperties
structure for single-sample 2D images is not required to match the standard sparse image block dimensions listed in the table. -
residencyStandard2DMultisampleBlockShape
isVK_TRUE
if the physical device will access all multisample 2D sparse resources using the standard sparse image block shapes (based on image format), as described in the Standard Sparse Image Block Shapes (MSAA) table. If this property is not supported, the value returned in theimageGranularity
member of theVkSparseImageFormatProperties
structure for multisample 2D images is not required to match the standard sparse image block dimensions listed in the table. -
residencyStandard3DBlockShape
isVK_TRUE
if the physical device will access all 3D sparse resources using the standard sparse image block shapes (based on image format), as described in the Standard Sparse Image Block Shapes (Single Sample) table. If this property is not supported, the value returned in theimageGranularity
member of theVkSparseImageFormatProperties
structure for 3D images is not required to match the standard sparse image block dimensions listed in the table. -
residencyAlignedMipSize
isVK_TRUE
if images with mip level dimensions that are not integer multiples of the corresponding dimensions of the sparse image block may be placed in the mip tail. If this property is not reported, only mip levels with dimensions smaller than theimageGranularity
member of theVkSparseImageFormatProperties
structure will be placed in the mip tail. If this property is reported the implementation is allowed to returnVK_SPARSE_IMAGE_FORMAT_ALIGNED_MIP_SIZE_BIT
in theflags
member ofVkSparseImageFormatProperties
, indicating that mip level dimensions that are not integer multiples of the corresponding dimensions of the sparse image block will be placed in the mip tail. -
residencyNonResidentStrict
specifies whether the physical device can consistently access non-resident regions of a resource. If this property isVK_TRUE
, access to non-resident regions of resources will be guaranteed to return values as if the resource were populated with 0; writes to non-resident regions will be discarded.
31.7.3. Sparse Image Format Properties
Given that certain aspects of sparse image support, including the sparse image block dimensions, may be implementation-dependent, vkGetPhysicalDeviceSparseImageFormatProperties can be used to query for sparse image format properties prior to resource creation. This command is used to check whether a given set of sparse image parameters is supported and what the sparse image block shape will be.
Sparse Image Format Properties API
The VkSparseImageFormatProperties
structure is defined as:
typedef struct VkSparseImageFormatProperties {
VkImageAspectFlags aspectMask;
VkExtent3D imageGranularity;
VkSparseImageFormatFlags flags;
} VkSparseImageFormatProperties;
-
aspectMask
is a bitmask VkImageAspectFlagBits specifying which aspects of the image the properties apply to. -
imageGranularity
is the width, height, and depth of the sparse image block in texels or compressed texel blocks. -
flags
is a bitmask of VkSparseImageFormatFlagBits specifying additional information about the sparse resource.
Bits which may be set in VkSparseImageFormatProperties::flags
,
specifying additional information about the sparse resource, are:
typedef enum VkSparseImageFormatFlagBits {
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT = 0x00000001,
VK_SPARSE_IMAGE_FORMAT_ALIGNED_MIP_SIZE_BIT = 0x00000002,
VK_SPARSE_IMAGE_FORMAT_NONSTANDARD_BLOCK_SIZE_BIT = 0x00000004,
} VkSparseImageFormatFlagBits;
-
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
specifies that the image uses a single mip tail region for all array layers. -
VK_SPARSE_IMAGE_FORMAT_ALIGNED_MIP_SIZE_BIT
specifies that the first mip level whose dimensions are not integer multiples of the corresponding dimensions of the sparse image block begins the mip tail region. -
VK_SPARSE_IMAGE_FORMAT_NONSTANDARD_BLOCK_SIZE_BIT
specifies that the image uses non-standard sparse image block dimensions, and theimageGranularity
values do not match the standard sparse image block dimensions for the given format.
typedef VkFlags VkSparseImageFormatFlags;
VkSparseImageFormatFlags
is a bitmask type for setting a mask of zero
or more VkSparseImageFormatFlagBits.
vkGetPhysicalDeviceSparseImageFormatProperties
returns an array of
VkSparseImageFormatProperties.
Each element will describe properties for one set of image aspects that are
bound simultaneously in the image.
This is usually one element for each aspect in the image, but for
interleaved depth/stencil images there is only one element describing the
combined aspects.
void vkGetPhysicalDeviceSparseImageFormatProperties(
VkPhysicalDevice physicalDevice,
VkFormat format,
VkImageType type,
VkSampleCountFlagBits samples,
VkImageUsageFlags usage,
VkImageTiling tiling,
uint32_t* pPropertyCount,
VkSparseImageFormatProperties* pProperties);
-
physicalDevice
is the physical device from which to query the sparse image capabilities. -
format
is the image format. -
type
is the dimensionality of image. -
samples
is the number of samples per texel as defined in VkSampleCountFlagBits. -
usage
is a bitmask describing the intended usage of the image. -
tiling
is the tiling arrangement of the texel blocks in memory. -
pPropertyCount
is a pointer to an integer related to the number of sparse format properties available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array of VkSparseImageFormatProperties structures.
If pProperties
is NULL
, then the number of sparse format properties
available is returned in pPropertyCount
.
Otherwise, pPropertyCount
must point to a variable set by the user to
the number of elements in the pProperties
array, and on return the
variable is overwritten with the number of structures actually written to
pProperties
.
If pPropertyCount
is less than the number of sparse format properties
available, at most pPropertyCount
structures will be written.
If VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
is not supported for the given
arguments, pPropertyCount
will be set to zero upon return, and no data
will be written to pProperties
.
Multiple aspects are returned for depth/stencil images that are implemented
as separate planes by the implementation.
The depth and stencil data planes each have unique
VkSparseImageFormatProperties
data.
Depth/stencil images with depth and stencil data interleaved into a single
plane will return a single VkSparseImageFormatProperties
structure
with the aspectMask
set to VK_IMAGE_ASPECT_DEPTH_BIT
|
VK_IMAGE_ASPECT_STENCIL_BIT
.
vkGetPhysicalDeviceSparseImageFormatProperties2
returns an array of
VkSparseImageFormatProperties2.
Each element will describe properties for one set of image aspects that are
bound simultaneously in the image.
This is usually one element for each aspect in the image, but for
interleaved depth/stencil images there is only one element describing the
combined aspects.
void vkGetPhysicalDeviceSparseImageFormatProperties2(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceSparseImageFormatInfo2* pFormatInfo,
uint32_t* pPropertyCount,
VkSparseImageFormatProperties2* pProperties);
or the equivalent command
void vkGetPhysicalDeviceSparseImageFormatProperties2KHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceSparseImageFormatInfo2* pFormatInfo,
uint32_t* pPropertyCount,
VkSparseImageFormatProperties2* pProperties);
-
physicalDevice
is the physical device from which to query the sparse image capabilities. -
pFormatInfo
is a pointer to a structure of type VkPhysicalDeviceSparseImageFormatInfo2 containing input parameters to the command. -
pPropertyCount
is a pointer to an integer related to the number of sparse format properties available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array of VkSparseImageFormatProperties2 structures.
vkGetPhysicalDeviceSparseImageFormatProperties2
behaves identically to
vkGetPhysicalDeviceSparseImageFormatProperties, with the ability to
return extended information by adding extension structures to the
pNext
chain of its pProperties
parameter.
The VkPhysicalDeviceSparseImageFormatInfo2
structure is defined as:
typedef struct VkPhysicalDeviceSparseImageFormatInfo2 {
VkStructureType sType;
const void* pNext;
VkFormat format;
VkImageType type;
VkSampleCountFlagBits samples;
VkImageUsageFlags usage;
VkImageTiling tiling;
} VkPhysicalDeviceSparseImageFormatInfo2;
or the equivalent
typedef VkPhysicalDeviceSparseImageFormatInfo2 VkPhysicalDeviceSparseImageFormatInfo2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
format
is the image format. -
type
is the dimensionality of image. -
samples
is the number of samples per texel as defined in VkSampleCountFlagBits. -
usage
is a bitmask describing the intended usage of the image. -
tiling
is the tiling arrangement of the texel blocks in memory.
The VkSparseImageFormatProperties2
structure is defined as:
typedef struct VkSparseImageFormatProperties2 {
VkStructureType sType;
void* pNext;
VkSparseImageFormatProperties properties;
} VkSparseImageFormatProperties2;
or the equivalent
typedef VkSparseImageFormatProperties2 VkSparseImageFormatProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
properties
is a structure of type VkSparseImageFormatProperties which is populated with the same values as in vkGetPhysicalDeviceSparseImageFormatProperties.
31.7.4. Sparse Resource Creation
Sparse resources require that one or more sparse feature flags be specified
(as part of the VkPhysicalDeviceFeatures
structure described
previously in the Physical Device Features
section) at CreateDevice time.
When the appropriate device features are enabled, the
VK_BUFFER_CREATE_SPARSE_*
and VK_IMAGE_CREATE_SPARSE_*
flags
can be used.
See vkCreateBuffer and vkCreateImage for details of the resource
creation APIs.
Note
Specifying |
31.7.5. Sparse Resource Memory Requirements
Sparse resources have specific memory requirements related to binding sparse
memory.
These memory requirements are reported differently for VkBuffer
objects and VkImage
objects.
Buffer and Fully-Resident Images
Buffers (both fully and partially resident) and fully-resident images can
be bound to memory using only the data from VkMemoryRequirements
.
For all sparse resources the VkMemoryRequirements
::alignment
member specifies both the bindable sparse block size in bytes and required
alignment of VkDeviceMemory
.
Partially Resident Images
Partially resident images have a different method for binding memory.
As with buffers and fully resident images, the
VkMemoryRequirements
::alignment
field specifies the bindable
sparse block size in bytes for the image.
Requesting sparse memory requirements for VkImage
objects using
vkGetImageSparseMemoryRequirements
will return an array of one or more
VkSparseImageMemoryRequirements
structures.
Each structure describes the sparse memory requirements for a group of
aspects of the image.
The sparse image must have been created using the
VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
flag to retrieve valid sparse
image memory requirements.
Sparse Image Memory Requirements
The VkSparseImageMemoryRequirements
structure is defined as:
typedef struct VkSparseImageMemoryRequirements {
VkSparseImageFormatProperties formatProperties;
uint32_t imageMipTailFirstLod;
VkDeviceSize imageMipTailSize;
VkDeviceSize imageMipTailOffset;
VkDeviceSize imageMipTailStride;
} VkSparseImageMemoryRequirements;
-
formatProperties.aspectMask
is the set of aspects of the image that this sparse memory requirement applies to. This will usually have a single aspect specified. However, depth/stencil images may have depth and stencil data interleaved in the same sparse block, in which case bothVK_IMAGE_ASPECT_DEPTH_BIT
andVK_IMAGE_ASPECT_STENCIL_BIT
would be present. -
formatProperties.imageGranularity
describes the dimensions of a single bindable sparse image block in texel units. For aspectVK_IMAGE_ASPECT_METADATA_BIT
, all dimensions will be zero. All metadata is located in the mip tail region. -
formatProperties.flags
is a bitmask of VkSparseImageFormatFlagBits:-
If
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
is set the image uses a single mip tail region for all array layers. -
If
VK_SPARSE_IMAGE_FORMAT_ALIGNED_MIP_SIZE_BIT
is set the dimensions of mip levels must be integer multiples of the corresponding dimensions of the sparse image block for levels not located in the mip tail. -
If
VK_SPARSE_IMAGE_FORMAT_NONSTANDARD_BLOCK_SIZE_BIT
is set the image uses non-standard sparse image block dimensions. TheformatProperties.imageGranularity
values do not match the standard sparse image block dimension corresponding to the image’s format.
-
-
imageMipTailFirstLod
is the first mip level at which image subresources are included in the mip tail region. -
imageMipTailSize
is the memory size (in bytes) of the mip tail region. IfformatProperties.flags
containsVK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
, this is the size of the whole mip tail, otherwise this is the size of the mip tail of a single array layer. This value is guaranteed to be a multiple of the sparse block size in bytes. -
imageMipTailOffset
is the opaque memory offset used with VkSparseImageOpaqueMemoryBindInfo to bind the mip tail region(s). -
imageMipTailStride
is the offset stride between each array-layer’s mip tail, ifformatProperties.flags
does not containVK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
(otherwise the value is undefined).
To query sparse memory requirements for an image, call:
void vkGetImageSparseMemoryRequirements(
VkDevice device,
VkImage image,
uint32_t* pSparseMemoryRequirementCount,
VkSparseImageMemoryRequirements* pSparseMemoryRequirements);
-
device
is the logical device that owns the image. -
image
is the VkImage object to get the memory requirements for. -
pSparseMemoryRequirementCount
is a pointer to an integer related to the number of sparse memory requirements available or queried, as described below. -
pSparseMemoryRequirements
is eitherNULL
or a pointer to an array ofVkSparseImageMemoryRequirements
structures.
If pSparseMemoryRequirements
is NULL
, then the number of sparse
memory requirements available is returned in
pSparseMemoryRequirementCount
.
Otherwise, pSparseMemoryRequirementCount
must point to a variable set
by the user to the number of elements in the pSparseMemoryRequirements
array, and on return the variable is overwritten with the number of
structures actually written to pSparseMemoryRequirements
.
If pSparseMemoryRequirementCount
is less than the number of sparse
memory requirements available, at most pSparseMemoryRequirementCount
structures will be written.
If the image was not created with VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
then pSparseMemoryRequirementCount
will be set to zero and
pSparseMemoryRequirements
will not be written to.
Note
It is legal for an implementation to report a larger value in
|
To query sparse memory requirements for an image, call:
void vkGetImageSparseMemoryRequirements2(
VkDevice device,
const VkImageSparseMemoryRequirementsInfo2* pInfo,
uint32_t* pSparseMemoryRequirementCount,
VkSparseImageMemoryRequirements2* pSparseMemoryRequirements);
or the equivalent command
void vkGetImageSparseMemoryRequirements2KHR(
VkDevice device,
const VkImageSparseMemoryRequirementsInfo2* pInfo,
uint32_t* pSparseMemoryRequirementCount,
VkSparseImageMemoryRequirements2* pSparseMemoryRequirements);
-
device
is the logical device that owns the image. -
pInfo
is a pointer to an instance of theVkImageSparseMemoryRequirementsInfo2
structure containing parameters required for the memory requirements query. -
pSparseMemoryRequirementCount
is a pointer to an integer related to the number of sparse memory requirements available or queried, as described below. -
pSparseMemoryRequirements
is eitherNULL
or a pointer to an array ofVkSparseImageMemoryRequirements2
structures.
The VkImageSparseMemoryRequirementsInfo2
structure is defined as:
typedef struct VkImageSparseMemoryRequirementsInfo2 {
VkStructureType sType;
const void* pNext;
VkImage image;
} VkImageSparseMemoryRequirementsInfo2;
or the equivalent
typedef VkImageSparseMemoryRequirementsInfo2 VkImageSparseMemoryRequirementsInfo2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
image
is the image to query.
The VkSparseImageMemoryRequirements2
structure is defined as:
typedef struct VkSparseImageMemoryRequirements2 {
VkStructureType sType;
void* pNext;
VkSparseImageMemoryRequirements memoryRequirements;
} VkSparseImageMemoryRequirements2;
or the equivalent
typedef VkSparseImageMemoryRequirements2 VkSparseImageMemoryRequirements2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
memoryRequirements
is a structure of type VkSparseImageMemoryRequirements describing the memory requirements of the sparse image.
31.7.6. Binding Resource Memory
Non-sparse resources are backed by a single physical allocation prior to
device use (via vkBindImageMemory
or vkBindBufferMemory
), and
their backing must not be changed.
On the other hand, sparse resources can be bound to memory non-contiguously
and these bindings can be altered during the lifetime of the resource.
Note
It is important to note that freeing a Implementations must ensure that no access to physical memory owned by the system or another process will occur in this scenario. In other words, accessing resources bound to freed memory may result in application termination, but must not result in system termination or in reading non-process-accessible memory. |
Sparse memory bindings execute on a queue that includes the
VK_QUEUE_SPARSE_BINDING_BIT
bit.
Applications must use synchronization primitives to
guarantee that other queues do not access ranges of memory concurrently with
a binding change.
Accessing memory in a range while it is being rebound results in undefined
behavior.
It is valid to access other ranges of the same resource while a bind
operation is executing.
Note
Implementations must provide a guarantee that simultaneously binding sparse blocks while another queue accesses those same sparse blocks via a sparse resource must not access memory owned by another process or otherwise corrupt the system. |
While some implementations may include VK_QUEUE_SPARSE_BINDING_BIT
support in queue families that also include graphics and compute support,
other implementations may only expose a
VK_QUEUE_SPARSE_BINDING_BIT
-only queue family.
In either case, applications must use synchronization
primitives to explicitly request any ordering dependencies between sparse
memory binding operations and other graphics/compute/transfer operations, as
sparse binding operations are not automatically ordered against command
buffer execution, even within a single queue.
When binding memory explicitly for the VK_IMAGE_ASPECT_METADATA_BIT
the application must use the VK_SPARSE_MEMORY_BIND_METADATA_BIT
in
the VkSparseMemoryBind
::flags
field when binding memory.
Binding memory for metadata is done the same way as binding memory for the
mip tail, with the addition of the VK_SPARSE_MEMORY_BIND_METADATA_BIT
flag.
Binding the mip tail for any aspect must only be performed using
VkSparseImageOpaqueMemoryBindInfo.
If formatProperties.flags
contains
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
, then it can be bound with
a single VkSparseMemoryBind structure, with resourceOffset
=
imageMipTailOffset
and size
= imageMipTailSize
.
If formatProperties.flags
does not contain
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
then the offset for the mip
tail in each array layer is given as:
arrayMipTailOffset = imageMipTailOffset + arrayLayer * imageMipTailStride;
and the mip tail can be bound with layerCount
VkSparseMemoryBind
structures, each using size
= imageMipTailSize
and
resourceOffset
= arrayMipTailOffset
as defined above.
Sparse memory binding is handled by the following APIs and related data structures.
Sparse Memory Binding Functions
The VkSparseMemoryBind
structure is defined as:
typedef struct VkSparseMemoryBind {
VkDeviceSize resourceOffset;
VkDeviceSize size;
VkDeviceMemory memory;
VkDeviceSize memoryOffset;
VkSparseMemoryBindFlags flags;
} VkSparseMemoryBind;
-
resourceOffset
is the offset into the resource. -
size
is the size of the memory region to be bound. -
memory
is the VkDeviceMemory object that the range of the resource is bound to. Ifmemory
is VK_NULL_HANDLE, the range is unbound. -
memoryOffset
is the offset into the VkDeviceMemory object to bind the resource range to. Ifmemory
is VK_NULL_HANDLE, this value is ignored. -
flags
is a bitmask of VkSparseMemoryBindFlagBits specifying usage of the binding operation.
The binding range [resourceOffset
, resourceOffset
+
size
) has different constraints based on flags
.
If flags
contains VK_SPARSE_MEMORY_BIND_METADATA_BIT
, the
binding range must be within the mip tail region of the metadata aspect.
This metadata region is defined by:
-
metadataRegion = [base, base +
imageMipTailSize
) -
base =
imageMipTailOffset
+imageMipTailStride
× n
and imageMipTailOffset
, imageMipTailSize
, and
imageMipTailStride
values are from the
VkSparseImageMemoryRequirements corresponding to the metadata aspect
of the image, and n is a valid array layer index for the image,
imageMipTailStride
is considered to be zero for aspects where
VkSparseImageMemoryRequirements
::formatProperties.flags
contains
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
.
If flags
does not contain VK_SPARSE_MEMORY_BIND_METADATA_BIT
,
the binding range must be within the range
[0,VkMemoryRequirements::size
).
Bits which can be set in VkSparseMemoryBind::flags
, specifying
usage of a sparse memory binding operation, are:
typedef enum VkSparseMemoryBindFlagBits {
VK_SPARSE_MEMORY_BIND_METADATA_BIT = 0x00000001,
} VkSparseMemoryBindFlagBits;
-
VK_SPARSE_MEMORY_BIND_METADATA_BIT
specifies that the memory being bound is only for the metadata aspect.
typedef VkFlags VkSparseMemoryBindFlags;
VkSparseMemoryBindFlags
is a bitmask type for setting a mask of zero
or more VkSparseMemoryBindFlagBits.
Memory is bound to VkBuffer
objects created with the
VK_BUFFER_CREATE_SPARSE_BINDING_BIT
flag using the following
structure:
typedef struct VkSparseBufferMemoryBindInfo {
VkBuffer buffer;
uint32_t bindCount;
const VkSparseMemoryBind* pBinds;
} VkSparseBufferMemoryBindInfo;
-
buffer
is the VkBuffer object to be bound. -
bindCount
is the number of VkSparseMemoryBind structures in thepBinds
array. -
pBinds
is a pointer to array of VkSparseMemoryBind structures.
Memory is bound to opaque regions of VkImage
objects created with the
VK_IMAGE_CREATE_SPARSE_BINDING_BIT
flag using the following structure:
typedef struct VkSparseImageOpaqueMemoryBindInfo {
VkImage image;
uint32_t bindCount;
const VkSparseMemoryBind* pBinds;
} VkSparseImageOpaqueMemoryBindInfo;
-
image
is the VkImage object to be bound. -
bindCount
is the number of VkSparseMemoryBind structures in thepBinds
array. -
pBinds
is a pointer to array of VkSparseMemoryBind structures.
Note
This operation is normally used to bind memory to fully-resident sparse images or for mip tail regions of partially resident images. However, it can also be used to bind memory for the entire binding range of partially resident images. In case When |
editing-note
(Jon) The preceding NOTE refers to |
Memory can be bound to sparse image blocks of VkImage
objects created
with the VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
flag using the following
structure:
typedef struct VkSparseImageMemoryBindInfo {
VkImage image;
uint32_t bindCount;
const VkSparseImageMemoryBind* pBinds;
} VkSparseImageMemoryBindInfo;
-
image
is the VkImage object to be bound -
bindCount
is the number of VkSparseImageMemoryBind structures in pBinds array -
pBinds
is a pointer to array of VkSparseImageMemoryBind structures
The VkSparseImageMemoryBind
structure is defined as:
typedef struct VkSparseImageMemoryBind {
VkImageSubresource subresource;
VkOffset3D offset;
VkExtent3D extent;
VkDeviceMemory memory;
VkDeviceSize memoryOffset;
VkSparseMemoryBindFlags flags;
} VkSparseImageMemoryBind;
-
subresource
is the aspectMask and region of interest in the image. -
offset
are the coordinates of the first texel within the image subresource to bind. -
extent
is the size in texels of the region within the image subresource to bind. The extent must be a multiple of the sparse image block dimensions, except when binding sparse image blocks along the edge of an image subresource it can instead be such that any coordinate ofoffset
+extent
equals the corresponding dimensions of the image subresource. -
memory
is the VkDeviceMemory object that the sparse image blocks of the image are bound to. Ifmemory
is VK_NULL_HANDLE, the sparse image blocks are unbound. -
memoryOffset
is an offset into VkDeviceMemory object. Ifmemory
is VK_NULL_HANDLE, this value is ignored. -
flags
are sparse memory binding flags.
To submit sparse binding operations to a queue, call:
VkResult vkQueueBindSparse(
VkQueue queue,
uint32_t bindInfoCount,
const VkBindSparseInfo* pBindInfo,
VkFence fence);
-
queue
is the queue that the sparse binding operations will be submitted to. -
bindInfoCount
is the number of elements in thepBindInfo
array. -
pBindInfo
is an array of VkBindSparseInfo structures, each specifying a sparse binding submission batch. -
fence
is an optional handle to a fence to be signaled. Iffence
is not VK_NULL_HANDLE, it defines a fence signal operation.
vkQueueBindSparse
is a queue submission
command, with each batch defined by an element of pBindInfo
as an
instance of the VkBindSparseInfo structure.
Batches begin execution in the order they appear in pBindInfo
, but
may complete out of order.
Within a batch, a given range of a resource must not be bound more than once. Across batches, if a range is to be bound to one allocation and offset and then to another allocation and offset, then the application must guarantee (usually using semaphores) that the binding operations are executed in the correct order, as well as to order binding operations against the execution of command buffer submissions.
As no operation to vkQueueBindSparse causes any pipeline stage to access memory, synchronization primitives used in this command effectively only define execution dependencies.
Additional information about fence and semaphore operation is described in the synchronization chapter.
The VkBindSparseInfo
structure is defined as:
typedef struct VkBindSparseInfo {
VkStructureType sType;
const void* pNext;
uint32_t waitSemaphoreCount;
const VkSemaphore* pWaitSemaphores;
uint32_t bufferBindCount;
const VkSparseBufferMemoryBindInfo* pBufferBinds;
uint32_t imageOpaqueBindCount;
const VkSparseImageOpaqueMemoryBindInfo* pImageOpaqueBinds;
uint32_t imageBindCount;
const VkSparseImageMemoryBindInfo* pImageBinds;
uint32_t signalSemaphoreCount;
const VkSemaphore* pSignalSemaphores;
} VkBindSparseInfo;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
waitSemaphoreCount
is the number of semaphores upon which to wait before executing the sparse binding operations for the batch. -
pWaitSemaphores
is a pointer to an array of semaphores upon which to wait on before the sparse binding operations for this batch begin execution. If semaphores to wait on are provided, they define a semaphore wait operation. -
bufferBindCount
is the number of sparse buffer bindings to perform in the batch. -
pBufferBinds
is a pointer to an array of VkSparseBufferMemoryBindInfo structures. -
imageOpaqueBindCount
is the number of opaque sparse image bindings to perform. -
pImageOpaqueBinds
is a pointer to an array of VkSparseImageOpaqueMemoryBindInfo structures, indicating opaque sparse image bindings to perform. -
imageBindCount
is the number of sparse image bindings to perform. -
pImageBinds
is a pointer to an array of VkSparseImageMemoryBindInfo structures, indicating sparse image bindings to perform. -
signalSemaphoreCount
is the number of semaphores to be signaled once the sparse binding operations specified by the structure have completed execution. -
pSignalSemaphores
is a pointer to an array of semaphores which will be signaled when the sparse binding operations for this batch have completed execution. If semaphores to be signaled are provided, they define a semaphore signal operation.
If the pNext
chain of VkBindSparseInfo includes a
VkDeviceGroupBindSparseInfo
structure, then that structure includes
device indices specifying which instance of the resources and memory are
bound.
The VkDeviceGroupBindSparseInfo
structure is defined as:
typedef struct VkDeviceGroupBindSparseInfo {
VkStructureType sType;
const void* pNext;
uint32_t resourceDeviceIndex;
uint32_t memoryDeviceIndex;
} VkDeviceGroupBindSparseInfo;
or the equivalent
typedef VkDeviceGroupBindSparseInfo VkDeviceGroupBindSparseInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
resourceDeviceIndex
is a device index indicating which instance of the resource is bound. -
memoryDeviceIndex
is a device index indicating which instance of the memory the resource instance is bound to.
These device indices apply to all buffer and image memory binds included in
the batch that points to this structure.
The semaphore waits and signals for the batch are executed only by the
physical device specified by the resourceDeviceIndex
.
If this structure is not present, resourceDeviceIndex
and
memoryDeviceIndex
are assumed to be zero.
31.8. Examples
The following examples illustrate basic creation of sparse images and binding them to physical memory.
31.8.1. Basic Sparse Resources
This basic example creates a normal VkImage
object but uses
fine-grained memory allocation to back the resource with multiple memory
ranges.
VkDevice device;
VkQueue queue;
VkImage sparseImage;
VkAllocationCallbacks* pAllocator = NULL;
VkMemoryRequirements memoryRequirements = {};
VkDeviceSize offset = 0;
VkSparseMemoryBind binds[MAX_CHUNKS] = {}; // MAX_CHUNKS is NOT part of Vulkan
uint32_t bindCount = 0;
// ...
// Allocate image object
const VkImageCreateInfo sparseImageInfo =
{
VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, // sType
NULL, // pNext
VK_IMAGE_CREATE_SPARSE_BINDING_BIT | ..., // flags
...
};
vkCreateImage(device, &sparseImageInfo, pAllocator, &sparseImage);
// Get memory requirements
vkGetImageMemoryRequirements(
device,
sparseImage,
&memoryRequirements);
// Bind memory in fine-grained fashion, find available memory ranges
// from potentially multiple VkDeviceMemory pools.
// (Illustration purposes only, can be optimized for perf)
while (memoryRequirements.size && bindCount < MAX_CHUNKS)
{
VkSparseMemoryBind* pBind = &binds[bindCount];
pBind->resourceOffset = offset;
AllocateOrGetMemoryRange(
device,
&memoryRequirements,
&pBind->memory,
&pBind->memoryOffset,
&pBind->size);
// memory ranges must be sized as multiples of the alignment
assert(IsMultiple(pBind->size, memoryRequirements.alignment));
assert(IsMultiple(pBind->memoryOffset, memoryRequirements.alignment));
memoryRequirements.size -= pBind->size;
offset += pBind->size;
bindCount++;
}
// Ensure all image has backing
if (memoryRequirements.size)
{
// Error condition - too many chunks
}
const VkSparseImageOpaqueMemoryBindInfo opaqueBindInfo =
{
sparseImage, // image
bindCount, // bindCount
binds // pBinds
};
const VkBindSparseInfo bindSparseInfo =
{
VK_STRUCTURE_TYPE_BIND_SPARSE_INFO, // sType
NULL, // pNext
...
1, // imageOpaqueBindCount
&opaqueBindInfo, // pImageOpaqueBinds
...
};
// vkQueueBindSparse is externally synchronized per queue object.
AcquireQueueOwnership(queue);
// Actually bind memory
vkQueueBindSparse(queue, 1, &bindSparseInfo, VK_NULL_HANDLE);
ReleaseQueueOwnership(queue);
31.8.2. Advanced Sparse Resources
This more advanced example creates an arrayed color attachment / texture image and binds only LOD zero and the required metadata to physical memory.
VkDevice device;
VkQueue queue;
VkImage sparseImage;
VkAllocationCallbacks* pAllocator = NULL;
VkMemoryRequirements memoryRequirements = {};
uint32_t sparseRequirementsCount = 0;
VkSparseImageMemoryRequirements* pSparseReqs = NULL;
VkSparseMemoryBind binds[MY_IMAGE_ARRAY_SIZE] = {};
VkSparseImageMemoryBind imageBinds[MY_IMAGE_ARRAY_SIZE] = {};
uint32_t bindCount = 0;
// Allocate image object (both renderable and sampleable)
const VkImageCreateInfo sparseImageInfo =
{
VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, // sType
NULL, // pNext
VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT | ..., // flags
...
VK_FORMAT_R8G8B8A8_UNORM, // format
...
MY_IMAGE_ARRAY_SIZE, // arrayLayers
...
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
VK_IMAGE_USAGE_SAMPLED_BIT, // usage
...
};
vkCreateImage(device, &sparseImageInfo, pAllocator, &sparseImage);
// Get memory requirements
vkGetImageMemoryRequirements(
device,
sparseImage,
&memoryRequirements);
// Get sparse image aspect properties
vkGetImageSparseMemoryRequirements(
device,
sparseImage,
&sparseRequirementsCount,
NULL);
pSparseReqs = (VkSparseImageMemoryRequirements*)
malloc(sparseRequirementsCount * sizeof(VkSparseImageMemoryRequirements));
vkGetImageSparseMemoryRequirements(
device,
sparseImage,
&sparseRequirementsCount,
pSparseReqs);
// Bind LOD level 0 and any required metadata to memory
for (uint32_t i = 0; i < sparseRequirementsCount; ++i)
{
if (pSparseReqs[i].formatProperties.aspectMask &
VK_IMAGE_ASPECT_METADATA_BIT)
{
// Metadata must not be combined with other aspects
assert(pSparseReqs[i].formatProperties.aspectMask ==
VK_IMAGE_ASPECT_METADATA_BIT);
if (pSparseReqs[i].formatProperties.flags &
VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT)
{
VkSparseMemoryBind* pBind = &binds[bindCount];
pBind->memorySize = pSparseReqs[i].imageMipTailSize;
bindCount++;
// ... Allocate memory range
pBind->resourceOffset = pSparseReqs[i].imageMipTailOffset;
pBind->memoryOffset = /* allocated memoryOffset */;
pBind->memory = /* allocated memory */;
pBind->flags = VK_SPARSE_MEMORY_BIND_METADATA_BIT;
}
else
{
// Need a mip tail region per array layer.
for (uint32_t a = 0; a < sparseImageInfo.arrayLayers; ++a)
{
VkSparseMemoryBind* pBind = &binds[bindCount];
pBind->memorySize = pSparseReqs[i].imageMipTailSize;
bindCount++;
// ... Allocate memory range
pBind->resourceOffset = pSparseReqs[i].imageMipTailOffset +
(a * pSparseReqs[i].imageMipTailStride);
pBind->memoryOffset = /* allocated memoryOffset */;
pBind->memory = /* allocated memory */
pBind->flags = VK_SPARSE_MEMORY_BIND_METADATA_BIT;
}
}
}
else
{
// resource data
VkExtent3D lod0BlockSize =
{
AlignedDivide(
sparseImageInfo.extent.width,
pSparseReqs[i].formatProperties.imageGranularity.width);
AlignedDivide(
sparseImageInfo.extent.height,
pSparseReqs[i].formatProperties.imageGranularity.height);
AlignedDivide(
sparseImageInfo.extent.depth,
pSparseReqs[i].formatProperties.imageGranularity.depth);
}
size_t totalBlocks =
lod0BlockSize.width *
lod0BlockSize.height *
lod0BlockSize.depth;
// Each block is the same size as the alignment requirement,
// calculate total memory size for level 0
VkDeviceSize lod0MemSize = totalBlocks * memoryRequirements.alignment;
// Allocate memory for each array layer
for (uint32_t a = 0; a < sparseImageInfo.arrayLayers; ++a)
{
// ... Allocate memory range
VkSparseImageMemoryBind* pBind = &imageBinds[a];
pBind->subresource.aspectMask = pSparseReqs[i].formatProperties.aspectMask;
pBind->subresource.mipLevel = 0;
pBind->subresource.arrayLayer = a;
pBind->offset = (VkOffset3D){0, 0, 0};
pBind->extent = sparseImageInfo.extent;
pBind->memoryOffset = /* allocated memoryOffset */;
pBind->memory = /* allocated memory */;
pBind->flags = 0;
}
}
free(pSparseReqs);
}
const VkSparseImageOpaqueMemoryBindInfo opaqueBindInfo =
{
sparseImage, // image
bindCount, // bindCount
binds // pBinds
};
const VkSparseImageMemoryBindInfo imageBindInfo =
{
sparseImage, // image
sparseImageInfo.arrayLayers, // bindCount
imageBinds // pBinds
};
const VkBindSparseInfo bindSparseInfo =
{
VK_STRUCTURE_TYPE_BIND_SPARSE_INFO, // sType
NULL, // pNext
...
1, // imageOpaqueBindCount
&opaqueBindInfo, // pImageOpaqueBinds
1, // imageBindCount
&imageBindInfo, // pImageBinds
...
};
// vkQueueBindSparse is externally synchronized per queue object.
AcquireQueueOwnership(queue);
// Actually bind memory
vkQueueBindSparse(queue, 1, &bindSparseInfo, VK_NULL_HANDLE);
ReleaseQueueOwnership(queue);
32. Window System Integration (WSI)
This chapter discusses the window system integration (WSI) between the Vulkan API and the various forms of displaying the results of rendering to a user. Since the Vulkan API can be used without displaying results, WSI is provided through the use of optional Vulkan extensions. This chapter provides an overview of WSI. See the appendix for additional details of each WSI extension, including which extensions must be enabled in order to use each of the functions described in this chapter.
32.1. WSI Platform
A platform is an abstraction for a window system, OS, etc. Some examples include MS Windows, Android, and Wayland. The Vulkan API may be integrated in a unique manner for each platform.
The Vulkan API does not define any type of platform object. Platform-specific WSI extensions are defined, which contain platform-specific functions for using WSI. Use of these extensions is guarded by preprocessor symbols as defined in the Window System-Specific Header Control appendix.
In order for an application to be compiled to use WSI with a given platform, it must either:
-
#define the appropriate preprocessor symbol prior to including the
vulkan.h
header file, or -
include
vulkan_core.h
and any native platform headers, followed by the appropriate platform-specific header.
The preprocessor symbols and platform-specific headers are defined in the Window System Extensions and Headers table.
Each platform-specific extension is an instance extension.
The application must enable instance extensions with vkCreateInstance
before using them.
32.2. WSI Surface
Native platform surface or window objects are abstracted by surface objects,
which are represented by VkSurfaceKHR
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSurfaceKHR)
The VK_KHR_surface
extension declares the VkSurfaceKHR
object, and
provides a function for destroying VkSurfaceKHR
objects.
Separate platform-specific extensions each provide a function for creating a
VkSurfaceKHR
object for the respective platform.
From the application’s perspective this is an opaque handle, just like the
handles of other Vulkan objects.
Note
On certain platforms, the Vulkan loader and ICDs may have conventions that
treat the handle as a pointer to a struct that contains the
platform-specific information about the surface.
This will be described in the documentation for the loader-ICD interface,
and in the |
editing-note
TODO: Consider replacing the above note editing note with a pointer to the loader spec when it exists. However, the information is not relevant to users of the API nor does it affect conformance of a Vulkan implementation to this spec. |
32.2.1. Android Platform
To create a VkSurfaceKHR
object for an Android native window, call:
VkResult vkCreateAndroidSurfaceKHR(
VkInstance instance,
const VkAndroidSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance to associate the surface with. -
pCreateInfo
is a pointer to an instance of theVkAndroidSurfaceCreateInfoKHR
structure containing parameters affecting the creation of the surface object. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface object is returned.
During the lifetime of a surface created using a particular
ANativeWindow
handle any attempts to create another surface for the
same ANativeWindow
and any attempts to connect to the same
ANativeWindow
through other platform mechanisms will fail.
Note
In particular, only one |
If successful, vkCreateAndroidSurfaceKHR
increments the
ANativeWindow
’s reference count, and vkDestroySurfaceKHR
will
decrement it.
On Android, when a swapchain’s imageExtent
does not match the
surface’s currentExtent
, the presentable images will be scaled to the
surface’s dimensions during presentation.
minImageExtent
is (1,1), and maxImageExtent
is the maximum
image size supported by the consumer.
For the system compositor, currentExtent
is the window size (i.e. the
consumer’s preferred size).
The VkAndroidSurfaceCreateInfoKHR
structure is defined as:
typedef struct VkAndroidSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkAndroidSurfaceCreateFlagsKHR flags;
struct ANativeWindow* window;
} VkAndroidSurfaceCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
window
is a pointer to theANativeWindow
to associate the surface with.
To remove an unnecessary compile-time dependency, an incomplete type
definition of ANativeWindow
is provided in the Vulkan headers:
struct ANativeWindow;
The actual ANativeWindow
type is defined in Android NDK headers.
32.2.2. Wayland Platform
To create a VkSurfaceKHR
object for a Wayland surface, call:
VkResult vkCreateWaylandSurfaceKHR(
VkInstance instance,
const VkWaylandSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance to associate the surface with. -
pCreateInfo
is a pointer to an instance of the VkWaylandSurfaceCreateInfoKHR structure containing parameters affecting the creation of the surface object. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface object is returned.
The VkWaylandSurfaceCreateInfoKHR
structure is defined as:
typedef struct VkWaylandSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkWaylandSurfaceCreateFlagsKHR flags;
struct wl_display* display;
struct wl_surface* surface;
} VkWaylandSurfaceCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
display
andsurface
are pointers to the Waylandwl_display
andwl_surface
to associate the surface with.
On Wayland, currentExtent
is the special value (0xFFFFFFFF,
0xFFFFFFFF), indicating that the surface size will be determined by the
extent of a swapchain targeting the surface.
Whatever the application sets a swapchain’s imageExtent
to will be the
size of the window, after the first image is presented.
minImageExtent
is (1,1), and maxImageExtent
is the maximum
supported surface size.
Any calls to vkGetPhysicalDeviceSurfacePresentModesKHR on a surface
created with vkCreateWaylandSurfaceKHR
are required to return
VK_PRESENT_MODE_MAILBOX_KHR
as one of the valid present modes.
Some Vulkan functions may send protocol over the specified wl_display
connection when using a swapchain or presentable images created from a
VkSurfaceKHR
referring to a wl_surface
.
Applications must therefore ensure that both the wl_display
and the
wl_surface
remain valid for the lifetime of any VkSwapchainKHR
objects created from a particular wl_display
and wl_surface
.
Also, calling vkQueuePresentKHR will result in Vulkan sending
wl_surface
.commit requests to the underlying wl_surface
of each
VkSwapchainKHR
objects referenced by pPresentInfo
.
If the swapchain is created with a present mode of
VK_PRESENT_MODE_MAILBOX_KHR
or VK_PRESENT_MODE_IMMEDIATE_KHR
,
then the corresponding wl_surface
.attach, wl_surface
.damage, and
wl_surface
.commit request must be issued by the implementation during
the call to vkQueuePresentKHR and must not be issued by the
implementation outside of vkQueuePresentKHR.
This ensures that any Wayland requests sent by the client after the call to
vkQueuePresentKHR returns will be received by the compositor after the
wl_surface
.commit.
Regardless of the mode of swapchain creation, a new wl_event_queue
must be created for each successful vkCreateWaylandSurfaceKHR call,
and every Wayland object created by the implementation must be assigned to
this event queue.
If the platform provides Wayland 1.11 or greater, this must be implemented
by the use of Wayland proxy object wrappers, to avoid race conditions.
If the application wishes to synchronize any window changes with a
particular frame, such requests must be sent to the Wayland display server
prior to calling vkQueuePresentKHR.
For full control over interactions between Vulkan rendering and other
Wayland protocol requests and events, a present mode of
VK_PRESENT_MODE_MAILBOX_KHR
should be used.
32.2.3. Win32 Platform
To create a VkSurfaceKHR
object for a Win32 window, call:
VkResult vkCreateWin32SurfaceKHR(
VkInstance instance,
const VkWin32SurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance to associate the surface with. -
pCreateInfo
is a pointer to an instance of theVkWin32SurfaceCreateInfoKHR
structure containing parameters affecting the creation of the surface object. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface object is returned.
The VkWin32SurfaceCreateInfoKHR
structure is defined as:
typedef struct VkWin32SurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkWin32SurfaceCreateFlagsKHR flags;
HINSTANCE hinstance;
HWND hwnd;
} VkWin32SurfaceCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
hinstance
andhwnd
are the Win32HINSTANCE
andHWND
for the window to associate the surface with.
With Win32, minImageExtent
, maxImageExtent
, and
currentExtent
must always equal the window size.
The currentExtent
of a Win32 surface must have both width
and
height
greater than 0, or both of them 0.
Note
Due to above restrictions, it is only possible to create a new swapchain on
this platform with The window size may become (0, 0) on this platform (e.g. when the window is minimized), and so a swapchain cannot be created until the size changes. |
32.2.4. XCB Platform
To create a VkSurfaceKHR
object for an X11 window, using the XCB
client-side library, call:
VkResult vkCreateXcbSurfaceKHR(
VkInstance instance,
const VkXcbSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance to associate the surface with. -
pCreateInfo
is a pointer to an instance of theVkXcbSurfaceCreateInfoKHR
structure containing parameters affecting the creation of the surface object. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface object is returned.
The VkXcbSurfaceCreateInfoKHR
structure is defined as:
typedef struct VkXcbSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkXcbSurfaceCreateFlagsKHR flags;
xcb_connection_t* connection;
xcb_window_t window;
} VkXcbSurfaceCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
connection
is a pointer to anxcb_connection_t
to the X server. -
window
is thexcb_window_t
for the X11 window to associate the surface with.
With Xcb, minImageExtent
, maxImageExtent
, and
currentExtent
must always equal the window size.
The currentExtent
of an Xcb surface must have both width
and
height
greater than 0, or both of them 0.
Note
Due to above restrictions, it is only possible to create a new swapchain on
this platform with The window size may become (0, 0) on this platform (e.g. when the window is minimized), and so a swapchain cannot be created until the size changes. |
Some Vulkan functions may send protocol over the specified xcb connection when using a swapchain or presentable images created from a VkSurfaceKHR referring to an xcb window. Applications must therefore ensure the xcb connection is available to Vulkan for the duration of any functions that manipulate such swapchains or their presentable images, and any functions that build or queue command buffers that operate on such presentable images. Specifically, applications using Vulkan with xcb-based swapchains must
-
Avoid holding a server grab on an xcb connection while waiting for Vulkan operations to complete using a swapchain derived from a different xcb connection referring to the same X server instance. Failing to do so may result in deadlock.
32.2.5. Xlib Platform
To create a VkSurfaceKHR
object for an X11 window, using the Xlib
client-side library, call:
VkResult vkCreateXlibSurfaceKHR(
VkInstance instance,
const VkXlibSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance to associate the surface with. -
pCreateInfo
is a pointer to an instance of theVkXlibSurfaceCreateInfoKHR
structure containing the parameters affecting the creation of the surface object. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface object is returned.
The VkXlibSurfaceCreateInfoKHR
structure is defined as:
typedef struct VkXlibSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkXlibSurfaceCreateFlagsKHR flags;
Display* dpy;
Window window;
} VkXlibSurfaceCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
dpy
is a pointer to an XlibDisplay
connection to the X server. -
window
is an XlibWindow
to associate the surface with.
With Xlib, minImageExtent
, maxImageExtent
, and
currentExtent
must always equal the window size.
The currentExtent
of an Xlib surface must have both width
and
height
greater than 0, or both of them 0.
Note
Due to above restrictions, it is only possible to create a new swapchain on
this platform with The window size may become (0, 0) on this platform (e.g. when the window is minimized), and so a swapchain cannot be created until the size changes. |
Some Vulkan functions may send protocol over the specified Xlib
Display
connection when using a swapchain or presentable images created
from a VkSurfaceKHR referring to an Xlib window.
Applications must therefore ensure the display connection is available to
Vulkan for the duration of any functions that manipulate such swapchains or
their presentable images, and any functions that build or queue command
buffers that operate on such presentable images.
Specifically, applications using Vulkan with Xlib-based swapchains must
-
Avoid holding a server grab on a display connection while waiting for Vulkan operations to complete using a swapchain derived from a different display connection referring to the same X server instance. Failing to do so may result in deadlock.
Some implementations may require threads to implement some presentation
modes so applications must call XInitThreads
() before calling any
other Xlib functions.
32.2.6. Fuchsia Platform
To create a VkSurfaceKHR
object for a Fuchsia ImagePipe, call:
VkResult vkCreateImagePipeSurfaceFUCHSIA(
VkInstance instance,
const VkImagePipeSurfaceCreateInfoFUCHSIA* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance to associate with the surface. -
pCreateInfo
is a pointer to an instance of the VkImagePipeSurfaceCreateInfoFUCHSIA structure containing parameters affecting the creation of the surface object. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface object is returned.
The VkImagePipeSurfaceCreateInfoFUCHSIA
structure is defined as:
typedef struct VkImagePipeSurfaceCreateInfoFUCHSIA {
VkStructureType sType;
const void* pNext;
VkImagePipeSurfaceCreateFlagsFUCHSIA flags;
zx_handle_t imagePipeHandle;
} VkImagePipeSurfaceCreateInfoFUCHSIA;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
imagePipeHandle
is azx_handle_t
referring to the ImagePipe to associate with the surface.
On Fuchsia, the surface currentExtent
is the special value
(0xFFFFFFFF, 0xFFFFFFFF), indicating that the surface size will be
determined by the extent of a swapchain targeting the surface.
32.2.7. iOS Platform
To create a VkSurfaceKHR
object for an iOS UIView
, call:
VkResult vkCreateIOSSurfaceMVK(
VkInstance instance,
const VkIOSSurfaceCreateInfoMVK* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance with which to associate the surface. -
pCreateInfo
is a pointer to an instance of the VkIOSSurfaceCreateInfoMVK structure containing parameters affecting the creation of the surface object. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface object is returned.
The VkIOSSurfaceCreateInfoMVK structure is defined as:
typedef struct VkIOSSurfaceCreateInfoMVK {
VkStructureType sType;
const void* pNext;
VkIOSSurfaceCreateFlagsMVK flags;
const void* pView;
} VkIOSSurfaceCreateInfoMVK;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
pView
is a reference to aUIView
object which will display this surface. ThisUIView
must be backed by aCALayer
instance of typeCAMetalLayer
.
32.2.8. macOS Platform
To create a VkSurfaceKHR
object for a macOS NSView
, call:
VkResult vkCreateMacOSSurfaceMVK(
VkInstance instance,
const VkMacOSSurfaceCreateInfoMVK* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance with which to associate the surface. -
pCreateInfo
is a pointer to an instance of the VkMacOSSurfaceCreateInfoMVK structure containing parameters affecting the creation of the surface object. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface object is returned.
The VkMacOSSurfaceCreateInfoMVK structure is defined as:
typedef struct VkMacOSSurfaceCreateInfoMVK {
VkStructureType sType;
const void* pNext;
VkMacOSSurfaceCreateFlagsMVK flags;
const void* pView;
} VkMacOSSurfaceCreateInfoMVK;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
pView
is a reference to aNSView
object which will display this surface. ThisNSView
must be backed by aCALayer
instance of typeCAMetalLayer
.
32.2.9. VI Platform
To create a VkSurfaceKHR
object for an nn
::vi
::Layer
,
query the layer’s native handle using
nn
::vi
::GetNativeWindow
, and then call:
VkResult vkCreateViSurfaceNN(
VkInstance instance,
const VkViSurfaceCreateInfoNN* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance with which to associate the surface. -
pCreateInfo
is a pointer to an instance of theVkViSurfaceCreateInfoNN
structure containing parameters affecting the creation of the surface object. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface object is returned.
During the lifetime of a surface created using a particular
nn
::vi
::NativeWindowHandle
any attempts to create another
surface for the same nn
::vi
::Layer
and any attempts to
connect to the same nn
::vi
::Layer
through other platform
mechanisms will have undefined results.
If the native window is created with a specified size, currentExtent
will reflect that size.
In this case, applications should use the same size for the swapchain’s
imageExtent
.
Otherwise, the currentExtent
will have the special value
[eq]#(0xFFFFFFFF, 0xFFFFFFFF), indicating that applications are expected to
choose an appropriate size for the swapchain’s imageExtent
(e.g., by
matching the result of a call to
nn
::vi
::GetDisplayResolution
).
The VkViSurfaceCreateInfoNN
structure is defined as:
typedef struct VkViSurfaceCreateInfoNN {
VkStructureType sType;
const void* pNext;
VkViSurfaceCreateFlagsNN flags;
void* window;
} VkViSurfaceCreateInfoNN;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use. -
window
is thenn
::vi
::NativeWindowHandle
for thenn
::vi
::Layer
with which to associate the surface.
32.2.10. Platform-Independent Information
Once created, VkSurfaceKHR
objects can be used in this and other
extensions, in particular the VK_KHR_swapchain
extension.
Several WSI functions return VK_ERROR_SURFACE_LOST_KHR
if the surface
becomes no longer available.
After such an error, the surface (and any child swapchain, if one exists)
should be destroyed, as there is no way to restore them to a not-lost
state.
Applications may attempt to create a new VkSurfaceKHR
using the same
native platform window object, but whether such re-creation will succeed is
platform-dependent and may depend on the reason the surface became
unavailable.
A lost surface does not otherwise cause devices to be
lost.
To destroy a VkSurfaceKHR
object, call:
void vkDestroySurfaceKHR(
VkInstance instance,
VkSurfaceKHR surface,
const VkAllocationCallbacks* pAllocator);
-
instance
is the instance used to create the surface. -
surface
is the surface to destroy. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation).
Destroying a VkSurfaceKHR
merely severs the connection between Vulkan
and the native surface, and does not imply destroying the native surface,
closing a window, or similar behavior.
32.3. Presenting Directly to Display Devices
In some environments applications can also present Vulkan rendering
directly to display devices without using an intermediate windowing system.
This can be useful for embedded applications, or implementing the
rendering/presentation backend of a windowing system using Vulkan.
The VK_KHR_display
extension provides the functionality necessary to
enumerate display devices and create VkSurfaceKHR
objects that target
displays.
32.3.1. Display Enumeration
Displays are represented by VkDisplayKHR
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDisplayKHR)
Various functions are provided for enumerating the available display devices present on a Vulkan physical device. To query information about the available displays, call:
VkResult vkGetPhysicalDeviceDisplayPropertiesKHR(
VkPhysicalDevice physicalDevice,
uint32_t* pPropertyCount,
VkDisplayPropertiesKHR* pProperties);
-
physicalDevice
is a physical device. -
pPropertyCount
is a pointer to an integer related to the number of display devices available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array ofVkDisplayPropertiesKHR
structures.
If pProperties
is NULL
, then the number of display devices available
for physicalDevice
is returned in pPropertyCount
.
Otherwise, pPropertyCount
must point to a variable set by the user to
the number of elements in the pProperties
array, and on return the
variable is overwritten with the number of structures actually written to
pProperties
.
If the value of pPropertyCount
is less than the number of display
devices for physicalDevice
, at most pPropertyCount
structures
will be written.
If pPropertyCount
is smaller than the number of display devices
available for physicalDevice
, VK_INCOMPLETE
will be returned
instead of VK_SUCCESS
to indicate that not all the available values
were returned.
The VkDisplayPropertiesKHR
structure is defined as:
typedef struct VkDisplayPropertiesKHR {
VkDisplayKHR display;
const char* displayName;
VkExtent2D physicalDimensions;
VkExtent2D physicalResolution;
VkSurfaceTransformFlagsKHR supportedTransforms;
VkBool32 planeReorderPossible;
VkBool32 persistentContent;
} VkDisplayPropertiesKHR;
-
display
is a handle that is used to refer to the display described here. This handle will be valid for the lifetime of the Vulkan instance. -
displayName
is a pointer to a NULL-terminated string containing the name of the display. Generally, this will be the name provided by the display’s EDID. It can beNULL
if no suitable name is available. If notNULL
, the memory it points to must remain accessible as long asdisplay
is valid. -
physicalDimensions
describes the physical width and height of the visible portion of the display, in millimeters. -
physicalResolution
describes the physical, native, or preferred resolution of the display.
Note
For devices which have no natural value to return here, implementations should return the maximum resolution supported. |
-
supportedTransforms
is a bitmask of VkSurfaceTransformFlagBitsKHR describing which transforms are supported by this display. -
planeReorderPossible
tells whether the planes on this display can have their z order changed. If this isVK_TRUE
, the application can re-arrange the planes on this display in any order relative to each other. -
persistentContent
tells whether the display supports self-refresh/internal buffering. If this is true, the application can submit persistent present operations on swapchains created against this display.
Note
Persistent presents may have higher latency, and may use less power when the screen content is updated infrequently, or when only a portion of the screen needs to be updated in most frames. |
To query information about the available displays, call:
VkResult vkGetPhysicalDeviceDisplayProperties2KHR(
VkPhysicalDevice physicalDevice,
uint32_t* pPropertyCount,
VkDisplayProperties2KHR* pProperties);
-
physicalDevice
is a physical device. -
pPropertyCount
is a pointer to an integer related to the number of display devices available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array ofVkDisplayProperties2KHR
structures.
vkGetPhysicalDeviceDisplayProperties2KHR
behaves similarly to
vkGetPhysicalDeviceDisplayPropertiesKHR, with the ability to return
extended information via chained output structures.
The VkDisplayProperties2KHR
structure is defined as:
typedef struct VkDisplayProperties2KHR {
VkStructureType sType;
void* pNext;
VkDisplayPropertiesKHR displayProperties;
} VkDisplayProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
displayProperties
is an instance of the VkDisplayPropertiesKHR structure.
Acquiring and Releasing Displays
On some platforms, access to displays is limited to a single process or native driver instance. On such platforms, some or all of the displays may not be available to Vulkan if they are already in use by a native windowing system or other application.
To acquire permission to directly access a display in Vulkan from an X11 server, call:
VkResult vkAcquireXlibDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
VkDisplayKHR display);
-
physicalDevice
The physical device the display is on. -
dpy
A connection to the X11 server that currently ownsdisplay
. -
display
The display the caller wishes to control in Vulkan.
All permissions necessary to control the display are granted to the Vulkan
instance associated with physicalDevice
until the display is released
or the X11 connection specified by dpy
is terminated.
Permission to access the display may be temporarily revoked during periods
when the X11 server from which control was acquired itself looses access to
display
.
During such periods, operations which require access to the display must
fail with an approriate error code.
If the X11 server associated with dpy
does not own display
, or
if permission to access it has already been acquired by another entity, the
call must return the error code VK_ERROR_INITIALIZATION_FAILED
.
Note
One example of when an X11 server loses access to a display is when it loses ownership of its virtual terminal. |
When acquiring displays from an X11 server, an application may also wish to
enumerate and identify them using a native handle rather than a
VkDisplayKHR
handle.
To determine the VkDisplayKHR
handle corresponding to an X11 RandR
Output, call:
VkResult vkGetRandROutputDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
RROutput rrOutput,
VkDisplayKHR* pDisplay);
-
physicalDevice
The physical device to query the display handle on. -
dpy
A connection to the X11 server from whichrrOutput
was queried. -
rrOutput
An X11 RandR output ID. -
pDisplay
The corresponding VkDisplayKHR handle will be returned here.
If there is no VkDisplayKHR corresponding to rrOutput
on
physicalDevice
, VK_NULL_HANDLE must be returned in
pDisplay
.
To release a previously acquired display, call:
VkResult vkReleaseDisplayEXT(
VkPhysicalDevice physicalDevice,
VkDisplayKHR display);
-
physicalDevice
The physical device the display is on. -
display
The display to release control of.
Display Planes
Images are presented to individual planes on a display. Devices must support at least one plane on each display. Planes can be stacked and blended to composite multiple images on one display. Devices may support only a fixed stacking order and fixed mapping between planes and displays, or they may allow arbitrary application specified stacking orders and mappings between planes and displays. To query the properties of device display planes, call:
VkResult vkGetPhysicalDeviceDisplayPlanePropertiesKHR(
VkPhysicalDevice physicalDevice,
uint32_t* pPropertyCount,
VkDisplayPlanePropertiesKHR* pProperties);
-
physicalDevice
is a physical device. -
pPropertyCount
is a pointer to an integer related to the number of display planes available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array ofVkDisplayPlanePropertiesKHR
structures.
If pProperties
is NULL
, then the number of display planes available
for physicalDevice
is returned in pPropertyCount
.
Otherwise, pPropertyCount
must point to a variable set by the user to
the number of elements in the pProperties
array, and on return the
variable is overwritten with the number of structures actually written to
pProperties
.
If the value of pPropertyCount
is less than the number of display
planes for physicalDevice
, at most pPropertyCount
structures
will be written.
The VkDisplayPlanePropertiesKHR
structure is defined as:
typedef struct VkDisplayPlanePropertiesKHR {
VkDisplayKHR currentDisplay;
uint32_t currentStackIndex;
} VkDisplayPlanePropertiesKHR;
-
currentDisplay
is the handle of the display the plane is currently associated with. If the plane is not currently attached to any displays, this will be VK_NULL_HANDLE. -
currentStackIndex
is the current z-order of the plane. This will be between 0 and the value returned byvkGetPhysicalDeviceDisplayPlanePropertiesKHR
inpPropertyCount
.
To query the properties of a device’s display planes, call:
VkResult vkGetPhysicalDeviceDisplayPlaneProperties2KHR(
VkPhysicalDevice physicalDevice,
uint32_t* pPropertyCount,
VkDisplayPlaneProperties2KHR* pProperties);
-
physicalDevice
is a physical device. -
pPropertyCount
is a pointer to an integer related to the number of display planes available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array ofVkDisplayPlaneProperties2KHR
structures.
vkGetPhysicalDeviceDisplayPlaneProperties2KHR
behaves similarly to
vkGetPhysicalDeviceDisplayPlanePropertiesKHR, with the ability to
return extended information via chained output structures.
The VkDisplayPlaneProperties2KHR
structure is defined as:
typedef struct VkDisplayPlaneProperties2KHR {
VkStructureType sType;
void* pNext;
VkDisplayPlanePropertiesKHR displayPlaneProperties;
} VkDisplayPlaneProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
displayPlaneProperties
is an instance of the VkDisplayPlanePropertiesKHR structure.
To determine which displays a plane is usable with, call
VkResult vkGetDisplayPlaneSupportedDisplaysKHR(
VkPhysicalDevice physicalDevice,
uint32_t planeIndex,
uint32_t* pDisplayCount,
VkDisplayKHR* pDisplays);
-
physicalDevice
is a physical device. -
planeIndex
is the plane which the application wishes to use, and must be in the range [0, physical device plane count - 1]. -
pDisplayCount
is a pointer to an integer related to the number of displays available or queried, as described below. -
pDisplays
is eitherNULL
or a pointer to an array ofVkDisplayKHR
handles.
If pDisplays
is NULL
, then the number of displays usable with the
specified planeIndex
for physicalDevice
is returned in
pDisplayCount
.
Otherwise, pDisplayCount
must point to a variable set by the user to
the number of elements in the pDisplays
array, and on return the
variable is overwritten with the number of handles actually written to
pDisplays
.
If the value of pDisplayCount
is less than the number of display
planes for physicalDevice
, at most pDisplayCount
handles will be
written.
If pDisplayCount
is smaller than the number of displays usable with
the specified planeIndex
for physicalDevice
, VK_INCOMPLETE
will be returned instead of VK_SUCCESS
to indicate that not all the
available values were returned.
Additional properties of displays are queried using specialized query functions.
Display Modes
Display modes are represented by VkDisplayModeKHR
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDisplayModeKHR)
Each display has one or more supported modes associated with it by default. These built-in modes are queried by calling:
VkResult vkGetDisplayModePropertiesKHR(
VkPhysicalDevice physicalDevice,
VkDisplayKHR display,
uint32_t* pPropertyCount,
VkDisplayModePropertiesKHR* pProperties);
-
physicalDevice
is the physical device associated withdisplay
. -
display
is the display to query. -
pPropertyCount
is a pointer to an integer related to the number of display modes available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array ofVkDisplayModePropertiesKHR
structures.
If pProperties
is NULL
, then the number of display modes available
on the specified display
for physicalDevice
is returned in
pPropertyCount
.
Otherwise, pPropertyCount
must point to a variable set by the user to
the number of elements in the pProperties
array, and on return the
variable is overwritten with the number of structures actually written to
pProperties
.
If the value of pPropertyCount
is less than the number of display
modes for physicalDevice
, at most pPropertyCount
structures will
be written.
If pPropertyCount
is smaller than the number of display modes
available on the specified display
for physicalDevice
,
VK_INCOMPLETE
will be returned instead of VK_SUCCESS
to indicate
that not all the available values were returned.
The VkDisplayModePropertiesKHR
structure is defined as:
typedef struct VkDisplayModePropertiesKHR {
VkDisplayModeKHR displayMode;
VkDisplayModeParametersKHR parameters;
} VkDisplayModePropertiesKHR;
-
displayMode
is a handle to the display mode described in this structure. This handle will be valid for the lifetime of the Vulkan instance. -
parameters
is a VkDisplayModeParametersKHR structure describing the display parameters associated withdisplayMode
.
To query the properties of a device’s built-in display modes, call:
VkResult vkGetDisplayModeProperties2KHR(
VkPhysicalDevice physicalDevice,
VkDisplayKHR display,
uint32_t* pPropertyCount,
VkDisplayModeProperties2KHR* pProperties);
-
physicalDevice
is the physical device associated withdisplay
. -
display
is the display to query. -
pPropertyCount
is a pointer to an integer related to the number of display modes available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array ofVkDisplayModeProperties2KHR
structures.
vkGetDisplayModeProperties2KHR
behaves similarly to
vkGetDisplayModePropertiesKHR, with the ability to return extended
information via chained output structures.
The VkDisplayModeProperties2KHR
structure is defined as:
typedef struct VkDisplayModeProperties2KHR {
VkStructureType sType;
void* pNext;
VkDisplayModePropertiesKHR displayModeProperties;
} VkDisplayModeProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
displayModeProperties
is an instance of the VkDisplayModePropertiesKHR structure.
The VkDisplayModeParametersKHR
structure is defined as:
typedef struct VkDisplayModeParametersKHR {
VkExtent2D visibleRegion;
uint32_t refreshRate;
} VkDisplayModeParametersKHR;
-
visibleRegion
is the 2D extents of the visible region. -
refreshRate
is auint32_t
that is the number of times the display is refreshed each second multiplied by 1000.
Note
For example, a 60Hz display mode would report a |
Additional modes may also be created by calling:
VkResult vkCreateDisplayModeKHR(
VkPhysicalDevice physicalDevice,
VkDisplayKHR display,
const VkDisplayModeCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDisplayModeKHR* pMode);
-
physicalDevice
is the physical device associated withdisplay
. -
display
is the display to create an additional mode for. -
pCreateInfo
is a VkDisplayModeCreateInfoKHR structure describing the new mode to create. -
pAllocator
is the allocator used for host memory allocated for the display mode object when there is no more specific allocator available (see Memory Allocation). -
pMode
returns the handle of the mode created.
The VkDisplayModeCreateInfoKHR
structure is defined as:
typedef struct VkDisplayModeCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkDisplayModeCreateFlagsKHR flags;
VkDisplayModeParametersKHR parameters;
} VkDisplayModeCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use, and must be zero. -
parameters
is a VkDisplayModeParametersKHR structure describing the display parameters to use in creating the new mode. If the parameters are not compatible with the specified display, the implementation must returnVK_ERROR_INITIALIZATION_FAILED
.
Applications that wish to present directly to a display must select which layer, or “plane” of the display they wish to target, and a mode to use with the display. Each display supports at least one plane. The capabilities of a given mode and plane combination are determined by calling:
VkResult vkGetDisplayPlaneCapabilitiesKHR(
VkPhysicalDevice physicalDevice,
VkDisplayModeKHR mode,
uint32_t planeIndex,
VkDisplayPlaneCapabilitiesKHR* pCapabilities);
-
physicalDevice
is the physical device associated withdisplay
-
mode
is the display mode the application intends to program when using the specified plane. Note this parameter also implicitly specifies a display. -
planeIndex
is the plane which the application intends to use with the display, and is less than the number of display planes supported by the device. -
pCapabilities
is a pointer to a VkDisplayPlaneCapabilitiesKHR structure in which the capabilities are returned.
The VkDisplayPlaneCapabilitiesKHR
structure is defined as:
typedef struct VkDisplayPlaneCapabilitiesKHR {
VkDisplayPlaneAlphaFlagsKHR supportedAlpha;
VkOffset2D minSrcPosition;
VkOffset2D maxSrcPosition;
VkExtent2D minSrcExtent;
VkExtent2D maxSrcExtent;
VkOffset2D minDstPosition;
VkOffset2D maxDstPosition;
VkExtent2D minDstExtent;
VkExtent2D maxDstExtent;
} VkDisplayPlaneCapabilitiesKHR;
-
supportedAlpha
is a bitmask of VkDisplayPlaneAlphaFlagBitsKHR describing the supported alpha blending modes. -
minSrcPosition
is the minimum source rectangle offset supported by this plane using the specified mode. -
maxSrcPosition
is the maximum source rectangle offset supported by this plane using the specified mode. Thex
andy
components ofmaxSrcPosition
must each be greater than or equal to thex
andy
components ofminSrcPosition
, respectively. -
minSrcExtent
is the minimum source rectangle size supported by this plane using the specified mode. -
maxSrcExtent
is the maximum source rectangle size supported by this plane using the specified mode. -
minDstPosition
,maxDstPosition
,minDstExtent
,maxDstExtent
all have similar semantics to their corresponding*Src*
equivalents, but apply to the output region within the mode rather than the input region within the source image. Unlike the*Src*
offsets,minDstPosition
andmaxDstPosition
may contain negative values.
The minimum and maximum position and extent fields describe the
implementation limits, if any, as they apply to the specified display mode
and plane.
Vendors may support displaying a subset of a swapchain’s presentable images
on the specified display plane.
This is expressed by returning minSrcPosition
, maxSrcPosition
,
minSrcExtent
, and maxSrcExtent
values that indicate a range of
possible positions and sizes may be used to specify the region within the
presentable images that source pixels will be read from when creating a
swapchain on the specified display mode and plane.
Vendors may also support mapping the presentable images’ content to a
subset or superset of the visible region in the specified display mode.
This is expressed by returning minDstPosition
, maxDstPosition
,
minDstExtent
and maxDstExtent
values that indicate a range of
possible positions and sizes may be used to describe the region within the
display mode that the source pixels will be mapped to.
Other vendors may support only a 1-1 mapping between pixels in the
presentable images and the display mode.
This may be indicated by returning (0,0) for minSrcPosition
,
maxSrcPosition
, minDstPosition
, and maxDstPosition
, and
(display mode width, display mode height) for minSrcExtent
,
maxSrcExtent
, minDstExtent
, and maxDstExtent
.
These values indicate the limits of the implementation’s individual fields.
Not all combinations of values within the offset and extent ranges returned
in VkDisplayPlaneCapabilitiesKHR
are guaranteed to be supported.
Vendors may still fail presentation requests that specify unsupported
combinations.
To query the capabilities of a given mode and plane combination, call:
VkResult vkGetDisplayPlaneCapabilities2KHR(
VkPhysicalDevice physicalDevice,
const VkDisplayPlaneInfo2KHR* pDisplayPlaneInfo,
VkDisplayPlaneCapabilities2KHR* pCapabilities);
-
physicalDevice
is the physical device associated withpDisplayPlaneInfo
. -
pDisplayPlaneInfo
is a pointer to an instance of the VkDisplayPlaneInfo2KHR structure describing the plane and mode. -
pCapabilities
is a pointer to a VkDisplayPlaneCapabilities2KHR structure in which the capabilities are returned.
vkGetDisplayPlaneCapabilities2KHR
behaves similarly to
vkGetDisplayPlaneCapabilitiesKHR, with the ability to specify extended
inputs via chained input structures, and to return extended information via
chained output structures.
The VkDisplayPlaneInfo2KHR
structure is defined as:
typedef struct VkDisplayPlaneInfo2KHR {
VkStructureType sType;
const void* pNext;
VkDisplayModeKHR mode;
uint32_t planeIndex;
} VkDisplayPlaneInfo2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
mode
is the display mode the application intends to program when using the specified plane.
Note
This parameter also implicitly specifies a display. |
-
planeIndex
is the plane which the application intends to use with the display.
The members of VkDisplayPlaneInfo2KHR
correspond to the arguments to
vkGetDisplayPlaneCapabilitiesKHR, with sType
and pNext
added for extensibility.
The VkDisplayPlaneCapabilities2KHR
structure is defined as:
typedef struct VkDisplayPlaneCapabilities2KHR {
VkStructureType sType;
void* pNext;
VkDisplayPlaneCapabilitiesKHR capabilities;
} VkDisplayPlaneCapabilities2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
capabilities
is an instance of the VkDisplayPlaneCapabilitiesKHR structure.
32.3.2. Display Control
To set the power state of a display, call:
VkResult vkDisplayPowerControlEXT(
VkDevice device,
VkDisplayKHR display,
const VkDisplayPowerInfoEXT* pDisplayPowerInfo);
-
device
is a logical device associated withdisplay
. -
display
is the display whose power state is modified. -
pDisplayPowerInfo
is an instance of VkDisplayPowerInfoEXT specifying the new power state ofdisplay
.
The VkDisplayPowerInfoEXT
structure is defined as:
typedef struct VkDisplayPowerInfoEXT {
VkStructureType sType;
const void* pNext;
VkDisplayPowerStateEXT powerState;
} VkDisplayPowerInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
powerState
is a VkDisplayPowerStateEXT value specifying the new power state of the display.
Possible values of VkDisplayPowerInfoEXT::powerState
, specifying
the new power state of a display, are:
typedef enum VkDisplayPowerStateEXT {
VK_DISPLAY_POWER_STATE_OFF_EXT = 0,
VK_DISPLAY_POWER_STATE_SUSPEND_EXT = 1,
VK_DISPLAY_POWER_STATE_ON_EXT = 2,
} VkDisplayPowerStateEXT;
-
VK_DISPLAY_POWER_STATE_OFF_EXT
specifies that the display is powered down. -
VK_DISPLAY_POWER_STATE_SUSPEND_EXT
specifies that the display is put into a low power mode, from which it may be able to transition back toVK_DISPLAY_POWER_STATE_ON_EXT
more quickly than if it were inVK_DISPLAY_POWER_STATE_OFF_EXT
. This state may be the same asVK_DISPLAY_POWER_STATE_OFF_EXT
. -
VK_DISPLAY_POWER_STATE_ON_EXT
specifies that the display is powered on.
32.3.3. Display Surfaces
A complete display configuration includes a mode, one or more display planes
and any parameters describing their behavior, and parameters describing some
aspects of the images associated with those planes.
Display surfaces describe the configuration of a single plane within a
complete display configuration.
To create a VkSurfaceKHR
structure for a display surface, call:
VkResult vkCreateDisplayPlaneSurfaceKHR(
VkInstance instance,
const VkDisplaySurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
-
instance
is the instance corresponding to the physical device the targeted display is on. -
pCreateInfo
is a pointer to an instance of the VkDisplaySurfaceCreateInfoKHR structure specifying which mode, plane, and other parameters to use, as described below. -
pAllocator
is the allocator used for host memory allocated for the surface object when there is no more specific allocator available (see Memory Allocation). -
pSurface
points to a VkSurfaceKHR handle in which the created surface is returned.
The VkDisplaySurfaceCreateInfoKHR
structure is defined as:
typedef struct VkDisplaySurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkDisplaySurfaceCreateFlagsKHR flags;
VkDisplayModeKHR displayMode;
uint32_t planeIndex;
uint32_t planeStackIndex;
VkSurfaceTransformFlagBitsKHR transform;
float globalAlpha;
VkDisplayPlaneAlphaFlagBitsKHR alphaMode;
VkExtent2D imageExtent;
} VkDisplaySurfaceCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is reserved for future use, and must be zero. -
displayMode
is a VkDisplayModeKHR handle specifying the mode to use when displaying this surface. -
planeIndex
is the plane on which this surface appears. -
planeStackIndex
is the z-order of the plane. -
transform
is a VkSurfaceTransformFlagBitsKHR value specifying the transformation to apply to images as part of the scanout operation. -
globalAlpha
is the global alpha value. This value is ignored ifalphaMode
is notVK_DISPLAY_PLANE_ALPHA_GLOBAL_BIT_KHR
. -
alphaMode
is a VkDisplayPlaneAlphaFlagBitsKHR value specifying the type of alpha blending to use. -
imageExtent
The size of the presentable images to use with the surface.
Note
Creating a display surface must not modify the state of the displays, planes, or other resources it names. For example, it must not apply the specified mode to be set on the associated display. Application of display configuration occurs as a side effect of presenting to a display surface. |
Possible values of VkDisplaySurfaceCreateInfoKHR::alphaMode
,
specifying the type of alpha blending to use on a display, are:
typedef enum VkDisplayPlaneAlphaFlagBitsKHR {
VK_DISPLAY_PLANE_ALPHA_OPAQUE_BIT_KHR = 0x00000001,
VK_DISPLAY_PLANE_ALPHA_GLOBAL_BIT_KHR = 0x00000002,
VK_DISPLAY_PLANE_ALPHA_PER_PIXEL_BIT_KHR = 0x00000004,
VK_DISPLAY_PLANE_ALPHA_PER_PIXEL_PREMULTIPLIED_BIT_KHR = 0x00000008,
} VkDisplayPlaneAlphaFlagBitsKHR;
-
VK_DISPLAY_PLANE_ALPHA_OPAQUE_BIT_KHR
specifies that the source image will be treated as opaque. -
VK_DISPLAY_PLANE_ALPHA_GLOBAL_BIT_KHR
specifies that a global alpha value must be specified that will be applied to all pixels in the source image. -
VK_DISPLAY_PLANE_ALPHA_PER_PIXEL_BIT_KHR
specifies that the alpha value will be determined by the alpha channel of the source image’s pixels. If the source format contains no alpha values, no blending will be applied. The source alpha values are not premultiplied into the source image’s other color channels. -
VK_DISPLAY_PLANE_ALPHA_PER_PIXEL_PREMULTIPLIED_BIT_KHR
is equivalent toVK_DISPLAY_PLANE_ALPHA_PER_PIXEL_BIT_KHR
, except the source alpha values are assumed to be premultiplied into the source image’s other color channels.
typedef VkFlags VkDisplayPlaneAlphaFlagsKHR;
VkDisplayPlaneAlphaFlagsKHR
is a bitmask type for setting a mask of
zero or more VkDisplayPlaneAlphaFlagBitsKHR.
32.4. Querying for WSI Support
Not all physical devices will include WSI support. Within a physical device, not all queue families will support presentation. WSI support and compatibility can be determined in a platform-neutral manner (which determines support for presentation to a particular surface object) and additionally may be determined in platform-specific manners (which determine support for presentation on the specified physical device but do not guarantee support for presentation to a particular surface object).
To determine whether a queue family of a physical device supports presentation to a given surface, call:
VkResult vkGetPhysicalDeviceSurfaceSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
VkSurfaceKHR surface,
VkBool32* pSupported);
-
physicalDevice
is the physical device. -
queueFamilyIndex
is the queue family. -
surface
is the surface. -
pSupported
is a pointer to aVkBool32
, which is set toVK_TRUE
to indicate support, andVK_FALSE
otherwise.
32.4.1. Android Platform
On Android, all physical devices and queue families must be capable of presentation with any native window. As a result there is no Android-specific query for these capabilities.
32.4.2. Wayland Platform
To determine whether a queue family of a physical device supports presentation to a Wayland compositor, call:
VkBool32 vkGetPhysicalDeviceWaylandPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
struct wl_display* display);
-
physicalDevice
is the physical device. -
queueFamilyIndex
is the queue family index. -
display
is a pointer to thewl_display
associated with a Wayland compositor.
This platform-specific function can be called prior to creating a surface.
32.4.3. Win32 Platform
To determine whether a queue family of a physical device supports presentation to the Microsoft Windows desktop, call:
VkBool32 vkGetPhysicalDeviceWin32PresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex);
-
physicalDevice
is the physical device. -
queueFamilyIndex
is the queue family index.
This platform-specific function can be called prior to creating a surface.
32.4.4. XCB Platform
To determine whether a queue family of a physical device supports presentation to an X11 server, using the XCB client-side library, call:
VkBool32 vkGetPhysicalDeviceXcbPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
xcb_connection_t* connection,
xcb_visualid_t visual_id);
-
physicalDevice
is the physical device. -
queueFamilyIndex
is the queue family index. -
connection
is a pointer to anxcb_connection_t
to the X server.visual_id
is an X11 visual (xcb_visualid_t
).
This platform-specific function can be called prior to creating a surface.
32.4.5. Xlib Platform
To determine whether a queue family of a physical device supports presentation to an X11 server, using the Xlib client-side library, call:
VkBool32 vkGetPhysicalDeviceXlibPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
Display* dpy,
VisualID visualID);
-
physicalDevice
is the physical device. -
queueFamilyIndex
is the queue family index. -
dpy
is a pointer to an XlibDisplay
connection to the server. -
visualId
is an X11 visual (VisualID
).
This platform-specific function can be called prior to creating a surface.
32.4.6. Fuchsia Platform
On Fuchsia, all physical devices and queue families must be capable of presentation with any ImagePipe. As a result there is no Fuchsia-specific query for these capabilities.
32.4.7. iOS Platform
On iOS, all physical devices and queue families must be capable of presentation with any layer. As a result there is no iOS-specific query for these capabilities.
32.4.8. macOS Platform
On macOS, all physical devices and queue families must be capable of presentation with any layer. As a result there is no macOS-specific query for these capabilities.
32.4.9. VI Platform
On VI, all physical devices and queue families must be capable of presentation with any layer. As a result there is no VI-specific query for these capabilities.
32.5. Surface Queries
The capabilities of a swapchain targeting a surface are the intersection of the capabilities of the WSI platform, the native window or display, and the physical device. The resulting capabilities can be obtained with the queries listed below in this section. Capabilities that correspond to image creation parameters are not independent of each other: combinations of parameters that are not supported as reported by vkGetPhysicalDeviceImageFormatProperties are not supported by the surface on that physical device, even if the capabilities taken individually are supported as part of some other parameter combinations.
To query the basic capabilities of a surface, needed in order to create a swapchain, call:
VkResult vkGetPhysicalDeviceSurfaceCapabilitiesKHR(
VkPhysicalDevice physicalDevice,
VkSurfaceKHR surface,
VkSurfaceCapabilitiesKHR* pSurfaceCapabilities);
-
physicalDevice
is the physical device that will be associated with the swapchain to be created, as described for vkCreateSwapchainKHR. -
surface
is the surface that will be associated with the swapchain. -
pSurfaceCapabilities
is a pointer to an instance of the VkSurfaceCapabilitiesKHR structure in which the capabilities are returned.
The VkSurfaceCapabilitiesKHR
structure is defined as:
typedef struct VkSurfaceCapabilitiesKHR {
uint32_t minImageCount;
uint32_t maxImageCount;
VkExtent2D currentExtent;
VkExtent2D minImageExtent;
VkExtent2D maxImageExtent;
uint32_t maxImageArrayLayers;
VkSurfaceTransformFlagsKHR supportedTransforms;
VkSurfaceTransformFlagBitsKHR currentTransform;
VkCompositeAlphaFlagsKHR supportedCompositeAlpha;
VkImageUsageFlags supportedUsageFlags;
} VkSurfaceCapabilitiesKHR;
-
minImageCount
is the minimum number of images the specified device supports for a swapchain created for the surface, and will be at least one. -
maxImageCount
is the maximum number of images the specified device supports for a swapchain created for the surface, and will be either 0, or greater than or equal tominImageCount
. A value of 0 means that there is no limit on the number of images, though there may be limits related to the total amount of memory used by presentable images. -
currentExtent
is the current width and height of the surface, or the special value (0xFFFFFFFF, 0xFFFFFFFF) indicating that the surface size will be determined by the extent of a swapchain targeting the surface. -
minImageExtent
contains the smallest valid swapchain extent for the surface on the specified device. Thewidth
andheight
of the extent will each be less than or equal to the correspondingwidth
andheight
ofcurrentExtent
, unlesscurrentExtent
has the special value described above. -
maxImageExtent
contains the largest valid swapchain extent for the surface on the specified device. Thewidth
andheight
of the extent will each be greater than or equal to the correspondingwidth
andheight
ofminImageExtent
. Thewidth
andheight
of the extent will each be greater than or equal to the correspondingwidth
andheight
ofcurrentExtent
, unlesscurrentExtent
has the special value described above. -
maxImageArrayLayers
is the maximum number of layers presentable images can have for a swapchain created for this device and surface, and will be at least one. -
supportedTransforms
is a bitmask of VkSurfaceTransformFlagBitsKHR indicating the presentation transforms supported for the surface on the specified device. At least one bit will be set. -
currentTransform
is VkSurfaceTransformFlagBitsKHR value indicating the surface’s current transform relative to the presentation engine’s natural orientation. -
supportedCompositeAlpha
is a bitmask of VkCompositeAlphaFlagBitsKHR, representing the alpha compositing modes supported by the presentation engine for the surface on the specified device, and at least one bit will be set. Opaque composition can be achieved in any alpha compositing mode by either using an image format that has no alpha component, or by ensuring that all pixels in the presentable images have an alpha value of 1.0. -
supportedUsageFlags
is a bitmask of VkImageUsageFlagBits representing the ways the application can use the presentable images of a swapchain created with VkPresentModeKHR set toVK_PRESENT_MODE_IMMEDIATE_KHR
,VK_PRESENT_MODE_MAILBOX_KHR
,VK_PRESENT_MODE_FIFO_KHR
orVK_PRESENT_MODE_FIFO_RELAXED_KHR
for the surface on the specified device.VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
must be included in the set but implementations may support additional usages.
Note
Supported usage flags of a presentable image when using
|
Note
Formulas such as min(N, |
To query the basic capabilities of a surface defined by the core or extensions, call:
VkResult vkGetPhysicalDeviceSurfaceCapabilities2KHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceSurfaceInfo2KHR* pSurfaceInfo,
VkSurfaceCapabilities2KHR* pSurfaceCapabilities);
-
physicalDevice
is the physical device that will be associated with the swapchain to be created, as described for vkCreateSwapchainKHR. -
pSurfaceInfo
points to an instance of the VkPhysicalDeviceSurfaceInfo2KHR structure, describing the surface and other fixed parameters that would be consumed by vkCreateSwapchainKHR. -
pSurfaceCapabilities
points to an instance of the VkSurfaceCapabilities2KHR structure in which the capabilities are returned.
vkGetPhysicalDeviceSurfaceCapabilities2KHR
behaves similarly to
vkGetPhysicalDeviceSurfaceCapabilitiesKHR, with the ability to specify
extended inputs via chained input structures, and to return extended
information via chained output structures.
The VkPhysicalDeviceSurfaceInfo2KHR
structure is defined as:
typedef struct VkPhysicalDeviceSurfaceInfo2KHR {
VkStructureType sType;
const void* pNext;
VkSurfaceKHR surface;
} VkPhysicalDeviceSurfaceInfo2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
surface
is the surface that will be associated with the swapchain.
The members of VkPhysicalDeviceSurfaceInfo2KHR
correspond to the
arguments to vkGetPhysicalDeviceSurfaceCapabilitiesKHR, with
sType
and pNext
added for extensibility.
The VkSurfaceCapabilities2KHR
structure is defined as:
typedef struct VkSurfaceCapabilities2KHR {
VkStructureType sType;
void* pNext;
VkSurfaceCapabilitiesKHR surfaceCapabilities;
} VkSurfaceCapabilities2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
surfaceCapabilities
is a structure of type VkSurfaceCapabilitiesKHR describing the capabilities of the specified surface.
The VkSharedPresentSurfaceCapabilitiesKHR
structure is defined as:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
sharedPresentSupportedUsageFlags
is a bitmask of VkImageUsageFlagBits representing the ways the application can use the shared presentable image from a swapchain created with VkPresentModeKHR set toVK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR
orVK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR
for the surface on the specified device.VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
must be included in the set but implementations may support additional usages.
To query the basic capabilities of a surface, needed in order to create a swapchain, call:
VkResult vkGetPhysicalDeviceSurfaceCapabilities2EXT(
VkPhysicalDevice physicalDevice,
VkSurfaceKHR surface,
VkSurfaceCapabilities2EXT* pSurfaceCapabilities);
-
physicalDevice
is the physical device that will be associated with the swapchain to be created, as described for vkCreateSwapchainKHR. -
surface
is the surface that will be associated with the swapchain. -
pSurfaceCapabilities
is a pointer to an instance of the VkSurfaceCapabilities2EXT structure in which the capabilities are returned.
vkGetPhysicalDeviceSurfaceCapabilities2EXT
behaves similarly to
vkGetPhysicalDeviceSurfaceCapabilitiesKHR, with the ability to return
extended information by adding extension structures to the pNext
chain
of its pSurfaceCapabilities
parameter.
The VkSurfaceCapabilities2EXT
structure is defined as:
typedef struct VkSurfaceCapabilities2EXT {
VkStructureType sType;
void* pNext;
uint32_t minImageCount;
uint32_t maxImageCount;
VkExtent2D currentExtent;
VkExtent2D minImageExtent;
VkExtent2D maxImageExtent;
uint32_t maxImageArrayLayers;
VkSurfaceTransformFlagsKHR supportedTransforms;
VkSurfaceTransformFlagBitsKHR currentTransform;
VkCompositeAlphaFlagsKHR supportedCompositeAlpha;
VkImageUsageFlags supportedUsageFlags;
VkSurfaceCounterFlagsEXT supportedSurfaceCounters;
} VkSurfaceCapabilities2EXT;
All members of VkSurfaceCapabilities2EXT
are identical to the
corresponding members of VkSurfaceCapabilitiesKHR where one exists.
The remaining members are:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
supportedSurfaceCounters
is a bitmask of VkSurfaceCounterFlagBitsEXT indicating the supported surface counter types.
Bits which can be set in
VkSurfaceCapabilities2EXT::supportedSurfaceCounters
, indicating
supported surface counter types, are:
typedef enum VkSurfaceCounterFlagBitsEXT {
VK_SURFACE_COUNTER_VBLANK_EXT = 0x00000001,
} VkSurfaceCounterFlagBitsEXT;
-
VK_SURFACE_COUNTER_VBLANK_EXT
specifies a counter incrementing once every time a vertical blanking period occurs on the display associated with the surface.
typedef VkFlags VkSurfaceCounterFlagsEXT;
VkSurfaceCounterFlagsEXT
is a bitmask type for setting a mask of zero
or more VkSurfaceCounterFlagBitsEXT.
Bits which may be set in
VkSurfaceCapabilitiesKHR::supportedTransforms
indicating the
presentation transforms supported for the surface on the specified device,
and possible values of
VkSurfaceCapabilitiesKHR::currentTransform
is indicating the
surface’s current transform relative to the presentation engine’s natural
orientation, are:
typedef enum VkSurfaceTransformFlagBitsKHR {
VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR = 0x00000001,
VK_SURFACE_TRANSFORM_ROTATE_90_BIT_KHR = 0x00000002,
VK_SURFACE_TRANSFORM_ROTATE_180_BIT_KHR = 0x00000004,
VK_SURFACE_TRANSFORM_ROTATE_270_BIT_KHR = 0x00000008,
VK_SURFACE_TRANSFORM_HORIZONTAL_MIRROR_BIT_KHR = 0x00000010,
VK_SURFACE_TRANSFORM_HORIZONTAL_MIRROR_ROTATE_90_BIT_KHR = 0x00000020,
VK_SURFACE_TRANSFORM_HORIZONTAL_MIRROR_ROTATE_180_BIT_KHR = 0x00000040,
VK_SURFACE_TRANSFORM_HORIZONTAL_MIRROR_ROTATE_270_BIT_KHR = 0x00000080,
VK_SURFACE_TRANSFORM_INHERIT_BIT_KHR = 0x00000100,
} VkSurfaceTransformFlagBitsKHR;
-
VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR
specifies that image content is presented without being transformed. -
VK_SURFACE_TRANSFORM_ROTATE_90_BIT_KHR
specifies that image content is rotated 90 degrees clockwise. -
VK_SURFACE_TRANSFORM_ROTATE_180_BIT_KHR
specifies that image content is rotated 180 degrees clockwise. -
VK_SURFACE_TRANSFORM_ROTATE_270_BIT_KHR
specifies that image content is rotated 270 degrees clockwise. -
VK_SURFACE_TRANSFORM_HORIZONTAL_MIRROR_BIT_KHR
specifies that image content is mirrored horizontally. -
VK_SURFACE_TRANSFORM_HORIZONTAL_MIRROR_ROTATE_90_BIT_KHR
specifies that image content is mirrored horizontally, then rotated 90 degrees clockwise. -
VK_SURFACE_TRANSFORM_HORIZONTAL_MIRROR_ROTATE_180_BIT_KHR
specifies that image content is mirrored horizontally, then rotated 180 degrees clockwise. -
VK_SURFACE_TRANSFORM_HORIZONTAL_MIRROR_ROTATE_270_BIT_KHR
specifies that image content is mirrored horizontally, then rotated 270 degrees clockwise. -
VK_SURFACE_TRANSFORM_INHERIT_BIT_KHR
specifies that the presentation transform is not specified, and is instead determined by platform-specific considerations and mechanisms outside Vulkan.
typedef VkFlags VkSurfaceTransformFlagsKHR;
VkSurfaceTransformFlagsKHR
is a bitmask type for setting a mask of
zero or more VkSurfaceTransformFlagBitsKHR.
The supportedCompositeAlpha
member is of type
VkCompositeAlphaFlagBitsKHR, which contains the following values:
typedef enum VkCompositeAlphaFlagBitsKHR {
VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR = 0x00000001,
VK_COMPOSITE_ALPHA_PRE_MULTIPLIED_BIT_KHR = 0x00000002,
VK_COMPOSITE_ALPHA_POST_MULTIPLIED_BIT_KHR = 0x00000004,
VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR = 0x00000008,
} VkCompositeAlphaFlagBitsKHR;
These values are described as follows:
-
VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR
: The alpha channel, if it exists, of the images is ignored in the compositing process. Instead, the image is treated as if it has a constant alpha of 1.0. -
VK_COMPOSITE_ALPHA_PRE_MULTIPLIED_BIT_KHR
: The alpha channel, if it exists, of the images is respected in the compositing process. The non-alpha channels of the image are expected to already be multiplied by the alpha channel by the application. -
VK_COMPOSITE_ALPHA_POST_MULTIPLIED_BIT_KHR
: The alpha channel, if it exists, of the images is respected in the compositing process. The non-alpha channels of the image are not expected to already be multiplied by the alpha channel by the application; instead, the compositor will multiply the non-alpha channels of the image by the alpha channel during compositing. -
VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR
: The way in which the presentation engine treats the alpha channel in the images is unknown to the Vulkan API. Instead, the application is responsible for setting the composite alpha blending mode using native window system commands. If the application does not set the blending mode using native window system commands, then a platform-specific default will be used.
typedef VkFlags VkCompositeAlphaFlagsKHR;
VkCompositeAlphaFlagsKHR
is a bitmask type for setting a mask of zero
or more VkCompositeAlphaFlagBitsKHR.
To query the supported swapchain format-color space pairs for a surface, call:
VkResult vkGetPhysicalDeviceSurfaceFormatsKHR(
VkPhysicalDevice physicalDevice,
VkSurfaceKHR surface,
uint32_t* pSurfaceFormatCount,
VkSurfaceFormatKHR* pSurfaceFormats);
-
physicalDevice
is the physical device that will be associated with the swapchain to be created, as described for vkCreateSwapchainKHR. -
surface
is the surface that will be associated with the swapchain. -
pSurfaceFormatCount
is a pointer to an integer related to the number of format pairs available or queried, as described below. -
pSurfaceFormats
is eitherNULL
or a pointer to an array ofVkSurfaceFormatKHR
structures.
If pSurfaceFormats
is NULL
, then the number of format pairs
supported for the given surface
is returned in
pSurfaceFormatCount
.
The number of format pairs supported will be greater than or equal to 1.
Otherwise, pSurfaceFormatCount
must point to a variable set by the
user to the number of elements in the pSurfaceFormats
array, and on
return the variable is overwritten with the number of structures actually
written to pSurfaceFormats
.
If the value of pSurfaceFormatCount
is less than the number of format
pairs supported, at most pSurfaceFormatCount
structures will be
written.
If pSurfaceFormatCount
is smaller than the number of format pairs
supported for the given surface
, VK_INCOMPLETE
will be returned
instead of VK_SUCCESS
to indicate that not all the available values
were returned.
The VkSurfaceFormatKHR
structure is defined as:
typedef struct VkSurfaceFormatKHR {
VkFormat format;
VkColorSpaceKHR colorSpace;
} VkSurfaceFormatKHR;
-
format
is a VkFormat that is compatible with the specified surface. -
colorSpace
is a presentation VkColorSpaceKHR that is compatible with the surface.
To query the supported swapchain format tuples for a surface, call:
VkResult vkGetPhysicalDeviceSurfaceFormats2KHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceSurfaceInfo2KHR* pSurfaceInfo,
uint32_t* pSurfaceFormatCount,
VkSurfaceFormat2KHR* pSurfaceFormats);
-
physicalDevice
is the physical device that will be associated with the swapchain to be created, as described for vkCreateSwapchainKHR. -
pSurfaceInfo
points to an instance of the VkPhysicalDeviceSurfaceInfo2KHR structure, describing the surface and other fixed parameters that would be consumed by vkCreateSwapchainKHR. -
pSurfaceFormatCount
is a pointer to an integer related to the number of format tuples available or queried, as described below. -
pSurfaceFormats
is eitherNULL
or a pointer to an array of VkSurfaceFormat2KHR structures.
If pSurfaceFormats
is NULL
, then the number of format tuples
supported for the given surface
is returned in
pSurfaceFormatCount
.
The number of format tuples supported will be greater than or equal to 1.
Otherwise, pSurfaceFormatCount
must point to a variable set by the
user to the number of elements in the pSurfaceFormats
array, and on
return the variable is overwritten with the number of structures actually
written to pSurfaceFormats
.
If the value of pSurfaceFormatCount
is less than the number of format
tuples supported, at most pSurfaceFormatCount
structures will be
written.
If pSurfaceFormatCount
is smaller than the number of format tuples
supported for the surface parameters described in pSurfaceInfo
,
VK_INCOMPLETE
will be returned instead of VK_SUCCESS
to indicate
that not all the available values were returned.
The VkSurfaceFormat2KHR
structure is defined as:
typedef struct VkSurfaceFormat2KHR {
VkStructureType sType;
void* pNext;
VkSurfaceFormatKHR surfaceFormat;
} VkSurfaceFormat2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
surfaceFormat
is an instance of VkSurfaceFormatKHR describing a format-color space pair that is compatible with the specified surface.
While the format
of a presentable image refers to the encoding of each
pixel, the colorSpace
determines how the presentation engine
interprets the pixel values.
A color space in this document refers to a specific color space (defined by
the chromaticities of its primaries and a white point in CIE Lab), and a
transfer function that is applied before storing or transmitting color data
in the given color space.
Possible values of VkSurfaceFormatKHR::colorSpace
, specifying
supported color spaces of a presentation engine, are:
typedef enum VkColorSpaceKHR {
VK_COLOR_SPACE_SRGB_NONLINEAR_KHR = 0,
VK_COLOR_SPACE_DISPLAY_P3_NONLINEAR_EXT = 1000104001,
VK_COLOR_SPACE_EXTENDED_SRGB_LINEAR_EXT = 1000104002,
VK_COLOR_SPACE_DCI_P3_LINEAR_EXT = 1000104003,
VK_COLOR_SPACE_DCI_P3_NONLINEAR_EXT = 1000104004,
VK_COLOR_SPACE_BT709_LINEAR_EXT = 1000104005,
VK_COLOR_SPACE_BT709_NONLINEAR_EXT = 1000104006,
VK_COLOR_SPACE_BT2020_LINEAR_EXT = 1000104007,
VK_COLOR_SPACE_HDR10_ST2084_EXT = 1000104008,
VK_COLOR_SPACE_DOLBYVISION_EXT = 1000104009,
VK_COLOR_SPACE_HDR10_HLG_EXT = 1000104010,
VK_COLOR_SPACE_ADOBERGB_LINEAR_EXT = 1000104011,
VK_COLOR_SPACE_ADOBERGB_NONLINEAR_EXT = 1000104012,
VK_COLOR_SPACE_PASS_THROUGH_EXT = 1000104013,
VK_COLOR_SPACE_EXTENDED_SRGB_NONLINEAR_EXT = 1000104014,
VK_COLORSPACE_SRGB_NONLINEAR_KHR = VK_COLOR_SPACE_SRGB_NONLINEAR_KHR,
} VkColorSpaceKHR;
-
VK_COLOR_SPACE_SRGB_NONLINEAR_KHR
specifies support for the sRGB color space. -
VK_COLOR_SPACE_DISPLAY_P3_NONLINEAR_EXT
specifies support for the Display-P3 color space and applies an sRGB-like transfer function (defined below). -
VK_COLOR_SPACE_EXTENDED_SRGB_LINEAR_EXT
specifies support for the extended sRGB color space and applies a linear transfer function. -
VK_COLOR_SPACE_EXTENDED_SRGB_NONLINEAR_EXT
specifies support for the extended sRGB color space and applies an sRGB transfer function. -
VK_COLOR_SPACE_DCI_P3_LINEAR_EXT
specifies support for the DCI-P3 color space and applies a linear OETF. -
VK_COLOR_SPACE_DCI_P3_NONLINEAR_EXT
specifies support for the DCI-P3 color space and applies the Gamma 2.6 OETF. -
VK_COLOR_SPACE_BT709_LINEAR_EXT
specifies support for the BT709 color space and applies a linear OETF. -
VK_COLOR_SPACE_BT709_NONLINEAR_EXT
specifies support for the BT709 color space and applies the SMPTE 170M OETF. -
VK_COLOR_SPACE_BT2020_LINEAR_EXT
specifies support for the BT2020 color space and applies a linear OETF. -
VK_COLOR_SPACE_HDR10_ST2084_EXT
specifies support for the HDR10 (BT2020 color) space and applies the SMPTE ST2084 Perceptual Quantizer (PQ) OETF. -
VK_COLOR_SPACE_DOLBYVISION_EXT
specifies support for the Dolby Vision (BT2020 color space), proprietary encoding, and applies the SMPTE ST2084 OETF. -
VK_COLOR_SPACE_HDR10_HLG_EXT
specifies support for the HDR10 (BT2020 color space) and applies the Hybrid Log Gamma (HLG) OETF. -
VK_COLOR_SPACE_ADOBERGB_LINEAR_EXT
specifies support for the AdobeRGB color space and applies a linear OETF. -
VK_COLOR_SPACE_ADOBERGB_NONLINEAR_EXT
specifies support for the AdobeRGB color space and applies the Gamma 2.2 OETF. -
VK_COLOR_SPACE_PASS_THROUGH_EXT
specifies that color components are used “as is”. This is intended to allow applications to supply data for color spaces not described here.
The color components of Non-linear color space swap chain images have had the appropriate transfer function applied. Vulkan requires that all implementations support the sRGB transfer function when using an SRGB pixel format. Other transfer functions, such as SMPTE 170M or SMPTE2084, must not be performed by the implementation, but can be performed by the application shader. This extension defines enums for VkColorSpaceKHR that correspond to the following color spaces:
Name | Red Primary | Green Primary | Blue Primary | White-point | Transfer function |
---|---|---|---|---|---|
DCI-P3 |
0.680, 0.320 |
0.265, 0.690 |
0.150, 0.060 |
0.3127, 0.3290 (D65) |
Gamma 2.6 |
Display-P3 |
0.680, 0.320 |
0.265, 0.690 |
0.150, 0.060 |
0.3127, 0.3290 (D65) |
Display-P3 |
BT709 |
0.640, 0.330 |
0.300, 0.600 |
0.150, 0.060 |
0.3127, 0.3290 (D65) |
SMPTE 170M |
sRGB |
0.640, 0.330 |
0.300, 0.600 |
0.150, 0.060 |
0.3127, 0.3290 (D65) |
sRGB |
extended sRGB |
0.640, 0.330 |
0.300, 0.600 |
0.150, 0.060 |
0.3127, 0.3290 (D65) |
extended sRGB |
HDR10_ST2084 |
0.708, 0.292 |
0.170, 0.797 |
0.131, 0.046 |
0.3127, 0.3290 (D65) |
ST2084 |
DOLBYVISION |
0.708, 0.292 |
0.170, 0.797 |
0.131, 0.046 |
0.3127, 0.3290 (D65) |
ST2084 |
HDR10_HLG |
0.708, 0.292 |
0.170, 0.797 |
0.131, 0.046 |
0.3127, 0.3290 (D65) |
HLG |
AdobeRGB |
0.640, 0.330 |
0.210, 0.710 |
0.150, 0.060 |
0.3127, 0.3290 (D65) |
AdobeRGB |
For Opto-Electrical Transfer Function (OETF), unless otherwise specified, the values of L and E are defined as:
L - linear luminance of image \(0 \leq L \leq 1\) for conventional colorimetry
E - corresponding electrical signal (value stored in memory)
32.5.1. sRGB transfer function
32.5.2. Display-P3 EOTF
\(a = 0.948\)
\(b = 0.052\)
\(c = 0.077\)
32.5.3. Display-P3 OETF
Note
For most uses, the sRGB OETF is equivalent. |
32.5.4. Extended sRGB OETF
L - luminance of image is within [-0.6038, 7.5913].
E can be negative and/or > 1. That is how extended sRGB specifies colors outside the standard sRGB gamut. This means extended sRGB needs a floating point pixel format to cover the intended color range.
32.5.5. SMPTE 170M OETF
\(\alpha = 1.099 \text{ and } \beta = 0.018 \text{ for 10-bits and
less per sample system (the values given in Rec.
709)}\)
\(\alpha = 1.0993 \text{ and } \beta = 0.0181 \text{ for 12-bits
per sample system}\)
32.5.6. SMPTE ST2084 OETF (Inverse-EOTF)
where:
\(m_1 = 2610 / 4096 \times \frac{1}{4} = 0.1593017578125\)
\(m_2 = 2523 / 4096 \times 128 = 78.84375\)
\(c_1 = 3424 / 4096 = 0.8359375 = c3 - c2 + 1\)
\(c_2 = 2413 / 4096 \times 32 = 18.8515625\)
\(c_3 = 2392 / 4096 \times 32 = 18.6875\)
32.5.7. Hybrid Log Gamma (HLG)
L — is the signal normalized by the reference white level
r — is the reference white level and has a signal value of 0.5
a = 0.17883277 and b = 0.28466892 and c = 0.55991073
32.5.8. Adobe RGB (1998) OETF
\(E = L^\frac{1}{2.19921875}\)
32.5.9. Gamma 2.6 OETF
\(E = L^\frac{1}{2.6}\)
An implementation supporting this extension indicates support for these color spaces via VkSurfaceFormatKHR structures returned from vkGetPhysicalDeviceSurfaceFormatsKHR.
Specifying the supported surface color space when calling vkCreateSwapchainKHR will create a swapchain using that color space.
Vulkan requires that all implementations support the sRGB Opto-Electrical Transfer Function (OETF) and Electro-optical transfer function (EOTF) when using an SRGB pixel format. Other transfer functions, such as SMPTE 170M, must not be performed by the implementation, but can be performed by the application shader.
If pSurfaceFormats
includes an entry whose value for colorSpace
is VK_COLOR_SPACE_SRGB_NONLINEAR_KHR
and whose value for format
is a UNORM (or SRGB) format and the corresponding SRGB (or UNORM) format is
a color renderable format for VK_IMAGE_TILING_OPTIMAL
, then
pSurfaceFormats
must also contain an entry with the same value for
colorSpace
and format
equal to the corresponding SRGB (or UNORM)
format.
Note
If |
Note
In the initial release of the |
To query the supported presentation modes for a surface, call:
VkResult vkGetPhysicalDeviceSurfacePresentModesKHR(
VkPhysicalDevice physicalDevice,
VkSurfaceKHR surface,
uint32_t* pPresentModeCount,
VkPresentModeKHR* pPresentModes);
-
physicalDevice
is the physical device that will be associated with the swapchain to be created, as described for vkCreateSwapchainKHR. -
surface
is the surface that will be associated with the swapchain. -
pPresentModeCount
is a pointer to an integer related to the number of presentation modes available or queried, as described below. -
pPresentModes
is eitherNULL
or a pointer to an array of VkPresentModeKHR values, indicating the supported presentation modes.
If pPresentModes
is NULL
, then the number of presentation modes
supported for the given surface
is returned in
pPresentModeCount
.
Otherwise, pPresentModeCount
must point to a variable set by the user
to the number of elements in the pPresentModes
array, and on return
the variable is overwritten with the number of values actually written to
pPresentModes
.
If the value of pPresentModeCount
is less than the number of
presentation modes supported, at most pPresentModeCount
values will be
written.
If pPresentModeCount
is smaller than the number of presentation modes
supported for the given surface
, VK_INCOMPLETE
will be returned
instead of VK_SUCCESS
to indicate that not all the available values
were returned.
Possible values of elements of the
vkGetPhysicalDeviceSurfacePresentModesKHR::pPresentModes
array,
indicating the supported presentation modes for a surface, are:
typedef enum VkPresentModeKHR {
VK_PRESENT_MODE_IMMEDIATE_KHR = 0,
VK_PRESENT_MODE_MAILBOX_KHR = 1,
VK_PRESENT_MODE_FIFO_KHR = 2,
VK_PRESENT_MODE_FIFO_RELAXED_KHR = 3,
VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR = 1000111000,
VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR = 1000111001,
} VkPresentModeKHR;
-
VK_PRESENT_MODE_IMMEDIATE_KHR
specifies that the presentation engine does not wait for a vertical blanking period to update the current image, meaning this mode may result in visible tearing. No internal queuing of presentation requests is needed, as the requests are applied immediately. -
VK_PRESENT_MODE_MAILBOX_KHR
specifies that the presentation engine waits for the next vertical blanking period to update the current image. Tearing cannot be observed. An internal single-entry queue is used to hold pending presentation requests. If the queue is full when a new presentation request is received, the new request replaces the existing entry, and any images associated with the prior entry become available for re-use by the application. One request is removed from the queue and processed during each vertical blanking period in which the queue is non-empty. -
VK_PRESENT_MODE_FIFO_KHR
specifies that the presentation engine waits for the next vertical blanking period to update the current image. Tearing cannot be observed. An internal queue is used to hold pending presentation requests. New requests are appended to the end of the queue, and one request is removed from the beginning of the queue and processed during each vertical blanking period in which the queue is non-empty. This is the only value ofpresentMode
that is required to be supported. -
VK_PRESENT_MODE_FIFO_RELAXED_KHR
specifies that the presentation engine generally waits for the next vertical blanking period to update the current image. If a vertical blanking period has already passed since the last update of the current image then the presentation engine does not wait for another vertical blanking period for the update, meaning this mode may result in visible tearing in this case. This mode is useful for reducing visual stutter with an application that will mostly present a new image before the next vertical blanking period, but may occasionally be late, and present a new image just after the next vertical blanking period. An internal queue is used to hold pending presentation requests. New requests are appended to the end of the queue, and one request is removed from the beginning of the queue and processed during or after each vertical blanking period in which the queue is non-empty. -
VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR
specifies that the presentation engine and application have concurrent access to a single image, which is referred to as a shared presentable image. The presentation engine is only required to update the current image after a new presentation request is received. Therefore the application must make a presentation request whenever an update is required. However, the presentation engine may update the current image at any point, meaning this mode may result in visible tearing. -
VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR
specifies that the presentation engine and application have concurrent access to a single image, which is referred to as a shared presentable image. The presentation engine periodically updates the current image on its regular refresh cycle. The application is only required to make one initial presentation request, after which the presentation engine must update the current image without any need for further presentation requests. The application can indicate the image contents have been updated by making a presentation request, but this does not guarantee the timing of when it will be updated. This mode may result in visible tearing if rendering to the image is not timed correctly.
The supported VkImageUsageFlagBits of the presentable images of a swapchain created for a surface may differ depending on the presentation mode, and can be determined as per the table below:
Presentation mode | Image usage flags |
---|---|
|
VkSurfaceCapabilitiesKHR:: |
|
VkSurfaceCapabilitiesKHR:: |
|
VkSurfaceCapabilitiesKHR:: |
|
VkSurfaceCapabilitiesKHR:: |
|
VkSharedPresentSurfaceCapabilitiesKHR:: |
|
VkSharedPresentSurfaceCapabilitiesKHR:: |
Note
For reference, the mode indicated by |
32.6. Device Group Queries
A logical device that represents multiple physical devices may support presenting from images on more than one physical device, or combining images from multiple physical devices.
To query these capabilities, call:
VkResult vkGetDeviceGroupPresentCapabilitiesKHR(
VkDevice device,
VkDeviceGroupPresentCapabilitiesKHR* pDeviceGroupPresentCapabilities);
-
device
is the logical device. -
pDeviceGroupPresentCapabilities
is a pointer to a structure of type VkDeviceGroupPresentCapabilitiesKHR that is filled with the logical device’s capabilities.
The VkDeviceGroupPresentCapabilitiesKHR
structure is defined as:
typedef struct VkDeviceGroupPresentCapabilitiesKHR {
VkStructureType sType;
const void* pNext;
uint32_t presentMask[VK_MAX_DEVICE_GROUP_SIZE];
VkDeviceGroupPresentModeFlagsKHR modes;
} VkDeviceGroupPresentCapabilitiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
presentMask
is an array of masks, where the mask at element i is non-zero if physical device i has a presentation engine, and where bit j is set in element i if physical device i can present swapchain images from physical device j. If element i is non-zero, then bit i must be set. -
modes
is a bitmask of VkDeviceGroupPresentModeFlagBitsKHR indicating which device group presentation modes are supported.
modes
always has VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR
set.
The present mode flags are also used when presenting an image, in
VkDeviceGroupPresentInfoKHR::mode
.
If a device group only includes a single physical device, then modes
must equal VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR
.
Bits which may be set in
VkDeviceGroupPresentCapabilitiesKHR::modes
to indicate which
device group presentation modes are supported are:
typedef enum VkDeviceGroupPresentModeFlagBitsKHR {
VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR = 0x00000001,
VK_DEVICE_GROUP_PRESENT_MODE_REMOTE_BIT_KHR = 0x00000002,
VK_DEVICE_GROUP_PRESENT_MODE_SUM_BIT_KHR = 0x00000004,
VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_MULTI_DEVICE_BIT_KHR = 0x00000008,
} VkDeviceGroupPresentModeFlagBitsKHR;
-
VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR
specifies that any physical device with a presentation engine can present its own swapchain images. -
VK_DEVICE_GROUP_PRESENT_MODE_REMOTE_BIT_KHR
specifies that any physical device with a presentation engine can present swapchain images from any physical device in itspresentMask
. -
VK_DEVICE_GROUP_PRESENT_MODE_SUM_BIT_KHR
specifies that any physical device with a presentation engine can present the sum of swapchain images from any physical devices in itspresentMask
. -
VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_MULTI_DEVICE_BIT_KHR
specifies that multiple physical devices with a presentation engine can each present their own swapchain images.
typedef VkFlags VkDeviceGroupPresentModeFlagsKHR;
VkDeviceGroupPresentModeFlagsKHR
is a bitmask type for setting a mask
of zero or more VkDeviceGroupPresentModeFlagBitsKHR.
Some surfaces may not be capable of using all the device group present modes.
To query the supported device group present modes for a particular surface, call:
VkResult vkGetDeviceGroupSurfacePresentModesKHR(
VkDevice device,
VkSurfaceKHR surface,
VkDeviceGroupPresentModeFlagsKHR* pModes);
-
device
is the logical device. -
surface
is the surface. -
pModes
is a pointer to a value of type VkDeviceGroupPresentModeFlagsKHR that is filled with the supported device group present modes for the surface.
The modes returned by this command are not invariant, and may change in response to the surface being moved, resized, or occluded. These modes must be a subset of the modes returned by vkGetDeviceGroupPresentCapabilitiesKHR.
When using VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_MULTI_DEVICE_BIT_KHR
,
the application may need to know which regions of the surface are used when
presenting locally on each physical device.
Presentation of swapchain images to this surface need only have valid
contents in the regions returned by this command.
To query a set of rectangles used in presentation on the physical device, call:
VkResult vkGetPhysicalDevicePresentRectanglesKHR(
VkPhysicalDevice physicalDevice,
VkSurfaceKHR surface,
uint32_t* pRectCount,
VkRect2D* pRects);
-
physicalDevice
is the physical device. -
surface
is the surface. -
pRectCount
is a pointer to an integer related to the number of rectangles available or queried, as described below. -
pRects
is eitherNULL
or a pointer to an array of VkRect2D structures.
If pRects
is NULL
, then the number of rectangles used when
presenting the given surface
is returned in pRectCount
.
Otherwise, pRectCount
must point to a variable set by the user to the
number of elements in the pRects
array, and on return the variable is
overwritten with the number of structures actually written to pRects
.
If the value of pRectCount
is less than the number of rectangles, at
most pRectCount
structures will be written.
If pRectCount
is smaller than the number of rectangles used for the
given surface
, VK_INCOMPLETE
will be returned instead of
VK_SUCCESS
to indicate that not all the available values were
returned.
The values returned by this command are not invariant, and may change in response to the surface being moved, resized, or occluded.
The rectangles returned by this command must not overlap.
32.7. Display Timing Queries
Traditional game and real-time-animation applications frequently use
VK_PRESENT_MODE_FIFO_KHR
so that presentable images are updated during
the vertical blanking period of a given refresh cycle (RC) of the
presentation engine’s display.
This avoids the visual anomaly known as tearing.
However, synchronizing the presentation of images with the RC does not prevent all forms of visual anomalies. Stuttering occurs when the geometry for each presentable image isn’t accurately positioned for when that image will be displayed. The geometry may appear to move too little some RCs, and too much for others. Sometimes the animation appears to freeze, when the same image is used for more than one RC.
In order to minimize stuttering, an application needs to correctly position
their geometry for when the presentable image will be displayed to the user.
To accomplish this, applications need various timing information about the
presentation engine’s display.
They need to know when presentable images were actually presented, and when
they could have been presented.
Applications also need to tell the presentation engine to display an image
no sooner than a given time.
This can allow the application’s animation to look smooth to the user, with
no stuttering.
The VK_GOOGLE_display_timing
extension allows an application to satisfy
these needs.
The presentation engine’s display typically refreshes the pixels that are displayed to the user on a periodic basis. The period may be fixed or variable. In many cases, the presentation engine is associated with fixed refresh rate (FRR) display technology, with a fixed refresh rate (RR, e.g. 60Hz). In some cases, the presentation engine is associated with variable refresh rate (VRR) display technology, where each refresh cycle (RC) can vary in length. This extension treats VRR displays as if they are FRR.
To query the duration of a refresh cycle (RC) for the presentation engine’s display, call:
VkResult vkGetRefreshCycleDurationGOOGLE(
VkDevice device,
VkSwapchainKHR swapchain,
VkRefreshCycleDurationGOOGLE* pDisplayTimingProperties);
-
device
is the device associated withswapchain
. -
swapchain
is the swapchain to obtain the refresh duration for. -
pDisplayTimingProperties
is a pointer to an instance of theVkRefreshCycleDurationGOOGLE
structure.
The VkRefreshCycleDurationGOOGLE
structure is defined as:
typedef struct VkRefreshCycleDurationGOOGLE {
uint64_t refreshDuration;
} VkRefreshCycleDurationGOOGLE;
-
refreshDuration
is the number of nanoseconds from the start of one refresh cycle to the next.
Note
The rate at which an application renders and presents new images is known as
the image present rate (IPR, aka frame rate).
The inverse of IPR, or the duration between each image present, is the image
present duration (IPD).
In order to provide a smooth, stutter-free animation, an application will
want its IPD to be a multiple of In order to determine a target IPD for a display (i.e. a multiple of
Adjustments to an application’s IPD may be needed because different views of
an application’s geometry can take different amounts of time to render.
For example, looking at the sky may take less time to render than looking at
multiple, complex items in a room.
In general, it is good to not frequently change IPD, as that can cause
visual anomalies.
Adjustments to a larger IPD because of late images should happen quickly,
but adjustments to a smaller IPD should only happen if the
|
The implementation will maintain a limited amount of history of timing
information about previous presents.
Because of the asynchronous nature of the presentation engine, the timing
information for a given vkQueuePresentKHR command will become
available some time later.
These time values can be asynchronously queried, and will be returned if
available.
All time values are in nanoseconds, relative to a monotonically-increasing
clock (e.g. CLOCK_MONOTONIC
(see clock_gettime(2)) on Android and Linux).
To asynchronously query the presentation engine, for newly-available timing information about one or more previous presents to a given swapchain, call:
VkResult vkGetPastPresentationTimingGOOGLE(
VkDevice device,
VkSwapchainKHR swapchain,
uint32_t* pPresentationTimingCount,
VkPastPresentationTimingGOOGLE* pPresentationTimings);
-
device
is the device associated withswapchain
. -
swapchain
is the swapchain to obtain presentation timing information duration for. -
pPresentationTimingCount
is a pointer to an integer related to the number ofVkPastPresentationTimingGOOGLE
structures to query, as described below. -
pPresentationTimings
is eitherNULL
or a pointer to an array ofVkPastPresentationTimingGOOGLE
structures.
If pPresentationTimings
is NULL
, then the number of newly-available
timing records for the given swapchain
is returned in
pPresentationTimingCount
.
Otherwise, pPresentationTimingCount
must point to a variable set by
the user to the number of elements in the pPresentationTimings
array,
and on return the variable is overwritten with the number of structures
actually written to pPresentationTimings
.
If the value of pPresentationTimingCount
is less than the number of
newly-available timing records, at most pPresentationTimingCount
structures will be written.
If pPresentationTimingCount
is smaller than the number of
newly-available timing records for the given swapchain
,
VK_INCOMPLETE
will be returned instead of VK_SUCCESS
to indicate
that not all the available values were returned.
The VkPastPresentationTimingGOOGLE
structure is defined as:
typedef struct VkPastPresentationTimingGOOGLE {
uint32_t presentID;
uint64_t desiredPresentTime;
uint64_t actualPresentTime;
uint64_t earliestPresentTime;
uint64_t presentMargin;
} VkPastPresentationTimingGOOGLE;
-
presentID
is an application-provided value that was given to a previousvkQueuePresentKHR
command via VkPresentTimeGOOGLE::presentID
(see below). It can be used to uniquely identify a previous present with the vkQueuePresentKHR command. -
desiredPresentTime
is an application-provided value that was given to a previous vkQueuePresentKHR command via VkPresentTimeGOOGLE::desiredPresentTime
. If non-zero, it was used by the application to indicate that an image not be presented any sooner thandesiredPresentTime
. -
actualPresentTime
is the time when the image of theswapchain
was actually displayed. -
earliestPresentTime
is the time when the image of theswapchain
could have been displayed. This may differ fromactualPresentTime
if the application requested that the image be presented no sooner than VkPresentTimeGOOGLE::desiredPresentTime
. -
presentMargin
is an indication of how early thevkQueuePresentKHR
command was processed compared to how soon it needed to be processed, and still be presented atearliestPresentTime
.
The results for a given swapchain
and presentID
are only
returned once from vkGetPastPresentationTimingGOOGLE
.
The application can use the VkPastPresentationTimingGOOGLE
values to
occasionally adjust its timing.
For example, if actualPresentTime
is later than expected (e.g. one
refreshDuration
late), the application may increase its target IPD to
a higher multiple of refreshDuration
(e.g. decrease its frame rate
from 60Hz to 30Hz).
If actualPresentTime
and earliestPresentTime
are consistently
different, and if presentMargin
is consistently large enough, the
application may decrease its target IPD to a smaller multiple of
refreshDuration
(e.g. increase its frame rate from 30Hz to 60Hz).
If actualPresentTime
and earliestPresentTime
are same, and if
presentMargin
is consistently high, the application may delay the
start of its input-render-present loop in order to decrease the latency
between user input and the corresponding present (always leaving some margin
in case a new image takes longer to render than the previous image).
An application that desires its target IPD to always be the same as
refreshDuration
, can also adjust features until
actualPresentTime
is never late and presentMargin
is
satisfactory.
The full VK_GOOGLE_display_timing
extension semantics are described for
swapchains created with VK_PRESENT_MODE_FIFO_KHR
.
For example, non-zero values of
VkPresentTimeGOOGLE
::desiredPresentTime
must be honored, and
vkGetPastPresentationTimingGOOGLE
should return a
VkPastPresentationTimingGOOGLE
structure with valid values for all
images presented with vkQueuePresentKHR
.
The semantics for other present modes are as follows:
-
VK_PRESENT_MODE_IMMEDIATE_KHR
. The presentation engine may ignore non-zero values ofVkPresentTimeGOOGLE
::desiredPresentTime
in favor of presenting immediately. The value ofVkPastPresentationTimingGOOGLE
::earliestPresentTime
must be the same asVkPastPresentationTimingGOOGLE
::actualPresentTime
, which should be when the presentation engine displayed the image. -
VK_PRESENT_MODE_MAILBOX_KHR
. The intention of using this present mode with this extension is to handle cases where an image is presented late, and the next image is presented soon enough to replace it at the next vertical blanking period. For images that are displayed to the user, the value ofVkPastPresentationTimingGOOGLE
::actualPresentTime
must be when the image was displayed. For images that are not displayed to the user,vkGetPastPresentationTimingGOOGLE
may not return aVkPastPresentationTimingGOOGLE
structure, or it may return aVkPastPresentationTimingGOOGLE
structure with the value of zero for bothVkPastPresentationTimingGOOGLE
::actualPresentTime
andVkPastPresentationTimingGOOGLE
::earliestPresentTime
. It is possible that an application can submit images withVkPresentTimeGOOGLE
::desiredPresentTime
values such that new images may not be displayed. For example, ifVkPresentTimeGOOGLE
::desiredPresentTime
is far enough in the future that an image is not presented beforevkQueuePresentKHR
is called to present another image, the first image will not be displayed to the user. If the application continues to do that, the presentation may not display new images. -
VK_PRESENT_MODE_FIFO_RELAXED_KHR
. For images that are presented in time to be displayed at the next vertical blanking period, the semantics are identical as forVK_PRESENT_MODE_FIFO_RELAXED_KHR
. For images that are presented late, and are displayed after the start of the vertical blanking period (i.e. with tearing), the values ofVkPastPresentationTimingGOOGLE
may be treated as if the image was displayed at the start of the vertical blanking period, or may be treated the same as forVK_PRESENT_MODE_IMMEDIATE_KHR
.
32.8. WSI Swapchain
A swapchain object (a.k.a.
swapchain) provides the ability to present rendering results to a surface.
Swapchain objects are represented by VkSwapchainKHR
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSwapchainKHR)
A swapchain is an abstraction for an array of presentable images that are
associated with a surface.
The presentable images are represented by VkImage
objects created by
the platform.
One image (which can be an array image for multiview/stereoscopic-3D
surfaces) is displayed at a time, but multiple images can be queued for
presentation.
An application renders to the image, and then queues the image for
presentation to the surface.
A native window cannot be associated with more than one non-retired swapchain at a time. Further, swapchains cannot be created for native windows that have a non-Vulkan graphics API surface associated with them.
Note
The presentation engine is an abstraction for the platform’s compositor or display engine. The presentation engine may be synchronous or asynchronous with respect to the application and/or logical device. Some implementations may use the device’s graphics queue or dedicated presentation hardware to perform presentation. |
The presentable images of a swapchain are owned by the presentation engine.
An application can acquire use of a presentable image from the presentation
engine.
Use of a presentable image must occur only after the image is returned by
vkAcquireNextImageKHR
, and before it is presented by
vkQueuePresentKHR
.
This includes transitioning the image layout and rendering commands.
An application can acquire use of a presentable image with
vkAcquireNextImageKHR
.
After acquiring a presentable image and before modifying it, the application
must use a synchronization primitive to ensure that the presentation engine
has finished reading from the image.
The application can then transition the image’s layout, queue rendering
commands to it, etc.
Finally, the application presents the image with vkQueuePresentKHR
,
which releases the acquisition of the image.
The presentation engine controls the order in which presentable images are acquired for use by the application.
Note
This allows the platform to handle situations which require out-of-order return of images after presentation. At the same time, it allows the application to generate command buffers referencing all of the images in the swapchain at initialization time, rather than in its main loop. |
How this all works is described below.
If a swapchain is created with presentMode
set to either
VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR
or
VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR
, a single presentable
image can be acquired, referred to as a shared presentable image.
A shared presentable image may be concurrently accessed by the application
and the presentation engine, without transitioning the image’s layout after
it is initially presented.
-
With
VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR
, the presentation engine is only required to update to the latest contents of a shared presentable image after a present. The application must callvkQueuePresentKHR
to guarantee an update. However, the presentation engine may update from it at any time. -
With
VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR
, the presentation engine will automatically present the latest contents of a shared presentable image during every refresh cycle. The application is only required to make one initial call tovkQueuePresentKHR
, after which the presentation engine will update from it without any need for further present calls. The application can indicate the image contents have been updated by callingvkQueuePresentKHR
, but this does not guarantee the timing of when updates will occur.
The presentation engine may access a shared presentable image at any time after it is first presented. To avoid tearing, an application should coordinate access with the presentation engine. This requires presentation engine timing information through platform-specific mechanisms and ensuring that color attachment writes are made available during the portion of the presentation engine’s refresh cycle they are intended for.
Note
The |
In order to query a swapchain’s status when rendering to a shared presentable image, call:
VkResult vkGetSwapchainStatusKHR(
VkDevice device,
VkSwapchainKHR swapchain);
-
device
is the device associated withswapchain
. -
swapchain
is the swapchain to query.
The possible return values for vkGetSwapchainStatusKHR
should be
interpreted as follows:
-
VK_SUCCESS
specifies the presentation engine is presenting the contents of the shared presentable image, as per the swapchain’s VkPresentModeKHR. -
VK_SUBOPTIMAL_KHR
the swapchain no longer matches the surface properties exactly, but the presentation engine is presenting the contents of the shared presentable image, as per the swapchain’s VkPresentModeKHR. -
VK_ERROR_OUT_OF_DATE_KHR
the surface has changed in such a way that it is no longer compatible with the swapchain. -
VK_ERROR_SURFACE_LOST_KHR
the surface is no longer available.
Note
The swapchain state may be cached by implementations, so applications
should regularly call |
To create a swapchain, call:
VkResult vkCreateSwapchainKHR(
VkDevice device,
const VkSwapchainCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSwapchainKHR* pSwapchain);
-
device
is the device to create the swapchain for. -
pCreateInfo
is a pointer to an instance of the VkSwapchainCreateInfoKHR structure specifying the parameters of the created swapchain. -
pAllocator
is the allocator used for host memory allocated for the swapchain object when there is no more specific allocator available (see Memory Allocation). -
pSwapchain
is a pointer to a VkSwapchainKHR handle in which the created swapchain object will be returned.
The VkSwapchainCreateInfoKHR
structure is defined as:
typedef struct VkSwapchainCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkSwapchainCreateFlagsKHR flags;
VkSurfaceKHR surface;
uint32_t minImageCount;
VkFormat imageFormat;
VkColorSpaceKHR imageColorSpace;
VkExtent2D imageExtent;
uint32_t imageArrayLayers;
VkImageUsageFlags imageUsage;
VkSharingMode imageSharingMode;
uint32_t queueFamilyIndexCount;
const uint32_t* pQueueFamilyIndices;
VkSurfaceTransformFlagBitsKHR preTransform;
VkCompositeAlphaFlagBitsKHR compositeAlpha;
VkPresentModeKHR presentMode;
VkBool32 clipped;
VkSwapchainKHR oldSwapchain;
} VkSwapchainCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkSwapchainCreateFlagBitsKHR indicating parameters of the swapchain creation. -
surface
is the surface onto which the swapchain will present images. If the creation succeeds, the swapchain becomes associated withsurface
. -
minImageCount
is the minimum number of presentable images that the application needs. The implementation will either create the swapchain with at least that many images, or it will fail to create the swapchain. -
imageFormat
is a VkFormat value specifying the format the swapchain image(s) will be created with. -
imageColorSpace
is a VkColorSpaceKHR value specifying the way the swapchain interprets image data. -
imageExtent
is the size (in pixels) of the swapchain image(s). The behavior is platform-dependent if the image extent does not match the surface’scurrentExtent
as returned byvkGetPhysicalDeviceSurfaceCapabilitiesKHR
.
Note
On some platforms, it is normal that |
-
imageArrayLayers
is the number of views in a multiview/stereo surface. For non-stereoscopic-3D applications, this value is 1. -
imageUsage
is a bitmask of VkImageUsageFlagBits describing the intended usage of the (acquired) swapchain images. -
imageSharingMode
is the sharing mode used for the image(s) of the swapchain. -
queueFamilyIndexCount
is the number of queue families having access to the image(s) of the swapchain whenimageSharingMode
isVK_SHARING_MODE_CONCURRENT
. -
pQueueFamilyIndices
is an array of queue family indices having access to the images(s) of the swapchain whenimageSharingMode
isVK_SHARING_MODE_CONCURRENT
. -
preTransform
is a VkSurfaceTransformFlagBitsKHR value describing the transform, relative to the presentation engine’s natural orientation, applied to the image content prior to presentation. If it does not match thecurrentTransform
value returned byvkGetPhysicalDeviceSurfaceCapabilitiesKHR
, the presentation engine will transform the image content as part of the presentation operation. -
compositeAlpha
is a VkCompositeAlphaFlagBitsKHR value indicating the alpha compositing mode to use when this surface is composited together with other surfaces on certain window systems. -
presentMode
is the presentation mode the swapchain will use. A swapchain’s present mode determines how incoming present requests will be processed and queued internally. -
clipped
specifies whether the Vulkan implementation is allowed to discard rendering operations that affect regions of the surface that are not visible.-
If set to
VK_TRUE
, the presentable images associated with the swapchain may not own all of their pixels. Pixels in the presentable images that correspond to regions of the target surface obscured by another window on the desktop, or subject to some other clipping mechanism will have undefined content when read back. Pixel shaders may not execute for these pixels, and thus any side effects they would have had will not occur.VK_TRUE
value does not guarantee any clipping will occur, but allows more optimal presentation methods to be used on some platforms. -
If set to
VK_FALSE
, presentable images associated with the swapchain will own all of the pixels they contain.
-
Note
Applications should set this value to |
-
oldSwapchain
is VK_NULL_HANDLE, or the existing non-retired swapchain currently associated withsurface
. Providing a validoldSwapchain
may aid in the resource reuse, and also allows the application to still present any images that are already acquired from it.
Upon calling vkCreateSwapchainKHR
with an oldSwapchain
that is
not VK_NULL_HANDLE, oldSwapchain
is retired — even if creation
of the new swapchain fails.
The new swapchain is created in the non-retired state whether or not
oldSwapchain
is VK_NULL_HANDLE.
Upon calling vkCreateSwapchainKHR
with an oldSwapchain
that is
not VK_NULL_HANDLE, any images from oldSwapchain
that are not
acquired by the application may be freed by the implementation, which may
occur even if creation of the new swapchain fails.
The application can destroy oldSwapchain
to free all memory
associated with oldSwapchain
.
Note
Multiple retired swapchains can be associated with the same
After The application can continue to use a shared presentable image obtained
from |
Bits which can be set in VkSwapchainCreateInfoKHR::flags
,
specifying parameters of swapchain creation, are:
typedef enum VkSwapchainCreateFlagBitsKHR {
VK_SWAPCHAIN_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT_KHR = 0x00000001,
VK_SWAPCHAIN_CREATE_PROTECTED_BIT_KHR = 0x00000002,
VK_SWAPCHAIN_CREATE_MUTABLE_FORMAT_BIT_KHR = 0x00000004,
} VkSwapchainCreateFlagBitsKHR;
-
VK_SWAPCHAIN_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT_KHR
specifies that images created from the swapchain (i.e. with theswapchain
member of VkImageSwapchainCreateInfoKHR set to this swapchain’s handle) must useVK_IMAGE_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT
. -
VK_SWAPCHAIN_CREATE_PROTECTED_BIT_KHR
specifies that images created from the swapchain are protected images. -
VK_SWAPCHAIN_CREATE_MUTABLE_FORMAT_BIT_KHR
specifies that the images of the swapchain can be used to create aVkImageView
with a different format than what the swapchain was created with. The list of allowed image view formats are specified by chaining an instance of the VkImageFormatListCreateInfoKHR structure to thepNext
chain ofVkSwapchainCreateInfoKHR
. In addition, this flag also specifies that the swapchain can be created with usage flags that are not supported for the format the swapchain is created with but are supported for at least one of the allowed image view formats.
typedef VkFlags VkSwapchainCreateFlagsKHR;
VkSwapchainCreateFlagsKHR
is a bitmask type for setting a mask of zero
or more VkSwapchainCreateFlagBitsKHR.
If the pNext
chain of VkSwapchainCreateInfoKHR includes a
VkDeviceGroupSwapchainCreateInfoKHR
structure, then that structure
includes a set of device group present modes that the swapchain can be used
with.
The VkDeviceGroupSwapchainCreateInfoKHR
structure is defined as:
typedef struct VkDeviceGroupSwapchainCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkDeviceGroupPresentModeFlagsKHR modes;
} VkDeviceGroupSwapchainCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
modes
is a bitfield of modes that the swapchain can be used with.
If this structure is not present, modes
is considered to be
VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR
.
To enable surface counters when creating a swapchain, add
VkSwapchainCounterCreateInfoEXT
to the pNext
chain of
VkSwapchainCreateInfoKHR.
VkSwapchainCounterCreateInfoEXT
is defined as:
typedef struct VkSwapchainCounterCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkSurfaceCounterFlagsEXT surfaceCounters;
} VkSwapchainCounterCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
surfaceCounters
is a bitmask of VkSurfaceCounterFlagBitsEXT specifying surface counters to enable for the swapchain.
The requested counters become active when the first presentation command for the associated swapchain is processed by the presentation engine. To query the value of an active counter, use:
VkResult vkGetSwapchainCounterEXT(
VkDevice device,
VkSwapchainKHR swapchain,
VkSurfaceCounterFlagBitsEXT counter,
uint64_t* pCounterValue);
-
device
is the VkDevice associated withswapchain
. -
swapchain
is the swapchain from which to query the counter value. -
counter
is the counter to query. -
pCounterValue
will return the current value of the counter.
If a counter is not available because the swapchain is out of date, the
implementation may return VK_ERROR_OUT_OF_DATE_KHR
.
As mentioned above, if vkCreateSwapchainKHR
succeeds, it will return a
handle to a swapchain that contains an array of at least minImageCount
presentable images.
While acquired by the application, presentable images can be used in any
way that equivalent non-presentable images can be used.
A presentable image is equivalent to a non-presentable image created with
the following VkImageCreateInfo
parameters:
VkImageCreateInfo Field |
Value |
---|---|
|
all other bits are unset |
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The surface
must not be destroyed until after the swapchain is
destroyed.
If oldSwapchain
is VK_NULL_HANDLE, and the native window
referred to by surface
is already associated with a Vulkan swapchain,
VK_ERROR_NATIVE_WINDOW_IN_USE_KHR
must be returned.
If the native window referred to by surface
is already associated with
a non-Vulkan graphics API surface, VK_ERROR_NATIVE_WINDOW_IN_USE_KHR
must be returned.
The native window referred to by surface
must not become associated
with a non-Vulkan graphics API surface before all associated Vulkan
swapchains have been destroyed.
Like core functions, several WSI functions, including
vkCreateSwapchainKHR
return VK_ERROR_DEVICE_LOST
if the logical
device was lost.
See Lost Device.
As with most core objects, VkSwapchainKHR
is a child of the device and
is affected by the lost state; it must be destroyed before destroying the
VkDevice
.
However, VkSurfaceKHR
is not a child of any VkDevice
and is not
otherwise affected by the lost device.
After successfully recreating a VkDevice
, the same VkSurfaceKHR
can be used to create a new VkSwapchainKHR
, provided the previous one
was destroyed.
Note
As mentioned in Lost Device, after a lost
device event, the |
To destroy a swapchain object call:
void vkDestroySwapchainKHR(
VkDevice device,
VkSwapchainKHR swapchain,
const VkAllocationCallbacks* pAllocator);
-
device
is the VkDevice associated withswapchain
. -
swapchain
is the swapchain to destroy. -
pAllocator
is the allocator used for host memory allocated for the swapchain object when there is no more specific allocator available (see Memory Allocation).
The application must not destroy a swapchain until after completion of all
outstanding operations on images that were acquired from the swapchain.
swapchain
and all associated VkImage
handles are destroyed, and
must not be acquired or used any more by the application.
The memory of each VkImage
will only be freed after that image is no
longer used by the presentation engine.
For example, if one image of the swapchain is being displayed in a window,
the memory for that image may not be freed until the window is destroyed,
or another swapchain is created for the window.
Destroying the swapchain does not invalidate the parent VkSurfaceKHR
,
and a new swapchain can be created with it.
When a swapchain associated with a display surface is destroyed, if the image most recently presented to the display surface is from the swapchain being destroyed, then either any display resources modified by presenting images from any swapchain associated with the display surface must be reverted by the implementation to their state prior to the first present performed on one of these swapchains, or such resources must be left in their current state.
To obtain the array of presentable images associated with a swapchain, call:
VkResult vkGetSwapchainImagesKHR(
VkDevice device,
VkSwapchainKHR swapchain,
uint32_t* pSwapchainImageCount,
VkImage* pSwapchainImages);
-
device
is the device associated withswapchain
. -
swapchain
is the swapchain to query. -
pSwapchainImageCount
is a pointer to an integer related to the number of presentable images available or queried, as described below. -
pSwapchainImages
is eitherNULL
or a pointer to an array ofVkImage
handles.
If pSwapchainImages
is NULL
, then the number of presentable images
for swapchain
is returned in pSwapchainImageCount
.
Otherwise, pSwapchainImageCount
must point to a variable set by the
user to the number of elements in the pSwapchainImages
array, and on
return the variable is overwritten with the number of structures actually
written to pSwapchainImages
.
If the value of pSwapchainImageCount
is less than the number of
presentable images for swapchain
, at most pSwapchainImageCount
structures will be written.
If pSwapchainImageCount
is smaller than the number of presentable
images for swapchain
, VK_INCOMPLETE
will be returned instead of
VK_SUCCESS
to indicate that not all the available values were
returned.
Note
By knowing all presentable images used in the swapchain, the application can create command buffers that reference these images prior to entering its main rendering loop. |
The implementation will have already allocated and bound the memory backing
the VkImage
objects returned by vkGetSwapchainImagesKHR
.
The memory for each image will not alias with the memory for other images or
with any VkDeviceMemory
object.
As such, performing any operation affecting the binding of memory to a
presentable image results in undefined behavior.
All presentable images are initially in the VK_IMAGE_LAYOUT_UNDEFINED
layout, thus before using presentable images, the application must
transition them to a valid layout for the intended use.
Further, the lifetime of presentable images is controlled by the implementation so destroying a presentable image with vkDestroyImage results in undefined behavior. See vkDestroySwapchainKHR for further details on the lifetime of presentable images.
Images can also be created by using vkCreateImage with
VkImageSwapchainCreateInfoKHR and bound to swapchain memory using
vkBindImageMemory2KHR with VkBindImageMemorySwapchainInfoKHR.
These images can be used anywhere swapchain images are used, and are useful
in logical devices with multiple physical devices to create peer memory
bindings of swapchain memory.
These images and bindings have no effect on what memory is presented.
Unlike images retrieved from vkGetSwapchainImagesKHR
, these images
must be destroyed with vkDestroyImage.
To acquire an available presentable image to use, and retrieve the index of that image, call:
VkResult vkAcquireNextImageKHR(
VkDevice device,
VkSwapchainKHR swapchain,
uint64_t timeout,
VkSemaphore semaphore,
VkFence fence,
uint32_t* pImageIndex);
-
device
is the device associated withswapchain
. -
swapchain
is the non-retired swapchain from which an image is being acquired. -
timeout
specifies how long the function waits, in nanoseconds, if no image is available. -
semaphore
is VK_NULL_HANDLE or a semaphore to signal. -
fence
is VK_NULL_HANDLE or a fence to signal. -
pImageIndex
is a pointer to auint32_t
that is set to the index of the next image to use (i.e. an index into the array of images returned byvkGetSwapchainImagesKHR
).
When successful, vkAcquireNextImageKHR
acquires a presentable image
from swapchain
that an application can use, and sets
pImageIndex
to the index of that image within the swapchain.
The presentation engine may not have finished reading from the image at the
time it is acquired, so the application must use semaphore
and/or
fence
to ensure that the image layout and contents are not modified
until the presentation engine reads have completed.
The order in which images are acquired is implementation-dependent, and may
be different than the order the images were presented.
If timeout
is zero, then vkAcquireNextImageKHR
does not wait,
and will either successfully acquire an image, or fail and return
VK_NOT_READY
if no image is available.
If the specified timeout period expires before an image is acquired,
vkAcquireNextImageKHR
returns VK_TIMEOUT
.
If timeout
is UINT64_MAX
, the timeout period is treated as
infinite, and vkAcquireNextImageKHR
will block until an image is
acquired or an error occurs.
An image will eventually be acquired if the number of images that the
application has currently acquired (but not yet presented) is less than or
equal to the difference between the number of images in swapchain
and
the value of VkSurfaceCapabilitiesKHR::minImageCount
.
If the number of currently acquired images is greater than this,
vkAcquireNextImageKHR
should not be called; if it is, timeout
must not be UINT64_MAX
.
If an image is acquired successfully, vkAcquireNextImageKHR
must
either return VK_SUCCESS
, or VK_SUBOPTIMAL_KHR
if the swapchain
no longer matches the surface properties exactly, but can still be used for
presentation.
Note
This may happen, for example, if the platform surface has been resized but the platform is able to scale the presented images to the new size to produce valid surface updates. It is up to the application to decide whether it prefers to continue using the current swapchain in this state, or to re-create the swapchain to better match the platform surface properties. |
If the swapchain images no longer match native surface properties, either
VK_SUBOPTIMAL_KHR
or VK_ERROR_OUT_OF_DATE_KHR
must be returned.
If VK_ERROR_OUT_OF_DATE_KHR
is returned, no image is acquired and
attempts to present previously acquired images to the swapchain will also
fail with VK_ERROR_OUT_OF_DATE_KHR
.
Applications need to create a new swapchain for the surface to continue
presenting if VK_ERROR_OUT_OF_DATE_KHR
is returned.
If device loss occurs (see Lost Device) before
the timeout has expired, vkAcquireNextImageKHR
must return in finite
time with either one of the allowed success codes, or
VK_ERROR_DEVICE_LOST
.
If semaphore
is not VK_NULL_HANDLE, the semaphore must be
unsignaled, with no signal or wait operations pending.
It will become signaled when the application can use the image.
Note
Use of |
If fence
is not equal to VK_NULL_HANDLE, the fence must be
unsignaled, with no signal operations pending.
It will become signaled when the application can use the image.
Note
Applications should not rely on |
An application must wait until either the semaphore
or fence
is
signaled before accessing the image’s data.
Note
When the presentable image will be accessed by some stage S, the recommended idiom for ensuring correct synchronization is:
|
After a successful return, the image indicated by pImageIndex
and its
data will be unmodified compared to when it was presented.
Note
Exclusive ownership of presentable images corresponding to a swapchain
created with |
The possible return values for vkAcquireNextImageKHR
depend on the
timeout
provided:
-
VK_SUCCESS
is returned if an image became available. -
VK_ERROR_SURFACE_LOST_KHR
if the surface becomes no longer available. -
VK_NOT_READY
is returned iftimeout
is zero and no image was available. -
VK_TIMEOUT
is returned iftimeout
is greater than zero and less thanUINT64_MAX
, and no image became available within the time allowed. -
VK_SUBOPTIMAL_KHR
is returned if an image became available, and the swapchain no longer matches the surface properties exactly, but can still be used to present to the surface successfully.
Note
This may happen, for example, if the platform surface has been resized but the platform is able to scale the presented images to the new size to produce valid surface updates. It is up to the application to decide whether it prefers to continue using the current swapchain indefinitely or temporarily in this state, or to re-create the swapchain to better match the platform surface properties. |
-
VK_ERROR_OUT_OF_DATE_KHR
is returned if the surface has changed in such a way that it is no longer compatible with the swapchain, and further presentation requests using the swapchain will fail. Applications must query the new surface properties and recreate their swapchain if they wish to continue presenting to the surface.
If the native surface and presented image sizes no longer match,
presentation may fail.
If presentation does succeed, the mapping from the presented image to the
native surface is implementation-defined.
It is the application’s responsibility to detect surface size changes and
react appropriately.
If presentation fails because of a mismatch in the surface and presented
image sizes, a VK_ERROR_OUT_OF_DATE_KHR
error will be returned.
Note
For example, consider a 4x3 window/surface that gets resized to be 3x4 (taller than wider). On some window systems, the portion of the window/surface that was previously and still is visible (the 3x3 part) will contain the same contents as before, while the remaining parts of the window will have undefined contents. Other window systems may squash/stretch the image to fill the new window size without any undefined contents, or apply some other mapping. |
To acquire an available presentable image to use, and retrieve the index of that image, call:
VkResult vkAcquireNextImage2KHR(
VkDevice device,
const VkAcquireNextImageInfoKHR* pAcquireInfo,
uint32_t* pImageIndex);
-
device
is the device associated withswapchain
. -
pAcquireInfo
is a pointer to a structure of type VkAcquireNextImageInfoKHR containing parameters of the acquire. -
pImageIndex
is a pointer to auint32_t
that is set to the index of the next image to use.
The VkAcquireNextImageInfoKHR
structure is defined as:
typedef struct VkAcquireNextImageInfoKHR {
VkStructureType sType;
const void* pNext;
VkSwapchainKHR swapchain;
uint64_t timeout;
VkSemaphore semaphore;
VkFence fence;
uint32_t deviceMask;
} VkAcquireNextImageInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
swapchain
is a non-retired swapchain from which an image is acquired. -
timeout
specifies how long the function waits, in nanoseconds, if no image is available. -
semaphore
is VK_NULL_HANDLE or a semaphore to signal. -
fence
is VK_NULL_HANDLE or a fence to signal. -
deviceMask
is a mask of physical devices for which the swapchain image will be ready to use when the semaphore or fence is signaled.
If vkAcquireNextImageKHR is used, the device mask is considered to include all physical devices in the logical device.
Note
vkAcquireNextImage2KHR signals at most one semaphore, even if the
application requests waiting for multiple physical devices to be ready via
the |
After queueing all rendering commands and transitioning the image to the correct layout, to queue an image for presentation, call:
VkResult vkQueuePresentKHR(
VkQueue queue,
const VkPresentInfoKHR* pPresentInfo);
-
queue
is a queue that is capable of presentation to the target surface’s platform on the same device as the image’s swapchain. -
pPresentInfo
is a pointer to an instance of the VkPresentInfoKHR structure specifying the parameters of the presentation.
Note
There is no requirement for an application to present images in the same order that they were acquired - applications can arbitrarily present any image that is currently acquired. |
Any writes to memory backing the images referenced by the
pImageIndices
and pSwapchains
members of pPresentInfo
,
that are available before vkQueuePresentKHR is executed, are
automatically made visible to the read access performed by the presentation
engine.
This automatic visibility operation for an image happens-after the semaphore
signal operation, and happens-before the presentation engine accesses the
image.
Queueing an image for presentation defines a set of queue operations, including waiting on the semaphores and submitting a presentation request to the presentation engine. However, the scope of this set of queue operations does not include the actual processing of the image by the presentation engine.
If vkQueuePresentKHR
fails to enqueue the corresponding set of queue
operations, it may return VK_ERROR_OUT_OF_HOST_MEMORY
or
VK_ERROR_OUT_OF_DEVICE_MEMORY
.
If it does, the implementation must ensure that the state and contents of
any resources or synchronization primitives referenced is unaffected by the
call or its failure.
If vkQueuePresentKHR
fails in such a way that the implementation is
unable to make that guarantee, the implementation must return
VK_ERROR_DEVICE_LOST
.
However, if the presentation request is rejected by the presentation engine
with an error VK_ERROR_OUT_OF_DATE_KHR
or
VK_ERROR_SURFACE_LOST_KHR
, the set of queue operations are still
considered to be enqueued and thus any semaphore to be waited on gets
unsignaled when the corresponding queue operation is complete.
The VkPresentInfoKHR
structure is defined as:
typedef struct VkPresentInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t waitSemaphoreCount;
const VkSemaphore* pWaitSemaphores;
uint32_t swapchainCount;
const VkSwapchainKHR* pSwapchains;
const uint32_t* pImageIndices;
VkResult* pResults;
} VkPresentInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
waitSemaphoreCount
is the number of semaphores to wait for before issuing the present request. The number may be zero. -
pWaitSemaphores
, if notNULL
, is an array of VkSemaphore objects withwaitSemaphoreCount
entries, and specifies the semaphores to wait for before issuing the present request. -
swapchainCount
is the number of swapchains being presented to by this command. -
pSwapchains
is an array of VkSwapchainKHR objects withswapchainCount
entries. A given swapchain must not appear in this list more than once. -
pImageIndices
is an array of indices into the array of each swapchain’s presentable images, withswapchainCount
entries. Each entry in this array identifies the image to present on the corresponding entry in thepSwapchains
array. -
pResults
is an array of VkResult typed elements withswapchainCount
entries. Applications that do not need per-swapchain results can useNULL
forpResults
. If non-NULL
, each entry inpResults
will be set to the VkResult for presenting the swapchain corresponding to the same index inpSwapchains
.
Before an application can present an image, the image’s layout must be
transitioned to the VK_IMAGE_LAYOUT_PRESENT_SRC_KHR
layout, or for a shared presentable image the
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
layout.
Note
When transitioning the image to
|
When the VK_KHR_incremental_present
extension is enabled, additional
fields can be specified that allow an application to specify that only
certain rectangular regions of the presentable images of a swapchain are
changed.
This is an optimization hint that a presentation engine may use to only
update the region of a surface that is actually changing.
The application still must ensure that all pixels of a presented image
contain the desired values, in case the presentation engine ignores this
hint.
An application can provide this hint by including the
VkPresentRegionsKHR
structure in the pNext
chain of the
VkPresentInfoKHR
structure.
The VkPresentRegionsKHR
structure is defined as:
typedef struct VkPresentRegionsKHR {
VkStructureType sType;
const void* pNext;
uint32_t swapchainCount;
const VkPresentRegionKHR* pRegions;
} VkPresentRegionsKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
swapchainCount
is the number of swapchains being presented to by this command. -
pRegions
isNULL
or a pointer to an array ofVkPresentRegionKHR
elements withswapchainCount
entries. If notNULL
, each element ofpRegions
contains the region that has changed since the last present to the swapchain in the corresponding entry in theVkPresentInfoKHR
::pSwapchains
array.
For a given image and swapchain, the region to present is specified by the
VkPresentRegionKHR
structure, which is defined as:
typedef struct VkPresentRegionKHR {
uint32_t rectangleCount;
const VkRectLayerKHR* pRectangles;
} VkPresentRegionKHR;
-
rectangleCount
is the number of rectangles inpRectangles
, or zero if the entire image has changed and should be presented. -
pRectangles
is eitherNULL
or a pointer to an array ofVkRectLayerKHR
structures. TheVkRectLayerKHR
structure is the framebuffer coordinates, plus layer, of a portion of a presentable image that has changed and must be presented. If non-NULL
, each entry inpRectangles
is a rectangle of the given image that has changed since the last image was presented to the given swapchain.
The VkRectLayerKHR
structure is defined as:
typedef struct VkRectLayerKHR {
VkOffset2D offset;
VkExtent2D extent;
uint32_t layer;
} VkRectLayerKHR;
-
offset
is the origin of the rectangle, in pixels. -
extent
is the size of the rectangle, in pixels. -
layer
is the layer of the image. For images with only one layer, the value oflayer
must be 0.
Some platforms allow the size of a surface to change, and then scale the
pixels of the image to fit the surface.
VkRectLayerKHR
specifies pixels of the swapchain’s image(s), which
will be constant for the life of the swapchain.
When the VK_KHR_display_swapchain
extension is enabled additional fields
can be specified when presenting an image to a swapchain by setting
VkPresentInfoKHR::pNext
to point to an instance of the
VkDisplayPresentInfoKHR structure.
The VkDisplayPresentInfoKHR
structure is defined as:
typedef struct VkDisplayPresentInfoKHR {
VkStructureType sType;
const void* pNext;
VkRect2D srcRect;
VkRect2D dstRect;
VkBool32 persistent;
} VkDisplayPresentInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
srcRect
is a rectangular region of pixels to present. It must be a subset of the image being presented. IfVkDisplayPresentInfoKHR
is not specified, this region will be assumed to be the entire presentable image. -
dstRect
is a rectangular region within the visible region of the swapchain’s display mode. IfVkDisplayPresentInfoKHR
is not specified, this region will be assumed to be the entire visible region of the visible region of the swapchain’s mode. If the specified rectangle is a subset of the display mode’s visible region, content from display planes below the swapchain’s plane will be visible outside the rectangle. If there are no planes below the swapchain’s, the area outside the specified rectangle will be black. If portions of the specified rectangle are outside of the display’s visible region, pixels mapping only to those portions of the rectangle will be discarded. -
persistent
: If this isVK_TRUE
, the display engine will enable buffered mode on displays that support it. This allows the display engine to stop sending content to the display until a new image is presented. The display will instead maintain a copy of the last presented image. This allows less power to be used, but may increase presentation latency. IfVkDisplayPresentInfoKHR
is not specified, persistent mode will not be used.
If the extent of the srcRect
and dstRect
are not equal, the
presented pixels will be scaled accordingly.
If the pNext
chain of VkPresentInfoKHR includes a
VkDeviceGroupPresentInfoKHR
structure, then that structure includes an
array of device masks and a device group present mode.
The VkDeviceGroupPresentInfoKHR
structure is defined as:
typedef struct VkDeviceGroupPresentInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t swapchainCount;
const uint32_t* pDeviceMasks;
VkDeviceGroupPresentModeFlagBitsKHR mode;
} VkDeviceGroupPresentInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
swapchainCount
is zero or the number of elements inpDeviceMasks
. -
pDeviceMasks
is an array of device masks, one for each element of VkPresentInfoKHR::pSwapchains. -
mode
is the device group present mode that will be used for this present.
If mode
is VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR
, then each
element of pDeviceMasks
selects which instance of the swapchain image
is presented.
Each element of pDeviceMasks
must have exactly one bit set, and the
corresponding physical device must have a presentation engine as reported
by VkDeviceGroupPresentCapabilitiesKHR.
If mode
is VK_DEVICE_GROUP_PRESENT_MODE_REMOTE_BIT_KHR
, then
each element of pDeviceMasks
selects which instance of the swapchain
image is presented.
Each element of pDeviceMasks
must have exactly one bit set, and some
physical device in the logical device must include that bit in its
VkDeviceGroupPresentCapabilitiesKHR::presentMask
.
If mode
is VK_DEVICE_GROUP_PRESENT_MODE_SUM_BIT_KHR
, then each
element of pDeviceMasks
selects which instances of the swapchain image
are component-wise summed and the sum of those images is presented.
If the sum in any component is outside the representable range, the value of
that component is undefined.
Each element of pDeviceMasks
must have a value for which all set bits
are set in one of the elements of
VkDeviceGroupPresentCapabilitiesKHR::presentMask
.
If mode
is
VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_MULTI_DEVICE_BIT_KHR
, then each
element of pDeviceMasks
selects which instance(s) of the swapchain
images are presented.
For each bit set in each element of pDeviceMasks
, the corresponding
physical device must have a presentation engine as reported by
VkDeviceGroupPresentCapabilitiesKHR.
If VkDeviceGroupPresentInfoKHR
is not provided or swapchainCount
is zero then the masks are considered to be 1
.
If VkDeviceGroupPresentInfoKHR
is not provided, mode
is
considered to be VK_DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR
.
When the VK_GOOGLE_display_timing
extension is enabled, additional
fields can be specified that allow an application to specify the earliest
time that an image should be displayed.
This allows an application to avoid stutter that is caused by an image being
displayed earlier than planned.
Such stuttering can occur with both fixed and variable-refresh-rate
displays, because stuttering occurs when the geometry is not correctly
positioned for when the image is displayed.
An application can instruct the presentation engine that an image should
not be displayed earlier than a specified time by including the
VkPresentTimesInfoGOOGLE
structure in the pNext
chain of the
VkPresentInfoKHR
structure.
The VkPresentTimesInfoGOOGLE
structure is defined as:
typedef struct VkPresentTimesInfoGOOGLE {
VkStructureType sType;
const void* pNext;
uint32_t swapchainCount;
const VkPresentTimeGOOGLE* pTimes;
} VkPresentTimesInfoGOOGLE;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
swapchainCount
is the number of swapchains being presented to by this command. -
pTimes
isNULL
or a pointer to an array ofVkPresentTimeGOOGLE
elements withswapchainCount
entries. If notNULL
, each element ofpTimes
contains the earliest time to present the image corresponding to the entry in theVkPresentInfoKHR
::pImageIndices
array.
The VkPresentTimeGOOGLE
structure is defined as:
typedef struct VkPresentTimeGOOGLE {
uint32_t presentID;
uint64_t desiredPresentTime;
} VkPresentTimeGOOGLE;
-
presentID
is an application-provided identification value, that can be used with the results of vkGetPastPresentationTimingGOOGLE, in order to uniquely identify this present. In order to be useful to the application, it should be unique within some period of time that is meaningful to the application. -
desiredPresentTime
specifies that the image given should not be displayed to the user any earlier than this time.desiredPresentTime
is a time in nanoseconds, relative to a monotonically-increasing clock (e.g.CLOCK_MONOTONIC
(see clock_gettime(2)) on Android and Linux). A value of zero specifies that the presentation engine may display the image at any time. This is useful when the application desires to providepresentID
, but doesn’t need a specificdesiredPresentTime
.
vkQueuePresentKHR
, releases the acquisition of the images referenced
by imageIndices
.
The queue family corresponding to the queue vkQueuePresentKHR
is
executed on must have ownership of the presented images as defined in
Resource Sharing.
vkQueuePresentKHR
does not alter the queue family ownership, but the
presented images must not be used again before they have been reacquired
using vkAcquireNextImageKHR
.
The processing of the presentation happens in issue order with other queue operations, but semaphores have to be used to ensure that prior rendering and other commands in the specified queue complete before the presentation begins. The presentation command itself does not delay processing of subsequent commands on the queue, however, presentation requests sent to a particular queue are always performed in order. Exact presentation timing is controlled by the semantics of the presentation engine and native platform in use.
If an image is presented to a swapchain created from a display surface, the mode of the associated display will be updated, if necessary, to match the mode specified when creating the display surface. The mode switch and presentation of the specified image will be performed as one atomic operation.
The result codes VK_ERROR_OUT_OF_DATE_KHR
and VK_SUBOPTIMAL_KHR
have the same meaning when returned by vkQueuePresentKHR
as they do
when returned by vkAcquireNextImageKHR
.
If multiple swapchains are presented, the result code is determined applying
the following rules in order:
-
If the device is lost,
VK_ERROR_DEVICE_LOST
is returned. -
If any of the target surfaces are no longer available the error
VK_ERROR_SURFACE_LOST_KHR
is returned. -
If any of the presents would have a result of
VK_ERROR_OUT_OF_DATE_KHR
if issued separately thenVK_ERROR_OUT_OF_DATE_KHR
is returned. -
If any of the presents would have a result of
VK_SUBOPTIMAL_KHR
if issued separately thenVK_SUBOPTIMAL_KHR
is returned. -
Otherwise
VK_SUCCESS
is returned.
Presentation is a read-only operation that will not affect the content of
the presentable images.
Upon reacquiring the image and transitioning it away from the
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR
layout, the contents will be the same
as they were prior to transitioning the image to the present source layout
and presenting it.
However, if a mechanism other than Vulkan is used to modify the platform
window associated with the swapchain, the content of all presentable images
in the swapchain becomes undefined.
Note
The application can continue to present any acquired images from a retired
swapchain as long as the swapchain has not entered a state that causes
vkQueuePresentKHR to return |
32.9. Hdr Metadata
To improve color reproduction of content it is useful to have information
that can be used to better reproduce the colors as seen on the mastering
display.
That information can be provided to an implementation by calling
vkSetHdrMetadataEXT
.
The metadata will be applied to the specified VkSwapchainKHR
objects
at the next vkQueuePresentKHR
call using that VkSwapchainKHR
object.
The metadata will persist until a subsequent vkSetHdrMetadataEXT
changes it.
The definitions below are from the associated SMPTE 2086, CTA 861.3 and CIE
15:2004 specifications.
The definition of vkSetHdrMetadataEXT
is:
void vkSetHdrMetadataEXT(
VkDevice device,
uint32_t swapchainCount,
const VkSwapchainKHR* pSwapchains,
const VkHdrMetadataEXT* pMetadata);
-
device
is the logical device where the swapchain(s) were created. -
swapchainCount
is the number of swapchains included inpSwapchains
. -
pSwapchains
is a pointer to the array ofswapchainCount
VkSwapchainKHR
handles. -
pMetadata
is a pointer to the array ofswapchainCount
VkHdrMetadataEXT
structures.
typedef struct VkXYColorEXT {
float x;
float y;
} VkXYColorEXT;
Chromaticity coordinates x and y are as specified in CIE 15:2004 “Calculation of chromaticity coordinates” (Section 7.3) and are limited to between 0 and 1 for real colors for the mastering display.
typedef struct VkHdrMetadataEXT {
VkStructureType sType;
const void* pNext;
VkXYColorEXT displayPrimaryRed;
VkXYColorEXT displayPrimaryGreen;
VkXYColorEXT displayPrimaryBlue;
VkXYColorEXT whitePoint;
float maxLuminance;
float minLuminance;
float maxContentLightLevel;
float maxFrameAverageLightLevel;
} VkHdrMetadataEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
displayPrimaryRed
is the mastering display’s red primary in chromaticity coordinates -
displayPrimaryGreen
is the mastering display’s green primary in chromaticity coordinates -
displayPrimaryBlue
is the mastering display’s blue primary in chromaticity coordinates -
whitePoint
is the mastering display’s white-point in chromaticity coordinates -
maxLuminance
is the maximum luminance of the mastering display in nits -
minLuminance
is the minimum luminance of the mastering display in nits -
maxContentLightLevel
is content’s maximum luminance in nits -
maxFrameAverageLightLevel
is the maximum frame average light level in nits
Note
The validity and use of this data is outside the scope of Vulkan and thus no Valid Usage is given. |
33. Ray Tracing
Unlike draw commands, which use rasterization, ray tracing is a rendering method that generates an image by tracing the path of rays which have a single origin and using shaders to determine the final colour of an image plane.
Ray tracing uses a separate rendering pipeline from both the graphics and compute pipelines (see Ray tracing Pipeline). It has a unique set of programmable and fixed function stages.
33.1. Ray Tracing Commands
Ray tracing commands provoke work in the ray tracing pipeline. Ray tracing commands are recorded into a command buffer and when executed by a queue will produce work that executes according to the currently bound ray tracing pipeline. A ray tracing pipeline must be bound to a command buffer before any ray tracing commands are recorded in that command buffer.
Each ray tracing call operates on a set of shader stages that are specific
to the ray tracing pipeline as well as a set of
VkAccelerationStructureNV
objects, which describe the scene geometry
in an implementation-specific way.
The relationship between the ray tracing pipeline object and the
acceleration structures is passed into the ray tracing command in a
VkBuffer object known as a shader binding table.
During execution, control alternates between scheduling and other operations. The scheduling functionality is implementation-specific and is responsible for workload execution. The shader stages are programmable. Traversal, which refers to the process of traversing acceleration structures to find potential intersections of rays with geometry, is fixed function.
The programmable portions of the pipeline are exposed in a single-ray programming model. Each GPU thread handles one ray at a time. Memory operations can be synchronized using standard memory barriers. However, communication and synchronization between threads is not allowed. In particular, the use of compute pipeline synchronization functions is not supported in the ray tracing pipeline.
To dispatch a ray tracing call use:
void vkCmdTraceRaysNV(
VkCommandBuffer commandBuffer,
VkBuffer raygenShaderBindingTableBuffer,
VkDeviceSize raygenShaderBindingOffset,
VkBuffer missShaderBindingTableBuffer,
VkDeviceSize missShaderBindingOffset,
VkDeviceSize missShaderBindingStride,
VkBuffer hitShaderBindingTableBuffer,
VkDeviceSize hitShaderBindingOffset,
VkDeviceSize hitShaderBindingStride,
VkBuffer callableShaderBindingTableBuffer,
VkDeviceSize callableShaderBindingOffset,
VkDeviceSize callableShaderBindingStride,
uint32_t width,
uint32_t height,
uint32_t depth);
-
commandBuffer
is the command buffer into which the command will be recorded. -
raygenShaderBindingTableBuffer
is the buffer object that holds the shader binding table data for the ray generation shader stage. -
raygenShaderBindingOffset
is the offset in bytes (relative toraygenShaderBindingTableBuffer
) of the ray generation shader being used for the trace. -
missShaderBindingTableBuffer
is the buffer object that holds the shader binding table data for the miss shader stage. -
missShaderBindingOffset
is the offset in bytes (relative tomissShaderBindingTableBuffer
) of the miss shader being used for the trace. -
missShaderBindingStride
is the size in bytes of each shader binding table record inmissShaderBindingTableBuffer
. -
hitShaderBindingTableBuffer
is the buffer object that holds the shader binding table data for the hit shader stages. -
hitShaderBindingOffset
is the offset in bytes (relative tohitShaderBindingTableBuffer
) of the hit shader group being used for the trace. -
hitShaderBindingStride
is the size in bytes of each shader binding table record inhitShaderBindingTableBuffer
. -
callableShaderBindingTableBuffer
is the buffer object that holds the shader binding table data for the callable shader stage. -
callableShaderBindingOffset
is the offset in bytes (relative tocallableShaderBindingTableBuffer
) of the callable shader being used for the trace. -
callableShaderBindingStride
is the size in bytes of each shader binding table record incallableShaderBindingTableBuffer
. -
width
is the width of the ray trace query dimensions. -
height
is height of the ray trace query dimensions. -
depth
is depth of the ray trace query dimensions.
When the command is executed, a ray generation group of width
× height
× depth
rays is assembled.
33.2. Shader Binding Table
A shader binding table is a resource which establishes the relationship between the ray tracing pipeline and the acceleration structures that were built for the ray tracing query. It indicates the shaders that operate on each geometry in an acceleration structure. In addition, it contains the resources accessed by each shader, including indices of textures and constants. The application allocates and manages shader binding tables as VkBuffer objects.
Each entry in the shader binding table consists of
shaderGroupHandleSize
bytes of data as queried by
vkGetRayTracingShaderGroupHandlesNV to refer to the shader that it
invokes.
The remainder of the data specified by the stride is application-visible
data that can be referenced by a shaderRecordNV
block in the shader.
The shader binding tables to use in a ray tracing query are passed to vkCmdTraceRaysNV. Shader binding tables are read-only in shaders that are executing on the ray tracing pipeline.
33.2.1. Indexing Rules
In order to execute the correct shaders and access the correct resources during a ray tracing dispatch, the implementation must be able to locate shader binding table entries at various stages of execution. This is accomplished by defining a set of indexing rules that compute shader binding table record positions relative to the buffer’s base address in memory. The application must organize the contents of the shader binding table’s memory in a way that application of the indexing rules will lead to correct records.
Ray Generation Shaders
Only one ray generation shader is executed per ray tracing dispatch.
Its location is passed into vkCmdTraceRaysNV using the
raygenShaderBindingTableBuffer
and
raygenShaderBindingTableOffset
parameters — there is no indexing.
Hit Shaders
The base for the computation of intersection, any-hit and closest hit shader
locations is the instanceShaderBindingTableRecordOffset
value stored
with each instance of a top-level acceleration structure.
This value determines the beginning of the shader binding table records for
a given instance.
Each geometry in the instance must have at least one hit program record.
In the following rule, geometryIndex refers to the location of the geometry within the instance.
The sbtRecordStride
and sbtRecordOffset
values are passed in as
parameters to traceNV
() calls made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
The result of this computation is then added to
hitProgramShaderBindingTableBaseIndex
, a base index passed to
vkCmdTraceRaysNV.
The complete rule to compute a hit shader binding table record index is:
-
instanceShaderBindingTableRecordOffset
+hitProgramShaderBindingTableBaseIndex
+ geometryIndex ×sbtRecordStride
+sbtRecordOffset
Miss Shaders
A Miss shader is executed whenever a ray query fails to find an intersection for the given scene geometry. Multiple miss shaders may be executed throughout a ray tracing dispatch.
The base for the computation of miss shader locations is
missProgramShaderBindingTableBaseIndex
, a base index passed into
vkCmdTraceRaysNV.
The sbtRecordOffset
value is passed in as parameters to traceNV
()
calls made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
The complete rule to compute a miss shader binding table record address is:
-
missProgramShaderBindingTableBaseIndex
×missShaderBindingStride
+sbtRecordOffset
33.3. Acceleration Structures
Acceleration structures are data structures used by the implementation to efficiently manage the scene geometry as it is traversed during a ray tracing query. The application is responsible for managing acceleration structure objects (see Acceleration Structures, including allocation, destruction, executing builds or updates, and synchronizing resources used during ray tracing queries.
There are two types of acceleration structures, top level acceleration structures and bottom level acceleration structures.
33.3.1. Instances
Instances are found in top level acceleration structures and contain data that refer to a single bottom-level acceleration structure, a transform matrix, and shading information. Multiple instances can point to a single bottom level acceleration structure.
An instance is defined in a VkBuffer by a structure consisting of 64 bytes of data.
-
transform
is 12 floats representing a 4x3 transform matrix in row-major order -
instanceCustomIndex
The low 24 bits of a 32-bit integer after the transform. This value appears in the builtingl_InstanceCustomIndexNV
-
mask
The high 8 bits of the same integer asinstanceCustomIndex
. This is the visibility mask. The instance may only be hit ifrayMask & instance.mask != 0
-
instanceOffset
The low 24 bits of the next 32-bit integer. The value contributed by this instance to the hit shader binding table index computation asinstanceShaderBindingTableRecordOffset
. -
flags
The high 8 bits of the same integer asinstanceOffset
. VkGeometryInstanceFlagBitsNV values that apply to this instance. -
accelerationStructure
. The 8 byte value returned by vkGetAccelerationStructureHandleNV for the bottom level acceleration structure referred to by this instance.
Note
The C language spec does not define the ordering of bit-fields, but in practice, this struct produces the layout described above:
|
Possible values of flags
in the instance modifying the behavior of
that instance are:,
typedef enum VkGeometryInstanceFlagBitsNV {
VK_GEOMETRY_INSTANCE_TRIANGLE_CULL_DISABLE_BIT_NV = 0x00000001,
VK_GEOMETRY_INSTANCE_TRIANGLE_FRONT_COUNTERCLOCKWISE_BIT_NV = 0x00000002,
VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_NV = 0x00000004,
VK_GEOMETRY_INSTANCE_FORCE_NO_OPAQUE_BIT_NV = 0x00000008,
} VkGeometryInstanceFlagBitsNV;
-
VK_GEOMETRY_INSTANCE_TRIANGLE_CULL_DISABLE_BIT_NV
disables face culling for this instance. -
VK_GEOMETRY_INSTANCE_TRIANGLE_FRONT_COUNTERCLOCKWISE_BIT_NV
indicates that the front face of the triangle for culling purposes is the face that is counter clockwise in object space relative to the ray origin. Because the facing is determined in object space, an instance transform matrix does not change the winding, but a geometry transform does. -
VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_NV
causes this instance to act as thoughVK_GEOMETRY_OPAQUE_BIT_NV
were specified on all geometries referenced by this instance. This behavior can be overridden by the ray flaggl_RayFlagsNoOpaqueNV
. -
VK_GEOMETRY_INSTANCE_FORCE_NO_OPAQUE_BIT_NV
causes this instance to act as thoughVK_GEOMETRY_OPAQUE_BIT_NV
were not specified on all geometries referenced by this instance. This behavior can be overridden by the ray flaggl_RayFlagsOpaqueNV
.
VK_GEOMETRY_INSTANCE_FORCE_NO_OPAQUE_BIT_NV
and
VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_NV
must not be used in the same
flag.
typedef VkFlags VkGeometryInstanceFlagsNV;
VkGeometryInstanceFlagsNV
is a bitmask type for setting a mask of zero
or more VkGeometryInstanceFlagBitsNV.
33.3.2. Geometry
Geometries refer to a triangle or axis-aligned bounding box.
33.3.3. Top Level Acceleration Structures
Opaque acceleration structure for an array of instances. The descriptor referencing this is the starting point for tracing
33.3.4. Bottom Level Acceleration Structures
Opaque acceleration structure for an array of geometries.
33.3.5. Building Acceleration Structures
To build an acceleration structure call:
void vkCmdBuildAccelerationStructureNV(
VkCommandBuffer commandBuffer,
const VkAccelerationStructureInfoNV* pInfo,
VkBuffer instanceData,
VkDeviceSize instanceOffset,
VkBool32 update,
VkAccelerationStructureNV dst,
VkAccelerationStructureNV src,
VkBuffer scratch,
VkDeviceSize scratchOffset);
-
commandBuffer
is the command buffer into which the command will be recorded. -
pInfo
contains the shared information for the acceleration structure’s structure. -
instanceData
is the buffer containing instance data that will be used to build the acceleration structure as described in Accelerator structure instances. This parameter must beNULL
for bottom level acceleration structures. -
instanceOffset
is the offset in bytes (relative to the start ofinstanceData
) at which the instance data is located. -
update
specifies whether to update thedst
acceleration structure with the data insrc
. -
dst
points to the target acceleration structure for the build. -
src
points to an existing acceleration structure that is to be used to update thedst
acceleration structure. -
scratch
is the VkBuffer that will be used as scratch memory for the build. -
scratchOffset
is the offset in bytes relative to the start ofscratch
that will be used as a scratch memory.
33.3.6. Copying Acceleration Structures
An additional command exists for copying acceleration structures without updating their contents. The acceleration structure object can be compacted in order to improve performance. Before copying, an application must query the size of the resulting acceleration structure.
To query acceleration structure size parameters call:
void vkCmdWriteAccelerationStructuresPropertiesNV(
VkCommandBuffer commandBuffer,
uint32_t accelerationStructureCount,
const VkAccelerationStructureNV* pAccelerationStructures,
VkQueryType queryType,
VkQueryPool queryPool,
uint32_t firstQuery);
-
commandBuffer
is the command buffer into which the command will be recorded. -
accelerationStructureCount
is the count of acceleration structures for which to query the property. -
pAccelerationStructures
points to an array of existing previously built acceleration structures. -
queryType
is a VkQueryType value specifying the type of queries managed by the pool. -
queryPool
is the query pool that will manage the results of the query. -
firstQuery
is the first query index within the query pool that will contain theaccelerationStructureCount
number of results.
To copy an acceleration structure call:
void vkCmdCopyAccelerationStructureNV(
VkCommandBuffer commandBuffer,
VkAccelerationStructureNV dst,
VkAccelerationStructureNV src,
VkCopyAccelerationStructureModeNV mode);
-
commandBuffer
is the command buffer into which the command will be recorded. -
dst
points to the target acceleration structure for the copy. -
src
points to the source acceleration structure for the copy. -
mode
is a VkCopyAccelerationStructureModeNV value that specifies additional operations to perform during the copy.
Possible values of vkCmdCopyAccelerationStructureNV::mode
,
specifying additional operations to perform during the copy, are:
typedef enum VkCopyAccelerationStructureModeNV {
VK_COPY_ACCELERATION_STRUCTURE_MODE_CLONE_NV = 0,
VK_COPY_ACCELERATION_STRUCTURE_MODE_COMPACT_NV = 1,
} VkCopyAccelerationStructureModeNV;
-
VK_COPY_ACCELERATION_STRUCTURE_MODE_CLONE_NV
creates a direct copy of the acceleration structure specified insrc
into the one specified bydst
. Thedst
acceleration structure must have been created with the same parameters assrc
. -
VK_COPY_ACCELERATION_STRUCTURE_MODE_COMPACT_NV
creates a more compact version of an acceleration structuresrc
intodst
. The acceleration structuredst
must have been created with acompactedSize
corresponding to the one returned by vkCmdWriteAccelerationStructuresPropertiesNV after the build of the acceleration structure specified bysrc
.
34. Extended Functionality
Additional functionality may be provided by layers or extensions. A layer cannot add or modify Vulkan commands, while an extension may do so.
The set of layers to enable is specified when creating an instance, and those layers are able to intercept any Vulkan command dispatched to that instance or any of its child objects.
Extensions can operate at either the instance or device extension scope. Enabled instance extensions are able to affect the operation of the instance and any of its child objects, while device extensions may only be available on a subset of physical devices, must be individually enabled per-device, and only affect the operation of the devices where they are enabled.
Note
Examples of these might be:
|
34.1. Layers
When a layer is enabled, it inserts itself into the call chain for Vulkan commands the layer is interested in. A common use of layers is to validate application behavior during development. For example, the implementation will not check that Vulkan enums used by the application fall within allowed ranges. Instead, a validation layer would do those checks and flag issues. This avoids a performance penalty during production use of the application because those layers would not be enabled in production.
Vulkan layers may wrap object handles (i.e. return a different handle value to the application than that generated by the implementation). This is generally discouraged, as it increases the probability of incompatibilities with new extensions. The validation layers wrap handles in order to track the proper use and destruction of each object. See the “Vulkan Loader Specification and Architecture Overview” document for additional information.
To query the available layers, call:
VkResult vkEnumerateInstanceLayerProperties(
uint32_t* pPropertyCount,
VkLayerProperties* pProperties);
-
pPropertyCount
is a pointer to an integer related to the number of layer properties available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array of VkLayerProperties structures.
If pProperties
is NULL
, then the number of layer properties
available is returned in pPropertyCount
.
Otherwise, pPropertyCount
must point to a variable set by the user to
the number of elements in the pProperties
array, and on return the
variable is overwritten with the number of structures actually written to
pProperties
.
If pPropertyCount
is less than the number of layer properties
available, at most pPropertyCount
structures will be written.
If pPropertyCount
is smaller than the number of layers available,
VK_INCOMPLETE
will be returned instead of VK_SUCCESS
, to
indicate that not all the available layer properties were returned.
The list of available layers may change at any time due to actions outside
of the Vulkan implementation, so two calls to
vkEnumerateInstanceLayerProperties
with the same parameters may
return different results, or retrieve different pPropertyCount
values
or pProperties
contents.
Once an instance has been created, the layers enabled for that instance will
continue to be enabled and valid for the lifetime of that instance, even if
some of them become unavailable for future instances.
The VkLayerProperties
structure is defined as:
typedef struct VkLayerProperties {
char layerName[VK_MAX_EXTENSION_NAME_SIZE];
uint32_t specVersion;
uint32_t implementationVersion;
char description[VK_MAX_DESCRIPTION_SIZE];
} VkLayerProperties;
-
layerName
is a null-terminated UTF-8 string specifying the name of the layer. Use this name in theppEnabledLayerNames
array passed in the VkInstanceCreateInfo structure to enable this layer for an instance. -
specVersion
is the Vulkan version the layer was written to, encoded as described in the API Version Numbers and Semantics section. -
implementationVersion
is the version of this layer. It is an integer, increasing with backward compatible changes. -
description
is a null-terminated UTF-8 string providing additional details that can be used by the application to identify the layer.
To enable a layer, the name of the layer should be added to the
ppEnabledLayerNames
member of VkInstanceCreateInfo when creating
a VkInstance
.
Loader implementations may provide mechanisms outside the Vulkan API for
enabling specific layers.
Layers enabled through such a mechanism are implicitly enabled, while
layers enabled by including the layer name in the ppEnabledLayerNames
member of VkInstanceCreateInfo are explicitly enabled.
Except where otherwise specified, implicitly enabled and explicitly enabled
layers differ only in the way they are enabled.
Explicitly enabling a layer that is implicitly enabled has no additional
effect.
34.1.1. Device Layer Deprecation
Previous versions of this specification distinguished between instance and
device layers.
Instance layers were only able to intercept commands that operate on
VkInstance
and VkPhysicalDevice
, except they were not able to
intercept vkCreateDevice.
Device layers were enabled for individual devices when they were created,
and could only intercept commands operating on that device or its child
objects.
Device-only layers are now deprecated, and this specification no longer distinguishes between instance and device layers. Layers are enabled during instance creation, and are able to intercept all commands operating on that instance or any of its child objects. At the time of deprecation there were no known device-only layers and no compelling reason to create one.
In order to maintain compatibility with implementations released prior to
device-layer deprecation, applications should still enumerate and enable
device layers.
The behavior of vkEnumerateDeviceLayerProperties
and valid usage of
the ppEnabledLayerNames
member of VkDeviceCreateInfo
maximizes
compatibility with applications written to work with the previous
requirements.
To enumerate device layers, call:
VkResult vkEnumerateDeviceLayerProperties(
VkPhysicalDevice physicalDevice,
uint32_t* pPropertyCount,
VkLayerProperties* pProperties);
-
pPropertyCount
is a pointer to an integer related to the number of layer properties available or queried. -
pProperties
is eitherNULL
or a pointer to an array of VkLayerProperties structures.
If pProperties
is NULL
, then the number of layer properties
available is returned in pPropertyCount
.
Otherwise, pPropertyCount
must point to a variable set by the user to
the number of elements in the pProperties
array, and on return the
variable is overwritten with the number of structures actually written to
pProperties
.
If pPropertyCount
is less than the number of layer properties
available, at most pPropertyCount
structures will be written.
If pPropertyCount
is smaller than the number of layers available,
VK_INCOMPLETE
will be returned instead of VK_SUCCESS
, to
indicate that not all the available layer properties were returned.
The list of layers enumerated by vkEnumerateDeviceLayerProperties
must be exactly the sequence of layers enabled for the instance.
The members of VkLayerProperties
for each enumerated layer must be
the same as the properties when the layer was enumerated by
vkEnumerateInstanceLayerProperties
.
The ppEnabledLayerNames
and enabledLayerCount
members of
VkDeviceCreateInfo
are deprecated and their values must be ignored by
implementations.
However, for compatibility, only an empty list of layers or a list that
exactly matches the sequence enabled at instance creation time are valid,
and validation layers should issue diagnostics for other cases.
Regardless of the enabled layer list provided in VkDeviceCreateInfo
,
the sequence of layers active for a device will be exactly the sequence of
layers enabled when the parent instance was created.
34.2. Extensions
Extensions may define new Vulkan commands, structures, and enumerants.
For compilation purposes, the interfaces defined by registered extensions,
including new structures and enumerants as well as function pointer types
for new commands, are defined in the Khronos-supplied vulkan_core.h
together with the core API.
However, commands defined by extensions may not be available for static
linking - in which case function pointers to these commands should be
queried at runtime as described in Command Function Pointers.
Extensions may be provided by layers as well as by a Vulkan implementation.
Because extensions may extend or change the behavior of the Vulkan API, extension authors should add support for their extensions to the Khronos validation layers. This is especially important for new commands whose parameters have been wrapped by the validation layers. See the “Vulkan Loader Specification and Architecture Overview” document for additional information.
Note
Valid Usage sections for individual commands and structures do not currently contain which extensions have to be enabled in order to make their use valid, although it might do so in the future. It is defined only in the Valid Usage for Extensions section. |
To query the available instance extensions, call:
VkResult vkEnumerateInstanceExtensionProperties(
const char* pLayerName,
uint32_t* pPropertyCount,
VkExtensionProperties* pProperties);
-
pLayerName
is eitherNULL
or a pointer to a null-terminated UTF-8 string naming the layer to retrieve extensions from. -
pPropertyCount
is a pointer to an integer related to the number of extension properties available or queried, as described below. -
pProperties
is eitherNULL
or a pointer to an array of VkExtensionProperties structures.
When pLayerName
parameter is NULL
, only extensions provided by the
Vulkan implementation or by implicitly enabled layers are returned.
When pLayerName
is the name of a layer, the instance extensions
provided by that layer are returned.
If pProperties
is NULL
, then the number of extensions properties
available is returned in pPropertyCount
.
Otherwise, pPropertyCount
must point to a variable set by the user to
the number of elements in the pProperties
array, and on return the
variable is overwritten with the number of structures actually written to
pProperties
.
If pPropertyCount
is less than the number of extension properties
available, at most pPropertyCount
structures will be written.
If pPropertyCount
is smaller than the number of extensions available,
VK_INCOMPLETE
will be returned instead of VK_SUCCESS
, to
indicate that not all the available properties were returned.
Because the list of available layers may change externally between calls to
vkEnumerateInstanceExtensionProperties, two calls may retrieve
different results if a pLayerName
is available in one call but not in
another.
The extensions supported by a layer may also change between two calls, e.g.
if the layer implementation is replaced by a different version between those
calls.
To enable an instance extension, the name of the extension should be added
to the ppEnabledExtensionNames
member of VkInstanceCreateInfo
when creating a VkInstance
.
Enabling an extension does not change behavior of functionality exposed by the core Vulkan API or any other extension, other than making valid the use of the commands, enums and structures defined by that extension.
To query the extensions available to a given physical device, call:
VkResult vkEnumerateDeviceExtensionProperties(
VkPhysicalDevice physicalDevice,
const char* pLayerName,
uint32_t* pPropertyCount,
VkExtensionProperties* pProperties);
-
physicalDevice
is the physical device that will be queried. -
pLayerName
is eitherNULL
or a pointer to a null-terminated UTF-8 string naming the layer to retrieve extensions from. -
pPropertyCount
is a pointer to an integer related to the number of extension properties available or queried, and is treated in the same fashion as the vkEnumerateInstanceExtensionProperties::pPropertyCount
parameter. -
pProperties
is eitherNULL
or a pointer to an array of VkExtensionProperties structures.
When pLayerName
parameter is NULL
, only extensions provided by the
Vulkan implementation or by implicitly enabled layers are returned.
When pLayerName
is the name of a layer, the device extensions provided
by that layer are returned.
The VkExtensionProperties
structure is defined as:
typedef struct VkExtensionProperties {
char extensionName[VK_MAX_EXTENSION_NAME_SIZE];
uint32_t specVersion;
} VkExtensionProperties;
-
extensionName
is a null-terminated string specifying the name of the extension. -
specVersion
is the version of this extension. It is an integer, incremented with backward compatible changes.
34.2.1. Instance Extensions and Device Extensions
This section provides some guidelines and rules for when to expose new functionality as an instance extension, as a device extension, or as both. The decision depends on the scope of the new functionality; such as whether it extends instance-level or device-level functionality. All Vulkan commands, structures, and enumerants are considered either instance-level, physical-device-level, or device-level.
New instance-level extension functionality must be structured within an
instance extension.
New device-level extension functionality may be structured within a device
extension.
Vulkan 1.0 initially required all new physical-device-level extension
functionality to be structured within an instance extension.
In order to avoid using an instance extension, which often requires loader
support, physical-device-level extension functionality may be implemented
within device extensions (which must depend on the
VK_KHR_get_physical_device_properties2
extension, or on Vulkan 1.1 or
later).
34.3. Extension Dependencies
Some extensions are dependent on other extensions to function. To enable extensions with dependencies, such required extensions must also be enabled through the same API mechanisms when creating an instance with vkCreateInstance or a device with vkCreateDevice. Each extension which has such dependencies documents them in the appendix summarizing that extension.
If an extension is supported (as queried by vkEnumerateInstanceExtensionProperties or vkEnumerateDeviceExtensionProperties), then required extensions of that extension must also be supported for the same instance or physical device.
Any device extension that has an instance extension dependency that is not enabled by vkCreateInstance is considered to be unsupported, hence it must not be returned by vkEnumerateDeviceExtensionProperties for any VkPhysicalDevice child of the instance.
34.4. Extension Compatibility
By default, all extensions are considered compatible with each other and any core API version, unless otherwise stated. Thus enabling such extensions does not otherwise alter the behavior of the application.
Each extension that is mutually exclusive or otherwise incompatible with
another extension or set of extensions documents them in the appendix summarizing that extension and has a corresponding Valid Usage
statement disallowing enabling such an incompatible combination of
extensions at VkInstance
creation time or VkDevice
creation
time, depending on the type of extensions participating in the interaction.
35. Features, Limits, and Formats
Vulkan is designed to support a wide variety of implementations, and as such there are a number of features, limits, and formats which are not supported on all implementations. Features describe functionality which is optional and which must be explicitly enabled before use. Limits describe implementation-dependent minimums, maximums, and other device characteristics that an application may need to be aware of. Supported buffer and image formats may vary across implementations. A minimum set of format features are guaranteed, but others must be explicitly queried before use to ensure they are supported by the implementation.
Note
The features and limits are reported via basic structures (that is
VkPhysicalDeviceFeatures and VkPhysicalDeviceLimits), as well as
extensible structures ( |
35.1. Features
The Specification defines a set of optional features that may be supported by a Vulkan implementation. Support for features is reported and enabled on a per-feature basis. Features are properties of the physical device.
To query supported features, call:
void vkGetPhysicalDeviceFeatures(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceFeatures* pFeatures);
-
physicalDevice
is the physical device from which to query the supported features. -
pFeatures
is a pointer to a VkPhysicalDeviceFeatures structure in which the physical device features are returned. For each feature, a value ofVK_TRUE
specifies that the feature is supported on this physical device, andVK_FALSE
specifies that the feature is not supported.
Fine-grained features used by a logical device must be enabled at
VkDevice
creation time.
If a feature is enabled that the physical device does not support,
VkDevice
creation will fail.
If an application uses a feature without enabling it at VkDevice
creation time, the device behavior is undefined.
The validation layer will warn if features are used without being enabled.
The fine-grained features are enabled by passing a pointer to the
VkPhysicalDeviceFeatures
structure via the pEnabledFeatures
member of the VkDeviceCreateInfo
structure that is passed into the
vkCreateDevice
call.
If a member of pEnabledFeatures
is set to VK_TRUE
or
VK_FALSE
, then the device will be created with the indicated feature
enabled or disabled, respectively.
Features can also be enabled by using the VkPhysicalDeviceFeatures2
structure.
If an application wishes to enable all features supported by a device, it
can simply pass in the VkPhysicalDeviceFeatures
structure that was
previously returned by vkGetPhysicalDeviceFeatures
.
To disable an individual feature, the application can set the desired
member to VK_FALSE
in the same structure.
Setting pEnabledFeatures
to NULL
and not including a VkPhysicalDeviceFeatures2 in the pNext
member of VkDeviceCreateInfo
is equivalent to setting all members of the structure to VK_FALSE
.
Note
Some features, such as |
To query supported features defined by the core or extensions, call:
void vkGetPhysicalDeviceFeatures2(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceFeatures2* pFeatures);
or the equivalent command
void vkGetPhysicalDeviceFeatures2KHR(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceFeatures2* pFeatures);
-
physicalDevice
is the physical device from which to query the supported features. -
pFeatures
is a pointer to a VkPhysicalDeviceFeatures2 structure in which the physical device features are returned.
Each structure in pFeatures
and its pNext
chain contain members
corresponding to fine-grained features.
vkGetPhysicalDeviceFeatures2
writes each member to a boolean value
indicating whether that feature is supported.
The VkPhysicalDeviceFeatures2
structure is defined as:
typedef struct VkPhysicalDeviceFeatures2 {
VkStructureType sType;
void* pNext;
VkPhysicalDeviceFeatures features;
} VkPhysicalDeviceFeatures2;
or the equivalent
typedef VkPhysicalDeviceFeatures2 VkPhysicalDeviceFeatures2KHR;
The VkPhysicalDeviceFeatures2
structure is defined as:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
features
is a structure of type VkPhysicalDeviceFeatures describing the fine-grained features of the Vulkan 1.0 API.
The pNext
chain of this structure is used to extend the structure with
features defined by extensions.
This structure can be used in vkGetPhysicalDeviceFeatures2 or can be
in the pNext
chain of a VkDeviceCreateInfo structure, in which
case it controls which features are enabled in the device in lieu of
pEnabledFeatures
.
The VkPhysicalDeviceFeatures
structure is defined as:
typedef struct VkPhysicalDeviceFeatures {
VkBool32 robustBufferAccess;
VkBool32 fullDrawIndexUint32;
VkBool32 imageCubeArray;
VkBool32 independentBlend;
VkBool32 geometryShader;
VkBool32 tessellationShader;
VkBool32 sampleRateShading;
VkBool32 dualSrcBlend;
VkBool32 logicOp;
VkBool32 multiDrawIndirect;
VkBool32 drawIndirectFirstInstance;
VkBool32 depthClamp;
VkBool32 depthBiasClamp;
VkBool32 fillModeNonSolid;
VkBool32 depthBounds;
VkBool32 wideLines;
VkBool32 largePoints;
VkBool32 alphaToOne;
VkBool32 multiViewport;
VkBool32 samplerAnisotropy;
VkBool32 textureCompressionETC2;
VkBool32 textureCompressionASTC_LDR;
VkBool32 textureCompressionBC;
VkBool32 occlusionQueryPrecise;
VkBool32 pipelineStatisticsQuery;
VkBool32 vertexPipelineStoresAndAtomics;
VkBool32 fragmentStoresAndAtomics;
VkBool32 shaderTessellationAndGeometryPointSize;
VkBool32 shaderImageGatherExtended;
VkBool32 shaderStorageImageExtendedFormats;
VkBool32 shaderStorageImageMultisample;
VkBool32 shaderStorageImageReadWithoutFormat;
VkBool32 shaderStorageImageWriteWithoutFormat;
VkBool32 shaderUniformBufferArrayDynamicIndexing;
VkBool32 shaderSampledImageArrayDynamicIndexing;
VkBool32 shaderStorageBufferArrayDynamicIndexing;
VkBool32 shaderStorageImageArrayDynamicIndexing;
VkBool32 shaderClipDistance;
VkBool32 shaderCullDistance;
VkBool32 shaderFloat64;
VkBool32 shaderInt64;
VkBool32 shaderInt16;
VkBool32 shaderResourceResidency;
VkBool32 shaderResourceMinLod;
VkBool32 sparseBinding;
VkBool32 sparseResidencyBuffer;
VkBool32 sparseResidencyImage2D;
VkBool32 sparseResidencyImage3D;
VkBool32 sparseResidency2Samples;
VkBool32 sparseResidency4Samples;
VkBool32 sparseResidency8Samples;
VkBool32 sparseResidency16Samples;
VkBool32 sparseResidencyAliased;
VkBool32 variableMultisampleRate;
VkBool32 inheritedQueries;
} VkPhysicalDeviceFeatures;
The members of the VkPhysicalDeviceFeatures
structure describe the
following features:
-
robustBufferAccess
specifies that accesses to buffers are bounds-checked against the range of the buffer descriptor (as determined byVkDescriptorBufferInfo
::range
,VkBufferViewCreateInfo
::range
, or the size of the buffer). Out of bounds accesses must not cause application termination, and the effects of shader loads, stores, and atomics must conform to an implementation-dependent behavior as described below.-
A buffer access is considered to be out of bounds if any of the following are true:
-
The pointer was formed by
OpImageTexelPointer
and the coordinate is less than zero or greater than or equal to the number of whole elements in the bound range. -
The pointer was not formed by
OpImageTexelPointer
and the object pointed to is not wholly contained within the bound range. This includes accesses performed via variable pointers where the buffer descriptor being accessed cannot be statically determined. Uninitialized pointers and pointers equal toOpConstantNull
are treated as pointing to a zero-sized object, so all accesses through such pointers are considered to be out of bounds.NoteIf a SPIR-V
OpLoad
instruction loads a structure and the tail end of the structure is out of bounds, then all members of the structure are considered out of bounds even if the members at the end are not statically used. -
If any buffer access in a given SPIR-V block is determined to be out of bounds, then any other access of the same type (load, store, or atomic) in the same SPIR-V block that accesses an address less than 16 bytes away from the out of bounds address may also be considered out of bounds.
-
-
Out-of-bounds buffer loads will return any of the following values:
-
Values from anywhere within the memory range(s) bound to the buffer (possibly including bytes of memory past the end of the buffer, up to the end of the bound range).
-
Zero values, or (0,0,0,x) vectors for vector reads where x is a valid value represented in the type of the vector components and may be any of:
-
0, 1, or the maximum representable positive integer value, for signed or unsigned integer components
-
0.0 or 1.0, for floating-point components
-
-
-
Out-of-bounds writes may modify values within the memory range(s) bound to the buffer, but must not modify any other memory.
-
Out-of-bounds atomics may modify values within the memory range(s) bound to the buffer, but must not modify any other memory, and return an undefined value.
-
Vertex input attributes are considered out of bounds if the offset of the attribute in the bound vertex buffer range plus the size of the attribute is greater than either:
-
vertexBufferRangeSize
, ifbindingStride
== 0; or -
(
vertexBufferRangeSize
- (vertexBufferRangeSize
%bindingStride
))
where
vertexBufferRangeSize
is the byte size of the memory range bound to the vertex buffer binding andbindingStride
is the byte stride of the corresponding vertex input binding. Further, if any vertex input attribute using a specific vertex input binding is out of bounds, then all vertex input attributes using that vertex input binding for that vertex shader invocation are considered out of bounds.-
If a vertex input attribute is out of bounds, it will be assigned one of the following values:
-
Values from anywhere within the memory range(s) bound to the buffer, converted according to the format of the attribute.
-
Zero values, format converted according to the format of the attribute.
-
Zero values, or (0,0,0,x) vectors, as described above.
-
-
-
If
robustBufferAccess
is not enabled, out of bounds accesses may corrupt any memory within the process and cause undefined behavior up to and including application termination.
-
-
fullDrawIndexUint32
specifies the full 32-bit range of indices is supported for indexed draw calls when using a VkIndexType ofVK_INDEX_TYPE_UINT32
.maxDrawIndexedIndexValue
is the maximum index value that may be used (aside from the primitive restart index, which is always 232-1 when the VkIndexType isVK_INDEX_TYPE_UINT32
). If this feature is supported,maxDrawIndexedIndexValue
must be 232-1; otherwise it must be no smaller than 224-1. See maxDrawIndexedIndexValue. -
imageCubeArray
specifies whether image views with a VkImageViewType ofVK_IMAGE_VIEW_TYPE_CUBE_ARRAY
can be created, and that the correspondingSampledCubeArray
andImageCubeArray
SPIR-V capabilities can be used in shader code. -
independentBlend
specifies whether theVkPipelineColorBlendAttachmentState
settings are controlled independently per-attachment. If this feature is not enabled, theVkPipelineColorBlendAttachmentState
settings for all color attachments must be identical. Otherwise, a differentVkPipelineColorBlendAttachmentState
can be provided for each bound color attachment. -
geometryShader
specifies whether geometry shaders are supported. If this feature is not enabled, theVK_SHADER_STAGE_GEOMETRY_BIT
andVK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT
enum values must not be used. This also specifies whether shader modules can declare theGeometry
capability. -
tessellationShader
specifies whether tessellation control and evaluation shaders are supported. If this feature is not enabled, theVK_SHADER_STAGE_TESSELLATION_CONTROL_BIT
,VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT
,VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT
,VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT
, andVK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_STATE_CREATE_INFO
enum values must not be used. This also specifies whether shader modules can declare theTessellation
capability. -
sampleRateShading
specifies whether Sample Shading and multisample interpolation are supported. If this feature is not enabled, thesampleShadingEnable
member of theVkPipelineMultisampleStateCreateInfo
structure must be set toVK_FALSE
and theminSampleShading
member is ignored. This also specifies whether shader modules can declare theSampleRateShading
capability. -
dualSrcBlend
specifies whether blend operations which take two sources are supported. If this feature is not enabled, theVK_BLEND_FACTOR_SRC1_COLOR
,VK_BLEND_FACTOR_ONE_MINUS_SRC1_COLOR
,VK_BLEND_FACTOR_SRC1_ALPHA
, andVK_BLEND_FACTOR_ONE_MINUS_SRC1_ALPHA
enum values must not be used as source or destination blending factors. See Dual-Source Blending. -
logicOp
specifies whether logic operations are supported. If this feature is not enabled, thelogicOpEnable
member of theVkPipelineColorBlendStateCreateInfo
structure must be set toVK_FALSE
, and thelogicOp
member is ignored. -
multiDrawIndirect
specifies whether multiple draw indirect is supported. If this feature is not enabled, thedrawCount
parameter to thevkCmdDrawIndirect
andvkCmdDrawIndexedIndirect
commands must be 0 or 1. ThemaxDrawIndirectCount
member of theVkPhysicalDeviceLimits
structure must also be 1 if this feature is not supported. See maxDrawIndirectCount. -
drawIndirectFirstInstance
specifies whether indirect draw calls support thefirstInstance
parameter. If this feature is not enabled, thefirstInstance
member of allVkDrawIndirectCommand
andVkDrawIndexedIndirectCommand
structures that are provided to thevkCmdDrawIndirect
andvkCmdDrawIndexedIndirect
commands must be 0. -
depthClamp
specifies whether depth clamping is supported. If this feature is not enabled, thedepthClampEnable
member of theVkPipelineRasterizationStateCreateInfo
structure must be set toVK_FALSE
. Otherwise, settingdepthClampEnable
toVK_TRUE
will enable depth clamping. -
depthBiasClamp
specifies whether depth bias clamping is supported. If this feature is not enabled, thedepthBiasClamp
member of theVkPipelineRasterizationStateCreateInfo
structure must be set to 0.0 unless theVK_DYNAMIC_STATE_DEPTH_BIAS
dynamic state is enabled, and thedepthBiasClamp
parameter tovkCmdSetDepthBias
must be set to 0.0. -
fillModeNonSolid
specifies whether point and wireframe fill modes are supported. If this feature is not enabled, theVK_POLYGON_MODE_POINT
andVK_POLYGON_MODE_LINE
enum values must not be used. -
depthBounds
specifies whether depth bounds tests are supported. If this feature is not enabled, thedepthBoundsTestEnable
member of theVkPipelineDepthStencilStateCreateInfo
structure must be set toVK_FALSE
. WhendepthBoundsTestEnable
is set toVK_FALSE
, theminDepthBounds
andmaxDepthBounds
members of theVkPipelineDepthStencilStateCreateInfo
structure are ignored. -
wideLines
specifies whether lines with width other than 1.0 are supported. If this feature is not enabled, thelineWidth
member of theVkPipelineRasterizationStateCreateInfo
structure must be set to 1.0 unless theVK_DYNAMIC_STATE_LINE_WIDTH
dynamic state is enabled, and thelineWidth
parameter tovkCmdSetLineWidth
must be set to 1.0. When this feature is supported, the range and granularity of supported line widths are indicated by thelineWidthRange
andlineWidthGranularity
members of theVkPhysicalDeviceLimits
structure, respectively. -
largePoints
specifies whether points with size greater than 1.0 are supported. If this feature is not enabled, only a point size of 1.0 written by a shader is supported. The range and granularity of supported point sizes are indicated by thepointSizeRange
andpointSizeGranularity
members of theVkPhysicalDeviceLimits
structure, respectively. -
alphaToOne
specifies whether the implementation is able to replace the alpha value of the color fragment output from the fragment shader with the maximum representable alpha value for fixed-point colors or 1.0 for floating-point colors. If this feature is not enabled, then thealphaToOneEnable
member of theVkPipelineMultisampleStateCreateInfo
structure must be set toVK_FALSE
. Otherwise settingalphaToOneEnable
toVK_TRUE
will enable alpha-to-one behavior. -
multiViewport
specifies whether more than one viewport is supported. If this feature is not enabled:-
The
viewportCount
andscissorCount
members of theVkPipelineViewportStateCreateInfo
structure must be set to 1. -
The
firstViewport
andviewportCount
parameters to thevkCmdSetViewport
command must be set to 0 and 1, respectively. -
The
firstScissor
andscissorCount
parameters to thevkCmdSetScissor
command must be set to 0 and 1, respectively. -
The
exclusiveScissorCount
member of theVkPipelineViewportExclusiveScissorStateCreateInfoNV
structure must be set to 0 or 1. -
The
firstExclusiveScissor
andexclusiveScissorCount
parameters to thevkCmdSetExclusiveScissorNV
command must be set to 0 and 1, respectively.
-
-
samplerAnisotropy
specifies whether anisotropic filtering is supported. If this feature is not enabled, theanisotropyEnable
member of theVkSamplerCreateInfo
structure must beVK_FALSE
. -
textureCompressionETC2
specifies whether all of the ETC2 and EAC compressed texture formats are supported. If this feature is enabled, then theVK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
,VK_FORMAT_FEATURE_BLIT_SRC_BIT
andVK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT
features must be supported inoptimalTilingFeatures
for the following formats:-
VK_FORMAT_ETC2_R8G8B8_UNORM_BLOCK
-
VK_FORMAT_ETC2_R8G8B8_SRGB_BLOCK
-
VK_FORMAT_ETC2_R8G8B8A1_UNORM_BLOCK
-
VK_FORMAT_ETC2_R8G8B8A1_SRGB_BLOCK
-
VK_FORMAT_ETC2_R8G8B8A8_UNORM_BLOCK
-
VK_FORMAT_ETC2_R8G8B8A8_SRGB_BLOCK
-
VK_FORMAT_EAC_R11_UNORM_BLOCK
-
VK_FORMAT_EAC_R11_SNORM_BLOCK
-
VK_FORMAT_EAC_R11G11_UNORM_BLOCK
-
VK_FORMAT_EAC_R11G11_SNORM_BLOCK
To query for additional properties, or if the feature is not enabled, vkGetPhysicalDeviceFormatProperties and vkGetPhysicalDeviceImageFormatProperties can be used to check for supported properties of individual formats as normal.
-
-
textureCompressionASTC_LDR
specifies whether all of the ASTC LDR compressed texture formats are supported. If this feature is enabled, then theVK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
,VK_FORMAT_FEATURE_BLIT_SRC_BIT
andVK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT
features must be supported inoptimalTilingFeatures
for the following formats:-
VK_FORMAT_ASTC_4x4_UNORM_BLOCK
-
VK_FORMAT_ASTC_4x4_SRGB_BLOCK
-
VK_FORMAT_ASTC_5x4_UNORM_BLOCK
-
VK_FORMAT_ASTC_5x4_SRGB_BLOCK
-
VK_FORMAT_ASTC_5x5_UNORM_BLOCK
-
VK_FORMAT_ASTC_5x5_SRGB_BLOCK
-
VK_FORMAT_ASTC_6x5_UNORM_BLOCK
-
VK_FORMAT_ASTC_6x5_SRGB_BLOCK
-
VK_FORMAT_ASTC_6x6_UNORM_BLOCK
-
VK_FORMAT_ASTC_6x6_SRGB_BLOCK
-
VK_FORMAT_ASTC_8x5_UNORM_BLOCK
-
VK_FORMAT_ASTC_8x5_SRGB_BLOCK
-
VK_FORMAT_ASTC_8x6_UNORM_BLOCK
-
VK_FORMAT_ASTC_8x6_SRGB_BLOCK
-
VK_FORMAT_ASTC_8x8_UNORM_BLOCK
-
VK_FORMAT_ASTC_8x8_SRGB_BLOCK
-
VK_FORMAT_ASTC_10x5_UNORM_BLOCK
-
VK_FORMAT_ASTC_10x5_SRGB_BLOCK
-
VK_FORMAT_ASTC_10x6_UNORM_BLOCK
-
VK_FORMAT_ASTC_10x6_SRGB_BLOCK
-
VK_FORMAT_ASTC_10x8_UNORM_BLOCK
-
VK_FORMAT_ASTC_10x8_SRGB_BLOCK
-
VK_FORMAT_ASTC_10x10_UNORM_BLOCK
-
VK_FORMAT_ASTC_10x10_SRGB_BLOCK
-
VK_FORMAT_ASTC_12x10_UNORM_BLOCK
-
VK_FORMAT_ASTC_12x10_SRGB_BLOCK
-
VK_FORMAT_ASTC_12x12_UNORM_BLOCK
-
VK_FORMAT_ASTC_12x12_SRGB_BLOCK
To query for additional properties, or if the feature is not enabled, vkGetPhysicalDeviceFormatProperties and vkGetPhysicalDeviceImageFormatProperties can be used to check for supported properties of individual formats as normal.
-
-
textureCompressionBC
specifies whether all of the BC compressed texture formats are supported. If this feature is enabled, then theVK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
,VK_FORMAT_FEATURE_BLIT_SRC_BIT
andVK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT
features must be supported inoptimalTilingFeatures
for the following formats:-
VK_FORMAT_BC1_RGB_UNORM_BLOCK
-
VK_FORMAT_BC1_RGB_SRGB_BLOCK
-
VK_FORMAT_BC1_RGBA_UNORM_BLOCK
-
VK_FORMAT_BC1_RGBA_SRGB_BLOCK
-
VK_FORMAT_BC2_UNORM_BLOCK
-
VK_FORMAT_BC2_SRGB_BLOCK
-
VK_FORMAT_BC3_UNORM_BLOCK
-
VK_FORMAT_BC3_SRGB_BLOCK
-
VK_FORMAT_BC4_UNORM_BLOCK
-
VK_FORMAT_BC4_SNORM_BLOCK
-
VK_FORMAT_BC5_UNORM_BLOCK
-
VK_FORMAT_BC5_SNORM_BLOCK
-
VK_FORMAT_BC6H_UFLOAT_BLOCK
-
VK_FORMAT_BC6H_SFLOAT_BLOCK
-
VK_FORMAT_BC7_UNORM_BLOCK
-
VK_FORMAT_BC7_SRGB_BLOCK
To query for additional properties, or if the feature is not enabled, vkGetPhysicalDeviceFormatProperties and vkGetPhysicalDeviceImageFormatProperties can be used to check for supported properties of individual formats as normal.
-
-
occlusionQueryPrecise
specifies whether occlusion queries returning actual sample counts are supported. Occlusion queries are created in aVkQueryPool
by specifying thequeryType
ofVK_QUERY_TYPE_OCCLUSION
in theVkQueryPoolCreateInfo
structure which is passed tovkCreateQueryPool
. If this feature is enabled, queries of this type can enableVK_QUERY_CONTROL_PRECISE_BIT
in theflags
parameter tovkCmdBeginQuery
. If this feature is not supported, the implementation supports only boolean occlusion queries. When any samples are passed, boolean queries will return a non-zero result value, otherwise a result value of zero is returned. When this feature is enabled andVK_QUERY_CONTROL_PRECISE_BIT
is set, occlusion queries will report the actual number of samples passed. -
pipelineStatisticsQuery
specifies whether the pipeline statistics queries are supported. If this feature is not enabled, queries of typeVK_QUERY_TYPE_PIPELINE_STATISTICS
cannot be created, and none of the VkQueryPipelineStatisticFlagBits bits can be set in thepipelineStatistics
member of theVkQueryPoolCreateInfo
structure. -
vertexPipelineStoresAndAtomics
specifies whether storage buffers and images support stores and atomic operations in the vertex, tessellation, and geometry shader stages. If this feature is not enabled, all storage image, storage texel buffers, and storage buffer variables used by these stages in shader modules must be decorated with theNonWritable
decoration (or thereadonly
memory qualifier in GLSL). -
fragmentStoresAndAtomics
specifies whether storage buffers and images support stores and atomic operations in the fragment shader stage. If this feature is not enabled, all storage image, storage texel buffers, and storage buffer variables used by the fragment stage in shader modules must be decorated with theNonWritable
decoration (or thereadonly
memory qualifier in GLSL). -
shaderTessellationAndGeometryPointSize
specifies whether thePointSize
built-in decoration is available in the tessellation control, tessellation evaluation, and geometry shader stages. If this feature is not enabled, members decorated with thePointSize
built-in decoration must not be read from or written to and all points written from a tessellation or geometry shader will have a size of 1.0. This also specifies whether shader modules can declare theTessellationPointSize
capability for tessellation control and evaluation shaders, or if the shader modules can declare theGeometryPointSize
capability for geometry shaders. An implementation supporting this feature must also support one or both of thetessellationShader
orgeometryShader
features. -
shaderImageGatherExtended
specifies whether the extended set of image gather instructions are available in shader code. If this feature is not enabled, theOpImage
*Gather
instructions do not support theOffset
andConstOffsets
operands. This also specifies whether shader modules can declare theImageGatherExtended
capability. -
shaderStorageImageExtendedFormats
specifies whether all the extended storage image formats are available in shader code. If this feature is enabled then theVK_FORMAT_FEATURE_STORAGE_IMAGE_BIT
feature must be supported inoptimalTilingFeatures
for all of the extended formats. To query for additional properties, or if the feature is not enabled, vkGetPhysicalDeviceFormatProperties and vkGetPhysicalDeviceImageFormatProperties can be used to check for supported properties of individual formats as normal. -
shaderStorageImageMultisample
specifies whether multisampled storage images are supported. If this feature is not enabled, images that are created with ausage
that includesVK_IMAGE_USAGE_STORAGE_BIT
must be created withsamples
equal toVK_SAMPLE_COUNT_1_BIT
. This also specifies whether shader modules can declare theStorageImageMultisample
capability. -
shaderStorageImageReadWithoutFormat
specifies whether storage images require a format qualifier to be specified when reading from storage images. If this feature is not enabled, theOpImageRead
instruction must not have anOpTypeImage
ofUnknown
. This also specifies whether shader modules can declare theStorageImageReadWithoutFormat
capability. -
shaderStorageImageWriteWithoutFormat
specifies whether storage images require a format qualifier to be specified when writing to storage images. If this feature is not enabled, theOpImageWrite
instruction must not have anOpTypeImage
ofUnknown
. This also specifies whether shader modules can declare theStorageImageWriteWithoutFormat
capability. -
shaderUniformBufferArrayDynamicIndexing
specifies whether arrays of uniform buffers can be indexed by dynamically uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
orVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
must be indexed only by constant integral expressions when aggregated into arrays in shader code. This also specifies whether shader modules can declare theUniformBufferArrayDynamicIndexing
capability. -
shaderSampledImageArrayDynamicIndexing
specifies whether arrays of samplers or sampled images can be indexed by dynamically uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_SAMPLER
,VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
, orVK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
must be indexed only by constant integral expressions when aggregated into arrays in shader code. This also specifies whether shader modules can declare theSampledImageArrayDynamicIndexing
capability. -
shaderStorageBufferArrayDynamicIndexing
specifies whether arrays of storage buffers can be indexed by dynamically uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_STORAGE_BUFFER
orVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
must be indexed only by constant integral expressions when aggregated into arrays in shader code. This also specifies whether shader modules can declare theStorageBufferArrayDynamicIndexing
capability. -
shaderStorageImageArrayDynamicIndexing
specifies whether arrays of storage images can be indexed by dynamically uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_STORAGE_IMAGE
must be indexed only by constant integral expressions when aggregated into arrays in shader code. This also specifies whether shader modules can declare theStorageImageArrayDynamicIndexing
capability. -
shaderClipDistance
specifies whether clip distances are supported in shader code. If this feature is not enabled, any members decorated with theClipDistance
built-in decoration must not be read from or written to in shader modules. This also specifies whether shader modules can declare theClipDistance
capability. -
shaderCullDistance
specifies whether cull distances are supported in shader code. If this feature is not enabled, any members decorated with theCullDistance
built-in decoration must not be read from or written to in shader modules. This also specifies whether shader modules can declare theCullDistance
capability. -
shaderFloat64
specifies whether 64-bit floats (doubles) are supported in shader code. If this feature is not enabled, 64-bit floating-point types must not be used in shader code. This also specifies whether shader modules can declare theFloat64
capability. -
shaderInt64
specifies whether 64-bit integers (signed and unsigned) are supported in shader code. If this feature is not enabled, 64-bit integer types must not be used in shader code. This also specifies whether shader modules can declare theInt64
capability. -
shaderInt16
specifies whether 16-bit integers (signed and unsigned) are supported in shader code. If this feature is not enabled, 16-bit integer types must not be used in shader code. This also specifies whether shader modules can declare theInt16
capability. -
shaderResourceResidency
specifies whether image operations that return resource residency information are supported in shader code. If this feature is not enabled, theOpImageSparse
* instructions must not be used in shader code. This also specifies whether shader modules can declare theSparseResidency
capability. The feature requires at least one of thesparseResidency*
features to be supported. -
shaderResourceMinLod
specifies whether image operations that specify the minimum resource LOD are supported in shader code. If this feature is not enabled, theMinLod
image operand must not be used in shader code. This also specifies whether shader modules can declare theMinLod
capability. -
sparseBinding
specifies whether resource memory can be managed at opaque sparse block level instead of at the object level. If this feature is not enabled, resource memory must be bound only on a per-object basis using thevkBindBufferMemory
andvkBindImageMemory
commands. In this case, buffers and images must not be created withVK_BUFFER_CREATE_SPARSE_BINDING_BIT
andVK_IMAGE_CREATE_SPARSE_BINDING_BIT
set in theflags
member of theVkBufferCreateInfo
andVkImageCreateInfo
structures, respectively. Otherwise resource memory can be managed as described in Sparse Resource Features. -
sparseResidencyBuffer
specifies whether the device can access partially resident buffers. If this feature is not enabled, buffers must not be created withVK_BUFFER_CREATE_SPARSE_RESIDENCY_BIT
set in theflags
member of theVkBufferCreateInfo
structure. -
sparseResidencyImage2D
specifies whether the device can access partially resident 2D images with 1 sample per pixel. If this feature is not enabled, images with animageType
ofVK_IMAGE_TYPE_2D
andsamples
set toVK_SAMPLE_COUNT_1_BIT
must not be created withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
set in theflags
member of theVkImageCreateInfo
structure. -
sparseResidencyImage3D
specifies whether the device can access partially resident 3D images. If this feature is not enabled, images with animageType
ofVK_IMAGE_TYPE_3D
must not be created withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
set in theflags
member of theVkImageCreateInfo
structure. -
sparseResidency2Samples
specifies whether the physical device can access partially resident 2D images with 2 samples per pixel. If this feature is not enabled, images with animageType
ofVK_IMAGE_TYPE_2D
andsamples
set toVK_SAMPLE_COUNT_2_BIT
must not be created withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
set in theflags
member of theVkImageCreateInfo
structure. -
sparseResidency4Samples
specifies whether the physical device can access partially resident 2D images with 4 samples per pixel. If this feature is not enabled, images with animageType
ofVK_IMAGE_TYPE_2D
andsamples
set toVK_SAMPLE_COUNT_4_BIT
must not be created withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
set in theflags
member of theVkImageCreateInfo
structure. -
sparseResidency8Samples
specifies whether the physical device can access partially resident 2D images with 8 samples per pixel. If this feature is not enabled, images with animageType
ofVK_IMAGE_TYPE_2D
andsamples
set toVK_SAMPLE_COUNT_8_BIT
must not be created withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
set in theflags
member of theVkImageCreateInfo
structure. -
sparseResidency16Samples
specifies whether the physical device can access partially resident 2D images with 16 samples per pixel. If this feature is not enabled, images with animageType
ofVK_IMAGE_TYPE_2D
andsamples
set toVK_SAMPLE_COUNT_16_BIT
must not be created withVK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
set in theflags
member of theVkImageCreateInfo
structure. -
sparseResidencyAliased
specifies whether the physical device can correctly access data aliased into multiple locations. If this feature is not enabled, theVK_BUFFER_CREATE_SPARSE_ALIASED_BIT
andVK_IMAGE_CREATE_SPARSE_ALIASED_BIT
enum values must not be used inflags
members of theVkBufferCreateInfo
andVkImageCreateInfo
structures, respectively. -
variableMultisampleRate
specifies whether all pipelines that will be bound to a command buffer during a subpass with no attachments must have the same value forVkPipelineMultisampleStateCreateInfo
::rasterizationSamples
. If set toVK_TRUE
, the implementation supports variable multisample rates in a subpass with no attachments. If set toVK_FALSE
, then all pipelines bound in such a subpass must have the same multisample rate. This has no effect in situations where a subpass uses any attachments. -
inheritedQueries
specifies whether a secondary command buffer may be executed while a query is active.
The VkPhysicalDeviceVariablePointerFeatures
structure is defined as:
typedef struct VkPhysicalDeviceVariablePointerFeatures {
VkStructureType sType;
void* pNext;
VkBool32 variablePointersStorageBuffer;
VkBool32 variablePointers;
} VkPhysicalDeviceVariablePointerFeatures;
or the equivalent
typedef VkPhysicalDeviceVariablePointerFeatures VkPhysicalDeviceVariablePointerFeaturesKHR;
The members of the VkPhysicalDeviceVariablePointerFeatures
structure
describe the following features:
-
variablePointersStorageBuffer
specifies whether the implementation supports the SPIR-VVariablePointersStorageBuffer
capability. When this feature is not enabled, shader modules must not declare theSPV_KHR_variable_pointers
extension or theVariablePointersStorageBuffer
capability. -
variablePointers
specifies whether the implementation supports the SPIR-VVariablePointers
capability. When this feature is not enabled, shader modules must not declare theVariablePointers
capability.
If the VkPhysicalDeviceVariablePointerFeatures
structure is included
in the pNext
chain of VkPhysicalDeviceFeatures2, it is filled
with values indicating whether each feature is supported.
VkPhysicalDeviceVariablePointerFeatures
can also be used in the
pNext
chain of VkDeviceCreateInfo to enable the features.
The VkPhysicalDeviceMultiviewFeatures
structure is defined as:
typedef struct VkPhysicalDeviceMultiviewFeatures {
VkStructureType sType;
void* pNext;
VkBool32 multiview;
VkBool32 multiviewGeometryShader;
VkBool32 multiviewTessellationShader;
} VkPhysicalDeviceMultiviewFeatures;
or the equivalent
typedef VkPhysicalDeviceMultiviewFeatures VkPhysicalDeviceMultiviewFeaturesKHR;
The members of the VkPhysicalDeviceMultiviewFeatures
structure
describe the following features:
-
multiview
specifies whether the implementation supports multiview rendering within a render pass. If this feature is not enabled, the view mask of each subpass must always be zero. -
multiviewGeometryShader
specifies whether the implementation supports multiview rendering within a render pass, with geometry shaders. If this feature is not enabled, then a pipeline compiled against a subpass with a non-zero view mask must not include a geometry shader. -
multiviewTessellationShader
specifies whether the implementation supports multiview rendering within a render pass, with tessellation shaders. If this feature is not enabled, then a pipeline compiled against a subpass with a non-zero view mask must not include any tessellation shaders.
If the VkPhysicalDeviceMultiviewFeatures
structure is included in the
pNext
chain of VkPhysicalDeviceFeatures2, it is filled with
values indicating whether each feature is supported.
VkPhysicalDeviceMultiviewFeatures
can also be used in the pNext
chain of VkDeviceCreateInfo to enable the features.
To query 64-bit atomic support for signed and unsigned integers call
vkGetPhysicalDeviceFeatures2 with a
VkPhysicalDeviceShaderAtomicInt64FeaturesKHR
structure included in the
pNext
chain of its pFeatures
parameter.
The VkPhysicalDeviceShaderAtomicInt64FeaturesKHR structure is defined as:
typedef struct VkPhysicalDeviceShaderAtomicInt64FeaturesKHR {
VkStructureType sType;
void* pNext;
VkBool32 shaderBufferInt64Atomics;
VkBool32 shaderSharedInt64Atomics;
} VkPhysicalDeviceShaderAtomicInt64FeaturesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
shaderBufferInt64Atomics
indicates whether shaders can support 64-bit unsigned and signed integer atomic operations on buffers. -
shaderSharedInt64Atomics
indicates whether shaders can support 64-bit unsigned and signed integer atomic operations on shared memory.
To query 8-bit storage features additionally supported call
vkGetPhysicalDeviceFeatures2 with a
VkPhysicalDevice8BitStorageFeaturesKHR
structure included in the
pNext
chain of its pFeatures
parameter.
The VkPhysicalDevice8BitStorageFeaturesKHR
structure can also be in
the pNext
chain of a VkDeviceCreateInfo structure, in which case
it controls which additional features are enabled in the device.
The VkPhysicalDevice8BitStorageFeaturesKHR structure is defined as:
typedef struct VkPhysicalDevice8BitStorageFeaturesKHR {
VkStructureType sType;
void* pNext;
VkBool32 storageBuffer8BitAccess;
VkBool32 uniformAndStorageBuffer8BitAccess;
VkBool32 storagePushConstant8;
} VkPhysicalDevice8BitStorageFeaturesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
storageBuffer8BitAccess
indicates whether objects in theStorageBuffer
storage class with theBlock
decoration can have 8-bit integer members. If this feature is not enabled, 8-bit integer members must not be used in such objects. This also indicates whether shader modules can declare theStorageBuffer8BitAccess
capability. -
uniformAndStorageBuffer8BitAccess
indicates whether objects in theUniform
storage class with theBlock
decoration and in theStorageBuffer
storage class with the same decoration can have 8-bit integer members. If this feature is not enabled, 8-bit integer members must not be used in such objects. This also indicates whether shader modules can declare theUniformAndStorageBuffer8BitAccess
capability. -
storagePushConstant8
indicates whether objects in thePushConstant
storage class can have 8-bit integer members. If this feature is not enabled, 8-bit integer members must not be used in such objects. This also indicates whether shader modules can declare theStoragePushConstant8
capability.
To query 16-bit storage features additionally supported call
vkGetPhysicalDeviceFeatures2 with a
VkPhysicalDevice16BitStorageFeatures
structure included in the
pNext
chain of its pFeatures
parameter.
The VkPhysicalDevice16BitStorageFeatures
structure can also be in the
pNext
chain of a VkDeviceCreateInfo structure, in which case it
controls which additional features are enabled in the device.
The VkPhysicalDevice16BitStorageFeatures structure is defined as:
typedef struct VkPhysicalDevice16BitStorageFeatures {
VkStructureType sType;
void* pNext;
VkBool32 storageBuffer16BitAccess;
VkBool32 uniformAndStorageBuffer16BitAccess;
VkBool32 storagePushConstant16;
VkBool32 storageInputOutput16;
} VkPhysicalDevice16BitStorageFeatures;
or the equivalent
typedef VkPhysicalDevice16BitStorageFeatures VkPhysicalDevice16BitStorageFeaturesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
storageBuffer16BitAccess
specifies whether objects in theStorageBuffer
storage class with theBlock
decoration can have 16-bit integer and 16-bit floating-point members. If this feature is not enabled, 16-bit integer or 16-bit floating-point members must not be used in such objects. This also specifies whether shader modules can declare theStorageBuffer16BitAccess
capability. -
uniformAndStorageBuffer16BitAccess
specifies whether objects in theUniform
storage class with theBlock
decoration and in theStorageBuffer
storage class with the same decoration can have 16-bit integer and 16-bit floating-point members. If this feature is not enabled, 16-bit integer or 16-bit floating-point members must not be used in such objects. This also specifies whether shader modules can declare theUniformAndStorageBuffer16BitAccess
capability. -
storagePushConstant16
specifies whether objects in thePushConstant
storage class can have 16-bit integer and 16-bit floating-point members. If this feature is not enabled, 16-bit integer or floating-point members must not be used in such objects. This also specifies whether shader modules can declare theStoragePushConstant16
capability. -
storageInputOutput16
specifies whether objects in theInput
andOutput
storage classes can have 16-bit integer and 16-bit floating-point members. If this feature is not enabled, 16-bit integer or 16-bit floating-point members must not be used in such objects. This also specifies whether shader modules can declare theStorageInputOutput16
capability.
To query features additionally supported by the VK_KHR_shader_float16_int8
extension, call vkGetPhysicalDeviceFeatures2KHR with a
VkPhysicalDeviceFloat16Int8FeaturesKHR
structure in the pNext
chain.
The VkPhysicalDeviceFloat16Int8FeaturesKHR
structure can also be in
the pNext
chain of a VkDeviceCreateInfo structure, in which case
it controls which additional features are enabled in the device.
The VkPhysicalDeviceFloat16Int8FeaturesKHR
structure is defined as:
typedef struct VkPhysicalDeviceFloat16Int8FeaturesKHR {
VkStructureType sType;
void* pNext;
VkBool32 shaderFloat16;
VkBool32 shaderInt8;
} VkPhysicalDeviceFloat16Int8FeaturesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
shaderFloat16
indicates whether 16-bit floats (halfs) are supported in shader code. This also indicates whether shader modules can declare theFloat16
capability. -
shaderInt8
indicates whether 8-bit integers (signed and unsigned) are supported in shader code. This also indicates whether shader modules can declare theInt8
capability.
The VkPhysicalDeviceSamplerYcbcrConversionFeatures
structure is
defined as:
typedef struct VkPhysicalDeviceSamplerYcbcrConversionFeatures {
VkStructureType sType;
void* pNext;
VkBool32 samplerYcbcrConversion;
} VkPhysicalDeviceSamplerYcbcrConversionFeatures;
or the equivalent
typedef VkPhysicalDeviceSamplerYcbcrConversionFeatures VkPhysicalDeviceSamplerYcbcrConversionFeaturesKHR;
The members of the VkPhysicalDeviceSamplerYcbcrConversionFeatures
structure describe the following feature:
-
samplerYcbcrConversion
specifies whether the implementation supports sampler Y’CBCR conversion. IfsamplerYcbcrConversion
isVK_FALSE
, sampler Y’CBCR conversion is not supported, and samplers using sampler Y’CBCR conversion must not be used.
The VkPhysicalDeviceProtectedMemoryFeatures
structure is defined as:
typedef struct VkPhysicalDeviceProtectedMemoryFeatures {
VkStructureType sType;
void* pNext;
VkBool32 protectedMemory;
} VkPhysicalDeviceProtectedMemoryFeatures;
-
protectedMemory
specifies whether protected memory is supported.
If the VkPhysicalDeviceProtectedMemoryFeatures
structure is included
in the pNext
chain of VkPhysicalDeviceFeatures2, it is filled
with a value indicating whether the feature is supported.
The VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 advancedBlendCoherentOperations;
} VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT;
The members of the VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT
structure describe the following features:
-
advancedBlendCoherentOperations
specifies whether blending using advanced blend operations is guaranteed to execute atomically and in primitive order. If this isVK_TRUE
,VK_ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT_BIT_EXT
is treated the same asVK_ACCESS_COLOR_ATTACHMENT_READ_BIT
, and advanced blending needs no additional synchronization over basic blending. If this isVK_FALSE
, then memory dependencies are required to guarantee order between two advanced blending operations that occur on the same sample.
If the VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2, it is
filled with values indicating whether each feature is supported.
VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT
can also be used in
pNext
chain of VkDeviceCreateInfo to enable the features.
The VkPhysicalDeviceConditionalRenderingFeaturesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceConditionalRenderingFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 conditionalRendering;
VkBool32 inheritedConditionalRendering;
} VkPhysicalDeviceConditionalRenderingFeaturesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
conditionalRendering
specifies whether conditional rendering is supported. -
inheritedConditionalRendering
specifies whether a secondary command buffer can be executed while conditional rendering is active in the primary command buffer.
If the VkPhysicalDeviceConditionalRenderingFeaturesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2, it is
filled with values indicating the implementation-dependent behavior.
VkPhysicalDeviceConditionalRenderingFeaturesEXT
can also be used in
pNext
chain of VkDeviceCreateInfo to enable the features.
The VkPhysicalDeviceShaderDrawParameterFeatures
structure is defined
as:
typedef struct VkPhysicalDeviceShaderDrawParameterFeatures {
VkStructureType sType;
void* pNext;
VkBool32 shaderDrawParameters;
} VkPhysicalDeviceShaderDrawParameterFeatures;
If the VkPhysicalDeviceShaderDrawParameterFeatures
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2, it is
filled with a value indicating whether the feature is supported.
The VkPhysicalDeviceMeshShaderFeaturesNV
structure is defined as:
typedef struct VkPhysicalDeviceMeshShaderFeaturesNV {
VkStructureType sType;
void* pNext;
VkBool32 taskShader;
VkBool32 meshShader;
} VkPhysicalDeviceMeshShaderFeaturesNV;
If the VkPhysicalDeviceMeshShaderFeaturesNV
structure is included in
the pNext
chain of VkPhysicalDeviceFeatures2, it is filled with
a value indicating whether the feature is supported.
VkPhysicalDeviceMeshShaderFeaturesNV
can also be used in pNext
chain of VkDeviceCreateInfo to enable the features.
The VkPhysicalDeviceDescriptorIndexingFeaturesEXT
structure is defined
as:
typedef struct VkPhysicalDeviceDescriptorIndexingFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 shaderInputAttachmentArrayDynamicIndexing;
VkBool32 shaderUniformTexelBufferArrayDynamicIndexing;
VkBool32 shaderStorageTexelBufferArrayDynamicIndexing;
VkBool32 shaderUniformBufferArrayNonUniformIndexing;
VkBool32 shaderSampledImageArrayNonUniformIndexing;
VkBool32 shaderStorageBufferArrayNonUniformIndexing;
VkBool32 shaderStorageImageArrayNonUniformIndexing;
VkBool32 shaderInputAttachmentArrayNonUniformIndexing;
VkBool32 shaderUniformTexelBufferArrayNonUniformIndexing;
VkBool32 shaderStorageTexelBufferArrayNonUniformIndexing;
VkBool32 descriptorBindingUniformBufferUpdateAfterBind;
VkBool32 descriptorBindingSampledImageUpdateAfterBind;
VkBool32 descriptorBindingStorageImageUpdateAfterBind;
VkBool32 descriptorBindingStorageBufferUpdateAfterBind;
VkBool32 descriptorBindingUniformTexelBufferUpdateAfterBind;
VkBool32 descriptorBindingStorageTexelBufferUpdateAfterBind;
VkBool32 descriptorBindingUpdateUnusedWhilePending;
VkBool32 descriptorBindingPartiallyBound;
VkBool32 descriptorBindingVariableDescriptorCount;
VkBool32 runtimeDescriptorArray;
} VkPhysicalDeviceDescriptorIndexingFeaturesEXT;
The members of the VkPhysicalDeviceDescriptorIndexingFeaturesEXT
structure describe the following features:
-
shaderInputAttachmentArrayDynamicIndexing
indicates whether arrays of input attachments can be indexed by dynamically uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
must be indexed only by constant integral expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theInputAttachmentArrayDynamicIndexingEXT
capability. -
shaderUniformTexelBufferArrayDynamicIndexing
indicates whether arrays of uniform texel buffers can be indexed by dynamically uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
must be indexed only by constant integral expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theUniformTexelBufferArrayDynamicIndexingEXT
capability. -
shaderStorageTexelBufferArrayDynamicIndexing
indicates whether arrays of storage texel buffers can be indexed by dynamically uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
must be indexed only by constant integral expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theStorageTexelBufferArrayDynamicIndexingEXT
capability. -
shaderUniformBufferArrayNonUniformIndexing
indicates whether arrays of uniform buffers can be indexed by non-uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
orVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
must not be indexed by non-uniform integer expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theUniformBufferArrayNonUniformIndexingEXT
capability. -
shaderSampledImageArrayNonUniformIndexing
indicates whether arrays of samplers or sampled images can be indexed by non-uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_SAMPLER
,VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
, orVK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
must not be indexed by non-uniform integer expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theSampledImageArrayNonUniformIndexingEXT
capability. -
shaderStorageBufferArrayNonUniformIndexing
indicates whether arrays of storage buffers can be indexed by non-uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_STORAGE_BUFFER
orVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
must not be indexed by non-uniform integer expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theStorageBufferArrayNonUniformIndexingEXT
capability. -
shaderStorageImageArrayNonUniformIndexing
indicates whether arrays of storage images can be indexed by non-uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_STORAGE_IMAGE
must not be indexed by non-uniform integer expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theStorageImageArrayNonUniformIndexingEXT
capability. -
shaderInputAttachmentArrayNonUniformIndexing
indicates whether arrays of input attachments can be indexed by non-uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
must not be indexed by non-uniform integer expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theInputAttachmentArrayNonUniformIndexingEXT
capability. -
shaderUniformTexelBufferArrayNonUniformIndexing
indicates whether arrays of uniform texel buffers can be indexed by non-uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
must not be indexed by non-uniform integer expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theUniformTexelBufferArrayNonUniformIndexingEXT
capability. -
shaderStorageTexelBufferArrayNonUniformIndexing
indicates whether arrays of storage texel buffers can be indexed by non-uniform integer expressions in shader code. If this feature is not enabled, resources with a descriptor type ofVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
must not be indexed by non-uniform integer expressions when aggregated into arrays in shader code. This also indicates whether shader modules can declare theStorageTexelBufferArrayNonUniformIndexingEXT
capability. -
descriptorBindingUniformBufferUpdateAfterBind
indicates whether the implementation supports updating uniform buffer descriptors after a set is bound. If this feature is not enabled,VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
must not be used withVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
. -
descriptorBindingSampledImageUpdateAfterBind
indicates whether the implementation supports updating sampled image descriptors after a set is bound. If this feature is not enabled,VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
must not be used withVK_DESCRIPTOR_TYPE_SAMPLER
,VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
, orVK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
. -
descriptorBindingStorageImageUpdateAfterBind
indicates whether the implementation supports updating storage image descriptors after a set is bound. If this feature is not enabled,VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
must not be used withVK_DESCRIPTOR_TYPE_STORAGE_IMAGE
. -
descriptorBindingStorageBufferUpdateAfterBind
indicates whether the implementation supports updating storage buffer descriptors after a set is bound. If this feature is not enabled,VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
must not be used withVK_DESCRIPTOR_TYPE_STORAGE_BUFFER
. -
descriptorBindingUniformTexelBufferUpdateAfterBind
indicates whether the implementation supports updating uniform texel buffer descriptors after a set is bound. If this feature is not enabled,VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
must not be used withVK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
. -
descriptorBindingStorageTexelBufferUpdateAfterBind
indicates whether the implementation supports updating storage texel buffer descriptors after a set is bound. If this feature is not enabled,VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
must not be used withVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
. -
descriptorBindingUpdateUnusedWhilePending
indicates whether the implementation supports updating descriptors while the set is in use. If this feature is not enabled,VK_DESCRIPTOR_BINDING_UPDATE_UNUSED_WHILE_PENDING_BIT_EXT
must not be used. -
descriptorBindingPartiallyBound
indicates whether the implementation supports statically using a descriptor set binding in which some descriptors are not valid. If this feature is not enabled,VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT
must not be used. -
descriptorBindingVariableDescriptorCount
indicates whether the implementation supports descriptor sets with a variable-sized last binding. If this feature is not enabled,VK_DESCRIPTOR_BINDING_VARIABLE_DESCRIPTOR_COUNT_BIT_EXT
must not be used. -
runtimeDescriptorArray
indicates whether the implementation supports the SPIR-V RuntimeDescriptorArrayEXT capability. If this feature is not enabled, descriptors must not be declared in runtime arrays.
If the VkPhysicalDeviceDescriptorIndexingFeaturesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2KHR, it
is filled with values indicating whether each feature is supported.
VkPhysicalDeviceDescriptorIndexingFeaturesEXT
can also be used in the
pNext
chain of VkDeviceCreateInfo to enable features.
The VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 vertexAttributeInstanceRateDivisor;
VkBool32 vertexAttributeInstanceRateZeroDivisor;
} VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
vertexAttributeInstanceRateDivisor
specifies whether vertex attribute fetching may be repeated in case of instanced rendering. -
vertexAttributeInstanceRateZeroDivisor
specifies whether a zero value forVkVertexInputBindingDivisorDescriptionEXT
::divisor
is supported.
If the VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2, it is
filled with values indicating the implementation-dependent behavior.
VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT
can also be used in
pNext
chain of VkDeviceCreateInfo to enable the feature.
The VkPhysicalDeviceASTCDecodeFeaturesEXT
structure is defined as:
typedef struct VkPhysicalDeviceASTCDecodeFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 decodeModeSharedExponent;
} VkPhysicalDeviceASTCDecodeFeaturesEXT;
The members of the VkPhysicalDeviceASTCDecodeFeaturesEXT
structure
describe the following features:
If the VkPhysicalDeviceASTCDecodeFeaturesEXT
structure is included in
the pNext
chain of vkGetPhysicalDeviceFeatures2KHR, it is filled
with values indicating whether each feature is supported.
VkPhysicalDeviceASTCDecodeFeaturesEXT
can also be used in the
pNext
chain of vkCreateDevice to enable features.
The VkPhysicalDeviceTransformFeedbackFeaturesEXT
structure is defined
as:
typedef struct VkPhysicalDeviceTransformFeedbackFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 transformFeedback;
VkBool32 geometryStreams;
} VkPhysicalDeviceTransformFeedbackFeaturesEXT;
The members of the VkPhysicalDeviceTransformFeedbackFeaturesEXT
structure describe the following features:
If the VkPhysicalDeviceTransformFeedbackFeaturesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2KHR, it
is filled with values indicating whether each feature is supported.
VkPhysicalDeviceTransformFeedbackFeaturesEXT
can also be used in the
pNext
chain of VkDeviceCreateInfo to enable features.
To query memory model features additionally supported call
vkGetPhysicalDeviceFeatures2 with a
VkPhysicalDeviceVulkanMemoryModelFeaturesKHR
structure included in the
pNext
chain of its pFeatures
parameter.
The VkPhysicalDeviceVulkanMemoryModelFeaturesKHR
structure can also
be in the pNext
chain of a VkDeviceCreateInfo structure, in
which case it controls which additional features are enabled in the device.
The VkPhysicalDeviceVulkanMemoryModelFeaturesKHR structure is defined as:
typedef struct VkPhysicalDeviceVulkanMemoryModelFeaturesKHR {
VkStructureType sType;
void* pNext;
VkBool32 vulkanMemoryModel;
VkBool32 vulkanMemoryModelDeviceScope;
} VkPhysicalDeviceVulkanMemoryModelFeaturesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
vulkanMemoryModel
indicates whether the Vulkan Memory Model is supported, as defined in Vulkan Memory Model. This also indicates whether shader modules can declare theVulkanMemoryModelKHR
capability. -
vulkanMemoryModelDeviceScope
indicates whether the Vulkan Memory Model can useDevice
scope synchronization. This also indicates whether shader modules can declare theVulkanMemoryModelDeviceScopeKHR
capability.
The VkPhysicalDeviceInlineUniformBlockFeaturesEXT
structure is defined
as:
typedef struct VkPhysicalDeviceInlineUniformBlockFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 inlineUniformBlock;
VkBool32 descriptorBindingInlineUniformBlockUpdateAfterBind;
} VkPhysicalDeviceInlineUniformBlockFeaturesEXT;
The members of the VkPhysicalDeviceInlineUniformBlockFeaturesEXT
structure describe the following features:
-
inlineUniformBlock
indicates whether the implementation supports inline uniform block descriptors. If this feature is not enabled,VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
must not be used. -
descriptorBindingInlineUniformBlockUpdateAfterBind
indicates whether the implementation supports updating inline uniform block descriptors after a set is bound. If this feature is not enabled,VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT_EXT
must not be used withVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
.
If the VkPhysicalDeviceInlineUniformBlockFeaturesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2, it is
filled with values indicating whether each feature is supported.
VkPhysicalDeviceInlineUniformBlockFeaturesEXT
can also be used in the
pNext
chain of VkDeviceCreateInfo to enable features.
The VkPhysicalDeviceRepresentativeFragmentTestFeaturesNV
structure is
defined as:
typedef struct VkPhysicalDeviceRepresentativeFragmentTestFeaturesNV {
VkStructureType sType;
void* pNext;
VkBool32 representativeFragmentTest;
} VkPhysicalDeviceRepresentativeFragmentTestFeaturesNV;
The members of the
VkPhysicalDeviceRepresentativeFragmentTestFeaturesNV
structure
describe the following features:
-
representativeFragmentTest
indicates whether the implementation supports the representative fragment test. See Representative Fragment Test.
If the VkPhysicalDeviceRepresentativeFragmentTestFeaturesNV
structure
is included in the pNext
chain of VkPhysicalDeviceFeatures2KHR,
it is filled with values indicating whether the feature is supported.
VkPhysicalDeviceRepresentativeFragmentTestFeaturesNV
can also be used
in the pNext
chain of VkDeviceCreateInfo to enable the feature.
The VkPhysicalDeviceExclusiveScissorFeaturesNV
structure is defined
as:
typedef struct VkPhysicalDeviceExclusiveScissorFeaturesNV {
VkStructureType sType;
void* pNext;
VkBool32 exclusiveScissor;
} VkPhysicalDeviceExclusiveScissorFeaturesNV;
The members of the VkPhysicalDeviceExclusiveScissorFeaturesNV
structure describe the following features:
See Exclusive Scissor Test for more information.
If the VkPhysicalDeviceExclusiveScissorFeaturesNV
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2KHR, it
is filled with values indicating whether the feature is supported.
VkPhysicalDeviceExclusiveScissorFeaturesNV
can also be used in the
pNext
chain of VkDeviceCreateInfo to enable the feature.
The VkPhysicalDeviceCornerSampledImageFeaturesNV
structure is defined
as:
typedef struct VkPhysicalDeviceCornerSampledImageFeaturesNV {
VkStructureType sType;
void* pNext;
VkBool32 cornerSampledImage;
} VkPhysicalDeviceCornerSampledImageFeaturesNV;
The members of the VkPhysicalDeviceCornerSampledImageFeaturesNV
structure describe the following features:
-
cornerSampledImage
specifies whether images can be created with a VkImageCreateInfo::flags
containingVK_IMAGE_CREATE_CORNER_SAMPLED_BIT_NV
. See Corner-Sampled Images.
If the VkPhysicalDeviceCornerSampledImageFeaturesNV
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2KHR, it
is filled with values indicating whether each feature is supported.
VkPhysicalDeviceCornerSampledImageFeaturesNV
can also be used in the
pNext
chain of VkDeviceCreateInfo to enable features.
The VkPhysicalDeviceComputeShaderDerivativesFeaturesNV
structure is
defined as:
typedef struct VkPhysicalDeviceComputeShaderDerivativesFeaturesNV {
VkStructureType sType;
void* pNext;
VkBool32 computeDerivativeGroupQuads;
VkBool32 computeDerivativeGroupLinear;
} VkPhysicalDeviceComputeShaderDerivativesFeaturesNV;
The members of the VkPhysicalDeviceComputeShaderDerivativesFeaturesNV
structure describe the following features:
See Compute Shader Derivatives for more information.
If the VkPhysicalDeviceComputeShaderDerivativesFeaturesNV
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2KHR, it
is filled with values indicating whether each feature is supported.
VkPhysicalDeviceComputeShaderDerivativesFeaturesNV
can also be used
in the pNext
chain of VkDeviceCreateInfo to enable features.
The VkPhysicalDeviceFragmentShaderBarycentricFeaturesNV
structure is
defined as:
typedef struct VkPhysicalDeviceFragmentShaderBarycentricFeaturesNV {
VkStructureType sType;
void* pNext;
VkBool32 fragmentShaderBarycentric;
} VkPhysicalDeviceFragmentShaderBarycentricFeaturesNV;
The members of the VkPhysicalDeviceFragmentShaderBarycentricFeaturesNV
structure describe the following features:
See Barycentric Interpolation for more information.
If the VkPhysicalDeviceFragmentShaderBarycentricFeaturesNV
structure
is included in the pNext
chain of VkPhysicalDeviceFeatures2KHR,
it is filled with values indicating whether the feature is supported.
VkPhysicalDeviceFragmentShaderBarycentricFeaturesNV
can also be used
in the pNext
chain of VkDeviceCreateInfo to enable features.
The VkPhysicalDeviceShaderImageFootprintFeaturesNV
structure is
defined as:
typedef struct VkPhysicalDeviceShaderImageFootprintFeaturesNV {
VkStructureType sType;
void* pNext;
VkBool32 imageFootprint;
} VkPhysicalDeviceShaderImageFootprintFeaturesNV;
See Texel Footprint Evaluation for more information.
If the VkPhysicalDeviceShaderImageFootprintFeaturesNV
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2KHR, it
is filled with values indicating whether each feature is supported.
VkPhysicalDeviceShaderImageFootprintFeaturesNV
can also be used in
the pNext
chain of VkDeviceCreateInfo to enable features.
The VkPhysicalDeviceShadingRateImageFeaturesNV
structure is defined
as:
typedef struct VkPhysicalDeviceShadingRateImageFeaturesNV {
VkStructureType sType;
void* pNext;
VkBool32 shadingRateImage;
VkBool32 shadingRateCoarseSampleOrder;
} VkPhysicalDeviceShadingRateImageFeaturesNV;
The members of the VkPhysicalDeviceShadingRateImageFeaturesNV
structure describe the following features:
-
shadingRateImage
indicates that the implementation supports the use of a shading rate image to derive an effective shading rate for fragment processing. It also indicates that the implementation supports theShadingRateNV
SPIR-V execution mode. -
shadingRateCoarseSampleOrder
indicates that the implementation supports a user-configurable ordering of coverage samples in fragments larger than one pixel.
See Shading Rate Image for more information.
If the VkPhysicalDeviceShadingRateImageFeaturesNV
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2KHR, it
is filled with values indicating whether the feature is supported.
VkPhysicalDeviceShadingRateImageFeaturesNV
can also be used in the
pNext
chain of VkDeviceCreateInfo to enable features.
The VkPhysicalDeviceFragmentDensityMapFeaturesEXT
structure is defined
as:
typedef struct VkPhysicalDeviceFragmentDensityMapFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 fragmentDensityMap;
VkBool32 fragmentDensityMapDynamic;
VkBool32 fragmentDensityMapNonSubsampledImages;
} VkPhysicalDeviceFragmentDensityMapFeaturesEXT;
The members of the VkPhysicalDeviceFragmentDensityMapFeaturesEXT
structure describe the following features:
-
fragmentDensityMap
specifies whether the implementation supports render passes with a fragment density map attachment. If this feature is not enabled and thepNext
chain ofVkRenderPassCreateInfo
containsVkRenderPassFragmentDensityMapCreateInfoEXT
,fragmentDensityMapAttachment
must beVK_ATTACHMENT_UNUSED
. -
fragmentDensityMapDynamic
specifies whether the implementation supports dynamic fragment density map image views. If this feature is not enabled,VK_IMAGE_VIEW_CREATE_FRAGMENT_DENSITY_MAP_DYNAMIC_BIT_EXT
must not be included inVkImageViewCreateInfo
::flags
. -
fragmentDensityMapNonSubsampledImages
specifies whether the implementation supports regular non-subsampled image attachments with fragment density map render passes. If this feature is not enabled, render passes with a fragment density map attachment must only have subsampled attachments bound.
If the VkPhysicalDeviceFragmentDensityMapFeaturesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2, it is
filled with values indicating whether each feature is supported.
VkPhysicalDeviceFragmentDensityMapFeaturesEXT
can also be used in
pNext
chain of VkDeviceCreateInfo to enable the features.
The VkPhysicalDeviceScalarBlockLayoutFeaturesEXT
structure is defined
as:
typedef struct VkPhysicalDeviceScalarBlockLayoutFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 scalarBlockLayout;
} VkPhysicalDeviceScalarBlockLayoutFeaturesEXT;
The members of the VkPhysicalDeviceScalarBlockLayoutFeaturesEXT
structure describe the following features:
-
scalarBlockLayout
indicates that the implementation supports the layout of resource blocks in shaders using scalar alignment.
If the VkPhysicalDeviceScalarBlockLayoutFeaturesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceFeatures2KHR, it
is filled with values indicating whether the feature is supported.
VkPhysicalDeviceScalarBlockLayoutFeaturesEXT
can also be used in the
pNext
chain of VkDeviceCreateInfo to enable this feature.
35.1.1. Feature Requirements
All Vulkan graphics implementations must support the following features:
-
variablePointersStorageBuffer
, if theVK_KHR_variable_pointers
extension is supported. -
storageBuffer8BitAccess
, if theVK_KHR_8bit_storage
extension is supported. -
If the
VK_EXT_descriptor_indexing
extension is supported: -
inlineUniformBlock
, if theVK_EXT_inline_uniform_block
extension is supported. -
descriptorBindingInlineUniformBlockUpdateAfterBind
, if theVK_EXT_inline_uniform_block
andVK_EXT_descriptor_indexing
extensions are both supported. -
If the
VK_EXT_scalar_block_layout
extension is supported:
All other features defined in the Specification are optional.
35.2. Limits
There are a variety of implementation-dependent limits.
The VkPhysicalDeviceLimits
are properties of the physical device.
These are available in the limits
member of the
VkPhysicalDeviceProperties structure which is returned from
vkGetPhysicalDeviceProperties.
The VkPhysicalDeviceLimits
structure is defined as:
typedef struct VkPhysicalDeviceLimits {
uint32_t maxImageDimension1D;
uint32_t maxImageDimension2D;
uint32_t maxImageDimension3D;
uint32_t maxImageDimensionCube;
uint32_t maxImageArrayLayers;
uint32_t maxTexelBufferElements;
uint32_t maxUniformBufferRange;
uint32_t maxStorageBufferRange;
uint32_t maxPushConstantsSize;
uint32_t maxMemoryAllocationCount;
uint32_t maxSamplerAllocationCount;
VkDeviceSize bufferImageGranularity;
VkDeviceSize sparseAddressSpaceSize;
uint32_t maxBoundDescriptorSets;
uint32_t maxPerStageDescriptorSamplers;
uint32_t maxPerStageDescriptorUniformBuffers;
uint32_t maxPerStageDescriptorStorageBuffers;
uint32_t maxPerStageDescriptorSampledImages;
uint32_t maxPerStageDescriptorStorageImages;
uint32_t maxPerStageDescriptorInputAttachments;
uint32_t maxPerStageResources;
uint32_t maxDescriptorSetSamplers;
uint32_t maxDescriptorSetUniformBuffers;
uint32_t maxDescriptorSetUniformBuffersDynamic;
uint32_t maxDescriptorSetStorageBuffers;
uint32_t maxDescriptorSetStorageBuffersDynamic;
uint32_t maxDescriptorSetSampledImages;
uint32_t maxDescriptorSetStorageImages;
uint32_t maxDescriptorSetInputAttachments;
uint32_t maxVertexInputAttributes;
uint32_t maxVertexInputBindings;
uint32_t maxVertexInputAttributeOffset;
uint32_t maxVertexInputBindingStride;
uint32_t maxVertexOutputComponents;
uint32_t maxTessellationGenerationLevel;
uint32_t maxTessellationPatchSize;
uint32_t maxTessellationControlPerVertexInputComponents;
uint32_t maxTessellationControlPerVertexOutputComponents;
uint32_t maxTessellationControlPerPatchOutputComponents;
uint32_t maxTessellationControlTotalOutputComponents;
uint32_t maxTessellationEvaluationInputComponents;
uint32_t maxTessellationEvaluationOutputComponents;
uint32_t maxGeometryShaderInvocations;
uint32_t maxGeometryInputComponents;
uint32_t maxGeometryOutputComponents;
uint32_t maxGeometryOutputVertices;
uint32_t maxGeometryTotalOutputComponents;
uint32_t maxFragmentInputComponents;
uint32_t maxFragmentOutputAttachments;
uint32_t maxFragmentDualSrcAttachments;
uint32_t maxFragmentCombinedOutputResources;
uint32_t maxComputeSharedMemorySize;
uint32_t maxComputeWorkGroupCount[3];
uint32_t maxComputeWorkGroupInvocations;
uint32_t maxComputeWorkGroupSize[3];
uint32_t subPixelPrecisionBits;
uint32_t subTexelPrecisionBits;
uint32_t mipmapPrecisionBits;
uint32_t maxDrawIndexedIndexValue;
uint32_t maxDrawIndirectCount;
float maxSamplerLodBias;
float maxSamplerAnisotropy;
uint32_t maxViewports;
uint32_t maxViewportDimensions[2];
float viewportBoundsRange[2];
uint32_t viewportSubPixelBits;
size_t minMemoryMapAlignment;
VkDeviceSize minTexelBufferOffsetAlignment;
VkDeviceSize minUniformBufferOffsetAlignment;
VkDeviceSize minStorageBufferOffsetAlignment;
int32_t minTexelOffset;
uint32_t maxTexelOffset;
int32_t minTexelGatherOffset;
uint32_t maxTexelGatherOffset;
float minInterpolationOffset;
float maxInterpolationOffset;
uint32_t subPixelInterpolationOffsetBits;
uint32_t maxFramebufferWidth;
uint32_t maxFramebufferHeight;
uint32_t maxFramebufferLayers;
VkSampleCountFlags framebufferColorSampleCounts;
VkSampleCountFlags framebufferDepthSampleCounts;
VkSampleCountFlags framebufferStencilSampleCounts;
VkSampleCountFlags framebufferNoAttachmentsSampleCounts;
uint32_t maxColorAttachments;
VkSampleCountFlags sampledImageColorSampleCounts;
VkSampleCountFlags sampledImageIntegerSampleCounts;
VkSampleCountFlags sampledImageDepthSampleCounts;
VkSampleCountFlags sampledImageStencilSampleCounts;
VkSampleCountFlags storageImageSampleCounts;
uint32_t maxSampleMaskWords;
VkBool32 timestampComputeAndGraphics;
float timestampPeriod;
uint32_t maxClipDistances;
uint32_t maxCullDistances;
uint32_t maxCombinedClipAndCullDistances;
uint32_t discreteQueuePriorities;
float pointSizeRange[2];
float lineWidthRange[2];
float pointSizeGranularity;
float lineWidthGranularity;
VkBool32 strictLines;
VkBool32 standardSampleLocations;
VkDeviceSize optimalBufferCopyOffsetAlignment;
VkDeviceSize optimalBufferCopyRowPitchAlignment;
VkDeviceSize nonCoherentAtomSize;
} VkPhysicalDeviceLimits;
-
maxImageDimension1D
is the maximum dimension (width
) supported for all images created with animageType
ofVK_IMAGE_TYPE_1D
. -
maxImageDimension2D
is the maximum dimension (width
orheight
) supported for all images created with animageType
ofVK_IMAGE_TYPE_2D
and withoutVK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
set inflags
. -
maxImageDimension3D
is the maximum dimension (width
,height
, ordepth
) supported for all images created with animageType
ofVK_IMAGE_TYPE_3D
. -
maxImageDimensionCube
is the maximum dimension (width
orheight
) supported for all images created with animageType
ofVK_IMAGE_TYPE_2D
and withVK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
set inflags
. -
maxImageArrayLayers
is the maximum number of layers (arrayLayers
) for an image. -
maxTexelBufferElements
is the maximum number of addressable texels for a buffer view created on a buffer which was created with theVK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT
orVK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT
set in theusage
member of theVkBufferCreateInfo
structure. -
maxUniformBufferRange
is the maximum value that can be specified in therange
member of any VkDescriptorBufferInfo structures passed to a call to vkUpdateDescriptorSets for descriptors of typeVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
orVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
. -
maxStorageBufferRange
is the maximum value that can be specified in therange
member of any VkDescriptorBufferInfo structures passed to a call to vkUpdateDescriptorSets for descriptors of typeVK_DESCRIPTOR_TYPE_STORAGE_BUFFER
orVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
. -
maxPushConstantsSize
is the maximum size, in bytes, of the pool of push constant memory. For each of the push constant ranges indicated by thepPushConstantRanges
member of theVkPipelineLayoutCreateInfo
structure, (offset
+size
) must be less than or equal to this limit. -
maxMemoryAllocationCount
is the maximum number of device memory allocations, as created by vkAllocateMemory, which can simultaneously exist. -
maxSamplerAllocationCount
is the maximum number of sampler objects, as created by vkCreateSampler, which can simultaneously exist on a device. -
bufferImageGranularity
is the granularity, in bytes, at which buffer or linear image resources, and optimal image resources can be bound to adjacent offsets in the sameVkDeviceMemory
object without aliasing. See Buffer-Image Granularity for more details. -
sparseAddressSpaceSize
is the total amount of address space available, in bytes, for sparse memory resources. This is an upper bound on the sum of the size of all sparse resources, regardless of whether any memory is bound to them. -
maxBoundDescriptorSets
is the maximum number of descriptor sets that can be simultaneously used by a pipeline. AllDescriptorSet
decorations in shader modules must have a value less thanmaxBoundDescriptorSets
. See Descriptor Sets. -
maxPerStageDescriptorSamplers
is the maximum number of samplers that can be accessible to a single shader stage in a pipeline layout. Descriptors with a type ofVK_DESCRIPTOR_TYPE_SAMPLER
orVK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. A descriptor is accessible to a shader stage when thestageFlags
member of theVkDescriptorSetLayoutBinding
structure has the bit for that shader stage set. See Sampler and Combined Image Sampler. -
maxPerStageDescriptorUniformBuffers
is the maximum number of uniform buffers that can be accessible to a single shader stage in a pipeline layout. Descriptors with a type ofVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
orVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. A descriptor is accessible to a shader stage when thestageFlags
member of theVkDescriptorSetLayoutBinding
structure has the bit for that shader stage set. See Uniform Buffer and Dynamic Uniform Buffer. -
maxPerStageDescriptorStorageBuffers
is the maximum number of storage buffers that can be accessible to a single shader stage in a pipeline layout. Descriptors with a type ofVK_DESCRIPTOR_TYPE_STORAGE_BUFFER
orVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. A descriptor is accessible to a pipeline shader stage when thestageFlags
member of theVkDescriptorSetLayoutBinding
structure has the bit for that shader stage set. See Storage Buffer and Dynamic Storage Buffer. -
maxPerStageDescriptorSampledImages
is the maximum number of sampled images that can be accessible to a single shader stage in a pipeline layout. Descriptors with a type ofVK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
,VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
, orVK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. A descriptor is accessible to a pipeline shader stage when thestageFlags
member of theVkDescriptorSetLayoutBinding
structure has the bit for that shader stage set. See Combined Image Sampler, Sampled Image, and Uniform Texel Buffer. -
maxPerStageDescriptorStorageImages
is the maximum number of storage images that can be accessible to a single shader stage in a pipeline layout. Descriptors with a type ofVK_DESCRIPTOR_TYPE_STORAGE_IMAGE
, orVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. A descriptor is accessible to a pipeline shader stage when thestageFlags
member of theVkDescriptorSetLayoutBinding
structure has the bit for that shader stage set. See Storage Image, and Storage Texel Buffer. -
maxPerStageDescriptorInputAttachments
is the maximum number of input attachments that can be accessible to a single shader stage in a pipeline layout. Descriptors with a type ofVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. A descriptor is accessible to a pipeline shader stage when thestageFlags
member of theVkDescriptorSetLayoutBinding
structure has the bit for that shader stage set. These are only supported for the fragment stage. See Input Attachment. -
maxPerStageResources
is the maximum number of resources that can be accessible to a single shader stage in a pipeline layout. Descriptors with a type ofVK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
,VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
,VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
,VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
,VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
,VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
,VK_DESCRIPTOR_TYPE_STORAGE_BUFFER
,VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
,VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
, orVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. For the fragment shader stage the framebuffer color attachments also count against this limit. -
maxDescriptorSetSamplers
is the maximum number of samplers that can be included in descriptor bindings in a pipeline layout across all pipeline shader stages and descriptor set numbers. Descriptors with a type ofVK_DESCRIPTOR_TYPE_SAMPLER
orVK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. See Sampler and Combined Image Sampler. -
maxDescriptorSetUniformBuffers
is the maximum number of uniform buffers that can be included in descriptor bindings in a pipeline layout across all pipeline shader stages and descriptor set numbers. Descriptors with a type ofVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
orVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. See Uniform Buffer and Dynamic Uniform Buffer. -
maxDescriptorSetUniformBuffersDynamic
is the maximum number of dynamic uniform buffers that can be included in descriptor bindings in a pipeline layout across all pipeline shader stages and descriptor set numbers. Descriptors with a type ofVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. See Dynamic Uniform Buffer. -
maxDescriptorSetStorageBuffers
is the maximum number of storage buffers that can be included in descriptor bindings in a pipeline layout across all pipeline shader stages and descriptor set numbers. Descriptors with a type ofVK_DESCRIPTOR_TYPE_STORAGE_BUFFER
orVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. See Storage Buffer and Dynamic Storage Buffer. -
maxDescriptorSetStorageBuffersDynamic
is the maximum number of dynamic storage buffers that can be included in descriptor bindings in a pipeline layout across all pipeline shader stages and descriptor set numbers. Descriptors with a type ofVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. See Dynamic Storage Buffer. -
maxDescriptorSetSampledImages
is the maximum number of sampled images that can be included in descriptor bindings in a pipeline layout across all pipeline shader stages and descriptor set numbers. Descriptors with a type ofVK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
,VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
, orVK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. See Combined Image Sampler, Sampled Image, and Uniform Texel Buffer. -
maxDescriptorSetStorageImages
is the maximum number of storage images that can be included in descriptor bindings in a pipeline layout across all pipeline shader stages and descriptor set numbers. Descriptors with a type ofVK_DESCRIPTOR_TYPE_STORAGE_IMAGE
, orVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. See Storage Image, and Storage Texel Buffer. -
maxDescriptorSetInputAttachments
is the maximum number of input attachments that can be included in descriptor bindings in a pipeline layout across all pipeline shader stages and descriptor set numbers. Descriptors with a type ofVK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT
count against this limit. Only descriptors in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. See Input Attachment. -
maxVertexInputAttributes
is the maximum number of vertex input attributes that can be specified for a graphics pipeline. These are described in the array ofVkVertexInputAttributeDescription
structures that are provided at graphics pipeline creation time via thepVertexAttributeDescriptions
member of theVkPipelineVertexInputStateCreateInfo
structure. See Vertex Attributes and Vertex Input Description. -
maxVertexInputBindings
is the maximum number of vertex buffers that can be specified for providing vertex attributes to a graphics pipeline. These are described in the array ofVkVertexInputBindingDescription
structures that are provided at graphics pipeline creation time via thepVertexBindingDescriptions
member of theVkPipelineVertexInputStateCreateInfo
structure. Thebinding
member ofVkVertexInputBindingDescription
must be less than this limit. See Vertex Input Description. -
maxVertexInputAttributeOffset
is the maximum vertex input attribute offset that can be added to the vertex input binding stride. Theoffset
member of theVkVertexInputAttributeDescription
structure must be less than or equal to this limit. See Vertex Input Description. -
maxVertexInputBindingStride
is the maximum vertex input binding stride that can be specified in a vertex input binding. Thestride
member of theVkVertexInputBindingDescription
structure must be less than or equal to this limit. See Vertex Input Description. -
maxVertexOutputComponents
is the maximum number of components of output variables which can be output by a vertex shader. See Vertex Shaders. -
maxTessellationGenerationLevel
is the maximum tessellation generation level supported by the fixed-function tessellation primitive generator. See Tessellation. -
maxTessellationPatchSize
is the maximum patch size, in vertices, of patches that can be processed by the tessellation control shader and tessellation primitive generator. ThepatchControlPoints
member of theVkPipelineTessellationStateCreateInfo
structure specified at pipeline creation time and the value provided in theOutputVertices
execution mode of shader modules must be less than or equal to this limit. See Tessellation. -
maxTessellationControlPerVertexInputComponents
is the maximum number of components of input variables which can be provided as per-vertex inputs to the tessellation control shader stage. -
maxTessellationControlPerVertexOutputComponents
is the maximum number of components of per-vertex output variables which can be output from the tessellation control shader stage. -
maxTessellationControlPerPatchOutputComponents
is the maximum number of components of per-patch output variables which can be output from the tessellation control shader stage. -
maxTessellationControlTotalOutputComponents
is the maximum total number of components of per-vertex and per-patch output variables which can be output from the tessellation control shader stage. -
maxTessellationEvaluationInputComponents
is the maximum number of components of input variables which can be provided as per-vertex inputs to the tessellation evaluation shader stage. -
maxTessellationEvaluationOutputComponents
is the maximum number of components of per-vertex output variables which can be output from the tessellation evaluation shader stage. -
maxGeometryShaderInvocations
is the maximum invocation count supported for instanced geometry shaders. The value provided in theInvocations
execution mode of shader modules must be less than or equal to this limit. See Geometry Shading. -
maxGeometryInputComponents
is the maximum number of components of input variables which can be provided as inputs to the geometry shader stage. -
maxGeometryOutputComponents
is the maximum number of components of output variables which can be output from the geometry shader stage. -
maxGeometryOutputVertices
is the maximum number of vertices which can be emitted by any geometry shader. -
maxGeometryTotalOutputComponents
is the maximum total number of components of output, across all emitted vertices, which can be output from the geometry shader stage. -
maxFragmentInputComponents
is the maximum number of components of input variables which can be provided as inputs to the fragment shader stage. -
maxFragmentOutputAttachments
is the maximum number of output attachments which can be written to by the fragment shader stage. -
maxFragmentDualSrcAttachments
is the maximum number of output attachments which can be written to by the fragment shader stage when blending is enabled and one of the dual source blend modes is in use. See Dual-Source Blending and dualSrcBlend. -
maxFragmentCombinedOutputResources
is the total number of storage buffers, storage images, and output buffers which can be used in the fragment shader stage. -
maxComputeSharedMemorySize
is the maximum total storage size, in bytes, available for variables declared with theWorkgroup
storage class in shader modules (or with theshared
storage qualifier in GLSL) in the compute shader stage. The amount of storage consumed by the variables declared with theWorkgroup
storage class is implementation-dependent. However, the amount of storage consumed may not exceed the largest block size that would be obtained if all active variables declared withWorkgroup
storage class were assigned offsets in an arbitrary order by successively taking the smallest valid offset according to the Standard Storage Buffer Layout rules. (This is equivalent to using the GLSL std430 layout rules.) -
maxComputeWorkGroupCount
[3] is the maximum number of local workgroups that can be dispatched by a single dispatch command. These three values represent the maximum number of local workgroups for the X, Y, and Z dimensions, respectively. The workgroup count parameters to the dispatch commands must be less than or equal to the corresponding limit. See Dispatching Commands. -
maxComputeWorkGroupInvocations
is the maximum total number of compute shader invocations in a single local workgroup. The product of the X, Y, and Z sizes as specified by theLocalSize
execution mode in shader modules and by the object decorated by theWorkgroupSize
decoration must be less than or equal to this limit. -
maxComputeWorkGroupSize
[3] is the maximum size of a local compute workgroup, per dimension. These three values represent the maximum local workgroup size in the X, Y, and Z dimensions, respectively. Thex
,y
, andz
sizes specified by theLocalSize
execution mode and by the object decorated by theWorkgroupSize
decoration in shader modules must be less than or equal to the corresponding limit. -
subPixelPrecisionBits
is the number of bits of subpixel precision in framebuffer coordinates xf and yf. See Rasterization. -
subTexelPrecisionBits
is the number of bits of precision in the division along an axis of an image used for minification and magnification filters. 2subTexelPrecisionBits
is the actual number of divisions along each axis of the image represented. Sub-texel values calculated during image sampling will snap to these locations when generating the filtered results. -
mipmapPrecisionBits
is the number of bits of division that the LOD calculation for mipmap fetching get snapped to when determining the contribution from each mip level to the mip filtered results. 2mipmapPrecisionBits
is the actual number of divisions. -
maxDrawIndexedIndexValue
is the maximum index value that can be used for indexed draw calls when using 32-bit indices. This excludes the primitive restart index value of 0xFFFFFFFF. See fullDrawIndexUint32. -
maxDrawIndirectCount
is the maximum draw count that is supported for indirect draw calls. See multiDrawIndirect. -
maxSamplerLodBias
is the maximum absolute sampler LOD bias. The sum of themipLodBias
member of theVkSamplerCreateInfo
structure and theBias
operand of image sampling operations in shader modules (or 0 if noBias
operand is provided to an image sampling operation) are clamped to the range [-maxSamplerLodBias
,+maxSamplerLodBias
]. See [samplers-mipLodBias]. -
maxSamplerAnisotropy
is the maximum degree of sampler anisotropy. The maximum degree of anisotropic filtering used for an image sampling operation is the minimum of themaxAnisotropy
member of theVkSamplerCreateInfo
structure and this limit. See [samplers-maxAnisotropy]. -
maxViewports
is the maximum number of active viewports. TheviewportCount
member of theVkPipelineViewportStateCreateInfo
structure that is provided at pipeline creation must be less than or equal to this limit. -
maxViewportDimensions
[2] are the maximum viewport dimensions in the X (width) and Y (height) dimensions, respectively. The maximum viewport dimensions must be greater than or equal to the largest image which can be created and used as a framebuffer attachment. See Controlling the Viewport. -
viewportBoundsRange
[2] is the [minimum, maximum] range that the corners of a viewport must be contained in. This range must be at least [-2 ×size
, 2 ×size
- 1], wheresize
= max(maxViewportDimensions
[0],maxViewportDimensions
[1]). See Controlling the Viewport.NoteThe intent of the
viewportBoundsRange
limit is to allow a maximum sized viewport to be arbitrarily shifted relative to the output target as long as at least some portion intersects. This would give a bounds limit of [-size
+ 1, 2 ×size
- 1] which would allow all possible non-empty-set intersections of the output target and the viewport. Since these numbers are typically powers of two, picking the signed number range using the smallest possible number of bits ends up with the specified range. -
viewportSubPixelBits
is the number of bits of subpixel precision for viewport bounds. The subpixel precision that floating-point viewport bounds are interpreted at is given by this limit. -
minMemoryMapAlignment
is the minimum required alignment, in bytes, of host visible memory allocations within the host address space. When mapping a memory allocation with vkMapMemory, subtractingoffset
bytes from the returned pointer will always produce an integer multiple of this limit. See Host Access to Device Memory Objects. -
minTexelBufferOffsetAlignment
is the minimum required alignment, in bytes, for theoffset
member of theVkBufferViewCreateInfo
structure for texel buffers. When a buffer view is created for a buffer which was created withVK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT
orVK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT
set in theusage
member of theVkBufferCreateInfo
structure, theoffset
must be an integer multiple of this limit. -
minUniformBufferOffsetAlignment
is the minimum required alignment, in bytes, for theoffset
member of theVkDescriptorBufferInfo
structure for uniform buffers. When a descriptor of typeVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER
orVK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC
is updated, theoffset
must be an integer multiple of this limit. Similarly, dynamic offsets for uniform buffers must be multiples of this limit. -
minStorageBufferOffsetAlignment
is the minimum required alignment, in bytes, for theoffset
member of theVkDescriptorBufferInfo
structure for storage buffers. When a descriptor of typeVK_DESCRIPTOR_TYPE_STORAGE_BUFFER
orVK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC
is updated, theoffset
must be an integer multiple of this limit. Similarly, dynamic offsets for storage buffers must be multiples of this limit. -
minTexelOffset
is the minimum offset value for theConstOffset
image operand of any of theOpImageSample
* orOpImageFetch
* image instructions. -
maxTexelOffset
is the maximum offset value for theConstOffset
image operand of any of theOpImageSample
* orOpImageFetch
* image instructions. -
minTexelGatherOffset
is the minimum offset value for theOffset
orConstOffsets
image operands of any of theOpImage
*Gather
image instructions. -
maxTexelGatherOffset
is the maximum offset value for theOffset
orConstOffsets
image operands of any of theOpImage
*Gather
image instructions. -
minInterpolationOffset
is the minimum negative offset value for theoffset
operand of theInterpolateAtOffset
extended instruction. -
maxInterpolationOffset
is the maximum positive offset value for theoffset
operand of theInterpolateAtOffset
extended instruction. -
subPixelInterpolationOffsetBits
is the number of subpixel fractional bits that thex
andy
offsets to theInterpolateAtOffset
extended instruction may be rounded to as fixed-point values. -
maxFramebufferWidth
is the maximum width for a framebuffer. Thewidth
member of theVkFramebufferCreateInfo
structure must be less than or equal to this limit. -
maxFramebufferHeight
is the maximum height for a framebuffer. Theheight
member of theVkFramebufferCreateInfo
structure must be less than or equal to this limit. -
maxFramebufferLayers
is the maximum layer count for a layered framebuffer. Thelayers
member of theVkFramebufferCreateInfo
structure must be less than or equal to this limit. -
framebufferColorSampleCounts
is a bitmask1 of VkSampleCountFlagBits indicating the color sample counts that are supported for all framebuffer color attachments with floating- or fixed-point formats. There is no limit that specifies the color sample counts that are supported for all color attachments with integer formats. -
framebufferDepthSampleCounts
is a bitmask1 of VkSampleCountFlagBits indicating the supported depth sample counts for all framebuffer depth/stencil attachments, when the format includes a depth component. -
framebufferStencilSampleCounts
is a bitmask1 of VkSampleCountFlagBits indicating the supported stencil sample counts for all framebuffer depth/stencil attachments, when the format includes a stencil component. -
framebufferNoAttachmentsSampleCounts
is a bitmask1 of VkSampleCountFlagBits indicating the supported sample counts for a framebuffer with no attachments. -
maxColorAttachments
is the maximum number of color attachments that can be used by a subpass in a render pass. ThecolorAttachmentCount
member of theVkSubpassDescription
structure must be less than or equal to this limit. -
sampledImageColorSampleCounts
is a bitmask1 of VkSampleCountFlagBits indicating the sample counts supported for all 2D images created withVK_IMAGE_TILING_OPTIMAL
,usage
containingVK_IMAGE_USAGE_SAMPLED_BIT
, and a non-integer color format. -
sampledImageIntegerSampleCounts
is a bitmask1 of VkSampleCountFlagBits indicating the sample counts supported for all 2D images created withVK_IMAGE_TILING_OPTIMAL
,usage
containingVK_IMAGE_USAGE_SAMPLED_BIT
, and an integer color format. -
sampledImageDepthSampleCounts
is a bitmask1 of VkSampleCountFlagBits indicating the sample counts supported for all 2D images created withVK_IMAGE_TILING_OPTIMAL
,usage
containingVK_IMAGE_USAGE_SAMPLED_BIT
, and a depth format. -
sampledImageStencilSampleCounts
is a bitmask1 of VkSampleCountFlagBits indicating the sample supported for all 2D images created withVK_IMAGE_TILING_OPTIMAL
,usage
containingVK_IMAGE_USAGE_SAMPLED_BIT
, and a stencil format. -
storageImageSampleCounts
is a bitmask1 of VkSampleCountFlagBits indicating the sample counts supported for all 2D images created withVK_IMAGE_TILING_OPTIMAL
, andusage
containingVK_IMAGE_USAGE_STORAGE_BIT
. -
maxSampleMaskWords
is the maximum number of array elements of a variable decorated with theSampleMask
built-in decoration. -
timestampComputeAndGraphics
specifies support for timestamps on all graphics and compute queues. If this limit is set toVK_TRUE
, all queues that advertise theVK_QUEUE_GRAPHICS_BIT
orVK_QUEUE_COMPUTE_BIT
in theVkQueueFamilyProperties
::queueFlags
supportVkQueueFamilyProperties
::timestampValidBits
of at least 36. See Timestamp Queries. -
timestampPeriod
is the number of nanoseconds required for a timestamp query to be incremented by 1. See Timestamp Queries. -
maxClipDistances
is the maximum number of clip distances that can be used in a single shader stage. The size of any array declared with theClipDistance
built-in decoration in a shader module must be less than or equal to this limit. -
maxCullDistances
is the maximum number of cull distances that can be used in a single shader stage. The size of any array declared with theCullDistance
built-in decoration in a shader module must be less than or equal to this limit. -
maxCombinedClipAndCullDistances
is the maximum combined number of clip and cull distances that can be used in a single shader stage. The sum of the sizes of any pair of arrays declared with theClipDistance
andCullDistance
built-in decoration used by a single shader stage in a shader module must be less than or equal to this limit. -
discreteQueuePriorities
is the number of discrete priorities that can be assigned to a queue based on the value of each member ofVkDeviceQueueCreateInfo
::pQueuePriorities
. This must be at least 2, and levels must be spread evenly over the range, with at least one level at 1.0, and another at 0.0. See Queue Priority. -
pointSizeRange
[2] is the range [minimum
,maximum
] of supported sizes for points. Values written to variables decorated with thePointSize
built-in decoration are clamped to this range. -
lineWidthRange
[2] is the range [minimum
,maximum
] of supported widths for lines. Values specified by thelineWidth
member of theVkPipelineRasterizationStateCreateInfo
or thelineWidth
parameter tovkCmdSetLineWidth
are clamped to this range. -
pointSizeGranularity
is the granularity of supported point sizes. Not all point sizes in the range defined bypointSizeRange
are supported. This limit specifies the granularity (or increment) between successive supported point sizes. -
lineWidthGranularity
is the granularity of supported line widths. Not all line widths in the range defined bylineWidthRange
are supported. This limit specifies the granularity (or increment) between successive supported line widths. -
strictLines
specifies whether lines are rasterized according to the preferred method of rasterization. If set toVK_FALSE
, lines may be rasterized under a relaxed set of rules. If set toVK_TRUE
, lines are rasterized as per the strict definition. See Basic Line Segment Rasterization. -
standardSampleLocations
specifies whether rasterization uses the standard sample locations as documented in Multisampling. If set toVK_TRUE
, the implementation uses the documented sample locations. If set toVK_FALSE
, the implementation may use different sample locations. -
optimalBufferCopyOffsetAlignment
is the optimal buffer offset alignment in bytes forvkCmdCopyBufferToImage
andvkCmdCopyImageToBuffer
. The per texel alignment requirements are enforced, but applications should use the optimal alignment for optimal performance and power use. -
optimalBufferCopyRowPitchAlignment
is the optimal buffer row pitch alignment in bytes forvkCmdCopyBufferToImage
andvkCmdCopyImageToBuffer
. Row pitch is the number of bytes between texels with the same X coordinate in adjacent rows (Y coordinates differ by one). The per texel alignment requirements are enforced, but applications should use the optimal alignment for optimal performance and power use. -
nonCoherentAtomSize
is the size and alignment in bytes that bounds concurrent access to host-mapped device memory.
- 1
-
For all bitmasks of VkSampleCountFlagBits, the sample count limits defined above represent the minimum supported sample counts for each image type. Individual images may support additional sample counts, which are queried using vkGetPhysicalDeviceImageFormatProperties as described in Supported Sample Counts.
Bits which may be set in the sample count limits returned by VkPhysicalDeviceLimits, as well as in other queries and structures representing image sample counts, are:
typedef enum VkSampleCountFlagBits {
VK_SAMPLE_COUNT_1_BIT = 0x00000001,
VK_SAMPLE_COUNT_2_BIT = 0x00000002,
VK_SAMPLE_COUNT_4_BIT = 0x00000004,
VK_SAMPLE_COUNT_8_BIT = 0x00000008,
VK_SAMPLE_COUNT_16_BIT = 0x00000010,
VK_SAMPLE_COUNT_32_BIT = 0x00000020,
VK_SAMPLE_COUNT_64_BIT = 0x00000040,
} VkSampleCountFlagBits;
-
VK_SAMPLE_COUNT_1_BIT
specifies an image with one sample per pixel. -
VK_SAMPLE_COUNT_2_BIT
specifies an image with 2 samples per pixel. -
VK_SAMPLE_COUNT_4_BIT
specifies an image with 4 samples per pixel. -
VK_SAMPLE_COUNT_8_BIT
specifies an image with 8 samples per pixel. -
VK_SAMPLE_COUNT_16_BIT
specifies an image with 16 samples per pixel. -
VK_SAMPLE_COUNT_32_BIT
specifies an image with 32 samples per pixel. -
VK_SAMPLE_COUNT_64_BIT
specifies an image with 64 samples per pixel.
typedef VkFlags VkSampleCountFlags;
VkSampleCountFlags
is a bitmask type for setting a mask of zero or
more VkSampleCountFlagBits.
The VkPhysicalDevicePushDescriptorPropertiesKHR
structure is defined
as:
typedef struct VkPhysicalDevicePushDescriptorPropertiesKHR {
VkStructureType sType;
void* pNext;
uint32_t maxPushDescriptors;
} VkPhysicalDevicePushDescriptorPropertiesKHR;
The members of the VkPhysicalDevicePushDescriptorPropertiesKHR
structure describe the following implementation-dependent limits:
If the VkPhysicalDevicePushDescriptorPropertiesKHR
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits.
The VkPhysicalDeviceMultiviewProperties
structure is defined as:
typedef struct VkPhysicalDeviceMultiviewProperties {
VkStructureType sType;
void* pNext;
uint32_t maxMultiviewViewCount;
uint32_t maxMultiviewInstanceIndex;
} VkPhysicalDeviceMultiviewProperties;
or the equivalent
typedef VkPhysicalDeviceMultiviewProperties VkPhysicalDeviceMultiviewPropertiesKHR;
The members of the VkPhysicalDeviceMultiviewProperties
structure
describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxMultiviewViewCount
is one greater than the maximum view index that can be used in a subpass. -
maxMultiviewInstanceIndex
is the maximum valid value of instance index allowed to be generated by a drawing command recorded within a subpass of a multiview render pass instance.
If the VkPhysicalDeviceMultiviewProperties
structure is included in
the pNext
chain of VkPhysicalDeviceProperties2, it is filled
with the implementation-dependent limits.
The members of the VkPhysicalDeviceFloatControlsPropertiesKHR
structure describe the following implementation-dependent limits:
typedef struct VkPhysicalDeviceFloatControlsPropertiesKHR {
VkStructureType sType;
void* pNext;
VkBool32 separateDenormSettings;
VkBool32 separateRoundingModeSettings;
VkBool32 shaderSignedZeroInfNanPreserveFloat16;
VkBool32 shaderSignedZeroInfNanPreserveFloat32;
VkBool32 shaderSignedZeroInfNanPreserveFloat64;
VkBool32 shaderDenormPreserveFloat16;
VkBool32 shaderDenormPreserveFloat32;
VkBool32 shaderDenormPreserveFloat64;
VkBool32 shaderDenormFlushToZeroFloat16;
VkBool32 shaderDenormFlushToZeroFloat32;
VkBool32 shaderDenormFlushToZeroFloat64;
VkBool32 shaderRoundingModeRTEFloat16;
VkBool32 shaderRoundingModeRTEFloat32;
VkBool32 shaderRoundingModeRTEFloat64;
VkBool32 shaderRoundingModeRTZFloat16;
VkBool32 shaderRoundingModeRTZFloat32;
VkBool32 shaderRoundingModeRTZFloat64;
} VkPhysicalDeviceFloatControlsPropertiesKHR;
-
separateDenormSettings
is a boolean value indicating whether the implementation supports separate settings for 16-bit and 64-bit denormals. -
separateRoundingModeSettings
is a boolean value indicating whether the implementation supports separate rounding modes for 16-bit and 64-bit floating point instructions. -
shaderSignedZeroInfNanPreserveFloat16
is a boolean value indicating whether sign of a zero, Nans and \(\pm\infty\) can be preserved in 16-bit floating-point computations. It also indicates whether theSignedZeroInfNanPreserve
execution mode can be used for 16-bit floating-point types. -
shaderSignedZeroInfNanPreserveFloat32
is a boolean value indicating whether sign of a zero, Nans and \(\pm\infty\) can be preserved in 32-bit floating-point computations. It also indicates whether theSignedZeroInfNanPreserve
execution mode can be used for 32-bit floating-point types. -
shaderSignedZeroInfNanPreserveFloat64
is a boolean value indicating whether sign of a zero, Nans and \(\pm\infty\) can be preserved in 64-bit floating-point computations. It also indicates whether theSignedZeroInfNanPreserve
execution mode can be used for 64-bit floating-point types. -
shaderDenormPreserveFloat16
is a boolean value indicating whether denormals can be preserved in 16-bit floating-point computations. It also indicates whether theDenormPreserve
execution mode can be used for 16-bit floating-point types. -
shaderDenormPreserveFloat32
is a boolean value indicating whether denormals can be preserved in 32-bit floating-point computations. It also indicates whether theDenormPreserve
execution mode can be used for 32-bit floating-point types. -
shaderDenormPreserveFloat64
is a boolean value indicating whether denormals can be preserved in 64-bit floating-point computations. It also indicates whether theDenormPreserve
execution mode can be used for 64-bit floating-point types. -
shaderDenormFlushToZeroFloat16
is a boolean value indicating whether denormals can be flushed to zero in 16-bit floating-point computations. It also indicates whether theDenormFlushToZero
execution mode can be used for 16-bit floating-point types. -
shaderDenormFlushToZeroFloat32
is a boolean value indicating whether denormals can be flushed to zero in 32-bit floating-point computations. It also indicates whether theDenormFlushToZero
execution mode can be used for 32-bit floating-point types. -
shaderDenormFlushToZeroFloat64
is a boolean value indicating whether denormals can be flushed to zero in 64-bit floating-point computations. It also indicates whether theDenormFlushToZero
execution mode can be used for 64-bit floating-point types. -
shaderRoundingModeRTEFloat16
is a boolean value indicating whether an implementation supports the round-to-nearest-even rounding mode for 16-bit floating-point arithmetic and conversion instructions. It also indicates whether theRoundingModeRTE
execution mode can be used for 16-bit floating-point types. -
shaderRoundingModeRTEFloat32
is a boolean value indicating whether an implementation supports the round-to-nearest-even rounding mode for 32-bit floating-point arithmetic and conversion instructions. It also indicates whether theRoundingModeRTE
execution mode can be used for 32-bit floating-point types. -
shaderRoundingModeRTEFloat64
is a boolean value indicating whether an implementation supports the round-to-nearest-even rounding mode for 64-bit floating-point arithmetic and conversion instructions. It also indicates whether theRoundingModeRTE
execution mode can be used for 64-bit floating-point types. -
shaderRoundingModeRTZFloat16
is a boolean value indicating whether an implementation supports the round-towards-zero rounding mode for 16-bit floating-point arithmetic and conversion instructions. It also indicates whether theRoundingModeRTZ
execution mode can be used for 16-bit floating-point types. -
shaderRoundingModeRTZFloat32
is a boolean value indicating whether an implementation supports the round-towards-zero rounding mode for 32-bit floating-point arithmetic and conversion instructions. It also indicates whether theRoundingModeRTZ
execution mode can be used for 32-bit floating-point types. -
shaderRoundingModeRTZFloat64
is a boolean value indicating whether an implementation supports the round-towards-zero rounding mode for 64-bit floating-point arithmetic and conversion instructions. It also indicates whether theRoundingModeRTZ
execution mode can be used for 64-bit floating-point types.
editing-note
Implementations may not be able to control behavior of denorms for floating-point atomics. This needs to be taken into account when such atomics will be added to Vulkan. |
If the VkPhysicalDeviceFloatControlsPropertiesKHR
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits.
The VkPhysicalDeviceDiscardRectanglePropertiesEXT
structure is defined
as:
typedef struct VkPhysicalDeviceDiscardRectanglePropertiesEXT {
VkStructureType sType;
void* pNext;
uint32_t maxDiscardRectangles;
} VkPhysicalDeviceDiscardRectanglePropertiesEXT;
The members of the VkPhysicalDeviceDiscardRectanglePropertiesEXT
structure describe the following implementation-dependent limits:
If the VkPhysicalDeviceDiscardRectanglePropertiesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits.
The VkPhysicalDeviceSampleLocationsPropertiesEXT
structure is defined
as:
typedef struct VkPhysicalDeviceSampleLocationsPropertiesEXT {
VkStructureType sType;
void* pNext;
VkSampleCountFlags sampleLocationSampleCounts;
VkExtent2D maxSampleLocationGridSize;
float sampleLocationCoordinateRange[2];
uint32_t sampleLocationSubPixelBits;
VkBool32 variableSampleLocations;
} VkPhysicalDeviceSampleLocationsPropertiesEXT;
The members of the VkPhysicalDeviceSampleLocationsPropertiesEXT
structure describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
sampleLocationSampleCounts
is a bitmask of VkSampleCountFlagBits indicating the sample counts supporting custom sample locations. -
maxSampleLocationGridSize
is the maximum size of the pixel grid in which sample locations can vary that is supported for all sample counts insampleLocationSampleCounts
. -
sampleLocationCoordinateRange
[2] is the range of supported sample location coordinates. -
sampleLocationSubPixelBits
is the number of bits of subpixel precision for sample locations. -
variableSampleLocations
specifies whether the sample locations used by all pipelines that will be bound to a command buffer during a subpass must match. If set toVK_TRUE
, the implementation supports variable sample locations in a subpass. If set toVK_FALSE
, then the sample locations must stay constant in each subpass.
If the VkPhysicalDeviceSampleLocationsPropertiesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits.
The VkPhysicalDeviceExternalMemoryHostPropertiesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceExternalMemoryHostPropertiesEXT {
VkStructureType sType;
void* pNext;
VkDeviceSize minImportedHostPointerAlignment;
} VkPhysicalDeviceExternalMemoryHostPropertiesEXT;
The members of the VkPhysicalDeviceExternalMemoryHostPropertiesEXT
structure describe the following implementation-dependent limits:
If the VkPhysicalDeviceExternalMemoryHostPropertiesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits.
The VkPhysicalDeviceMultiviewPerViewAttributesPropertiesNVX
structure
is defined as:
typedef struct VkPhysicalDeviceMultiviewPerViewAttributesPropertiesNVX {
VkStructureType sType;
void* pNext;
VkBool32 perViewPositionAllComponents;
} VkPhysicalDeviceMultiviewPerViewAttributesPropertiesNVX;
The members of the
VkPhysicalDeviceMultiviewPerViewAttributesPropertiesNVX
structure
describe the following implementation-dependent limits:
If the VkPhysicalDeviceMultiviewPerViewAttributesPropertiesNVX
structure is included in the pNext
chain of
VkPhysicalDeviceProperties2, it is filled with the
implementation-dependent limits.
The VkPhysicalDevicePointClippingProperties
structure is defined as:
typedef struct VkPhysicalDevicePointClippingProperties {
VkStructureType sType;
void* pNext;
VkPointClippingBehavior pointClippingBehavior;
} VkPhysicalDevicePointClippingProperties;
or the equivalent
typedef VkPhysicalDevicePointClippingProperties VkPhysicalDevicePointClippingPropertiesKHR;
The members of the VkPhysicalDevicePointClippingProperties
structure
describe the following implementation-dependent limit:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pointClippingBehavior
is the point clipping behavior supported by the implementation, and is of type VkPointClippingBehavior.
If the VkPhysicalDevicePointClippingProperties
structure is included
in the pNext
chain of VkPhysicalDeviceProperties2, it is filled
with the implementation-dependent limits.
The VkPhysicalDeviceSubgroupProperties
structure is defined as:
typedef struct VkPhysicalDeviceSubgroupProperties {
VkStructureType sType;
void* pNext;
uint32_t subgroupSize;
VkShaderStageFlags supportedStages;
VkSubgroupFeatureFlags supportedOperations;
VkBool32 quadOperationsInAllStages;
} VkPhysicalDeviceSubgroupProperties;
The members of the VkPhysicalDeviceSubgroupProperties
structure
describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
subgroupSize
is the number of invocations in each subgroup. This will match anySubgroupSize
decorated variable used in any shader module created on this device.subgroupSize
is at least 1 if any of the physical device’s queues supportVK_QUEUE_GRAPHICS_BIT
orVK_QUEUE_COMPUTE_BIT
. -
supportedStages
is a bitfield of VkShaderStageFlagBits describing the shader stages that subgroup operations are supported in.supportedStages
will have theVK_SHADER_STAGE_COMPUTE_BIT
bit set if any of the physical device’s queues supportVK_QUEUE_COMPUTE_BIT
. -
supportedOperations
is a bitmask of VkSubgroupFeatureFlagBits specifying the sets of subgroup operations supported on this device.supportedOperations
will have theVK_SUBGROUP_FEATURE_BASIC_BIT
bit set if any of the physical device’s queues supportVK_QUEUE_GRAPHICS_BIT
orVK_QUEUE_COMPUTE_BIT
. -
quadOperationsInAllStages
is a boolean that specifies whether quad subgroup operations are available in all stages, or are restricted to fragment and compute stages.
If the VkPhysicalDeviceSubgroupProperties
structure is included in the
pNext
chain of VkPhysicalDeviceProperties2, it is filled with
the implementation-dependent limits.
Bits which can be set in
VkPhysicalDeviceSubgroupProperties::supportedOperations
to
specify supported subgroup operations are:
typedef enum VkSubgroupFeatureFlagBits {
VK_SUBGROUP_FEATURE_BASIC_BIT = 0x00000001,
VK_SUBGROUP_FEATURE_VOTE_BIT = 0x00000002,
VK_SUBGROUP_FEATURE_ARITHMETIC_BIT = 0x00000004,
VK_SUBGROUP_FEATURE_BALLOT_BIT = 0x00000008,
VK_SUBGROUP_FEATURE_SHUFFLE_BIT = 0x00000010,
VK_SUBGROUP_FEATURE_SHUFFLE_RELATIVE_BIT = 0x00000020,
VK_SUBGROUP_FEATURE_CLUSTERED_BIT = 0x00000040,
VK_SUBGROUP_FEATURE_QUAD_BIT = 0x00000080,
VK_SUBGROUP_FEATURE_PARTITIONED_BIT_NV = 0x00000100,
} VkSubgroupFeatureFlagBits;
-
VK_SUBGROUP_FEATURE_BASIC_BIT
specifies the device will accept SPIR-V shader modules that contain theGroupNonUniform
capability. -
VK_SUBGROUP_FEATURE_VOTE_BIT
specifies the device will accept SPIR-V shader modules that contain theGroupNonUniformVote
capability. -
VK_SUBGROUP_FEATURE_ARITHMETIC_BIT
specifies the device will accept SPIR-V shader modules that contain theGroupNonUniformArithmetic
capability. -
VK_SUBGROUP_FEATURE_BALLOT_BIT
specifies the device will accept SPIR-V shader modules that contain theGroupNonUniformBallot
capability. -
VK_SUBGROUP_FEATURE_SHUFFLE_BIT
specifies the device will accept SPIR-V shader modules that contain theGroupNonUniformShuffle
capability. -
VK_SUBGROUP_FEATURE_SHUFFLE_RELATIVE_BIT
specifies the device will accept SPIR-V shader modules that contain theGroupNonUniformShuffleRelative
capability. -
VK_SUBGROUP_FEATURE_CLUSTERED_BIT
specifies the device will accept SPIR-V shader modules that contain theGroupNonUniformClustered
capability. -
VK_SUBGROUP_FEATURE_QUAD_BIT
specifies the device will accept SPIR-V shader modules that contain theGroupNonUniformQuad
capability. -
VK_SUBGROUP_FEATURE_PARTITIONED_BIT_NV
specifies the device will accept SPIR-V shader modules that contain theGroupNonUniformPartitionedNV
capability.
typedef VkFlags VkSubgroupFeatureFlags;
VkSubgroupFeatureFlags
is a bitmask type for setting a mask of zero or
more VkSubgroupFeatureFlagBits.
The VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT {
VkStructureType sType;
void* pNext;
uint32_t advancedBlendMaxColorAttachments;
VkBool32 advancedBlendIndependentBlend;
VkBool32 advancedBlendNonPremultipliedSrcColor;
VkBool32 advancedBlendNonPremultipliedDstColor;
VkBool32 advancedBlendCorrelatedOverlap;
VkBool32 advancedBlendAllOperations;
} VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT;
The members of the VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT
structure describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
advancedBlendMaxColorAttachments
is one greater than the highest color attachment index that can be used in a subpass, for a pipeline that uses an advanced blend operation. -
advancedBlendIndependentBlend
specifies whether advanced blend operations can vary per-attachment. -
advancedBlendNonPremultipliedSrcColor
specifies whether the source color can be treated as non-premultiplied. If this isVK_FALSE
, then VkPipelineColorBlendAdvancedStateCreateInfoEXT::srcPremultiplied
must beVK_TRUE
. -
advancedBlendNonPremultipliedDstColor
specifies whether the destination color can be treated as non-premultiplied. If this isVK_FALSE
, then VkPipelineColorBlendAdvancedStateCreateInfoEXT::dstPremultiplied
must beVK_TRUE
. -
advancedBlendCorrelatedOverlap
specifies whether the overlap mode can be treated as correlated. If this isVK_FALSE
, then VkPipelineColorBlendAdvancedStateCreateInfoEXT::blendOverlap
must beVK_BLEND_OVERLAP_UNCORRELATED_EXT
. -
advancedBlendAllOperations
specifies whether all advanced blend operation enums are supported. See the valid usage of VkPipelineColorBlendAttachmentState.
If the VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT
structure
is included in the pNext
chain of VkPhysicalDeviceProperties2,
it is filled with the implementation-dependent limits.
The VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT {
VkStructureType sType;
void* pNext;
uint32_t maxVertexAttribDivisor;
} VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT;
The members of the VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT
structure describe the following implementation-dependent limits:
If the VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT
structure
is included in the pNext
chain of VkPhysicalDeviceProperties2,
it is filled with the implementation-dependent limits.
The VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT {
VkStructureType sType;
void* pNext;
VkBool32 filterMinmaxSingleComponentFormats;
VkBool32 filterMinmaxImageComponentMapping;
} VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT;
The members of the VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT
structure describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
filterMinmaxSingleComponentFormats
is a boolean value indicating whether a minimum set of required formats support min/max filtering. -
filterMinmaxImageComponentMapping
is a boolean value indicating whether the implementation supports non-identity component mapping of the image when doing min/max filtering.
If the VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits.
If filterMinmaxSingleComponentFormats
is VK_TRUE
, the following
formats must support the
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_MINMAX_BIT_EXT
feature with
VK_IMAGE_TILING_OPTIMAL
, if they support
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
.
-
VK_FORMAT_R8_UNORM
-
VK_FORMAT_R8_SNORM
-
VK_FORMAT_R16_UNORM
-
VK_FORMAT_R16_SNORM
-
VK_FORMAT_R16_SFLOAT
-
VK_FORMAT_R32_SFLOAT
-
VK_FORMAT_D16_UNORM
-
VK_FORMAT_X8_D24_UNORM_PACK32
-
VK_FORMAT_D32_SFLOAT
-
VK_FORMAT_D16_UNORM_S8_UINT
-
VK_FORMAT_D24_UNORM_S8_UINT
-
VK_FORMAT_D32_SFLOAT_S8_UINT
If the format is a depth/stencil format, this bit only specifies that the depth aspect (not the stencil aspect) of an image of this format supports min/max filtering, and that min/max filtering of the depth aspect is supported when depth compare is disabled in the sampler.
If filterMinmaxImageComponentMapping
is VK_FALSE
the component
mapping of the image view used with min/max filtering must have been
created with the r
component set to
VK_COMPONENT_SWIZZLE_IDENTITY
.
Only the r
component of the sampled image value is defined and the
other component values are undefined.
If filterMinmaxImageComponentMapping
is VK_TRUE
this restriction
does not apply and image component mapping works as normal.
The VkPhysicalDeviceProtectedMemoryProperties
structure is defined as:
typedef struct VkPhysicalDeviceProtectedMemoryProperties {
VkStructureType sType;
void* pNext;
VkBool32 protectedNoFault;
} VkPhysicalDeviceProtectedMemoryProperties;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
protectedNoFault
specifies whether the undefined behavior may include process termination or device loss. IfprotectedNoFault
isVK_FALSE
, undefined behavior may include process termination or device loss. IfprotectedNoFault
isVK_TRUE
, undefined behavior will not include process termination or device loss.
If the VkPhysicalDeviceProtectedMemoryProperties
structure is included
in the pNext
chain of VkPhysicalDeviceProperties2, it is filled
with a value indicating the implementation-dependent behavior.
The VkPhysicalDeviceMaintenance3Properties
structure is defined as:
typedef struct VkPhysicalDeviceMaintenance3Properties {
VkStructureType sType;
void* pNext;
uint32_t maxPerSetDescriptors;
VkDeviceSize maxMemoryAllocationSize;
} VkPhysicalDeviceMaintenance3Properties;
or the equivalent
typedef VkPhysicalDeviceMaintenance3Properties VkPhysicalDeviceMaintenance3PropertiesKHR;
The members of the VkPhysicalDeviceMaintenance3Properties
structure
describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxPerSetDescriptors
is a maximum number of descriptors (summed over all descriptor types) in a single descriptor set that is guaranteed to satisfy any implementation-dependent constraints on the size of a descriptor set itself. Applications can query whether a descriptor set that goes beyond this limit is supported using vkGetDescriptorSetLayoutSupport. -
maxMemoryAllocationSize
is the maximum size of a memory allocation that can be created, even if there is more space available in the heap.
If the VkPhysicalDeviceMaintenance3Properties
structure is included in
the pNext
chain of VkPhysicalDeviceProperties2, it is filled
with the implementation-dependent limits.
The VkPhysicalDeviceMeshShaderPropertiesNV
structure is defined as:
typedef struct VkPhysicalDeviceMeshShaderPropertiesNV {
VkStructureType sType;
void* pNext;
uint32_t maxDrawMeshTasksCount;
uint32_t maxTaskWorkGroupInvocations;
uint32_t maxTaskWorkGroupSize[3];
uint32_t maxTaskTotalMemorySize;
uint32_t maxTaskOutputCount;
uint32_t maxMeshWorkGroupInvocations;
uint32_t maxMeshWorkGroupSize[3];
uint32_t maxMeshTotalMemorySize;
uint32_t maxMeshOutputVertices;
uint32_t maxMeshOutputPrimitives;
uint32_t maxMeshMultiviewViewCount;
uint32_t meshOutputPerVertexGranularity;
uint32_t meshOutputPerPrimitiveGranularity;
} VkPhysicalDeviceMeshShaderPropertiesNV;
The members of the VkPhysicalDeviceMeshShaderPropertiesNV
structure
describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxDrawMeshTasksCount
is the maximum number of local workgroups that can be launched by a single draw mesh tasks command. See Programmable Mesh Shading. -
maxTaskWorkGroupInvocations
is the maximum total number of task shader invocations in a single local workgroup. The product of the X, Y, and Z sizes as specified by theLocalSize
execution mode in shader modules and by the object decorated by theWorkgroupSize
decoration must be less than or equal to this limit. -
maxTaskWorkGroupSize
[3] is the maximum size of a local task workgroup. These three values represent the maximum local workgroup size in the X, Y, and Z dimensions, respectively. Thex
,y
, andz
sizes specified by theLocalSize
execution mode and by the object decorated by theWorkgroupSize
decoration in shader modules must be less than or equal to the corresponding limit. -
maxTaskTotalMemorySize
is the maximum number of bytes that the task shader can use in total for shared and output memory combined. -
maxTaskOutputCount
is the maximum number of output tasks a single task shader workgroup can emit. -
maxMeshWorkGroupInvocations
is the maximum total number of mesh shader invocations in a single local workgroup. The product of the X, Y, and Z sizes as specified by theLocalSize
execution mode in shader modules and by the object decorated by theWorkgroupSize
decoration must be less than or equal to this limit. -
maxMeshWorkGroupSize
[3] is the maximum size of a local mesh workgroup. These three values represent the maximum local workgroup size in the X, Y, and Z dimensions, respectively. Thex
,y
, andz
sizes specified by theLocalSize
execution mode and by the object decorated by theWorkgroupSize
decoration in shader modules must be less than or equal to the corresponding limit. -
maxMeshTotalMemorySize
is the maximum number of bytes that the mesh shader can use in total for shared and output memory combined. -
maxMeshOutputVertices
is the maximum number of vertices a mesh shader output can store. -
maxMeshOutputPrimitives
is the maximum number of primitives a mesh shader output can store. -
maxMeshMultiviewViewCount
is the maximum number of multi-view views a mesh shader can use. -
meshOutputPerVertexGranularity
is the granularity with which mesh vertex outputs are allocated. The value can be used to compute the memory size used by the mesh shader, which must be less than or equal tomaxMeshTotalMemorySize
. -
meshOutputPerPrimitiveGranularity
is the granularity with which mesh outputs qualified as per-primitive are allocated. The value can be used to compute the memory size used by the mesh shader, which must be less than or equal tomaxMeshTotalMemorySize
.
If the VkPhysicalDeviceMeshShaderPropertiesNV
structure is included in
the pNext
chain of VkPhysicalDeviceProperties2, it is filled
with the implementation-dependent limits.
The VkPhysicalDeviceDescriptorIndexingPropertiesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceDescriptorIndexingPropertiesEXT {
VkStructureType sType;
void* pNext;
uint32_t maxUpdateAfterBindDescriptorsInAllPools;
VkBool32 shaderUniformBufferArrayNonUniformIndexingNative;
VkBool32 shaderSampledImageArrayNonUniformIndexingNative;
VkBool32 shaderStorageBufferArrayNonUniformIndexingNative;
VkBool32 shaderStorageImageArrayNonUniformIndexingNative;
VkBool32 shaderInputAttachmentArrayNonUniformIndexingNative;
VkBool32 robustBufferAccessUpdateAfterBind;
VkBool32 quadDivergentImplicitLod;
uint32_t maxPerStageDescriptorUpdateAfterBindSamplers;
uint32_t maxPerStageDescriptorUpdateAfterBindUniformBuffers;
uint32_t maxPerStageDescriptorUpdateAfterBindStorageBuffers;
uint32_t maxPerStageDescriptorUpdateAfterBindSampledImages;
uint32_t maxPerStageDescriptorUpdateAfterBindStorageImages;
uint32_t maxPerStageDescriptorUpdateAfterBindInputAttachments;
uint32_t maxPerStageUpdateAfterBindResources;
uint32_t maxDescriptorSetUpdateAfterBindSamplers;
uint32_t maxDescriptorSetUpdateAfterBindUniformBuffers;
uint32_t maxDescriptorSetUpdateAfterBindUniformBuffersDynamic;
uint32_t maxDescriptorSetUpdateAfterBindStorageBuffers;
uint32_t maxDescriptorSetUpdateAfterBindStorageBuffersDynamic;
uint32_t maxDescriptorSetUpdateAfterBindSampledImages;
uint32_t maxDescriptorSetUpdateAfterBindStorageImages;
uint32_t maxDescriptorSetUpdateAfterBindInputAttachments;
} VkPhysicalDeviceDescriptorIndexingPropertiesEXT;
The members of the VkPhysicalDeviceDescriptorIndexingPropertiesEXT
structure describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxUpdateAfterBindDescriptorsInAllPools
is the maximum number of descriptors (summed over all descriptor types) that can be created across all pools that are created with theVK_DESCRIPTOR_POOL_CREATE_UPDATE_AFTER_BIND_BIT_EXT
bit set. Pool creation may fail when this limit is exceeded, or when the space this limit represents is unable to satisfy a pool creation due to fragmentation. -
shaderUniformBufferArrayNonUniformIndexingNative
is a boolean value indicating whether uniform buffer descriptors natively support nonuniform indexing. If this isVK_FALSE
, then a single dynamic instance of an instruction that nonuniformly indexes an array of uniform buffers may execute multiple times in order to access all the descriptors. -
shaderSampledImageArrayNonUniformIndexingNative
is a boolean value indicating whether sampler and image descriptors natively support nonuniform indexing. If this isVK_FALSE
, then a single dynamic instance of an instruction that nonuniformly indexes an array of samplers or images may execute multiple times in order to access all the descriptors. -
shaderStorageBufferArrayNonUniformIndexingNative
is a boolean value indicating whether storage buffer descriptors natively support nonuniform indexing. If this isVK_FALSE
, then a single dynamic instance of an instruction that nonuniformly indexes an array of storage buffers may execute multiple times in order to access all the descriptors. -
shaderStorageImageArrayNonUniformIndexingNative
is a boolean value indicating whether storage image descriptors natively support nonuniform indexing. If this isVK_FALSE
, then a single dynamic instance of an instruction that nonuniformly indexes an array of storage images may execute multiple times in order to access all the descriptors. -
shaderInputAttachmentArrayNonUniformIndexingNative
is a boolean value indicating whether input attachment descriptors natively support nonuniform indexing. If this isVK_FALSE
, then a single dynamic instance of an instruction that nonuniformly indexes an array of input attachments may execute multiple times in order to access all the descriptors. -
robustBufferAccessUpdateAfterBind
is a boolean value indicating whetherrobustBufferAccess
can be enabled in a device simultaneously withdescriptorBindingUniformBufferUpdateAfterBind
,descriptorBindingStorageBufferUpdateAfterBind
,descriptorBindingUniformTexelBufferUpdateAfterBind
, and/ordescriptorBindingStorageTexelBufferUpdateAfterBind
. If this isVK_FALSE
, then eitherrobustBufferAccess
must be disabled or all of these update-after-bind features must be disabled. -
quadDivergentImplicitLod
is a boolean value indicating whether implicit level of detail calculations for image operations have well-defined results when the image and/or sampler objects used for the instruction are not uniform within a quad. See Derivative Image Operations. -
maxPerStageDescriptorUpdateAfterBindSamplers
is similar tomaxPerStageDescriptorSamplers
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxPerStageDescriptorUpdateAfterBindUniformBuffers
is similar tomaxPerStageDescriptorUniformBuffers
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxPerStageDescriptorUpdateAfterBindStorageBuffers
is similar tomaxPerStageDescriptorStorageBuffers
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxPerStageDescriptorUpdateAfterBindSampledImages
is similar tomaxPerStageDescriptorSampledImages
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxPerStageDescriptorUpdateAfterBindStorageImages
is similar tomaxPerStageDescriptorStorageImages
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxPerStageDescriptorUpdateAfterBindInputAttachments
is similar tomaxPerStageDescriptorInputAttachments
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxPerStageUpdateAfterBindResources
is similar tomaxPerStageResources
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxDescriptorSetUpdateAfterBindSamplers
is similar tomaxDescriptorSetSamplers
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxDescriptorSetUpdateAfterBindUniformBuffers
is similar tomaxDescriptorSetUniformBuffers
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxDescriptorSetUpdateAfterBindUniformBuffersDynamic
is similar tomaxDescriptorSetUniformBuffersDynamic
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxDescriptorSetUpdateAfterBindStorageBuffers
is similar tomaxDescriptorSetStorageBuffers
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxDescriptorSetUpdateAfterBindStorageBuffersDynamic
is similar tomaxDescriptorSetStorageBuffersDynamic
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxDescriptorSetUpdateAfterBindSampledImages
is similar tomaxDescriptorSetSampledImages
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxDescriptorSetUpdateAfterBindStorageImages
is similar tomaxDescriptorSetStorageImages
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxDescriptorSetUpdateAfterBindInputAttachments
is similar tomaxDescriptorSetInputAttachments
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set.
If the VkPhysicalDeviceDescriptorIndexingPropertiesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits.
The VkPhysicalDeviceInlineUniformBlockPropertiesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceInlineUniformBlockPropertiesEXT {
VkStructureType sType;
void* pNext;
uint32_t maxInlineUniformBlockSize;
uint32_t maxPerStageDescriptorInlineUniformBlocks;
uint32_t maxPerStageDescriptorUpdateAfterBindInlineUniformBlocks;
uint32_t maxDescriptorSetInlineUniformBlocks;
uint32_t maxDescriptorSetUpdateAfterBindInlineUniformBlocks;
} VkPhysicalDeviceInlineUniformBlockPropertiesEXT;
The members of the VkPhysicalDeviceInlineUniformBlockPropertiesEXT
structure describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxInlineUniformBlockSize
is the maximum size in bytes of an inline uniform block binding. -
maxPerStageDescriptorInlineUniformBlock
is the maximum number of inline uniform block bindings that can be accessible to a single shader stage in a pipeline layout. Descriptor bindings with a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
count against this limit. Only descriptor bindings in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. -
maxPerStageDescriptorUpdateAfterBindInlineUniformBlocks
is similar tomaxPerStageDescriptorInlineUniformBlocks
but counts descriptor bindings from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set. -
maxDescriptorSetInlineUniformBlocks
is the maximum number of inline uniform block bindings that can be included in descriptor bindings in a pipeline layout across all pipeline shader stages and descriptor set numbers. Descriptor bindings with a descriptor type ofVK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
count against this limit. Only descriptor bindings in descriptor set layouts created without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set count against this limit. -
maxDescriptorSetUpdateAfterBindInlineUniformBlocks
is similar tomaxDescriptorSetInlineUniformBlocks
but counts descriptor bindings from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
bit set.
If the VkPhysicalDeviceInlineUniformBlockPropertiesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits.
The VkPhysicalDeviceConservativeRasterizationPropertiesEXT
structure
is defined as:
typedef struct VkPhysicalDeviceConservativeRasterizationPropertiesEXT {
VkStructureType sType;
void* pNext;
float primitiveOverestimationSize;
float maxExtraPrimitiveOverestimationSize;
float extraPrimitiveOverestimationSizeGranularity;
VkBool32 primitiveUnderestimation;
VkBool32 conservativePointAndLineRasterization;
VkBool32 degenerateTrianglesRasterized;
VkBool32 degenerateLinesRasterized;
VkBool32 fullyCoveredFragmentShaderInputVariable;
VkBool32 conservativeRasterizationPostDepthCoverage;
} VkPhysicalDeviceConservativeRasterizationPropertiesEXT;
The members of the
VkPhysicalDeviceConservativeRasterizationPropertiesEXT
structure
describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
primitiveOverestimationSize
is the size in pixels the generating primitive is increased at each of its edges during conservative rasterization overestimation mode. Even with a size of 0.0, conservative rasterization overestimation rules still apply and if any part of the pixel rectangle is covered by the generating primitive, fragments are generated for the entire pixel. However implementations may make the pixel coverage area even more conservative by increasing the size of the generating primitive. -
maxExtraPrimitiveOverestimationSize
is the maximum size in pixels of extra overestimation the implementation supports in the pipeline state. A value of 0.0 means the implementation does not support any additional overestimation of the generating primitive during conservative rasterization. A value above 0.0 allows the application to further increase the size of the generating primitive during conservative rasterization overestimation. -
extraPrimitiveOverestimationSizeGranularity
is the granularity of extra overestimation that can be specified in the pipeline state between 0.0 andmaxExtraPrimitiveOverestimationSize
inclusive. A value of 0.0 means the implementation can use the smallest representable non-zero value in the screen space pixel fixed-point grid. -
primitiveUnderestimation
is true if the implementation supports theVK_CONSERVATIVE_RASTERIZATION_MODE_UNDERESTIMATE_EXT
conservative rasterization mode in addition toVK_CONSERVATIVE_RASTERIZATION_MODE_OVERESTIMATE_EXT
. Otherwise the implementation only supportsVK_CONSERVATIVE_RASTERIZATION_MODE_OVERESTIMATE_EXT
. -
conservativePointAndLineRasterization
is true if the implementation supports conservative rasterization of point and line primitives as well as triangle primitives. Otherwise the implementation only supports triangle primitives. -
degenerateTrianglesRasterized
is false if the implementation culls primitives generated from triangles that become zero area after they are quantized to the fixed-point rasterization pixel grid.degenerateTrianglesRasterized
is true if these primitives are not culled and the provoking vertex attributes and depth value are used for the fragments. The primitive area calculation is done on the primitive generated from the clipped triangle if applicable. Zero area primitives are backfacing and the application can enable backface culling if desired. -
degenerateLinesRasterized
is false if the implementation culls lines that become zero length after they are quantized to the fixed-point rasterization pixel grid.degenerateLinesRasterized
is true if zero length lines are not culled and the provoking vertex attributes and depth value are used for the fragments. -
fullyCoveredFragmentShaderInputVariable
is true if the implementation supports the SPIR-V builtin fragment shader input variable FullyCoveredEXT which specifies that conservative rasterization is enabled and the fragment area is fully covered by the generating primitive. -
conservativeRasterizationPostDepthCoverage
is true if the implementation supports conservative rasterization with thePostDepthCoverage
execution mode enabled. When supported theSampleMask
built-in input variable will reflect the coverage after the early per-fragment depth and stencil tests are applied even when conservative rasterization is enabled. OtherwisePostDepthCoverage
execution mode must not be used when conservative rasterization is enabled.
If the VkPhysicalDeviceConservativeRasterizationPropertiesEXT
structure is included in the pNext
chain of
VkPhysicalDeviceProperties2, it is filled with the
implementation-dependent limits and properties.
The VkPhysicalDeviceFragmentDensityMapPropertiesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceFragmentDensityMapPropertiesEXT {
VkStructureType sType;
void* pNext;
VkExtent2D minFragmentDensityTexelSize;
VkExtent2D maxFragmentDensityTexelSize;
VkBool32 fragmentDensityInvocations;
} VkPhysicalDeviceFragmentDensityMapPropertiesEXT;
The members of the VkPhysicalDeviceFragmentDensityMapPropertiesEXT
structure describe the following implementation-dependent limits:
-
minFragmentDensityTexelSize
is the minimum fragment density texel size. -
maxFragmentDensityTexelSize
is the maximum fragment density texel size. -
fragmentDensityInvocations
specifies whether the implementation may invoke additional fragment shader invocations for each covered sample.
If the VkPhysicalDeviceFragmentDensityMapPropertiesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2KHR,
it is filled with the implementation-dependent limits and properties.
The VkPhysicalDeviceShaderCorePropertiesAMD
structure is defined as:
typedef struct VkPhysicalDeviceShaderCorePropertiesAMD {
VkStructureType sType;
void* pNext;
uint32_t shaderEngineCount;
uint32_t shaderArraysPerEngineCount;
uint32_t computeUnitsPerShaderArray;
uint32_t simdPerComputeUnit;
uint32_t wavefrontsPerSimd;
uint32_t wavefrontSize;
uint32_t sgprsPerSimd;
uint32_t minSgprAllocation;
uint32_t maxSgprAllocation;
uint32_t sgprAllocationGranularity;
uint32_t vgprsPerSimd;
uint32_t minVgprAllocation;
uint32_t maxVgprAllocation;
uint32_t vgprAllocationGranularity;
} VkPhysicalDeviceShaderCorePropertiesAMD;
The members of the VkPhysicalDeviceShaderCorePropertiesAMD
structure
describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
shaderEngineCount
is an unsigned integer value indicating the number of shader engines found inside the shader core of the physical device. -
shaderArraysPerEngineCount
is an unsigned integer value indicating the number of shader arrays inside a shader engine. Each shader array has its own scan converter, set of compute units, and a render back end (color and depth buffers). Shader arrays within a shader engine share shader processor input (wave launcher) and shader export (export buffer) units. Currently, a shader engine can have one or two shader arrays. -
computeUnitsPerShaderArray
is an unsigned integer value indicating the number of compute units within a shader array. A compute unit houses a set of SIMDs along with a sequencer module and a local data store. -
simdPerComputeUnit
is an unsigned integer value indicating the number of SIMDs inside a compute unit. Each SIMD processes a single instruction at a time. -
wavefrontSize
is an unsigned integer value indicating the number of channels (or threads) in a wavefront. -
sgprsPerSimd
is an unsigned integer value indicating the number of physical Scalar General Purpose Registers (SGPRs) per SIMD. -
minSgprAllocation
is an unsigned integer value indicating the minimum number of SGPRs allocated for a wave. -
maxSgprAllocation
is an unsigned integer value indicating the maximum number of SGPRs allocated for a wave. -
sgprAllocationGranularity
is an unsigned integer value indicating the granularity of SGPR allocation for a wave. -
vgprsPerSimd
is an unsigned integer value indicating the number of physical Vector General Purpose Registers (VGPRs) per SIMD. -
minVgprAllocation
is an unsigned integer value indicating the minimum number of VGPRs allocated for a wave. -
maxVgprAllocation
is an unsigned integer value indicating the maximum number of VGPRs allocated for a wave. -
vgprAllocationGranularity
is an unsigned integer value indicating the granularity of VGPR allocation for a wave.
If the VkPhysicalDeviceShaderCorePropertiesAMD
structure is included
in the pNext
chain of VkPhysicalDeviceProperties2, it is filled
with the implementation-dependent limits.
The VkPhysicalDeviceShadingRateImagePropertiesNV
structure is defined
as:
typedef struct VkPhysicalDeviceShadingRateImagePropertiesNV {
VkStructureType sType;
void* pNext;
VkExtent2D shadingRateTexelSize;
uint32_t shadingRatePaletteSize;
uint32_t shadingRateMaxCoarseSamples;
} VkPhysicalDeviceShadingRateImagePropertiesNV;
The members of the VkPhysicalDeviceShadingRateImagePropertiesNV
structure describe the following implementation-dependent properties related
to the shading rate image feature:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
shadingRateTexelSize
indicates the width and height of the portion of the framebuffer corresponding to each texel in the shading rate image. -
shadingRatePaletteSize
indicates the maximum number of palette entries supported for the shading rate image. -
shadingRateMaxCoarseSamples
specifies the maximum number of coverage samples supported in a single fragment. If the product of the fragment size derived from the base shading rate and the number of coverage samples per pixel exceeds this limit, the final shading rate will be adjusted so that its product does not exceed the limit.
If the VkPhysicalDeviceShadingRateImagePropertiesNV
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits.
The VkPhysicalDeviceTransformFeedbackPropertiesEXT
structure is
defined as:
typedef struct VkPhysicalDeviceTransformFeedbackPropertiesEXT {
VkStructureType sType;
void* pNext;
uint32_t maxTransformFeedbackStreams;
uint32_t maxTransformFeedbackBuffers;
VkDeviceSize maxTransformFeedbackBufferSize;
uint32_t maxTransformFeedbackStreamDataSize;
uint32_t maxTransformFeedbackBufferDataSize;
uint32_t maxTransformFeedbackBufferDataStride;
VkBool32 transformFeedbackQueries;
VkBool32 transformFeedbackStreamsLinesTriangles;
VkBool32 transformFeedbackRasterizationStreamSelect;
VkBool32 transformFeedbackDraw;
} VkPhysicalDeviceTransformFeedbackPropertiesEXT;
The members of the VkPhysicalDeviceTransformFeedbackPropertiesEXT
structure describe the following implementation-dependent limits:
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxTransformFeedbackStreams
is the maximum number of vertex streams that can be output from geometry shaders declared with theGeometryStreams
capability. If the implementation does not supportVkPhysicalDeviceTransformFeedbackFeaturesEXT
::geometryStreams
thenmaxTransformFeedbackStreams
must be set to1
. -
maxTransformFeedbackBuffers
is the maximum number of transform feedback buffers that can be bound for capturing shader outputs from the last vertex processing stage. -
maxTransformFeedbackBufferSize
is the maximum size that can be specified when binding a buffer for transform feedback in vkCmdBindTransformFeedbackBuffersEXT. -
maxTransformFeedbackStreamDataSize
is the maximum amount of data in bytes for each vertex that captured to one or more transform feedback buffers associated with a specific vertex stream. -
maxTransformFeedbackBufferDataSize
is the maximum amount of data in bytes for each vertex that can be captured to a specific transform feedback buffer. -
maxTransformFeedbackBufferDataStride
is the maximum stride between each capture of vertex data to the buffer. -
transformFeedbackQueries
is true if the implementation supports theVK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT
query type.transformFeedbackQueries
is false if queries of this type cannot be created. -
transformFeedbackStreamsLinesTriangles
is true if the implementation supports the geometry shaderOpExecutionMode
ofOutputLineStrip
andOutputTriangleStrip
in addition toOutputPoints
when more than one vertex stream is output. IftransformFeedbackStreamsLinesTriangles
is false the implementation only supports anOpExecutionMode
ofOutputPoints
when more than one vertex stream is output from the geometry shader. -
transformFeedbackRasterizationStreamSelect
is true if the implementation supports theGeometryStreams
SPIR-V capability and the application can use VkPipelineRasterizationStateStreamCreateInfoEXT to modify which vertex stream output is used for rasterization. Otherwise vertex stream0
must always be used for rasterization. -
transformFeedbackDraw
is true if the implementation supports the vkCmdDrawIndirectByteCountEXT function otherwise the function must not be called.
If the VkPhysicalDeviceTransformFeedbackPropertiesEXT
structure is
included in the pNext
chain of VkPhysicalDeviceProperties2, it
is filled with the implementation-dependent limits and properties.
The VkPhysicalDeviceRayTracingPropertiesNV
structure is defined as:
typedef struct VkPhysicalDeviceRayTracingPropertiesNV {
VkStructureType sType;
void* pNext;
uint32_t shaderGroupHandleSize;
uint32_t maxRecursionDepth;
uint32_t maxShaderGroupStride;
uint32_t shaderGroupBaseAlignment;
uint64_t maxGeometryCount;
uint64_t maxInstanceCount;
uint64_t maxTriangleCount;
uint32_t maxDescriptorSetAccelerationStructures;
} VkPhysicalDeviceRayTracingPropertiesNV;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
shaderGroupHandleSize
size in bytes of the shader header. -
maxRecursionDepth
is the maximum number of levels of recursion allowed in a trace command. -
maxShaderGroupStride
is the maximum stride in bytes allowed between shader groups in the SBT. -
shaderGroupBaseAlignment
is the required alignment in bytes for the base of the SBTs. -
maxGeometryCount
is the maximum number of geometries in the bottom level acceleration structure. -
maxInstanceCount
is the maximum number of instances in the top level acceleration structure. -
maxTriangleCount
is the maximum number of triangles in all geometries in the bottom level acceleration structure. -
maxDescriptorSetAccelerationStructures
is the maximum number of acceleration structure descriptors that are allowed in a descriptor set.
If the VkPhysicalDeviceRayTracingPropertiesNV
structure is included in
the pNext
chain of VkPhysicalDeviceProperties2, it is filled
with the implementation-dependent limits.
35.2.1. Limit Requirements
The following table specifies the required minimum/maximum for all Vulkan graphics implementations. Where a limit corresponds to a fine-grained device feature which is optional, the feature name is listed with two required limits, one when the feature is supported and one when it is not supported. If an implementation supports a feature, the limits reported are the same whether or not the feature is enabled.
Type | Limit | Feature |
---|---|---|
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
sparseBinding |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- |
|
|
- |
|
|
|
|
|
- |
|
|
- |
3 × |
|
- |
|
|
- |
3 × |
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
|
|
|
|
|
|
- |
|
|
|
|
|
|
2 × |
|
- |
2 × |
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- |
|
|
- |
|
|
- |
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
|
- |
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
|
|
|
|
- |
|
|
- |
|
|
- |
|
|
|
|
|
|
|
|
|
|
|
- |
2 × |
|
|
2 × |
|
|
|
|
|
|
|
|
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Limit | Unsupported Limit | Supported Limit | Limit Type1 |
---|---|---|---|
|
- |
4096 |
min |
|
- |
4096 |
min |
|
- |
256 |
min |
|
- |
4096 |
min |
|
- |
256 |
min |
|
- |
65536 |
min |
|
- |
16384 |
min |
|
- |
227 |
min |
|
- |
128 |
min |
|
- |
4096 |
min |
|
- |
4000 |
min |
|
- |
131072 |
max |
|
0 |
231 |
min |
|
- |
4 |
min |
|
- |
16 |
min |
|
- |
12 |
min |
|
- |
4 |
min |
|
- |
16 |
min |
|
- |
4 |
min |
|
- |
4 |
min |
|
- |
128 2 |
min |
|
- |
96 8 |
min, n × PerStage |
|
- |
72 8 |
min, n × PerStage |
|
- |
8 |
min |
|
- |
24 8 |
min, n × PerStage |
|
- |
4 |
min |
|
- |
96 8 |
min, n × PerStage |
|
- |
24 8 |
min, n × PerStage |
|
- |
4 |
min |
|
- |
16 |
min |
|
- |
16 |
min |
|
- |
2047 |
min |
|
- |
2048 |
min |
|
- |
64 |
min |
|
0 |
64 |
min |
|
0 |
32 |
min |
|
0 |
64 |
min |
|
0 |
64 |
min |
|
0 |
120 |
min |
|
0 |
2048 |
min |
|
0 |
64 |
min |
|
0 |
64 |
min |
|
0 |
32 |
min |
|
0 |
64 |
min |
|
0 |
64 |
min |
|
0 |
256 |
min |
|
0 |
1024 |
min |
|
- |
64 |
min |
|
- |
4 |
min |
|
0 |
1 |
min |
|
- |
4 |
min |
|
- |
16384 |
min |
|
- |
(65535,65535,65535) |
min |
|
- |
128 |
min |
|
- |
(128,128,64) |
min |
|
- |
4 |
min |
|
- |
4 |
min |
|
- |
4 |
min |
|
224-1 |
232-1 |
min |
|
1 |
216-1 |
min |
|
- |
2 |
min |
|
1 |
16 |
min |
|
1 |
16 |
min |
|
- |
(4096,4096) 3 |
min |
|
- |
(-8192,8191) 4 |
(max,min) |
|
- |
0 |
min |
|
- |
64 |
min |
|
- |
256 |
max |
|
- |
256 |
max |
|
- |
256 |
max |
|
- |
-8 |
max |
|
- |
7 |
min |
|
0 |
-8 |
max |
|
0 |
7 |
min |
|
0.0 |
-0.5 5 |
max |
|
0.0 |
0.5 - (1 ULP) 5 |
min |
|
0 |
4 5 |
min |
|
- |
4096 |
min |
|
- |
4096 |
min |
|
- |
256 |
min |
|
- |
( |
min |
|
- |
( |
min |
|
- |
( |
min |
|
- |
( |
min |
|
- |
4 |
min |
|
- |
( |
min |
|
- |
|
min |
|
- |
( |
min |
|
- |
( |
min |
|
|
( |
min |
|
- |
1 |
min |
|
- |
- |
implementation dependent |
|
- |
- |
duration |
|
0 |
8 |
min |
|
0 |
8 |
min |
|
0 |
8 |
min |
|
- |
2 |
min |
|
(1.0,1.0) |
(1.0,64.0 - ULP)6 |
(max,min) |
|
(1.0,1.0) |
(1.0,8.0 - ULP)7 |
(max,min) |
|
0.0 |
1.0 6 |
max, fixed point increment |
|
0.0 |
1.0 7 |
max, fixed point increment |
|
- |
- |
implementation dependent |
|
- |
- |
implementation dependent |
|
- |
- |
recommendation |
|
- |
- |
recommendation |
|
- |
256 |
max |
|
- |
32 |
min |
|
- |
6 |
min |
|
- |
227-1 |
min |
|
0 |
4 |
min |
|
- |
|
min |
|
- |
(1,1) |
min |
|
- |
(0.0, 0.9375) |
(max,min) |
|
- |
4 |
min |
|
- |
false |
implementation dependent |
|
- |
65536 |
max |
|
- |
- |
implementation dependent |
|
- |
- |
implementation dependent |
|
- |
- |
implementation dependent |
|
- |
1 |
min |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
1024 |
min |
|
- |
230 |
min |
|
- |
0.0 |
min |
|
- |
0.0 |
min |
|
- |
0.0 |
min |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
500000 |
min |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
500000 9 |
min |
|
- |
12 9 |
min |
|
- |
500000 9 |
min |
|
- |
500000 9 |
min |
|
- |
500000 9 |
min |
|
- |
500000 9 |
min |
|
- |
500000 9 |
min |
|
- |
500000 9 |
min |
|
- |
72 8 9 |
min, n × PerStage |
|
- |
8 9 |
min |
|
- |
500000 9 |
min |
|
- |
4 9 |
min |
|
- |
500000 9 |
min |
|
- |
500000 9 |
min |
|
- |
500000 9 |
min |
|
- |
256 |
min |
|
- |
4 |
min |
|
- |
4 |
min |
|
- |
4 |
min |
|
- |
4 |
min |
|
- |
216-1 |
min |
|
- |
232-1 |
min |
|
- |
32 |
min |
|
- |
(32,1,1) |
min |
|
- |
16384 |
min |
|
- |
216-1 |
min |
|
- |
32 |
min |
|
- |
(32,1,1) |
min |
|
- |
16384 |
min |
|
- |
256 |
min |
|
- |
256 |
min |
|
- |
1 |
min |
|
- |
- |
implementation dependent |
|
- |
- |
implementation dependent |
|
- |
1 |
min |
|
- |
1 |
min |
|
- |
227 |
min |
|
- |
512 |
min |
|
- |
512 |
min |
|
- |
512 |
min |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
false |
implementation dependent |
|
- |
(1,1) |
min |
|
- |
(1,1) |
min |
|
- |
- |
implementation dependent |
- 1
-
The Limit Type column specifies the limit is either the minimum limit all implementations must support or the maximum limit all implementations must support. For bitmasks a minimum limit is the least bits all implementations must set, but they may have additional bits set beyond this minimum.
- 2
-
The
maxPerStageResources
must be at least the smallest of the following:-
the sum of the
maxPerStageDescriptorUniformBuffers
,maxPerStageDescriptorStorageBuffers
,maxPerStageDescriptorSampledImages
,maxPerStageDescriptorStorageImages
,maxPerStageDescriptorInputAttachments
,maxColorAttachments
limits, or -
128.
It may not be possible to reach this limit in every stage.
-
- 3
-
See
maxViewportDimensions
for the required relationship to other limits. - 4
-
See
viewportBoundsRange
for the required relationship to other limits. - 5
-
The values
minInterpolationOffset
andmaxInterpolationOffset
describe the closed interval of supported interpolation offsets: [minInterpolationOffset
,maxInterpolationOffset
]. The ULP is determined bysubPixelInterpolationOffsetBits
. IfsubPixelInterpolationOffsetBits
is 4, this provides increments of (1/24) = 0.0625, and thus the range of supported interpolation offsets would be [-0.5, 0.4375]. - 6
-
The point size ULP is determined by
pointSizeGranularity
. If thepointSizeGranularity
is 0.125, the range of supported point sizes must be at least [1.0, 63.875]. - 7
-
The line width ULP is determined by
lineWidthGranularity
. If thelineWidthGranularity
is 0.0625, the range of supported line widths must be at least [1.0, 7.9375]. - 8
-
The minimum
maxDescriptorSet*
limit is n times the corresponding specification minimummaxPerStageDescriptor*
limit, where n is the number of shader stages supported by the VkPhysicalDevice. If all shader stages are supported, n = 6 (vertex, tessellation control, tessellation evaluation, geometry, fragment, compute). - 9
-
The
UpdateAfterBind
descriptor limits must each be greater than or equal to the correspondingnon
-UpdateAfterBind limit.
35.3. Additional Multisampling Capabilities
In addition to the minimum capabilities described in the previous section (Limits), implementations may support additional multisampling capabilities specific to a particular sample count.
To query additional sample count specific multisampling capabilities, call:
void vkGetPhysicalDeviceMultisamplePropertiesEXT(
VkPhysicalDevice physicalDevice,
VkSampleCountFlagBits samples,
VkMultisamplePropertiesEXT* pMultisampleProperties);
-
physicalDevice
is the physical device from which to query the additional multisampling capabilities. -
samples
is the sample count to query the capabilities for. -
pMultisampleProperties
is a pointer to a structure of type VkMultisamplePropertiesEXT, in which information about the additional multisampling capabilities specific to the sample count is returned.
The VkMultisamplePropertiesEXT
structure is defined as
typedef struct VkMultisamplePropertiesEXT {
VkStructureType sType;
void* pNext;
VkExtent2D maxSampleLocationGridSize;
} VkMultisamplePropertiesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
maxSampleLocationGridSize
is the maximum size of the pixel grid in which sample locations can vary.
If the sample count for which additional multisampling capabilities are
requested using vkGetPhysicalDeviceMultisamplePropertiesEXT
is set
in VkPhysicalDeviceSampleLocationsPropertiesEXT
::
sampleLocationSampleCounts
the width
and height
members
of VkMultisamplePropertiesEXT
::maxSampleLocationGridSize
must
be greater than or equal to the corresponding members of
VkPhysicalDeviceSampleLocationsPropertiesEXT
::
maxSampleLocationGridSize
,
respectively, otherwise both members must be 0
.
35.4. Formats
The features for the set of formats (VkFormat) supported by the implementation are queried individually using the vkGetPhysicalDeviceFormatProperties command.
35.4.1. Format Definition
The following image formats can be passed to, and may be returned from Vulkan commands. The memory required to store each format is discussed with that format, and also summarized in the Representation and Texel Block Size section and the Compatible formats table.
typedef enum VkFormat {
VK_FORMAT_UNDEFINED = 0,
VK_FORMAT_R4G4_UNORM_PACK8 = 1,
VK_FORMAT_R4G4B4A4_UNORM_PACK16 = 2,
VK_FORMAT_B4G4R4A4_UNORM_PACK16 = 3,
VK_FORMAT_R5G6B5_UNORM_PACK16 = 4,
VK_FORMAT_B5G6R5_UNORM_PACK16 = 5,
VK_FORMAT_R5G5B5A1_UNORM_PACK16 = 6,
VK_FORMAT_B5G5R5A1_UNORM_PACK16 = 7,
VK_FORMAT_A1R5G5B5_UNORM_PACK16 = 8,
VK_FORMAT_R8_UNORM = 9,
VK_FORMAT_R8_SNORM = 10,
VK_FORMAT_R8_USCALED = 11,
VK_FORMAT_R8_SSCALED = 12,
VK_FORMAT_R8_UINT = 13,
VK_FORMAT_R8_SINT = 14,
VK_FORMAT_R8_SRGB = 15,
VK_FORMAT_R8G8_UNORM = 16,
VK_FORMAT_R8G8_SNORM = 17,
VK_FORMAT_R8G8_USCALED = 18,
VK_FORMAT_R8G8_SSCALED = 19,
VK_FORMAT_R8G8_UINT = 20,
VK_FORMAT_R8G8_SINT = 21,
VK_FORMAT_R8G8_SRGB = 22,
VK_FORMAT_R8G8B8_UNORM = 23,
VK_FORMAT_R8G8B8_SNORM = 24,
VK_FORMAT_R8G8B8_USCALED = 25,
VK_FORMAT_R8G8B8_SSCALED = 26,
VK_FORMAT_R8G8B8_UINT = 27,
VK_FORMAT_R8G8B8_SINT = 28,
VK_FORMAT_R8G8B8_SRGB = 29,
VK_FORMAT_B8G8R8_UNORM = 30,
VK_FORMAT_B8G8R8_SNORM = 31,
VK_FORMAT_B8G8R8_USCALED = 32,
VK_FORMAT_B8G8R8_SSCALED = 33,
VK_FORMAT_B8G8R8_UINT = 34,
VK_FORMAT_B8G8R8_SINT = 35,
VK_FORMAT_B8G8R8_SRGB = 36,
VK_FORMAT_R8G8B8A8_UNORM = 37,
VK_FORMAT_R8G8B8A8_SNORM = 38,
VK_FORMAT_R8G8B8A8_USCALED = 39,
VK_FORMAT_R8G8B8A8_SSCALED = 40,
VK_FORMAT_R8G8B8A8_UINT = 41,
VK_FORMAT_R8G8B8A8_SINT = 42,
VK_FORMAT_R8G8B8A8_SRGB = 43,
VK_FORMAT_B8G8R8A8_UNORM = 44,
VK_FORMAT_B8G8R8A8_SNORM = 45,
VK_FORMAT_B8G8R8A8_USCALED = 46,
VK_FORMAT_B8G8R8A8_SSCALED = 47,
VK_FORMAT_B8G8R8A8_UINT = 48,
VK_FORMAT_B8G8R8A8_SINT = 49,
VK_FORMAT_B8G8R8A8_SRGB = 50,
VK_FORMAT_A8B8G8R8_UNORM_PACK32 = 51,
VK_FORMAT_A8B8G8R8_SNORM_PACK32 = 52,
VK_FORMAT_A8B8G8R8_USCALED_PACK32 = 53,
VK_FORMAT_A8B8G8R8_SSCALED_PACK32 = 54,
VK_FORMAT_A8B8G8R8_UINT_PACK32 = 55,
VK_FORMAT_A8B8G8R8_SINT_PACK32 = 56,
VK_FORMAT_A8B8G8R8_SRGB_PACK32 = 57,
VK_FORMAT_A2R10G10B10_UNORM_PACK32 = 58,
VK_FORMAT_A2R10G10B10_SNORM_PACK32 = 59,
VK_FORMAT_A2R10G10B10_USCALED_PACK32 = 60,
VK_FORMAT_A2R10G10B10_SSCALED_PACK32 = 61,
VK_FORMAT_A2R10G10B10_UINT_PACK32 = 62,
VK_FORMAT_A2R10G10B10_SINT_PACK32 = 63,
VK_FORMAT_A2B10G10R10_UNORM_PACK32 = 64,
VK_FORMAT_A2B10G10R10_SNORM_PACK32 = 65,
VK_FORMAT_A2B10G10R10_USCALED_PACK32 = 66,
VK_FORMAT_A2B10G10R10_SSCALED_PACK32 = 67,
VK_FORMAT_A2B10G10R10_UINT_PACK32 = 68,
VK_FORMAT_A2B10G10R10_SINT_PACK32 = 69,
VK_FORMAT_R16_UNORM = 70,
VK_FORMAT_R16_SNORM = 71,
VK_FORMAT_R16_USCALED = 72,
VK_FORMAT_R16_SSCALED = 73,
VK_FORMAT_R16_UINT = 74,
VK_FORMAT_R16_SINT = 75,
VK_FORMAT_R16_SFLOAT = 76,
VK_FORMAT_R16G16_UNORM = 77,
VK_FORMAT_R16G16_SNORM = 78,
VK_FORMAT_R16G16_USCALED = 79,
VK_FORMAT_R16G16_SSCALED = 80,
VK_FORMAT_R16G16_UINT = 81,
VK_FORMAT_R16G16_SINT = 82,
VK_FORMAT_R16G16_SFLOAT = 83,
VK_FORMAT_R16G16B16_UNORM = 84,
VK_FORMAT_R16G16B16_SNORM = 85,
VK_FORMAT_R16G16B16_USCALED = 86,
VK_FORMAT_R16G16B16_SSCALED = 87,
VK_FORMAT_R16G16B16_UINT = 88,
VK_FORMAT_R16G16B16_SINT = 89,
VK_FORMAT_R16G16B16_SFLOAT = 90,
VK_FORMAT_R16G16B16A16_UNORM = 91,
VK_FORMAT_R16G16B16A16_SNORM = 92,
VK_FORMAT_R16G16B16A16_USCALED = 93,
VK_FORMAT_R16G16B16A16_SSCALED = 94,
VK_FORMAT_R16G16B16A16_UINT = 95,
VK_FORMAT_R16G16B16A16_SINT = 96,
VK_FORMAT_R16G16B16A16_SFLOAT = 97,
VK_FORMAT_R32_UINT = 98,
VK_FORMAT_R32_SINT = 99,
VK_FORMAT_R32_SFLOAT = 100,
VK_FORMAT_R32G32_UINT = 101,
VK_FORMAT_R32G32_SINT = 102,
VK_FORMAT_R32G32_SFLOAT = 103,
VK_FORMAT_R32G32B32_UINT = 104,
VK_FORMAT_R32G32B32_SINT = 105,
VK_FORMAT_R32G32B32_SFLOAT = 106,
VK_FORMAT_R32G32B32A32_UINT = 107,
VK_FORMAT_R32G32B32A32_SINT = 108,
VK_FORMAT_R32G32B32A32_SFLOAT = 109,
VK_FORMAT_R64_UINT = 110,
VK_FORMAT_R64_SINT = 111,
VK_FORMAT_R64_SFLOAT = 112,
VK_FORMAT_R64G64_UINT = 113,
VK_FORMAT_R64G64_SINT = 114,
VK_FORMAT_R64G64_SFLOAT = 115,
VK_FORMAT_R64G64B64_UINT = 116,
VK_FORMAT_R64G64B64_SINT = 117,
VK_FORMAT_R64G64B64_SFLOAT = 118,
VK_FORMAT_R64G64B64A64_UINT = 119,
VK_FORMAT_R64G64B64A64_SINT = 120,
VK_FORMAT_R64G64B64A64_SFLOAT = 121,
VK_FORMAT_B10G11R11_UFLOAT_PACK32 = 122,
VK_FORMAT_E5B9G9R9_UFLOAT_PACK32 = 123,
VK_FORMAT_D16_UNORM = 124,
VK_FORMAT_X8_D24_UNORM_PACK32 = 125,
VK_FORMAT_D32_SFLOAT = 126,
VK_FORMAT_S8_UINT = 127,
VK_FORMAT_D16_UNORM_S8_UINT = 128,
VK_FORMAT_D24_UNORM_S8_UINT = 129,
VK_FORMAT_D32_SFLOAT_S8_UINT = 130,
VK_FORMAT_BC1_RGB_UNORM_BLOCK = 131,
VK_FORMAT_BC1_RGB_SRGB_BLOCK = 132,
VK_FORMAT_BC1_RGBA_UNORM_BLOCK = 133,
VK_FORMAT_BC1_RGBA_SRGB_BLOCK = 134,
VK_FORMAT_BC2_UNORM_BLOCK = 135,
VK_FORMAT_BC2_SRGB_BLOCK = 136,
VK_FORMAT_BC3_UNORM_BLOCK = 137,
VK_FORMAT_BC3_SRGB_BLOCK = 138,
VK_FORMAT_BC4_UNORM_BLOCK = 139,
VK_FORMAT_BC4_SNORM_BLOCK = 140,
VK_FORMAT_BC5_UNORM_BLOCK = 141,
VK_FORMAT_BC5_SNORM_BLOCK = 142,
VK_FORMAT_BC6H_UFLOAT_BLOCK = 143,
VK_FORMAT_BC6H_SFLOAT_BLOCK = 144,
VK_FORMAT_BC7_UNORM_BLOCK = 145,
VK_FORMAT_BC7_SRGB_BLOCK = 146,
VK_FORMAT_ETC2_R8G8B8_UNORM_BLOCK = 147,
VK_FORMAT_ETC2_R8G8B8_SRGB_BLOCK = 148,
VK_FORMAT_ETC2_R8G8B8A1_UNORM_BLOCK = 149,
VK_FORMAT_ETC2_R8G8B8A1_SRGB_BLOCK = 150,
VK_FORMAT_ETC2_R8G8B8A8_UNORM_BLOCK = 151,
VK_FORMAT_ETC2_R8G8B8A8_SRGB_BLOCK = 152,
VK_FORMAT_EAC_R11_UNORM_BLOCK = 153,
VK_FORMAT_EAC_R11_SNORM_BLOCK = 154,
VK_FORMAT_EAC_R11G11_UNORM_BLOCK = 155,
VK_FORMAT_EAC_R11G11_SNORM_BLOCK = 156,
VK_FORMAT_ASTC_4x4_UNORM_BLOCK = 157,
VK_FORMAT_ASTC_4x4_SRGB_BLOCK = 158,
VK_FORMAT_ASTC_5x4_UNORM_BLOCK = 159,
VK_FORMAT_ASTC_5x4_SRGB_BLOCK = 160,
VK_FORMAT_ASTC_5x5_UNORM_BLOCK = 161,
VK_FORMAT_ASTC_5x5_SRGB_BLOCK = 162,
VK_FORMAT_ASTC_6x5_UNORM_BLOCK = 163,
VK_FORMAT_ASTC_6x5_SRGB_BLOCK = 164,
VK_FORMAT_ASTC_6x6_UNORM_BLOCK = 165,
VK_FORMAT_ASTC_6x6_SRGB_BLOCK = 166,
VK_FORMAT_ASTC_8x5_UNORM_BLOCK = 167,
VK_FORMAT_ASTC_8x5_SRGB_BLOCK = 168,
VK_FORMAT_ASTC_8x6_UNORM_BLOCK = 169,
VK_FORMAT_ASTC_8x6_SRGB_BLOCK = 170,
VK_FORMAT_ASTC_8x8_UNORM_BLOCK = 171,
VK_FORMAT_ASTC_8x8_SRGB_BLOCK = 172,
VK_FORMAT_ASTC_10x5_UNORM_BLOCK = 173,
VK_FORMAT_ASTC_10x5_SRGB_BLOCK = 174,
VK_FORMAT_ASTC_10x6_UNORM_BLOCK = 175,
VK_FORMAT_ASTC_10x6_SRGB_BLOCK = 176,
VK_FORMAT_ASTC_10x8_UNORM_BLOCK = 177,
VK_FORMAT_ASTC_10x8_SRGB_BLOCK = 178,
VK_FORMAT_ASTC_10x10_UNORM_BLOCK = 179,
VK_FORMAT_ASTC_10x10_SRGB_BLOCK = 180,
VK_FORMAT_ASTC_12x10_UNORM_BLOCK = 181,
VK_FORMAT_ASTC_12x10_SRGB_BLOCK = 182,
VK_FORMAT_ASTC_12x12_UNORM_BLOCK = 183,
VK_FORMAT_ASTC_12x12_SRGB_BLOCK = 184,
VK_FORMAT_G8B8G8R8_422_UNORM = 1000156000,
VK_FORMAT_B8G8R8G8_422_UNORM = 1000156001,
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM = 1000156002,
VK_FORMAT_G8_B8R8_2PLANE_420_UNORM = 1000156003,
VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM = 1000156004,
VK_FORMAT_G8_B8R8_2PLANE_422_UNORM = 1000156005,
VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM = 1000156006,
VK_FORMAT_R10X6_UNORM_PACK16 = 1000156007,
VK_FORMAT_R10X6G10X6_UNORM_2PACK16 = 1000156008,
VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16 = 1000156009,
VK_FORMAT_G10X6B10X6G10X6R10X6_422_UNORM_4PACK16 = 1000156010,
VK_FORMAT_B10X6G10X6R10X6G10X6_422_UNORM_4PACK16 = 1000156011,
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_420_UNORM_3PACK16 = 1000156012,
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_420_UNORM_3PACK16 = 1000156013,
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16 = 1000156014,
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_422_UNORM_3PACK16 = 1000156015,
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_444_UNORM_3PACK16 = 1000156016,
VK_FORMAT_R12X4_UNORM_PACK16 = 1000156017,
VK_FORMAT_R12X4G12X4_UNORM_2PACK16 = 1000156018,
VK_FORMAT_R12X4G12X4B12X4A12X4_UNORM_4PACK16 = 1000156019,
VK_FORMAT_G12X4B12X4G12X4R12X4_422_UNORM_4PACK16 = 1000156020,
VK_FORMAT_B12X4G12X4R12X4G12X4_422_UNORM_4PACK16 = 1000156021,
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_420_UNORM_3PACK16 = 1000156022,
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_420_UNORM_3PACK16 = 1000156023,
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_422_UNORM_3PACK16 = 1000156024,
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_422_UNORM_3PACK16 = 1000156025,
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_444_UNORM_3PACK16 = 1000156026,
VK_FORMAT_G16B16G16R16_422_UNORM = 1000156027,
VK_FORMAT_B16G16R16G16_422_UNORM = 1000156028,
VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM = 1000156029,
VK_FORMAT_G16_B16R16_2PLANE_420_UNORM = 1000156030,
VK_FORMAT_G16_B16_R16_3PLANE_422_UNORM = 1000156031,
VK_FORMAT_G16_B16R16_2PLANE_422_UNORM = 1000156032,
VK_FORMAT_G16_B16_R16_3PLANE_444_UNORM = 1000156033,
VK_FORMAT_PVRTC1_2BPP_UNORM_BLOCK_IMG = 1000054000,
VK_FORMAT_PVRTC1_4BPP_UNORM_BLOCK_IMG = 1000054001,
VK_FORMAT_PVRTC2_2BPP_UNORM_BLOCK_IMG = 1000054002,
VK_FORMAT_PVRTC2_4BPP_UNORM_BLOCK_IMG = 1000054003,
VK_FORMAT_PVRTC1_2BPP_SRGB_BLOCK_IMG = 1000054004,
VK_FORMAT_PVRTC1_4BPP_SRGB_BLOCK_IMG = 1000054005,
VK_FORMAT_PVRTC2_2BPP_SRGB_BLOCK_IMG = 1000054006,
VK_FORMAT_PVRTC2_4BPP_SRGB_BLOCK_IMG = 1000054007,
VK_FORMAT_G8B8G8R8_422_UNORM_KHR = VK_FORMAT_G8B8G8R8_422_UNORM,
VK_FORMAT_B8G8R8G8_422_UNORM_KHR = VK_FORMAT_B8G8R8G8_422_UNORM,
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM_KHR = VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM,
VK_FORMAT_G8_B8R8_2PLANE_420_UNORM_KHR = VK_FORMAT_G8_B8R8_2PLANE_420_UNORM,
VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM_KHR = VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM,
VK_FORMAT_G8_B8R8_2PLANE_422_UNORM_KHR = VK_FORMAT_G8_B8R8_2PLANE_422_UNORM,
VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM_KHR = VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM,
VK_FORMAT_R10X6_UNORM_PACK16_KHR = VK_FORMAT_R10X6_UNORM_PACK16,
VK_FORMAT_R10X6G10X6_UNORM_2PACK16_KHR = VK_FORMAT_R10X6G10X6_UNORM_2PACK16,
VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16_KHR = VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16,
VK_FORMAT_G10X6B10X6G10X6R10X6_422_UNORM_4PACK16_KHR = VK_FORMAT_G10X6B10X6G10X6R10X6_422_UNORM_4PACK16,
VK_FORMAT_B10X6G10X6R10X6G10X6_422_UNORM_4PACK16_KHR = VK_FORMAT_B10X6G10X6R10X6G10X6_422_UNORM_4PACK16,
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_420_UNORM_3PACK16_KHR = VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_420_UNORM_3PACK16,
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_420_UNORM_3PACK16_KHR = VK_FORMAT_G10X6_B10X6R10X6_2PLANE_420_UNORM_3PACK16,
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16_KHR = VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16,
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_422_UNORM_3PACK16_KHR = VK_FORMAT_G10X6_B10X6R10X6_2PLANE_422_UNORM_3PACK16,
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_444_UNORM_3PACK16_KHR = VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_444_UNORM_3PACK16,
VK_FORMAT_R12X4_UNORM_PACK16_KHR = VK_FORMAT_R12X4_UNORM_PACK16,
VK_FORMAT_R12X4G12X4_UNORM_2PACK16_KHR = VK_FORMAT_R12X4G12X4_UNORM_2PACK16,
VK_FORMAT_R12X4G12X4B12X4A12X4_UNORM_4PACK16_KHR = VK_FORMAT_R12X4G12X4B12X4A12X4_UNORM_4PACK16,
VK_FORMAT_G12X4B12X4G12X4R12X4_422_UNORM_4PACK16_KHR = VK_FORMAT_G12X4B12X4G12X4R12X4_422_UNORM_4PACK16,
VK_FORMAT_B12X4G12X4R12X4G12X4_422_UNORM_4PACK16_KHR = VK_FORMAT_B12X4G12X4R12X4G12X4_422_UNORM_4PACK16,
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_420_UNORM_3PACK16_KHR = VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_420_UNORM_3PACK16,
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_420_UNORM_3PACK16_KHR = VK_FORMAT_G12X4_B12X4R12X4_2PLANE_420_UNORM_3PACK16,
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_422_UNORM_3PACK16_KHR = VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_422_UNORM_3PACK16,
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_422_UNORM_3PACK16_KHR = VK_FORMAT_G12X4_B12X4R12X4_2PLANE_422_UNORM_3PACK16,
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_444_UNORM_3PACK16_KHR = VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_444_UNORM_3PACK16,
VK_FORMAT_G16B16G16R16_422_UNORM_KHR = VK_FORMAT_G16B16G16R16_422_UNORM,
VK_FORMAT_B16G16R16G16_422_UNORM_KHR = VK_FORMAT_B16G16R16G16_422_UNORM,
VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM_KHR = VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM,
VK_FORMAT_G16_B16R16_2PLANE_420_UNORM_KHR = VK_FORMAT_G16_B16R16_2PLANE_420_UNORM,
VK_FORMAT_G16_B16_R16_3PLANE_422_UNORM_KHR = VK_FORMAT_G16_B16_R16_3PLANE_422_UNORM,
VK_FORMAT_G16_B16R16_2PLANE_422_UNORM_KHR = VK_FORMAT_G16_B16R16_2PLANE_422_UNORM,
VK_FORMAT_G16_B16_R16_3PLANE_444_UNORM_KHR = VK_FORMAT_G16_B16_R16_3PLANE_444_UNORM,
} VkFormat;
-
VK_FORMAT_UNDEFINED
specifies that the format is not specified. -
VK_FORMAT_R4G4_UNORM_PACK8
specifies a two-component, 8-bit packed unsigned normalized format that has a 4-bit R component in bits 4..7, and a 4-bit G component in bits 0..3. -
VK_FORMAT_R4G4B4A4_UNORM_PACK16
specifies a four-component, 16-bit packed unsigned normalized format that has a 4-bit R component in bits 12..15, a 4-bit G component in bits 8..11, a 4-bit B component in bits 4..7, and a 4-bit A component in bits 0..3. -
VK_FORMAT_B4G4R4A4_UNORM_PACK16
specifies a four-component, 16-bit packed unsigned normalized format that has a 4-bit B component in bits 12..15, a 4-bit G component in bits 8..11, a 4-bit R component in bits 4..7, and a 4-bit A component in bits 0..3. -
VK_FORMAT_R5G6B5_UNORM_PACK16
specifies a three-component, 16-bit packed unsigned normalized format that has a 5-bit R component in bits 11..15, a 6-bit G component in bits 5..10, and a 5-bit B component in bits 0..4. -
VK_FORMAT_B5G6R5_UNORM_PACK16
specifies a three-component, 16-bit packed unsigned normalized format that has a 5-bit B component in bits 11..15, a 6-bit G component in bits 5..10, and a 5-bit R component in bits 0..4. -
VK_FORMAT_R5G5B5A1_UNORM_PACK16
specifies a four-component, 16-bit packed unsigned normalized format that has a 5-bit R component in bits 11..15, a 5-bit G component in bits 6..10, a 5-bit B component in bits 1..5, and a 1-bit A component in bit 0. -
VK_FORMAT_B5G5R5A1_UNORM_PACK16
specifies a four-component, 16-bit packed unsigned normalized format that has a 5-bit B component in bits 11..15, a 5-bit G component in bits 6..10, a 5-bit R component in bits 1..5, and a 1-bit A component in bit 0. -
VK_FORMAT_A1R5G5B5_UNORM_PACK16
specifies a four-component, 16-bit packed unsigned normalized format that has a 1-bit A component in bit 15, a 5-bit R component in bits 10..14, a 5-bit G component in bits 5..9, and a 5-bit B component in bits 0..4. -
VK_FORMAT_R8_UNORM
specifies a one-component, 8-bit unsigned normalized format that has a single 8-bit R component. -
VK_FORMAT_R8_SNORM
specifies a one-component, 8-bit signed normalized format that has a single 8-bit R component. -
VK_FORMAT_R8_USCALED
specifies a one-component, 8-bit unsigned scaled integer format that has a single 8-bit R component. -
VK_FORMAT_R8_SSCALED
specifies a one-component, 8-bit signed scaled integer format that has a single 8-bit R component. -
VK_FORMAT_R8_UINT
specifies a one-component, 8-bit unsigned integer format that has a single 8-bit R component. -
VK_FORMAT_R8_SINT
specifies a one-component, 8-bit signed integer format that has a single 8-bit R component. -
VK_FORMAT_R8_SRGB
specifies a one-component, 8-bit unsigned normalized format that has a single 8-bit R component stored with sRGB nonlinear encoding. -
VK_FORMAT_R8G8_UNORM
specifies a two-component, 16-bit unsigned normalized format that has an 8-bit R component in byte 0, and an 8-bit G component in byte 1. -
VK_FORMAT_R8G8_SNORM
specifies a two-component, 16-bit signed normalized format that has an 8-bit R component in byte 0, and an 8-bit G component in byte 1. -
VK_FORMAT_R8G8_USCALED
specifies a two-component, 16-bit unsigned scaled integer format that has an 8-bit R component in byte 0, and an 8-bit G component in byte 1. -
VK_FORMAT_R8G8_SSCALED
specifies a two-component, 16-bit signed scaled integer format that has an 8-bit R component in byte 0, and an 8-bit G component in byte 1. -
VK_FORMAT_R8G8_UINT
specifies a two-component, 16-bit unsigned integer format that has an 8-bit R component in byte 0, and an 8-bit G component in byte 1. -
VK_FORMAT_R8G8_SINT
specifies a two-component, 16-bit signed integer format that has an 8-bit R component in byte 0, and an 8-bit G component in byte 1. -
VK_FORMAT_R8G8_SRGB
specifies a two-component, 16-bit unsigned normalized format that has an 8-bit R component stored with sRGB nonlinear encoding in byte 0, and an 8-bit G component stored with sRGB nonlinear encoding in byte 1. -
VK_FORMAT_R8G8B8_UNORM
specifies a three-component, 24-bit unsigned normalized format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, and an 8-bit B component in byte 2. -
VK_FORMAT_R8G8B8_SNORM
specifies a three-component, 24-bit signed normalized format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, and an 8-bit B component in byte 2. -
VK_FORMAT_R8G8B8_USCALED
specifies a three-component, 24-bit unsigned scaled format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, and an 8-bit B component in byte 2. -
VK_FORMAT_R8G8B8_SSCALED
specifies a three-component, 24-bit signed scaled format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, and an 8-bit B component in byte 2. -
VK_FORMAT_R8G8B8_UINT
specifies a three-component, 24-bit unsigned integer format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, and an 8-bit B component in byte 2. -
VK_FORMAT_R8G8B8_SINT
specifies a three-component, 24-bit signed integer format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, and an 8-bit B component in byte 2. -
VK_FORMAT_R8G8B8_SRGB
specifies a three-component, 24-bit unsigned normalized format that has an 8-bit R component stored with sRGB nonlinear encoding in byte 0, an 8-bit G component stored with sRGB nonlinear encoding in byte 1, and an 8-bit B component stored with sRGB nonlinear encoding in byte 2. -
VK_FORMAT_B8G8R8_UNORM
specifies a three-component, 24-bit unsigned normalized format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, and an 8-bit R component in byte 2. -
VK_FORMAT_B8G8R8_SNORM
specifies a three-component, 24-bit signed normalized format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, and an 8-bit R component in byte 2. -
VK_FORMAT_B8G8R8_USCALED
specifies a three-component, 24-bit unsigned scaled format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, and an 8-bit R component in byte 2. -
VK_FORMAT_B8G8R8_SSCALED
specifies a three-component, 24-bit signed scaled format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, and an 8-bit R component in byte 2. -
VK_FORMAT_B8G8R8_UINT
specifies a three-component, 24-bit unsigned integer format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, and an 8-bit R component in byte 2. -
VK_FORMAT_B8G8R8_SINT
specifies a three-component, 24-bit signed integer format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, and an 8-bit R component in byte 2. -
VK_FORMAT_B8G8R8_SRGB
specifies a three-component, 24-bit unsigned normalized format that has an 8-bit B component stored with sRGB nonlinear encoding in byte 0, an 8-bit G component stored with sRGB nonlinear encoding in byte 1, and an 8-bit R component stored with sRGB nonlinear encoding in byte 2. -
VK_FORMAT_R8G8B8A8_UNORM
specifies a four-component, 32-bit unsigned normalized format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, an 8-bit B component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_R8G8B8A8_SNORM
specifies a four-component, 32-bit signed normalized format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, an 8-bit B component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_R8G8B8A8_USCALED
specifies a four-component, 32-bit unsigned scaled format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, an 8-bit B component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_R8G8B8A8_SSCALED
specifies a four-component, 32-bit signed scaled format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, an 8-bit B component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_R8G8B8A8_UINT
specifies a four-component, 32-bit unsigned integer format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, an 8-bit B component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_R8G8B8A8_SINT
specifies a four-component, 32-bit signed integer format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, an 8-bit B component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_R8G8B8A8_SRGB
specifies a four-component, 32-bit unsigned normalized format that has an 8-bit R component stored with sRGB nonlinear encoding in byte 0, an 8-bit G component stored with sRGB nonlinear encoding in byte 1, an 8-bit B component stored with sRGB nonlinear encoding in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_B8G8R8A8_UNORM
specifies a four-component, 32-bit unsigned normalized format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, an 8-bit R component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_B8G8R8A8_SNORM
specifies a four-component, 32-bit signed normalized format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, an 8-bit R component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_B8G8R8A8_USCALED
specifies a four-component, 32-bit unsigned scaled format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, an 8-bit R component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_B8G8R8A8_SSCALED
specifies a four-component, 32-bit signed scaled format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, an 8-bit R component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_B8G8R8A8_UINT
specifies a four-component, 32-bit unsigned integer format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, an 8-bit R component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_B8G8R8A8_SINT
specifies a four-component, 32-bit signed integer format that has an 8-bit B component in byte 0, an 8-bit G component in byte 1, an 8-bit R component in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_B8G8R8A8_SRGB
specifies a four-component, 32-bit unsigned normalized format that has an 8-bit B component stored with sRGB nonlinear encoding in byte 0, an 8-bit G component stored with sRGB nonlinear encoding in byte 1, an 8-bit R component stored with sRGB nonlinear encoding in byte 2, and an 8-bit A component in byte 3. -
VK_FORMAT_A8B8G8R8_UNORM_PACK32
specifies a four-component, 32-bit packed unsigned normalized format that has an 8-bit A component in bits 24..31, an 8-bit B component in bits 16..23, an 8-bit G component in bits 8..15, and an 8-bit R component in bits 0..7. -
VK_FORMAT_A8B8G8R8_SNORM_PACK32
specifies a four-component, 32-bit packed signed normalized format that has an 8-bit A component in bits 24..31, an 8-bit B component in bits 16..23, an 8-bit G component in bits 8..15, and an 8-bit R component in bits 0..7. -
VK_FORMAT_A8B8G8R8_USCALED_PACK32
specifies a four-component, 32-bit packed unsigned scaled integer format that has an 8-bit A component in bits 24..31, an 8-bit B component in bits 16..23, an 8-bit G component in bits 8..15, and an 8-bit R component in bits 0..7. -
VK_FORMAT_A8B8G8R8_SSCALED_PACK32
specifies a four-component, 32-bit packed signed scaled integer format that has an 8-bit A component in bits 24..31, an 8-bit B component in bits 16..23, an 8-bit G component in bits 8..15, and an 8-bit R component in bits 0..7. -
VK_FORMAT_A8B8G8R8_UINT_PACK32
specifies a four-component, 32-bit packed unsigned integer format that has an 8-bit A component in bits 24..31, an 8-bit B component in bits 16..23, an 8-bit G component in bits 8..15, and an 8-bit R component in bits 0..7. -
VK_FORMAT_A8B8G8R8_SINT_PACK32
specifies a four-component, 32-bit packed signed integer format that has an 8-bit A component in bits 24..31, an 8-bit B component in bits 16..23, an 8-bit G component in bits 8..15, and an 8-bit R component in bits 0..7. -
VK_FORMAT_A8B8G8R8_SRGB_PACK32
specifies a four-component, 32-bit packed unsigned normalized format that has an 8-bit A component in bits 24..31, an 8-bit B component stored with sRGB nonlinear encoding in bits 16..23, an 8-bit G component stored with sRGB nonlinear encoding in bits 8..15, and an 8-bit R component stored with sRGB nonlinear encoding in bits 0..7. -
VK_FORMAT_A2R10G10B10_UNORM_PACK32
specifies a four-component, 32-bit packed unsigned normalized format that has a 2-bit A component in bits 30..31, a 10-bit R component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit B component in bits 0..9. -
VK_FORMAT_A2R10G10B10_SNORM_PACK32
specifies a four-component, 32-bit packed signed normalized format that has a 2-bit A component in bits 30..31, a 10-bit R component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit B component in bits 0..9. -
VK_FORMAT_A2R10G10B10_USCALED_PACK32
specifies a four-component, 32-bit packed unsigned scaled integer format that has a 2-bit A component in bits 30..31, a 10-bit R component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit B component in bits 0..9. -
VK_FORMAT_A2R10G10B10_SSCALED_PACK32
specifies a four-component, 32-bit packed signed scaled integer format that has a 2-bit A component in bits 30..31, a 10-bit R component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit B component in bits 0..9. -
VK_FORMAT_A2R10G10B10_UINT_PACK32
specifies a four-component, 32-bit packed unsigned integer format that has a 2-bit A component in bits 30..31, a 10-bit R component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit B component in bits 0..9. -
VK_FORMAT_A2R10G10B10_SINT_PACK32
specifies a four-component, 32-bit packed signed integer format that has a 2-bit A component in bits 30..31, a 10-bit R component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit B component in bits 0..9. -
VK_FORMAT_A2B10G10R10_UNORM_PACK32
specifies a four-component, 32-bit packed unsigned normalized format that has a 2-bit A component in bits 30..31, a 10-bit B component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit R component in bits 0..9. -
VK_FORMAT_A2B10G10R10_SNORM_PACK32
specifies a four-component, 32-bit packed signed normalized format that has a 2-bit A component in bits 30..31, a 10-bit B component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit R component in bits 0..9. -
VK_FORMAT_A2B10G10R10_USCALED_PACK32
specifies a four-component, 32-bit packed unsigned scaled integer format that has a 2-bit A component in bits 30..31, a 10-bit B component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit R component in bits 0..9. -
VK_FORMAT_A2B10G10R10_SSCALED_PACK32
specifies a four-component, 32-bit packed signed scaled integer format that has a 2-bit A component in bits 30..31, a 10-bit B component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit R component in bits 0..9. -
VK_FORMAT_A2B10G10R10_UINT_PACK32
specifies a four-component, 32-bit packed unsigned integer format that has a 2-bit A component in bits 30..31, a 10-bit B component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit R component in bits 0..9. -
VK_FORMAT_A2B10G10R10_SINT_PACK32
specifies a four-component, 32-bit packed signed integer format that has a 2-bit A component in bits 30..31, a 10-bit B component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit R component in bits 0..9. -
VK_FORMAT_R16_UNORM
specifies a one-component, 16-bit unsigned normalized format that has a single 16-bit R component. -
VK_FORMAT_R16_SNORM
specifies a one-component, 16-bit signed normalized format that has a single 16-bit R component. -
VK_FORMAT_R16_USCALED
specifies a one-component, 16-bit unsigned scaled integer format that has a single 16-bit R component. -
VK_FORMAT_R16_SSCALED
specifies a one-component, 16-bit signed scaled integer format that has a single 16-bit R component. -
VK_FORMAT_R16_UINT
specifies a one-component, 16-bit unsigned integer format that has a single 16-bit R component. -
VK_FORMAT_R16_SINT
specifies a one-component, 16-bit signed integer format that has a single 16-bit R component. -
VK_FORMAT_R16_SFLOAT
specifies a one-component, 16-bit signed floating-point format that has a single 16-bit R component. -
VK_FORMAT_R16G16_UNORM
specifies a two-component, 32-bit unsigned normalized format that has a 16-bit R component in bytes 0..1, and a 16-bit G component in bytes 2..3. -
VK_FORMAT_R16G16_SNORM
specifies a two-component, 32-bit signed normalized format that has a 16-bit R component in bytes 0..1, and a 16-bit G component in bytes 2..3. -
VK_FORMAT_R16G16_USCALED
specifies a two-component, 32-bit unsigned scaled integer format that has a 16-bit R component in bytes 0..1, and a 16-bit G component in bytes 2..3. -
VK_FORMAT_R16G16_SSCALED
specifies a two-component, 32-bit signed scaled integer format that has a 16-bit R component in bytes 0..1, and a 16-bit G component in bytes 2..3. -
VK_FORMAT_R16G16_UINT
specifies a two-component, 32-bit unsigned integer format that has a 16-bit R component in bytes 0..1, and a 16-bit G component in bytes 2..3. -
VK_FORMAT_R16G16_SINT
specifies a two-component, 32-bit signed integer format that has a 16-bit R component in bytes 0..1, and a 16-bit G component in bytes 2..3. -
VK_FORMAT_R16G16_SFLOAT
specifies a two-component, 32-bit signed floating-point format that has a 16-bit R component in bytes 0..1, and a 16-bit G component in bytes 2..3. -
VK_FORMAT_R16G16B16_UNORM
specifies a three-component, 48-bit unsigned normalized format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, and a 16-bit B component in bytes 4..5. -
VK_FORMAT_R16G16B16_SNORM
specifies a three-component, 48-bit signed normalized format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, and a 16-bit B component in bytes 4..5. -
VK_FORMAT_R16G16B16_USCALED
specifies a three-component, 48-bit unsigned scaled integer format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, and a 16-bit B component in bytes 4..5. -
VK_FORMAT_R16G16B16_SSCALED
specifies a three-component, 48-bit signed scaled integer format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, and a 16-bit B component in bytes 4..5. -
VK_FORMAT_R16G16B16_UINT
specifies a three-component, 48-bit unsigned integer format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, and a 16-bit B component in bytes 4..5. -
VK_FORMAT_R16G16B16_SINT
specifies a three-component, 48-bit signed integer format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, and a 16-bit B component in bytes 4..5. -
VK_FORMAT_R16G16B16_SFLOAT
specifies a three-component, 48-bit signed floating-point format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, and a 16-bit B component in bytes 4..5. -
VK_FORMAT_R16G16B16A16_UNORM
specifies a four-component, 64-bit unsigned normalized format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7. -
VK_FORMAT_R16G16B16A16_SNORM
specifies a four-component, 64-bit signed normalized format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7. -
VK_FORMAT_R16G16B16A16_USCALED
specifies a four-component, 64-bit unsigned scaled integer format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7. -
VK_FORMAT_R16G16B16A16_SSCALED
specifies a four-component, 64-bit signed scaled integer format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7. -
VK_FORMAT_R16G16B16A16_UINT
specifies a four-component, 64-bit unsigned integer format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7. -
VK_FORMAT_R16G16B16A16_SINT
specifies a four-component, 64-bit signed integer format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7. -
VK_FORMAT_R16G16B16A16_SFLOAT
specifies a four-component, 64-bit signed floating-point format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7. -
VK_FORMAT_R32_UINT
specifies a one-component, 32-bit unsigned integer format that has a single 32-bit R component. -
VK_FORMAT_R32_SINT
specifies a one-component, 32-bit signed integer format that has a single 32-bit R component. -
VK_FORMAT_R32_SFLOAT
specifies a one-component, 32-bit signed floating-point format that has a single 32-bit R component. -
VK_FORMAT_R32G32_UINT
specifies a two-component, 64-bit unsigned integer format that has a 32-bit R component in bytes 0..3, and a 32-bit G component in bytes 4..7. -
VK_FORMAT_R32G32_SINT
specifies a two-component, 64-bit signed integer format that has a 32-bit R component in bytes 0..3, and a 32-bit G component in bytes 4..7. -
VK_FORMAT_R32G32_SFLOAT
specifies a two-component, 64-bit signed floating-point format that has a 32-bit R component in bytes 0..3, and a 32-bit G component in bytes 4..7. -
VK_FORMAT_R32G32B32_UINT
specifies a three-component, 96-bit unsigned integer format that has a 32-bit R component in bytes 0..3, a 32-bit G component in bytes 4..7, and a 32-bit B component in bytes 8..11. -
VK_FORMAT_R32G32B32_SINT
specifies a three-component, 96-bit signed integer format that has a 32-bit R component in bytes 0..3, a 32-bit G component in bytes 4..7, and a 32-bit B component in bytes 8..11. -
VK_FORMAT_R32G32B32_SFLOAT
specifies a three-component, 96-bit signed floating-point format that has a 32-bit R component in bytes 0..3, a 32-bit G component in bytes 4..7, and a 32-bit B component in bytes 8..11. -
VK_FORMAT_R32G32B32A32_UINT
specifies a four-component, 128-bit unsigned integer format that has a 32-bit R component in bytes 0..3, a 32-bit G component in bytes 4..7, a 32-bit B component in bytes 8..11, and a 32-bit A component in bytes 12..15. -
VK_FORMAT_R32G32B32A32_SINT
specifies a four-component, 128-bit signed integer format that has a 32-bit R component in bytes 0..3, a 32-bit G component in bytes 4..7, a 32-bit B component in bytes 8..11, and a 32-bit A component in bytes 12..15. -
VK_FORMAT_R32G32B32A32_SFLOAT
specifies a four-component, 128-bit signed floating-point format that has a 32-bit R component in bytes 0..3, a 32-bit G component in bytes 4..7, a 32-bit B component in bytes 8..11, and a 32-bit A component in bytes 12..15. -
VK_FORMAT_R64_UINT
specifies a one-component, 64-bit unsigned integer format that has a single 64-bit R component. -
VK_FORMAT_R64_SINT
specifies a one-component, 64-bit signed integer format that has a single 64-bit R component. -
VK_FORMAT_R64_SFLOAT
specifies a one-component, 64-bit signed floating-point format that has a single 64-bit R component. -
VK_FORMAT_R64G64_UINT
specifies a two-component, 128-bit unsigned integer format that has a 64-bit R component in bytes 0..7, and a 64-bit G component in bytes 8..15. -
VK_FORMAT_R64G64_SINT
specifies a two-component, 128-bit signed integer format that has a 64-bit R component in bytes 0..7, and a 64-bit G component in bytes 8..15. -
VK_FORMAT_R64G64_SFLOAT
specifies a two-component, 128-bit signed floating-point format that has a 64-bit R component in bytes 0..7, and a 64-bit G component in bytes 8..15. -
VK_FORMAT_R64G64B64_UINT
specifies a three-component, 192-bit unsigned integer format that has a 64-bit R component in bytes 0..7, a 64-bit G component in bytes 8..15, and a 64-bit B component in bytes 16..23. -
VK_FORMAT_R64G64B64_SINT
specifies a three-component, 192-bit signed integer format that has a 64-bit R component in bytes 0..7, a 64-bit G component in bytes 8..15, and a 64-bit B component in bytes 16..23. -
VK_FORMAT_R64G64B64_SFLOAT
specifies a three-component, 192-bit signed floating-point format that has a 64-bit R component in bytes 0..7, a 64-bit G component in bytes 8..15, and a 64-bit B component in bytes 16..23. -
VK_FORMAT_R64G64B64A64_UINT
specifies a four-component, 256-bit unsigned integer format that has a 64-bit R component in bytes 0..7, a 64-bit G component in bytes 8..15, a 64-bit B component in bytes 16..23, and a 64-bit A component in bytes 24..31. -
VK_FORMAT_R64G64B64A64_SINT
specifies a four-component, 256-bit signed integer format that has a 64-bit R component in bytes 0..7, a 64-bit G component in bytes 8..15, a 64-bit B component in bytes 16..23, and a 64-bit A component in bytes 24..31. -
VK_FORMAT_R64G64B64A64_SFLOAT
specifies a four-component, 256-bit signed floating-point format that has a 64-bit R component in bytes 0..7, a 64-bit G component in bytes 8..15, a 64-bit B component in bytes 16..23, and a 64-bit A component in bytes 24..31. -
VK_FORMAT_B10G11R11_UFLOAT_PACK32
specifies a three-component, 32-bit packed unsigned floating-point format that has a 10-bit B component in bits 22..31, an 11-bit G component in bits 11..21, an 11-bit R component in bits 0..10. See Unsigned 10-Bit Floating-Point Numbers and Unsigned 11-Bit Floating-Point Numbers. -
VK_FORMAT_E5B9G9R9_UFLOAT_PACK32
specifies a three-component, 32-bit packed unsigned floating-point format that has a 5-bit shared exponent in bits 27..31, a 9-bit B component mantissa in bits 18..26, a 9-bit G component mantissa in bits 9..17, and a 9-bit R component mantissa in bits 0..8. -
VK_FORMAT_D16_UNORM
specifies a one-component, 16-bit unsigned normalized format that has a single 16-bit depth component. -
VK_FORMAT_X8_D24_UNORM_PACK32
specifies a two-component, 32-bit format that has 24 unsigned normalized bits in the depth component and, optionally:, 8 bits that are unused. -
VK_FORMAT_D32_SFLOAT
specifies a one-component, 32-bit signed floating-point format that has 32-bits in the depth component. -
VK_FORMAT_S8_UINT
specifies a one-component, 8-bit unsigned integer format that has 8-bits in the stencil component. -
VK_FORMAT_D16_UNORM_S8_UINT
specifies a two-component, 24-bit format that has 16 unsigned normalized bits in the depth component and 8 unsigned integer bits in the stencil component. -
VK_FORMAT_D24_UNORM_S8_UINT
specifies a two-component, 32-bit packed format that has 8 unsigned integer bits in the stencil component, and 24 unsigned normalized bits in the depth component. -
VK_FORMAT_D32_SFLOAT_S8_UINT
specifies a two-component format that has 32 signed float bits in the depth component and 8 unsigned integer bits in the stencil component. There are optionally: 24-bits that are unused. -
VK_FORMAT_BC1_RGB_UNORM_BLOCK
specifies a three-component, block-compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGB texel data. This format has no alpha and is considered opaque. -
VK_FORMAT_BC1_RGB_SRGB_BLOCK
specifies a three-component, block-compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGB texel data with sRGB nonlinear encoding. This format has no alpha and is considered opaque. -
VK_FORMAT_BC1_RGBA_UNORM_BLOCK
specifies a four-component, block-compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGB texel data, and provides 1 bit of alpha. -
VK_FORMAT_BC1_RGBA_SRGB_BLOCK
specifies a four-component, block-compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGB texel data with sRGB nonlinear encoding, and provides 1 bit of alpha. -
VK_FORMAT_BC2_UNORM_BLOCK
specifies a four-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data with the first 64 bits encoding alpha values followed by 64 bits encoding RGB values. -
VK_FORMAT_BC2_SRGB_BLOCK
specifies a four-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data with the first 64 bits encoding alpha values followed by 64 bits encoding RGB values with sRGB nonlinear encoding. -
VK_FORMAT_BC3_UNORM_BLOCK
specifies a four-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data with the first 64 bits encoding alpha values followed by 64 bits encoding RGB values. -
VK_FORMAT_BC3_SRGB_BLOCK
specifies a four-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data with the first 64 bits encoding alpha values followed by 64 bits encoding RGB values with sRGB nonlinear encoding. -
VK_FORMAT_BC4_UNORM_BLOCK
specifies a one-component, block-compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized red texel data. -
VK_FORMAT_BC4_SNORM_BLOCK
specifies a one-component, block-compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of signed normalized red texel data. -
VK_FORMAT_BC5_UNORM_BLOCK
specifies a two-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RG texel data with the first 64 bits encoding red values followed by 64 bits encoding green values. -
VK_FORMAT_BC5_SNORM_BLOCK
specifies a two-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of signed normalized RG texel data with the first 64 bits encoding red values followed by 64 bits encoding green values. -
VK_FORMAT_BC6H_UFLOAT_BLOCK
specifies a three-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned floating-point RGB texel data. -
VK_FORMAT_BC6H_SFLOAT_BLOCK
specifies a three-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of signed floating-point RGB texel data. -
VK_FORMAT_BC7_UNORM_BLOCK
specifies a four-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_BC7_SRGB_BLOCK
specifies a four-component, block-compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ETC2_R8G8B8_UNORM_BLOCK
specifies a three-component, ETC2 compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGB texel data. This format has no alpha and is considered opaque. -
VK_FORMAT_ETC2_R8G8B8_SRGB_BLOCK
specifies a three-component, ETC2 compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGB texel data with sRGB nonlinear encoding. This format has no alpha and is considered opaque. -
VK_FORMAT_ETC2_R8G8B8A1_UNORM_BLOCK
specifies a four-component, ETC2 compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGB texel data, and provides 1 bit of alpha. -
VK_FORMAT_ETC2_R8G8B8A1_SRGB_BLOCK
specifies a four-component, ETC2 compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGB texel data with sRGB nonlinear encoding, and provides 1 bit of alpha. -
VK_FORMAT_ETC2_R8G8B8A8_UNORM_BLOCK
specifies a four-component, ETC2 compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data with the first 64 bits encoding alpha values followed by 64 bits encoding RGB values. -
VK_FORMAT_ETC2_R8G8B8A8_SRGB_BLOCK
specifies a four-component, ETC2 compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data with the first 64 bits encoding alpha values followed by 64 bits encoding RGB values with sRGB nonlinear encoding applied. -
VK_FORMAT_EAC_R11_UNORM_BLOCK
specifies a one-component, ETC2 compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized red texel data. -
VK_FORMAT_EAC_R11_SNORM_BLOCK
specifies a one-component, ETC2 compressed format where each 64-bit compressed texel block encodes a 4×4 rectangle of signed normalized red texel data. -
VK_FORMAT_EAC_R11G11_UNORM_BLOCK
specifies a two-component, ETC2 compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RG texel data with the first 64 bits encoding red values followed by 64 bits encoding green values. -
VK_FORMAT_EAC_R11G11_SNORM_BLOCK
specifies a two-component, ETC2 compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of signed normalized RG texel data with the first 64 bits encoding red values followed by 64 bits encoding green values. -
VK_FORMAT_ASTC_4x4_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_4x4_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 4×4 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_5x4_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 5×4 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_5x4_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 5×4 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_5x5_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 5×5 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_5x5_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 5×5 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_6x5_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 6×5 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_6x5_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 6×5 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_6x6_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 6×6 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_6x6_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 6×6 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_8x5_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes an 8×5 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_8x5_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes an 8×5 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_8x6_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes an 8×6 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_8x6_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes an 8×6 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_8x8_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes an 8×8 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_8x8_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes an 8×8 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_10x5_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 10×5 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_10x5_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 10×5 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_10x6_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 10×6 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_10x6_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 10×6 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_10x8_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 10×8 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_10x8_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 10×8 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_10x10_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 10×10 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_10x10_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 10×10 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_12x10_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 12×10 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_12x10_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 12×10 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_ASTC_12x12_UNORM_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 12×12 rectangle of unsigned normalized RGBA texel data. -
VK_FORMAT_ASTC_12x12_SRGB_BLOCK
specifies a four-component, ASTC compressed format where each 128-bit compressed texel block encodes a 12×12 rectangle of unsigned normalized RGBA texel data with sRGB nonlinear encoding applied to the RGB components. -
VK_FORMAT_G8B8G8R8_422_UNORM
specifies a four-component, 32-bit format containing a pair of G components, an R component, and a B component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One G value is present at each i coordinate, with the B and R values shared across both G values and thus recorded at half the horizontal resolution of the image. This format has an 8-bit G component for the even i coordinate in byte 0, an 8-bit B component in byte 1, an 8-bit G component for the odd i coordinate in byte 2, and an 8-bit R component in byte 3. Images in this format must be defined with a width that is a multiple of two. For the purposes of the constraints on copy extents, this format is treated as a compressed format with a 2×1 compressed texel block. -
VK_FORMAT_B8G8R8G8_422_UNORM
specifies a four-component, 32-bit format containing a pair of G components, an R component, and a B component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One G value is present at each i coordinate, with the B and R values shared across both G values and thus recorded at half the horizontal resolution of the image. This format has an 8-bit B component in byte 0, an 8-bit G component for the even i coordinate in byte 1, an 8-bit R component in byte 2, and an 8-bit G component for the odd i coordinate in byte 3. Images in this format must be defined with a width that is a multiple of two. For the purposes of the constraints on copy extents, this format is treated as a compressed format with a 2×1 compressed texel block. -
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM
specifies an unsigned normalized multi-planar format that has an 8-bit G component in plane 0, an 8-bit B component in plane 1, and an 8-bit R component in plane 2. The horizontal and vertical dimensions of the R and B planes are halved relative to the image dimensions, and each R and B component is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\) and \(\lfloor j_G \times 0.5 \rfloor = j_B = j_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. Images in this format must be defined with a width and height that is a multiple of two. -
VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
specifies an unsigned normalized multi-planar format that has an 8-bit G component in plane 0, and a two-component, 16-bit BR plane 1 consisting of an 8-bit B component in byte 0 and an 8-bit R component in byte 1. The horizontal and vertical dimensions of the BR plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\) and \(\lfloor j_G \times 0.5 \rfloor = j_B = j_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane, andVK_IMAGE_ASPECT_PLANE_1_BIT
for the BR plane. Images in this format must be defined with a width and height that is a multiple of two. -
VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM
specifies an unsigned normalized multi-planar format that has an 8-bit G component in plane 0, an 8-bit B component in plane 1, and an 8-bit R component in plane 2. The horizontal dimension of the R and B plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. Images in this format must be defined with a width that is a multiple of two. -
VK_FORMAT_G8_B8R8_2PLANE_422_UNORM
specifies an unsigned normalized multi-planar format that has an 8-bit G component in plane 0, and a two-component, 16-bit BR plane 1 consisting of an 8-bit B component in byte 0 and an 8-bit R component in byte 1. The horizontal dimensions of the BR plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane, andVK_IMAGE_ASPECT_PLANE_1_BIT
for the BR plane. Images in this format must be defined with a width that is a multiple of two. -
VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM
specifies an unsigned normalized multi-planar format that has an 8-bit G component in plane 0, an 8-bit B component in plane 1, and an 8-bit R component in plane 2. Each plane has the same dimensions and each R, G and B component contributes to a single texel. The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. -
VK_FORMAT_R10X6_UNORM_PACK16
specifies a one-component, 16-bit unsigned normalized format that has a single 10-bit R component in the top 10 bits of a 16-bit word, with the bottom 6 bits set to 0. -
VK_FORMAT_R10X6G10X6_UNORM_2PACK16
specifies a two-component, 32-bit unsigned normalized format that has a 10-bit R component in the top 10 bits of the word in bytes 0..1, and a 10-bit G component in the top 10 bits of the word in bytes 2..3, with the bottom 6 bits of each word set to 0. -
VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16
specifies a four-component, 64-bit unsigned normalized format that has a 10-bit R component in the top 10 bits of the word in bytes 0..1, a 10-bit G component in the top 10 bits of the word in bytes 2..3, a 10-bit B component in the top 10 bits of the word in bytes 4..5, and a 10-bit A component in the top 10 bits of the word in bytes 6..7, with the bottom 6 bits of each word set to 0. -
VK_FORMAT_G10X6B10X6G10X6R10X6_422_UNORM_4PACK16
specifies a four-component, 64-bit format containing a pair of G components, an R component, and a B component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One G value is present at each i coordinate, with the B and R values shared across both G values and thus recorded at half the horizontal resolution of the image. This format has a 10-bit G component for the even i coordinate in the top 10 bits of the word in bytes 0..1, a 10-bit B component in the top 10 bits of the word in bytes 2..3, a 10-bit G component for the odd i coordinate in the top 10 bits of the word in bytes 4..5, and a 10-bit R component in the top 10 bits of the word in bytes 6..7, with the bottom 6 bits of each word set to 0. Images in this format must be defined with a width that is a multiple of two. For the purposes of the constraints on copy extents, this format is treated as a compressed format with a 2×1 compressed texel block. -
VK_FORMAT_B10X6G10X6R10X6G10X6_422_UNORM_4PACK16
specifies a four-component, 64-bit format containing a pair of G components, an R component, and a B component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One G value is present at each i coordinate, with the B and R values shared across both G values and thus recorded at half the horizontal resolution of the image. This format has a 10-bit B component in the top 10 bits of the word in bytes 0..1, a 10-bit G component for the even i coordinate in the top 10 bits of the word in bytes 2..3, a 10-bit R component in the top 10 bits of the word in bytes 4..5, and a 10-bit G component for the odd i coordinate in the top 10 bits of the word in bytes 6..7, with the bottom 6 bits of each word set to 0. Images in this format must be defined with a width that is a multiple of two. For the purposes of the constraints on copy extents, this format is treated as a compressed format with a 2×1 compressed texel block. -
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_420_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 10-bit G component in the top 10 bits of each 16-bit word of plane 0, a 10-bit B component in the top 10 bits of each 16-bit word of plane 1, and a 10-bit R component in the top 10 bits of each 16-bit word of plane 2, with the bottom 6 bits of each word set to 0. The horizontal and vertical dimensions of the R and B planes are halved relative to the image dimensions, and each R and B component is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\) and \(\lfloor j_G \times 0.5 \rfloor = j_B = j_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. Images in this format must be defined with a width and height that is a multiple of two. -
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_420_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 10-bit G component in the top 10 bits of each 16-bit word of plane 0, and a two-component, 32-bit BR plane 1 consisting of a 10-bit B component in the top 10 bits of the word in bytes 0..1, and a 10-bit R component in the top 10 bits of the word in bytes 2..3, the bottom 6 bits of each word set to 0. The horizontal and vertical dimensions of the BR plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\) and \(\lfloor j_G \times 0.5 \rfloor = j_B = j_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane, andVK_IMAGE_ASPECT_PLANE_1_BIT
for the BR plane. Images in this format must be defined with a width and height that is a multiple of two. -
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 10-bit G component in the top 10 bits of each 16-bit word of plane 0, a 10-bit B component in the top 10 bits of each 16-bit word of plane 1, and a 10-bit R component in the top 10 bits of each 16-bit word of plane 2, with the bottom 6 bits of each word set to 0. The horizontal dimension of the R and B plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. Images in this format must be defined with a width that is a multiple of two. -
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_422_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 10-bit G component in the top 10 bits of each 16-bit word of plane 0, and a two-component, 32-bit BR plane 1 consisting of a 10-bit B component in the top 10 bits of the word in bytes 0..1, and a 10-bit R component in the top 10 bits of the word in bytes 2..3, the bottom 6 bits of each word set to 0. The horizontal dimensions of the BR plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane, andVK_IMAGE_ASPECT_PLANE_1_BIT
for the BR plane. Images in this format must be defined with a width that is a multiple of two. -
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_444_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 10-bit G component in the top 10 bits of each 16-bit word of plane 0, a 10-bit B component in the top 10 bits of each 16-bit word of plane 1, and a 10-bit R component in the top 10 bits of each 16-bit word of plane 2, with the bottom 6 bits of each word set to 0. Each plane has the same dimensions and each R, G and B component contributes to a single texel. The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. -
VK_FORMAT_R12X4_UNORM_PACK16
specifies a one-component, 16-bit unsigned normalized format that has a single 12-bit R component in the top 12 bits of a 16-bit word, with the bottom 4 bits set to 0. -
VK_FORMAT_R12X4G12X4_UNORM_2PACK16
specifies a two-component, 32-bit unsigned normalized format that has a 12-bit R component in the top 12 bits of the word in bytes 0..1, and a 12-bit G component in the top 12 bits of the word in bytes 2..3, with the bottom 4 bits of each word set to 0. -
VK_FORMAT_R12X4G12X4B12X4A12X4_UNORM_4PACK16
specifies a four-component, 64-bit unsigned normalized format that has a 12-bit R component in the top 12 bits of the word in bytes 0..1, a 12-bit G component in the top 12 bits of the word in bytes 2..3, a 12-bit B component in the top 12 bits of the word in bytes 4..5, and a 12-bit A component in the top 12 bits of the word in bytes 6..7, with the bottom 4 bits of each word set to 0. -
VK_FORMAT_G12X4B12X4G12X4R12X4_422_UNORM_4PACK16
specifies a four-component, 64-bit format containing a pair of G components, an R component, and a B component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One G value is present at each i coordinate, with the B and R values shared across both G values and thus recorded at half the horizontal resolution of the image. This format has a 12-bit G component for the even i coordinate in the top 12 bits of the word in bytes 0..1, a 12-bit B component in the top 12 bits of the word in bytes 2..3, a 12-bit G component for the odd i coordinate in the top 12 bits of the word in bytes 4..5, and a 12-bit R component in the top 12 bits of the word in bytes 6..7, with the bottom 4 bits of each word set to 0. Images in this format must be defined with a width that is a multiple of two. For the purposes of the constraints on copy extents, this format is treated as a compressed format with a 2×1 compressed texel block. -
VK_FORMAT_B12X4G12X4R12X4G12X4_422_UNORM_4PACK16
specifies a four-component, 64-bit format containing a pair of G components, an R component, and a B component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One G value is present at each i coordinate, with the B and R values shared across both G values and thus recorded at half the horizontal resolution of the image. This format has a 12-bit B component in the top 12 bits of the word in bytes 0..1, a 12-bit G component for the even i coordinate in the top 12 bits of the word in bytes 2..3, a 12-bit R component in the top 12 bits of the word in bytes 4..5, and a 12-bit G component for the odd i coordinate in the top 12 bits of the word in bytes 6..7, with the bottom 4 bits of each word set to 0. Images in this format must be defined with a width that is a multiple of two. For the purposes of the constraints on copy extents, this format is treated as a compressed format with a 2×1 compressed texel block. -
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_420_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 12-bit G component in the top 12 bits of each 16-bit word of plane 0, a 12-bit B component in the top 12 bits of each 16-bit word of plane 1, and a 12-bit R component in the top 12 bits of each 16-bit word of plane 2, with the bottom 4 bits of each word set to 0. The horizontal and vertical dimensions of the R and B planes are halved relative to the image dimensions, and each R and B component is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\) and \(\lfloor j_G \times 0.5 \rfloor = j_B = j_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. Images in this format must be defined with a width and height that is a multiple of two. -
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_420_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 12-bit G component in the top 12 bits of each 16-bit word of plane 0, and a two-component, 32-bit BR plane 1 consisting of a 12-bit B component in the top 12 bits of the word in bytes 0..1, and a 12-bit R component in the top 12 bits of the word in bytes 2..3, the bottom 4 bits of each word set to 0. The horizontal and vertical dimensions of the BR plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\) and \(\lfloor j_G \times 0.5 \rfloor = j_B = j_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane, andVK_IMAGE_ASPECT_PLANE_1_BIT
for the BR plane. Images in this format must be defined with a width and height that is a multiple of two. -
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_422_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 12-bit G component in the top 12 bits of each 16-bit word of plane 0, a 12-bit B component in the top 12 bits of each 16-bit word of plane 1, and a 12-bit R component in the top 12 bits of each 16-bit word of plane 2, with the bottom 4 bits of each word set to 0. The horizontal dimension of the R and B plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. Images in this format must be defined with a width that is a multiple of two. -
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_422_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 12-bit G component in the top 12 bits of each 16-bit word of plane 0, and a two-component, 32-bit BR plane 1 consisting of a 12-bit B component in the top 12 bits of the word in bytes 0..1, and a 12-bit R component in the top 12 bits of the word in bytes 2..3, the bottom 4 bits of each word set to 0. The horizontal dimensions of the BR plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane, andVK_IMAGE_ASPECT_PLANE_1_BIT
for the BR plane. Images in this format must be defined with a width that is a multiple of two. -
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_444_UNORM_3PACK16
specifies an unsigned normalized multi-planar format that has a 12-bit G component in the top 12 bits of each 16-bit word of plane 0, a 12-bit B component in the top 12 bits of each 16-bit word of plane 1, and a 12-bit R component in the top 12 bits of each 16-bit word of plane 2, with the bottom 4 bits of each word set to 0. Each plane has the same dimensions and each R, G and B component contributes to a single texel. The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. -
VK_FORMAT_G16B16G16R16_422_UNORM
specifies a four-component, 64-bit format containing a pair of G components, an R component, and a B component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One G value is present at each i coordinate, with the B and R values shared across both G values and thus recorded at half the horizontal resolution of the image. This format has a 16-bit G component for the even i coordinate in the word in bytes 0..1, a 16-bit B component in the word in bytes 2..3, a 16-bit G component for the odd i coordinate in the word in bytes 4..5, and a 16-bit R component in the word in bytes 6..7. Images in this format must be defined with a width that is a multiple of two. For the purposes of the constraints on copy extents, this format is treated as a compressed format with a 2×1 compressed texel block. -
VK_FORMAT_B16G16R16G16_422_UNORM
specifies a four-component, 64-bit format containing a pair of G components, an R component, and a B component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One G value is present at each i coordinate, with the B and R values shared across both G values and thus recorded at half the horizontal resolution of the image. This format has a 16-bit B component in the word in bytes 0..1, a 16-bit G component for the even i coordinate in the word in bytes 2..3, a 16-bit R component in the word in bytes 4..5, and a 16-bit G component for the odd i coordinate in the word in bytes 6..7. Images in this format must be defined with a width that is a multiple of two. For the purposes of the constraints on copy extents, this format is treated as a compressed format with a 2×1 compressed texel block. -
VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM
specifies an unsigned normalized multi-planar format that has a 16-bit G component in each 16-bit word of plane 0, a 16-bit B component in each 16-bit word of plane 1, and a 16-bit R component in each 16-bit word of plane 2. The horizontal and vertical dimensions of the R and B planes are halved relative to the image dimensions, and each R and B component is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\) and \(\lfloor j_G \times 0.5 \rfloor = j_B = j_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. Images in this format must be defined with a width and height that is a multiple of two. -
VK_FORMAT_G16_B16R16_2PLANE_420_UNORM
specifies an unsigned normalized multi-planar format that has a 16-bit G component in each 16-bit word of plane 0, and a two-component, 32-bit BR plane 1 consisting of a 16-bit B component in the word in bytes 0..1, and a 16-bit R component in the word in bytes 2..3. The horizontal and vertical dimensions of the BR plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\) and \(\lfloor j_G \times 0.5 \rfloor = j_B = j_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane, andVK_IMAGE_ASPECT_PLANE_1_BIT
for the BR plane. Images in this format must be defined with a width and height that is a multiple of two. -
VK_FORMAT_G16_B16_R16_3PLANE_422_UNORM
specifies an unsigned normalized multi-planar format that has a 16-bit G component in each 16-bit word of plane 0, a 16-bit B component in each 16-bit word of plane 1, and a 16-bit R component in each 16-bit word of plane 2. The horizontal dimension of the R and B plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane. Images in this format must be defined with a width that is a multiple of two. -
VK_FORMAT_G16_B16R16_2PLANE_422_UNORM
specifies an unsigned normalized multi-planar format that has a 16-bit G component in each 16-bit word of plane 0, and a two-component, 32-bit BR plane 1 consisting of a 16-bit B component in the word in bytes 0..1, and a 16-bit R component in the word in bytes 2..3. The horizontal dimensions of the BR plane is halved relative to the image dimensions, and each R and B value is shared with the G components for which \(\lfloor i_G \times 0.5 \rfloor = i_B = i_R\). The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane, andVK_IMAGE_ASPECT_PLANE_1_BIT
for the BR plane. Images in this format must be defined with a width that is a multiple of two. -
VK_FORMAT_G16_B16_R16_3PLANE_444_UNORM
specifies an unsigned normalized multi-planar format that has a 16-bit G component in each 16-bit word of plane 0, a 16-bit B component in each 16-bit word of plane 1, and a 16-bit R component in each 16-bit word of plane 2. Each plane has the same dimensions and each R, G and B component contributes to a single texel. The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, usingVK_IMAGE_ASPECT_PLANE_0_BIT
for the G plane,VK_IMAGE_ASPECT_PLANE_1_BIT
for the B plane, andVK_IMAGE_ASPECT_PLANE_2_BIT
for the R plane.
Compatible formats of planes of multi-planar formats
Individual planes of multi-planar formats are compatible with single-plane formats if they occupy the same number of bits per texel block. In the following table, individual planes of a multi-planar format are compatible with the format listed against the relevant plane index for that multi-planar format.
Plane | Compatible format for plane | Width relative to the width w of the plane with the largest dimensions | Height relative to the height h of the plane with the largest dimensions |
---|---|---|---|
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h/2 |
2 |
|
w/2 |
h/2 |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h/2 |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h |
2 |
|
w/2 |
h |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h |
|
|||
0 |
|
w |
h |
1 |
|
w |
h |
2 |
|
w |
h |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h/2 |
2 |
|
w/2 |
h/2 |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h/2 |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h |
2 |
|
w/2 |
h |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h |
|
|||
0 |
|
w |
h |
1 |
|
w |
h |
2 |
|
w |
h |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h/2 |
2 |
|
w/2 |
h/2 |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h/2 |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h |
2 |
|
w/2 |
h |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h |
|
|||
0 |
|
w |
h |
1 |
|
w |
h |
2 |
|
w |
h |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h/2 |
2 |
|
w/2 |
h/2 |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h/2 |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h |
2 |
|
w/2 |
h |
|
|||
0 |
|
w |
h |
1 |
|
w/2 |
h |
|
|||
0 |
|
w |
h |
1 |
|
w |
h |
2 |
|
w |
h |
Packed Formats
For the purposes of address alignment when accessing buffer memory containing vertex attribute or texel data, the following formats are considered packed - whole texels or attributes are stored in bitfields of a single 8-, 16-, or 32-bit fundamental data type.
-
-
VK_FORMAT_R4G4_UNORM_PACK8
-
-
Packed into 16-bit data types:
-
VK_FORMAT_R4G4B4A4_UNORM_PACK16
-
VK_FORMAT_B4G4R4A4_UNORM_PACK16
-
VK_FORMAT_R5G6B5_UNORM_PACK16
-
VK_FORMAT_B5G6R5_UNORM_PACK16
-
VK_FORMAT_R5G5B5A1_UNORM_PACK16
-
VK_FORMAT_B5G5R5A1_UNORM_PACK16
-
VK_FORMAT_A1R5G5B5_UNORM_PACK16
-
-
Packed into 32-bit data types:
-
VK_FORMAT_A8B8G8R8_UNORM_PACK32
-
VK_FORMAT_A8B8G8R8_SNORM_PACK32
-
VK_FORMAT_A8B8G8R8_USCALED_PACK32
-
VK_FORMAT_A8B8G8R8_SSCALED_PACK32
-
VK_FORMAT_A8B8G8R8_UINT_PACK32
-
VK_FORMAT_A8B8G8R8_SINT_PACK32
-
VK_FORMAT_A8B8G8R8_SRGB_PACK32
-
VK_FORMAT_A2R10G10B10_UNORM_PACK32
-
VK_FORMAT_A2R10G10B10_SNORM_PACK32
-
VK_FORMAT_A2R10G10B10_USCALED_PACK32
-
VK_FORMAT_A2R10G10B10_SSCALED_PACK32
-
VK_FORMAT_A2R10G10B10_UINT_PACK32
-
VK_FORMAT_A2R10G10B10_SINT_PACK32
-
VK_FORMAT_A2B10G10R10_UNORM_PACK32
-
VK_FORMAT_A2B10G10R10_SNORM_PACK32
-
VK_FORMAT_A2B10G10R10_USCALED_PACK32
-
VK_FORMAT_A2B10G10R10_SSCALED_PACK32
-
VK_FORMAT_A2B10G10R10_UINT_PACK32
-
VK_FORMAT_A2B10G10R10_SINT_PACK32
-
VK_FORMAT_B10G11R11_UFLOAT_PACK32
-
VK_FORMAT_E5B9G9R9_UFLOAT_PACK32
-
VK_FORMAT_X8_D24_UNORM_PACK32
-
Identification of Formats
A “format” is represented by a single enum value. The name of a format is usually built up by using the following pattern:
VK_FORMAT_{component-format|compression-scheme}_{numeric-format}
The component-format indicates either the size of the R, G, B, and A components (if they are present) in the case of a color format, or the size of the depth (D) and stencil (S) components (if they are present) in the case of a depth/stencil format (see below). An X indicates a component that is unused, but may be present for padding.
Numeric format | Description |
---|---|
|
The components are unsigned normalized values in the range [0,1] |
|
The components are signed normalized values in the range [-1,1] |
|
The components are unsigned integer values that get converted to floating-point in the range [0,2n-1] |
|
The components are signed integer values that get converted to floating-point in the range [-2n-1,2n-1-1] |
|
The components are unsigned integer values in the range [0,2n-1] |
|
The components are signed integer values in the range [-2n-1,2n-1-1] |
|
The components are unsigned floating-point numbers (used by packed, shared exponent, and some compressed formats) |
|
The components are signed floating-point numbers |
|
The R, G, and B components are unsigned normalized values that represent values using sRGB nonlinear encoding, while the A component (if one exists) is a regular unsigned normalized value |
The suffix _PACKnn
indicates that the format is packed into an
underlying type with nn bits.
The suffix _mPACKnn
is a short-hand that indicates that the format has
several components (which may or may not be stored in separate planes)
that are each packed into an underlying type with nn bits.
The suffix _BLOCK
indicates that the format is a block-compressed
format, with the representation of multiple pixels encoded interdependently
within a region.
Compression scheme | Description |
---|---|
|
Block Compression. See Block-Compressed Image Formats. |
|
Ericsson Texture Compression. See ETC Compressed Image Formats. |
|
ETC2 Alpha Compression. See ETC Compressed Image Formats. |
|
Adaptive Scalable Texture Compression (LDR Profile). See ASTC Compressed Image Formats. |
For multi-planar images, the components in separate planes are separated
by underscores, and the number of planes is indicated by the addition of a
_2PLANE
or _3PLANE
suffix.
Similarly, the separate aspects of depth-stencil formats are separated by
underscores, although these are not considered separate planes.
Formats are suffixed by _422
to indicate that planes other than the
first are reduced in size by a factor of two horizontally or that the R and
B values appear at half the horizontal frequency of the G values, _420
to indicate that planes other than the first are reduced in size by a factor
of two both horizontally and vertically, and _444
for consistency to
indicate that all three planes of a three-planar image are the same size.
Note
No common format has a single plane containing both R and B channels but does not store these channels at reduced horizontal resolution. |
Representation and Texel Block Size
Color formats must be represented in memory in exactly the form indicated by the format’s name. This means that promoting one format to another with more bits per component and/or additional components must not occur for color formats. Depth/stencil formats have more relaxed requirements as discussed below.
Each format has a texel block size, the number of bytes used to store one texel block (a single addressable element of an uncompressed image, or a single compressed block of a compressed image). The texel block size for each format is shown in the Compatible formats table.
The representation of non-packed formats is that the first component specified in the name of the format is in the lowest memory addresses and the last component specified is in the highest memory addresses. See Byte mappings for non-packed/compressed color formats. The in-memory ordering of bytes within a component is determined by the host endianness.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | ← Byte |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R |
|
|||||||||||||||
R |
G |
|
||||||||||||||
R |
G |
B |
|
|||||||||||||
B |
G |
R |
|
|||||||||||||
R |
G |
B |
A |
|
||||||||||||
B |
G |
R |
A |
|
||||||||||||
G0 |
B |
G1 |
R |
|
||||||||||||
B |
G0 |
R |
G1 |
|
||||||||||||
R |
|
|||||||||||||||
R |
G |
|
||||||||||||||
R |
G |
B |
|
|||||||||||||
R |
G |
B |
A |
|
||||||||||||
G0 |
B |
G1 |
R |
|
||||||||||||
B |
G0 |
R |
G1 |
|
||||||||||||
R |
|
|||||||||||||||
R |
G |
|
||||||||||||||
R |
G |
B |
|
|||||||||||||
R |
G |
B |
A |
|
||||||||||||
R |
|
|||||||||||||||
R |
G |
|
||||||||||||||
|
||||||||||||||||
|
Packed formats store multiple components within one underlying type. The bit representation is that the first component specified in the name of the format is in the most-significant bits and the last component specified is in the least-significant bits of the underlying type. The in-memory ordering of bytes comprising the underlying type is determined by the host endianness.
Bit | |||||||
---|---|---|---|---|---|---|---|
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|||||||
R |
G |
||||||
3 |
2 |
1 |
0 |
3 |
2 |
1 |
0 |
Bit | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||
R |
G |
B |
A |
||||||||||||
3 |
2 |
1 |
0 |
3 |
2 |
1 |
0 |
3 |
2 |
1 |
0 |
3 |
2 |
1 |
0 |
|
|||||||||||||||
B |
G |
R |
A |
||||||||||||
3 |
2 |
1 |
0 |
3 |
2 |
1 |
0 |
3 |
2 |
1 |
0 |
3 |
2 |
1 |
0 |
|
|||||||||||||||
R |
G |
B |
|||||||||||||
4 |
3 |
2 |
1 |
0 |
5 |
4 |
3 |
2 |
1 |
0 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||
B |
G |
R |
|||||||||||||
4 |
3 |
2 |
1 |
0 |
5 |
4 |
3 |
2 |
1 |
0 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||
R |
G |
B |
A |
||||||||||||
4 |
3 |
2 |
1 |
0 |
4 |
3 |
2 |
1 |
0 |
4 |
3 |
2 |
1 |
0 |
0 |
|
|||||||||||||||
B |
G |
R |
A |
||||||||||||
4 |
3 |
2 |
1 |
0 |
4 |
3 |
2 |
1 |
0 |
4 |
3 |
2 |
1 |
0 |
0 |
|
|||||||||||||||
A |
R |
G |
B |
||||||||||||
0 |
4 |
3 |
2 |
1 |
0 |
4 |
3 |
2 |
1 |
0 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||
R |
X |
||||||||||||||
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||
R |
X |
||||||||||||||
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
3 |
2 |
1 |
0 |
Bit | |||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 |
30 |
29 |
28 |
27 |
26 |
25 |
24 |
23 |
22 |
21 |
20 |
19 |
18 |
17 |
16 |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||||||||||||||||||
A |
B |
G |
R |
||||||||||||||||||||||||||||
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||||||||||||||||||
A |
R |
G |
B |
||||||||||||||||||||||||||||
1 |
0 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||||||||||||||||||
A |
B |
G |
R |
||||||||||||||||||||||||||||
1 |
0 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||||||||||||||||||
B |
G |
R |
|||||||||||||||||||||||||||||
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||||||||||||||||||
E |
B |
G |
R |
||||||||||||||||||||||||||||
4 |
3 |
2 |
1 |
0 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|||||||||||||||||||||||||||||||
X |
D |
||||||||||||||||||||||||||||||
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
23 |
22 |
21 |
20 |
19 |
18 |
17 |
16 |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Depth/Stencil Formats
Depth/stencil formats are considered opaque and need not be stored in the exact number of bits per texel or component ordering indicated by the format enum. However, implementations must not substitute a different depth or stencil precision than that described in the format (e.g. D16 must not be implemented as D24 or D32).
Format Compatibility Classes
Uncompressed color formats are compatible with each other if they occupy the same number of bits per texel block. Compressed color formats are compatible with each other if the only difference between them is the numerical type of the uncompressed pixels (e.g. signed vs. unsigned, or SRGB vs. UNORM encoding). Each depth/stencil format is only compatible with itself. In the following table, all the formats in the same row are compatible.
Class, Texel Block Size, # Texels/Block | Formats |
---|---|
8-bit |
|
16-bit |
|
24-bit |
|
32-bit |
|
32-bit G8B8G8R8 |
|
32-bit B8G8R8G8 |
|
48-bit |
|
64-bit |
|
64-bit R10G10B10A10 |
|
64-bit G10B10G10R10 |
|
64-bit B10G10R10G10 |
|
64-bit R12G12B12A12 |
|
64-bit G12B12G12R12 |
|
64-bit B12G12R12G12 |
|
64-bit G16B16G16R16 |
|
64-bit B16G16R16G16 |
|
96-bit |
|
128-bit |
|
192-bit |
|
256-bit |
|
BC1_RGB (64 bit) |
|
BC1_RGBA (64 bit) |
|
BC2 (128 bit) |
|
BC3 (128 bit) |
|
BC4 (64 bit) |
|
BC5 (128 bit) |
|
BC6H (128 bit) |
|
BC7 (128 bit) |
|
ETC2_RGB (64 bit) |
|
ETC2_RGBA (64 bit) |
|
ETC2_EAC_RGBA (64 bit) |
|
EAC_R (64 bit) |
|
EAC_RG (128 bit) |
|
ASTC_4x4 (128 bit) |
|
ASTC_5x4 (128 bit) |
|
ASTC_5x5 (128 bit) |
|
ASTC_6x5 (128 bit) |
|
ASTC_6x6 (128 bit) |
|
ASTC_8x5 (128 bit) |
|
ASTC_8x6 (128 bit) |
|
ASTC_8x8 (128 bit) |
|
ASTC_10x5 (128 bit) |
|
ASTC_10x6 (128 bit) |
|
ASTC_10x8 (128 bit) |
|
ASTC_10x10 (128 bit) |
|
ASTC_12x10 (128 bit) |
|
ASTC_12x12 (128 bit) |
|
D16 (16 bit) |
|
D24 (32 bit) |
|
D32 (32 bit) |
|
S8 (8 bit) |
|
D16S8 (24 bit) |
|
D24S8 (32 bit) |
|
D32S8 (40 bit) |
|
8-bit 3-plane 420 |
|
8-bit 2-plane 420 |
|
8-bit 3-plane 422 |
|
8-bit 2-plane 422 |
|
8-bit 3-plane 444 |
|
10-bit 3-plane 420 |
|
10-bit 2-plane 420 |
|
10-bit 3-plane 422 |
|
10-bit 2-plane 422 |
|
10-bit 3-plane 444 |
|
12-bit 3-plane 420 |
|
12-bit 2-plane 420 |
|
12-bit 3-plane 422 |
|
12-bit 2-plane 422 |
|
12-bit 3-plane 444 |
|
16-bit 3-plane 420 |
|
16-bit 2-plane 420 |
|
16-bit 3-plane 422 |
|
16-bit 2-plane 422 |
|
16-bit 3-plane 444 |
|
35.4.2. Format Properties
To query supported format features which are properties of the physical device, call:
void vkGetPhysicalDeviceFormatProperties(
VkPhysicalDevice physicalDevice,
VkFormat format,
VkFormatProperties* pFormatProperties);
-
physicalDevice
is the physical device from which to query the format properties. -
format
is the format whose properties are queried. -
pFormatProperties
is a pointer to a VkFormatProperties structure in which physical device properties forformat
are returned.
The VkFormatProperties
structure is defined as:
typedef struct VkFormatProperties {
VkFormatFeatureFlags linearTilingFeatures;
VkFormatFeatureFlags optimalTilingFeatures;
VkFormatFeatureFlags bufferFeatures;
} VkFormatProperties;
-
linearTilingFeatures
is a bitmask of VkFormatFeatureFlagBits specifying features supported by images created with atiling
parameter ofVK_IMAGE_TILING_LINEAR
. -
optimalTilingFeatures
is a bitmask of VkFormatFeatureFlagBits specifying features supported by images created with atiling
parameter ofVK_IMAGE_TILING_OPTIMAL
. -
bufferFeatures
is a bitmask of VkFormatFeatureFlagBits specifying features supported by buffers.
Note
If no format feature flags are supported, the format itself is not supported, and images of that format cannot be created. |
If format
is a block-compressed format, then bufferFeatures
must not support any features for the format.
If format
is a multi-plane format then linearTilingFeatures
and
optimalTilingFeatures
must not contain
VK_FORMAT_FEATURE_DISJOINT_BIT
.
Bits which can be set in the VkFormatProperties features
linearTilingFeatures
, optimalTilingFeatures
,
drmFormatModifierTilingFeatures
,
and bufferFeatures
are:
typedef enum VkFormatFeatureFlagBits {
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT = 0x00000001,
VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT = 0x00000002,
VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT = 0x00000004,
VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT = 0x00000008,
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_BIT = 0x00000010,
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_ATOMIC_BIT = 0x00000020,
VK_FORMAT_FEATURE_VERTEX_BUFFER_BIT = 0x00000040,
VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT = 0x00000080,
VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BLEND_BIT = 0x00000100,
VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT = 0x00000200,
VK_FORMAT_FEATURE_BLIT_SRC_BIT = 0x00000400,
VK_FORMAT_FEATURE_BLIT_DST_BIT = 0x00000800,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT = 0x00001000,
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT = 0x00004000,
VK_FORMAT_FEATURE_TRANSFER_DST_BIT = 0x00008000,
VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT = 0x00020000,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER_BIT = 0x00040000,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_SEPARATE_RECONSTRUCTION_FILTER_BIT = 0x00080000,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT = 0x00100000,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_FORCEABLE_BIT = 0x00200000,
VK_FORMAT_FEATURE_DISJOINT_BIT = 0x00400000,
VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT = 0x00800000,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_CUBIC_BIT_IMG = 0x00002000,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_MINMAX_BIT_EXT = 0x00010000,
VK_FORMAT_FEATURE_FRAGMENT_DENSITY_MAP_BIT_EXT = 0x01000000,
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR = VK_FORMAT_FEATURE_TRANSFER_SRC_BIT,
VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR = VK_FORMAT_FEATURE_TRANSFER_DST_BIT,
VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT_KHR = VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER_BIT_KHR = VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER_BIT,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_SEPARATE_RECONSTRUCTION_FILTER_BIT_KHR = VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_SEPARATE_RECONSTRUCTION_FILTER_BIT,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT_KHR = VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT,
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_FORCEABLE_BIT_KHR = VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_FORCEABLE_BIT,
VK_FORMAT_FEATURE_DISJOINT_BIT_KHR = VK_FORMAT_FEATURE_DISJOINT_BIT,
VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT_KHR = VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT,
} VkFormatFeatureFlagBits;
The following bits may be set in
linearTilingFeatures
, optimalTilingFeatures
, and
drmFormatModifierTilingFeatures
,
specifying that the features are supported by images or
image views created with the queried
vkGetPhysicalDeviceFormatProperties::format
:
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
specifies that an image view can be sampled from. -
VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT
specifies that an image view can be used as a storage images. -
VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT
specifies that an image view can be used as storage image that supports atomic operations. -
VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT
specifies that an image view can be used as a framebuffer color attachment and as an input attachment. -
VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BLEND_BIT
specifies that an image view can be used as a framebuffer color attachment that supports blending and as an input attachment. -
VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT
specifies that an image view can be used as a framebuffer depth/stencil attachment and as an input attachment. -
VK_FORMAT_FEATURE_BLIT_SRC_BIT
specifies that an image can be used assrcImage
for thevkCmdBlitImage
command. -
VK_FORMAT_FEATURE_BLIT_DST_BIT
specifies that an image can be used asdstImage
for thevkCmdBlitImage
command. -
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT
specifies that ifVK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
is also set, an image view can be used with a sampler that has either ofmagFilter
orminFilter
set toVK_FILTER_LINEAR
, ormipmapMode
set toVK_SAMPLER_MIPMAP_MODE_LINEAR
. IfVK_FORMAT_FEATURE_BLIT_SRC_BIT
is also set, an image can be used as thesrcImage
to vkCmdBlitImage with afilter
ofVK_FILTER_LINEAR
. This bit must only be exposed for formats that also support theVK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
orVK_FORMAT_FEATURE_BLIT_SRC_BIT
.If the format being queried is a depth/stencil format, this bit only specifies that the depth aspect (not the stencil aspect) of an image of this format supports linear filtering, and that linear filtering of the depth aspect is supported whether depth compare is enabled in the sampler or not. If this bit is not present, linear filtering with depth compare disabled is unsupported and linear filtering with depth compare enabled is supported, but may compute the filtered value in an implementation-dependent manner which differs from the normal rules of linear filtering. The resulting value must be in the range [0,1] and should be proportional to, or a weighted average of, the number of comparison passes or failures.
-
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT
specifies that an image can be used as a source image for copy commands. -
VK_FORMAT_FEATURE_TRANSFER_DST_BIT
specifies that an image can be used as a destination image for copy commands and clear commands. -
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_MINMAX_BIT_EXT
specifiesVkImage
can be used as a sampled image with a min or max VkSamplerReductionModeEXT. This bit must only be exposed for formats that also support theVK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
. -
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_CUBIC_BIT_IMG
specifies thatVkImage
can be used with a sampler that has either ofmagFilter
orminFilter
set toVK_FILTER_CUBIC_IMG
, or be the source image for a blit withfilter
set toVK_FILTER_CUBIC_IMG
. This bit must only be exposed for formats that also support theVK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
. If the format being queried is a depth/stencil format, this only specifies that the depth aspect is cubic filterable. -
VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT
specifies that an application can define a sampler Y’CBCR conversion using this format as a source, and that an image of this format can be used with aVkSamplerYcbcrConversionCreateInfo
xChromaOffset
and/oryChromaOffset
ofVK_CHROMA_LOCATION_MIDPOINT
. Otherwise bothxChromaOffset
andyChromaOffset
must beVK_CHROMA_LOCATION_COSITED_EVEN
. If a format does not incorporate chroma downsampling (it is not a “422” or “420” format) but the implementation supports sampler Y’CBCR conversion for this format, the implementation must setVK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT
. -
VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT
specifies that an application can define a sampler Y’CBCR conversion using this format as a source, and that an image of this format can be used with aVkSamplerYcbcrConversionCreateInfo
xChromaOffset
and/oryChromaOffset
ofVK_CHROMA_LOCATION_COSITED_EVEN
. Otherwise bothxChromaOffset
andyChromaOffset
must beVK_CHROMA_LOCATION_MIDPOINT
. If neitherVK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT
norVK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT
is set, the application must not define a sampler Y’CBCR conversion using this format as a source. -
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER_BIT
specifies that the format can do linear sampler filtering (min/magFilter) whilst sampler Y’CBCR conversion is enabled. -
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_SEPARATE_RECONSTRUCTION_FILTER_BIT
specifies that the format can have different chroma, min, and mag filters. -
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT
specifies that reconstruction is explicit, as described in Chroma Reconstruction. If this bit is not present, reconstruction is implicit by default. -
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_FORCEABLE_BIT
specifies that reconstruction can be forcibly made explicit by setting VkSamplerYcbcrConversionCreateInfo::forceExplicitReconstruction
toVK_TRUE
. -
VK_FORMAT_FEATURE_DISJOINT_BIT
specifies that a multi-planar image can have theVK_IMAGE_CREATE_DISJOINT_BIT
set during image creation. An implementation must not setVK_FORMAT_FEATURE_DISJOINT_BIT
for single-plane formats. -
VK_FORMAT_FEATURE_FRAGMENT_DENSITY_MAP_BIT_EXT
specifies that an image view can be used as a fragment density map attachment.
The following bits may be set in bufferFeatures
, specifying that the
features are supported by buffers or buffer
views created with the queried
vkGetPhysicalDeviceProperties::format
:
-
VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT
specifies that the format can be used to create a buffer view that can be bound to aVK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
descriptor. -
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_BIT
specifies that the format can be used to create a buffer view that can be bound to aVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
descriptor. -
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_ATOMIC_BIT
specifies that atomic operations are supported onVK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
with this format. -
VK_FORMAT_FEATURE_VERTEX_BUFFER_BIT
specifies that the format can be used as a vertex attribute format (VkVertexInputAttributeDescription
::format
).
typedef VkFlags VkFormatFeatureFlags;
VkFormatFeatureFlags
is a bitmask type for setting a mask of zero or
more VkFormatFeatureFlagBits.
To query supported format features which are properties of the physical device, call:
void vkGetPhysicalDeviceFormatProperties2(
VkPhysicalDevice physicalDevice,
VkFormat format,
VkFormatProperties2* pFormatProperties);
or the equivalent command
void vkGetPhysicalDeviceFormatProperties2KHR(
VkPhysicalDevice physicalDevice,
VkFormat format,
VkFormatProperties2* pFormatProperties);
-
physicalDevice
is the physical device from which to query the format properties. -
format
is the format whose properties are queried. -
pFormatProperties
is a pointer to a VkFormatProperties2 structure in which physical device properties forformat
are returned.
vkGetPhysicalDeviceFormatProperties2
behaves similarly to
vkGetPhysicalDeviceFormatProperties, with the ability to return
extended information in a pNext
chain of output structures.
The VkFormatProperties2
structure is defined as:
typedef struct VkFormatProperties2 {
VkStructureType sType;
void* pNext;
VkFormatProperties formatProperties;
} VkFormatProperties2;
or the equivalent
typedef VkFormatProperties2 VkFormatProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
formatProperties
is a structure of type VkFormatProperties describing features supported by the requested format.
To obtain the list of Linux DRM format
modifiers compatible with a VkFormat, add
VkDrmFormatModifierPropertiesListEXT to the pNext
chain of
VkFormatProperties2.
The VkDrmFormatModifierPropertiesListEXT structure is defined as:
typedef struct VkDrmFormatModifierPropertiesListEXT {
VkStructureType sType;
void* pNext;
uint32_t drmFormatModifierCount;
VkDrmFormatModifierPropertiesEXT* pDrmFormatModifierProperties;
} VkDrmFormatModifierPropertiesListEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
drmFormatModifierCount
is an inout parameter related to the number of modifiers compatible with theformat
, as described below. -
pDrmFormatModifierProperties
is eitherNULL
or an array of VkDrmFormatModifierPropertiesEXT structures.
If pDrmFormatModifierProperties
is NULL
, then the function returns
in drmFormatModifierCount
the number of modifiers compatible with the
queried format
.
Otherwise, the application must set drmFormatModifierCount
to the
length of the array pDrmFormatModifierProperties
; the function will
write at most drmFormatModifierCount
elements to the array, and will
return in drmFormatModifierCount
the number of elements written.
Among the elements in array pDrmFormatModifierProperties
, each
returned drmFormatModifier
must be unique.
The VkDrmFormatModifierPropertiesEXT structure describes properties of a VkFormat when that format is combined with a Linux DRM format modifier. These properties, like those of VkFormatProperties2, are independent of any particular image.
The VkDrmFormatModifierPropertiesEXT structure is defined as:
typedef struct VkDrmFormatModifierPropertiesEXT {
uint64_t drmFormatModifier;
uint32_t drmFormatModifierPlaneCount;
VkFormatFeatureFlags drmFormatModifierTilingFeatures;
} VkDrmFormatModifierPropertiesEXT;
-
drmFormatModifier
is a Linux DRM format modifier. -
drmFormatModifierPlaneCount
is the number of memory planes in any image created withformat
anddrmFormatModifier
. An image’s memory planecount is distinct from its format planecount, as explained below. -
drmFormatModifierTilingFeatures
is a bitmask of VkFormatFeatureFlagBits that are supported by any image created withformat
anddrmFormatModifier
.
The returned drmFormatModifierTilingFeatures
must contain at least
one bit.
The implementation must not return DRM_FORMAT_MOD_INVALID
in
drmFormatModifier
.
An image’s memory planecount (as returned by
drmFormatModifierPlaneCount
) is distinct from its format planecount
(in the sense of
multi-planar
Y’CBCR formats).
In VkImageAspectFlags, each
VK_IMAGE_ASPECT_MEMORY_PLANE
i_BIT_EXT represents a _memory plane
and each VK_IMAGE_ASPECT_PLANE
i_BIT a _format plane.
An image’s set of format planes is an ordered partition of the image’s
content into separable groups of format channels.
The ordered partition is encoded in the name of each VkFormat.
For example, VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
contains two format
planes; the first plane contains the green channel and the second plane
contains the blue channel and red channel.
If the format name does not contain PLANE
, then the format contains a
single plane; for example, VK_FORMAT_R8G8B8A8_UNORM
.
Some commands, such as vkCmdCopyBufferToImage, do not operate on all
format channels in the image, but instead operate only on the format
planes explicitly chosen by the application and operate on each format
plane independently.
An image’s set of memory planes is an ordered partition of the image’s memory rather than the image’s content. Each memory plane is a contiguous range of memory. The union of an image’s memory planes is not necessarily contiguous.
If an image is linear, then the partition is
the same for memory planes and for format planes.
Therefore, if the returned drmFormatModifier
is
DRM_FORMAT_MOD_LINEAR
, then drmFormatModifierPlaneCount
must
equal the format planecount, and drmFormatModifierTilingFeatures
must be identical to the
VkFormatProperties2::linearTilingFeatures
returned in the same
pNext
chain.
If an image is non-linear, then the partition
of the image’s memory into memory planes is implementation-specific and
may be unrelated to the partition of the image’s content into format
planes.
For example, consider an image whose format
is
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM
, tiling
is
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
, whose drmFormatModifier
is not DRM_FORMAT_MOD_LINEAR
, and flags
lacks
VK_IMAGE_CREATE_DISJOINT_BIT
.
The image has 3 format planes, and commands such
vkCmdCopyBufferToImage act on each format plane independently as if
the data of each format plane were separable from the data of the other
planes.
In a straightforward implementation, the implementation may store the
image’s content in 3 adjacent memory planes where each memory plane
corresponds exactly to a format plane.
However, the implementation may also store the image’s content in a single
memory plane where all format channels are combined using an
implementation-private block-compressed format; or the implementation may
store the image’s content in a collection of 7 adjacent memory planes
using an implementation-private sharding technique.
Because the image is non-linear and non-disjoint, the implementation has
much freedom when choosing the image’s placement in memory.
The memory planecount applies to function parameters and structures only
when the API specifies an explicit requirement on
drmFormatModifierPlaneCount
.
In all other cases, the memory planecount is ignored.
35.4.3. Required Format Support
Implementations must support at least the following set of features on the listed formats. For images, these features must be supported for every VkImageType (including arrayed and cube variants) unless otherwise noted. These features are supported on existing formats without needing to advertise an extension or needing to explicitly enable them. Support for additional functionality beyond the requirements listed here is queried using the vkGetPhysicalDeviceFormatProperties command.
Note
Unless otherwise excluded below, the required formats are supported for all VkImageCreateFlags values as long as those flag values are otherwise allowed. |
The following tables show which feature bits must be supported for each
format.
Formats that are required to support
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT
must also support
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT
and
VK_FORMAT_FEATURE_TRANSFER_DST_BIT
.
✓ |
This feature must be supported on the named format |
† |
This feature must be supported on at least some of the named formats, with more information in the table where the symbol appears |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
||||||||
|
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
✓ |
||||||||||||
|
✓ |
||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
||||||||||||
|
✓ |
||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
||||||||||||
|
✓ |
||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
||||||||||||
|
✓ |
||||||||||||
|
✓ |
||||||||||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||||
|
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
|||||||||||||
|
✓ |
✓ |
✓ |
✓ |
|||||||||
|
✓ |
✓ |
✓ |
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
✓ |
✓ |
✓ |
||||||||||
|
† |
||||||||||||
|
✓ |
✓ |
† |
||||||||||
|
|||||||||||||
|
|||||||||||||
|
† |
||||||||||||
|
† |
||||||||||||
|
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
The |
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
The |
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
|
↓ |
||||||||||||
Format |
|||||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
|
† |
† |
† |
||||||||||
The |
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_CUBIC_BIT_IMG
must be
supported for the following formats:
-
VK_FORMAT_R4G4_UNORM_PACK8
-
VK_FORMAT_R4G4B4A4_UNORM_PACK16
-
VK_FORMAT_B4G4R4A4_UNORM_PACK16
-
VK_FORMAT_R5G6B5_UNORM_PACK16
-
VK_FORMAT_B5G6R5_UNORM_PACK16
-
VK_FORMAT_R5G5B5A1_UNORM_PACK16
-
VK_FORMAT_B5G5R5A1_UNORM_PACK16
-
VK_FORMAT_A1R5G5B5_UNORM_PACK16
-
VK_FORMAT_R8_UNORM
-
VK_FORMAT_R8_SNORM
-
VK_FORMAT_R8_SRGB
-
VK_FORMAT_R8G8_UNORM
-
VK_FORMAT_R8G8_SNORM
-
VK_FORMAT_R8G8_SRGB
-
VK_FORMAT_R8G8B8_UNORM
-
VK_FORMAT_R8G8B8_SNORM
-
VK_FORMAT_R8G8B8_SRGB
-
VK_FORMAT_B8G8R8_UNORM
-
VK_FORMAT_B8G8R8_SNORM
-
VK_FORMAT_B8G8R8_SRGB
-
VK_FORMAT_R8G8B8A8_UNORM
-
VK_FORMAT_R8G8B8A8_SNORM
-
VK_FORMAT_R8G8B8A8_SRGB
-
VK_FORMAT_B8G8R8A8_UNORM
-
VK_FORMAT_B8G8R8A8_SNORM
-
VK_FORMAT_B8G8R8A8_SRGB
-
VK_FORMAT_A8B8G8R8_UNORM_PACK32
-
VK_FORMAT_A8B8G8R8_SNORM_PACK32
-
VK_FORMAT_A8B8G8R8_USCALED_PACK32
-
VK_FORMAT_A8B8G8R8_SSCALED_PACK32
-
VK_FORMAT_A8B8G8R8_UINT_PACK32
-
VK_FORMAT_A8B8G8R8_SINT_PACK32
-
VK_FORMAT_A8B8G8R8_SRGB_PACK32
If ETC2 compressed formats are supported, the following additional formats
must support VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_CUBIC_BIT_IMG
:
-
VK_FORMAT_ETC2_R8G8B8_UNORM_BLOCK
-
VK_FORMAT_ETC2_R8G8B8_SRGB_BLOCK
-
VK_FORMAT_ETC2_R8G8B8A1_UNORM_BLOCK
-
VK_FORMAT_ETC2_R8G8B8A1_SRGB_BLOCK
-
VK_FORMAT_ETC2_R8G8B8A8_UNORM_BLOCK
-
VK_FORMAT_ETC2_R8G8B8A8_SRGB_BLOCK
To be used with VkImageView
with subresourceRange.aspectMask
=
VK_IMAGE_ASPECT_COLOR_BIT
, sampler
Y’CBCR conversion must be enabled for the following formats:
|
↓ |
||||||||||
|
↓ |
||||||||||
|
↓ |
||||||||||
|
↓ |
||||||||||
|
↓ |
||||||||||
|
↓ |
||||||||||
|
↓ |
||||||||||
|
↓ |
||||||||||
|
↓ |
||||||||||
|
↓ |
||||||||||
Format |
Planes |
||||||||||
|
1 |
||||||||||
|
1 |
||||||||||
|
3 |
✓ |
✓ |
✓ |
✓ |
||||||
|
2 |
✓ |
✓ |
✓ |
✓ |
||||||
|
3 |
||||||||||
|
2 |
||||||||||
|
3 |
||||||||||
|
1 |
||||||||||
|
1 |
||||||||||
|
1 |
||||||||||
|
3 |
||||||||||
|
2 |
||||||||||
|
3 |
||||||||||
|
2 |
||||||||||
|
3 |
||||||||||
|
1 |
||||||||||
|
1 |
||||||||||
|
1 |
||||||||||
|
3 |
||||||||||
|
2 |
||||||||||
|
3 |
||||||||||
|
2 |
||||||||||
|
3 |
||||||||||
|
1 |
||||||||||
|
1 |
||||||||||
|
3 |
||||||||||
|
2 |
||||||||||
|
3 |
||||||||||
|
2 |
||||||||||
|
3 |
||||||||||
Format features marked ✓ must be supported only if VkPhysicalDeviceSamplerYcbcrConversionFeatures is enabled |
Implementations are not required to support the
VK_IMAGE_CREATE_SPARSE_BINDING_BIT
,
VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
, or
VK_IMAGE_CREATE_SPARSE_ALIASED_BIT
VkImageCreateFlags for the
above formats that require sampler Y’CBCR
conversion.
To determine whether the implementation supports sparse image creation flags
with these formats use vkGetPhysicalDeviceImageFormatProperties or
vkGetPhysicalDeviceImageFormatProperties2.
VK_FORMAT_FEATURE_FRAGMENT_DENSITY_MAP_BIT_EXT
must be supported for
the following formats if the fragment
density map feature is enabled:
-
VK_FORMAT_R8G8_UNORM
35.5. Additional Image Capabilities
In addition to the minimum capabilities described in the previous sections (Limits and Formats), implementations may support additional capabilities for certain types of images. For example, larger dimensions or additional sample counts for certain image types, or additional capabilities for linear tiling format images.
To query additional capabilities specific to image types, call:
VkResult vkGetPhysicalDeviceImageFormatProperties(
VkPhysicalDevice physicalDevice,
VkFormat format,
VkImageType type,
VkImageTiling tiling,
VkImageUsageFlags usage,
VkImageCreateFlags flags,
VkImageFormatProperties* pImageFormatProperties);
-
physicalDevice
is the physical device from which to query the image capabilities. -
format
is a VkFormat value specifying the image format, corresponding to VkImageCreateInfo::format
. -
type
is a VkImageType value specifying the image type, corresponding to VkImageCreateInfo::imageType
. -
tiling
is a VkImageTiling value specifying the image tiling, corresponding to VkImageCreateInfo::tiling
. -
usage
is a bitmask of VkImageUsageFlagBits specifying the intended usage of the image, corresponding to VkImageCreateInfo::usage
. -
flags
is a bitmask of VkImageCreateFlagBits specifying additional parameters of the image, corresponding to VkImageCreateInfo::flags
. -
pImageFormatProperties
points to an instance of the VkImageFormatProperties structure in which capabilities are returned.
The format
, type
, tiling
, usage
, and flags
parameters correspond to parameters that would be consumed by
vkCreateImage (as members of VkImageCreateInfo
).
If format
is not a supported image format, or if the combination of
format
, type
, tiling
, usage
, and flags
is not
supported for images, then vkGetPhysicalDeviceImageFormatProperties
returns VK_ERROR_FORMAT_NOT_SUPPORTED
.
The limitations on an image format that are reported by
vkGetPhysicalDeviceImageFormatProperties
have the following property:
if usage1
and usage2
of type VkImageUsageFlags are such that
the bits set in usage1
are a subset of the bits set in usage2
, and
flags1
and flags2
of type VkImageCreateFlags are such that
the bits set in flags1
are a subset of the bits set in flags2
,
then the limitations for usage1
and flags1
must be no more strict
than the limitations for usage2
and flags2
, for all values of
format
, type
, and tiling
.
The VkImageFormatProperties
structure is defined as:
typedef struct VkImageFormatProperties {
VkExtent3D maxExtent;
uint32_t maxMipLevels;
uint32_t maxArrayLayers;
VkSampleCountFlags sampleCounts;
VkDeviceSize maxResourceSize;
} VkImageFormatProperties;
-
maxExtent
are the maximum image dimensions. See the Allowed Extent Values section below for how these values are constrained bytype
. -
maxMipLevels
is the maximum number of mipmap levels.maxMipLevels
must be equal to the number of levels in the complete mipmap chain based on themaxExtent.width
,maxExtent.height
, andmaxExtent.depth
, except when one of the following conditions is true, in which case it may instead be1
:-
vkGetPhysicalDeviceImageFormatProperties
::tiling
wasVK_IMAGE_TILING_LINEAR
-
VkPhysicalDeviceImageFormatInfo2::
tiling
wasVK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
-
the VkPhysicalDeviceImageFormatInfo2::
pNext
chain included an instance of VkPhysicalDeviceExternalImageFormatInfo with a handle type included in thehandleTypes
member for which mipmap image support is not required -
image
format
is one of those listed in Formats requiring sampler Y’CBCR conversion forVK_IMAGE_ASPECT_COLOR_BIT
image views -
flags
containsVK_IMAGE_CREATE_SUBSAMPLED_BIT_EXT
-
-
maxArrayLayers
is the maximum number of array layers. -
If
tiling
isVK_IMAGE_TILING_LINEAR
, thenmaxArrayLayers
must either be equal to 1 or be no less than VkPhysicalDeviceLimits::maxImageArrayLayers
. -
If
tiling
isVK_IMAGE_TILING_OPTIMAL
andtype
isVK_IMAGE_TYPE_3D
, thenmaxArrayLayers
must either be equal to 1 or be no less than VkPhysicalDeviceLimits::maxImageArrayLayers
. -
If
tiling
isVK_IMAGE_TILING_OPTIMAL
andtype
is notVK_IMAGE_TYPE_3D
, thenmaxArrayLayers
must be no less than VkPhysicalDeviceLimits::maxImageArrayLayers
. -
If
tiling
isVK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
, thenmaxArrayLayers
must not be 0. -
sampleCounts
is a bitmask of VkSampleCountFlagBits specifying all the supported sample counts for this image as described below. -
maxResourceSize
is an upper bound on the total image size in bytes, inclusive of all image subresources. Implementations may have an address space limit on total size of a resource, which is advertised by this property.maxResourceSize
must be at least 231.
Note
There is no mechanism to query the size of an image before creating it, to
compare that size against |
If the combination of parameters to
vkGetPhysicalDeviceImageFormatProperties
is not supported by the
implementation for use in vkCreateImage, then all members of
VkImageFormatProperties
will be filled with zero.
Note
Filling |
To determine the image capabilities compatible with an external memory handle type, call:
VkResult vkGetPhysicalDeviceExternalImageFormatPropertiesNV(
VkPhysicalDevice physicalDevice,
VkFormat format,
VkImageType type,
VkImageTiling tiling,
VkImageUsageFlags usage,
VkImageCreateFlags flags,
VkExternalMemoryHandleTypeFlagsNV externalHandleType,
VkExternalImageFormatPropertiesNV* pExternalImageFormatProperties);
-
physicalDevice
is the physical device from which to query the image capabilities -
format
is the image format, corresponding to VkImageCreateInfo::format
. -
type
is the image type, corresponding to VkImageCreateInfo::imageType
. -
tiling
is the image tiling, corresponding to VkImageCreateInfo::tiling
. -
usage
is the intended usage of the image, corresponding to VkImageCreateInfo::usage
. -
flags
is a bitmask describing additional parameters of the image, corresponding to VkImageCreateInfo::flags
. -
externalHandleType
is either one of the bits from VkExternalMemoryHandleTypeFlagBitsNV, or 0. -
pExternalImageFormatProperties
points to an instance of the VkExternalImageFormatPropertiesNV structure in which capabilities are returned.
If externalHandleType
is 0,
pExternalImageFormatProperties
::imageFormatProperties will return the
same values as a call to vkGetPhysicalDeviceImageFormatProperties, and
the other members of pExternalImageFormatProperties
will all be 0.
Otherwise, they are filled in as described for
VkExternalImageFormatPropertiesNV.
The VkExternalImageFormatPropertiesNV
structure is defined as:
typedef struct VkExternalImageFormatPropertiesNV {
VkImageFormatProperties imageFormatProperties;
VkExternalMemoryFeatureFlagsNV externalMemoryFeatures;
VkExternalMemoryHandleTypeFlagsNV exportFromImportedHandleTypes;
VkExternalMemoryHandleTypeFlagsNV compatibleHandleTypes;
} VkExternalImageFormatPropertiesNV;
-
imageFormatProperties
will be filled in as when calling vkGetPhysicalDeviceImageFormatProperties, but the values returned may vary depending on the external handle type requested. -
externalMemoryFeatures
is a bitmask of VkExternalMemoryFeatureFlagBitsNV, indicating properties of the external memory handle type (vkGetPhysicalDeviceExternalImageFormatPropertiesNV::externalHandleType
) being queried, or 0 if the external memory handle type is 0. -
exportFromImportedHandleTypes
is a bitmask of VkExternalMemoryHandleTypeFlagBitsNV containing a bit set for every external handle type that may be used to create memory from which the handles of the type specified in vkGetPhysicalDeviceExternalImageFormatPropertiesNV::externalHandleType
can be exported, or 0 if the external memory handle type is 0. -
compatibleHandleTypes
is a bitmask of VkExternalMemoryHandleTypeFlagBitsNV containing a bit set for every external handle type that may be specified simultaneously with the handle type specified by vkGetPhysicalDeviceExternalImageFormatPropertiesNV::externalHandleType
when calling vkAllocateMemory, or 0 if the external memory handle type is 0.compatibleHandleTypes
will always contain vkGetPhysicalDeviceExternalImageFormatPropertiesNV::externalHandleType
Bits which can be set in
VkExternalImageFormatPropertiesNV::externalMemoryFeatures
,
indicating properties of the external memory handle type, are:
typedef enum VkExternalMemoryFeatureFlagBitsNV {
VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT_NV = 0x00000001,
VK_EXTERNAL_MEMORY_FEATURE_EXPORTABLE_BIT_NV = 0x00000002,
VK_EXTERNAL_MEMORY_FEATURE_IMPORTABLE_BIT_NV = 0x00000004,
} VkExternalMemoryFeatureFlagBitsNV;
-
VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT_NV
specifies that external memory of the specified type must be created as a dedicated allocation when used in the manner specified. -
VK_EXTERNAL_MEMORY_FEATURE_EXPORTABLE_BIT_NV
specifies that the implementation supports exporting handles of the specified type. -
VK_EXTERNAL_MEMORY_FEATURE_IMPORTABLE_BIT_NV
specifies that the implementation supports importing handles of the specified type.
typedef VkFlags VkExternalMemoryFeatureFlagsNV;
VkExternalMemoryFeatureFlagsNV
is a bitmask type for setting a mask of
zero or more VkExternalMemoryFeatureFlagBitsNV.
To query additional capabilities specific to image types, call:
VkResult vkGetPhysicalDeviceImageFormatProperties2(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceImageFormatInfo2* pImageFormatInfo,
VkImageFormatProperties2* pImageFormatProperties);
or the equivalent command
VkResult vkGetPhysicalDeviceImageFormatProperties2KHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceImageFormatInfo2* pImageFormatInfo,
VkImageFormatProperties2* pImageFormatProperties);
-
physicalDevice
is the physical device from which to query the image capabilities. -
pImageFormatInfo
points to an instance of the VkPhysicalDeviceImageFormatInfo2 structure, describing the parameters that would be consumed by vkCreateImage. -
pImageFormatProperties
points to an instance of the VkImageFormatProperties2 structure in which capabilities are returned.
vkGetPhysicalDeviceImageFormatProperties2
behaves similarly to
vkGetPhysicalDeviceImageFormatProperties, with the ability to return
extended information in a pNext
chain of output structures.
The VkPhysicalDeviceImageFormatInfo2
structure is defined as:
typedef struct VkPhysicalDeviceImageFormatInfo2 {
VkStructureType sType;
const void* pNext;
VkFormat format;
VkImageType type;
VkImageTiling tiling;
VkImageUsageFlags usage;
VkImageCreateFlags flags;
} VkPhysicalDeviceImageFormatInfo2;
or the equivalent
typedef VkPhysicalDeviceImageFormatInfo2 VkPhysicalDeviceImageFormatInfo2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. ThepNext
chain ofVkPhysicalDeviceImageFormatInfo2
is used to provide additional image parameters tovkGetPhysicalDeviceImageFormatProperties2
. -
format
is a VkFormat value indicating the image format, corresponding to VkImageCreateInfo::format
. -
type
is a VkImageType value indicating the image type, corresponding to VkImageCreateInfo::imageType
. -
tiling
is a VkImageTiling value indicating the image tiling, corresponding to VkImageCreateInfo::tiling
. -
usage
is a bitmask of VkImageUsageFlagBits indicating the intended usage of the image, corresponding to VkImageCreateInfo::usage
. -
flags
is a bitmask of VkImageCreateFlagBits indicating additional parameters of the image, corresponding to VkImageCreateInfo::flags
.
The members of VkPhysicalDeviceImageFormatInfo2
correspond to the
arguments to vkGetPhysicalDeviceImageFormatProperties, with
sType
and pNext
added for extensibility.
The VkImageFormatProperties2
structure is defined as:
typedef struct VkImageFormatProperties2 {
VkStructureType sType;
void* pNext;
VkImageFormatProperties imageFormatProperties;
} VkImageFormatProperties2;
or the equivalent
typedef VkImageFormatProperties2 VkImageFormatProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. ThepNext
chain ofVkImageFormatProperties2
is used to allow the specification of additional capabilities to be returned fromvkGetPhysicalDeviceImageFormatProperties2
. -
imageFormatProperties
is an instance of a VkImageFormatProperties structure in which capabilities are returned.
If the combination of parameters to
vkGetPhysicalDeviceImageFormatProperties2
is not supported by the
implementation for use in vkCreateImage, then all members of
imageFormatProperties
will be filled with zero.
Note
Filling |
To determine if texture gather functions that take explicit LOD and/or bias
argument values can be used with a given image format, add
VkImageFormatProperties2 to the pNext
chain of the
VkPhysicalDeviceImageFormatInfo2 structure and
VkTextureLODGatherFormatPropertiesAMD to the pNext
chain of the
VkImageFormatProperties2 structure.
The VkTextureLODGatherFormatPropertiesAMD
structure is defined as:
typedef struct VkTextureLODGatherFormatPropertiesAMD {
VkStructureType sType;
void* pNext;
VkBool32 supportsTextureGatherLODBiasAMD;
} VkTextureLODGatherFormatPropertiesAMD;
-
sType
is the type of this structure. -
pNext
isNULL
. -
supportsTextureGatherLODBiasAMD
tells if the image format can be used with texture gather bias/LOD functions, as introduced by theVK_AMD_texture_gather_bias_lod
extension. This field is set by the implementation. User-specified value is ignored.
To determine the image capabilities compatible with an external memory
handle type, add VkPhysicalDeviceExternalImageFormatInfo to the
pNext
chain of the VkPhysicalDeviceImageFormatInfo2 structure
and VkExternalImageFormatProperties
to the pNext
chain of the
VkImageFormatProperties2 structure.
The VkPhysicalDeviceExternalImageFormatInfo
structure is defined as:
typedef struct VkPhysicalDeviceExternalImageFormatInfo {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagBits handleType;
} VkPhysicalDeviceExternalImageFormatInfo;
or the equivalent
typedef VkPhysicalDeviceExternalImageFormatInfo VkPhysicalDeviceExternalImageFormatInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
handleType
is a VkExternalMemoryHandleTypeFlagBits value specifying the memory handle type that will be used with the memory associated with the image.
If handleType
is 0, vkGetPhysicalDeviceImageFormatProperties2
will behave as if VkPhysicalDeviceExternalImageFormatInfo was not
present, and VkExternalImageFormatProperties will be ignored.
If handleType
is not compatible with the format
, type
,
tiling
, usage
, and flags
specified in
VkPhysicalDeviceImageFormatInfo2, then
vkGetPhysicalDeviceImageFormatProperties2 returns
VK_ERROR_FORMAT_NOT_SUPPORTED
.
Possible values of
VkPhysicalDeviceExternalImageFormatInfo::handleType
, specifying
an external memory handle type, are:
typedef enum VkExternalMemoryHandleTypeFlagBits {
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT = 0x00000001,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT = 0x00000002,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT = 0x00000004,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_BIT = 0x00000008,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_KMT_BIT = 0x00000010,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_HEAP_BIT = 0x00000020,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_RESOURCE_BIT = 0x00000040,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT = 0x00000200,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID = 0x00000400,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT = 0x00000080,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_MAPPED_FOREIGN_MEMORY_BIT_EXT = 0x00000100,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT_KHR = VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_KHR = VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT_KHR = VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_BIT_KHR = VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_BIT,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_KMT_BIT_KHR = VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_KMT_BIT,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_HEAP_BIT_KHR = VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_HEAP_BIT,
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_RESOURCE_BIT_KHR = VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_RESOURCE_BIT,
} VkExternalMemoryHandleTypeFlagBits;
or the equivalent
typedef VkExternalMemoryHandleTypeFlagBits VkExternalMemoryHandleTypeFlagBitsKHR;
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT
specifies a POSIX file descriptor handle that has only limited valid usage outside of Vulkan and other compatible APIs. It must be compatible with the POSIX system callsdup
,dup2
,close
, and the non-standard system calldup3
. Additionally, it must be transportable over a socket using anSCM_RIGHTS
control message. It owns a reference to the underlying memory resource represented by its Vulkan memory object. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT
specifies an NT handle that has only limited valid usage outside of Vulkan and other compatible APIs. It must be compatible with the functionsDuplicateHandle
,CloseHandle
,CompareObjectHandles
,GetHandleInformation
, andSetHandleInformation
. It owns a reference to the underlying memory resource represented by its Vulkan memory object. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT
specifies a global share handle that has only limited valid usage outside of Vulkan and other compatible APIs. It is not compatible with any native APIs. It does not own a reference to the underlying memory resource represented its Vulkan memory object, and will therefore become invalid when all Vulkan memory objects associated with it are destroyed. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_BIT
specifies an NT handle returned byIDXGIResource1
::CreateSharedHandle
referring to a Direct3D 10 or 11 texture resource. It owns a reference to the memory used by the Direct3D resource. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_KMT_BIT
specifies a global share handle returned byIDXGIResource
::GetSharedHandle
referring to a Direct3D 10 or 11 texture resource. It does not own a reference to the underlying Direct3D resource, and will therefore become invalid when all Vulkan memory objects and Direct3D resources associated with it are destroyed. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_HEAP_BIT
specifies an NT handle returned byID3D12Device
::CreateSharedHandle
referring to a Direct3D 12 heap resource. It owns a reference to the resources used by the Direct3D heap. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_RESOURCE_BIT
specifies an NT handle returned byID3D12Device
::CreateSharedHandle
referring to a Direct3D 12 committed resource. It owns a reference to the memory used by the Direct3D resource. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT
specifies a host pointer returned by a host memory allocation command. It does not own a reference to the underlying memory resource, and will therefore become invalid if the host memory is freed. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_MAPPED_FOREIGN_MEMORY_BIT_EXT
specifies a host pointer to host mapped foreign memory. It does not own a reference to the underlying memory resource, and will therefore become invalid if the foreign memory is unmapped or otherwise becomes no longer available. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT
is a file descriptor for a Linux dma_buf. It owns a reference to the underlying memory resource represented by its Vulkan memory object. -
VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID
specifies anAHardwareBuffer
object defined by the Android NDK. See Android Hardware Buffers for more details of this handle type.
Some external memory handle types can only be shared within the same underlying physical device and/or the same driver version, as defined in the following table:
Handle type |
|
|
|
Must match |
Must match |
|
Must match |
Must match |
|
Must match |
Must match |
|
Must match |
Must match |
|
Must match |
Must match |
|
Must match |
Must match |
|
Must match |
Must match |
|
No restriction |
No restriction |
|
No restriction |
No restriction |
|
No restriction |
No restriction |
|
No restriction |
No restriction |
Note
The above table does not restrict the drivers and devices with which
|
Note
Even though the above table does not restrict the drivers and devices with
which |
typedef VkFlags VkExternalMemoryHandleTypeFlags;
or the equivalent
typedef VkExternalMemoryHandleTypeFlags VkExternalMemoryHandleTypeFlagsKHR;
VkExternalMemoryHandleTypeFlags
is a bitmask type for setting a mask
of zero or more VkExternalMemoryHandleTypeFlagBits.
The VkExternalImageFormatProperties
structure is defined as:
typedef struct VkExternalImageFormatProperties {
VkStructureType sType;
void* pNext;
VkExternalMemoryProperties externalMemoryProperties;
} VkExternalImageFormatProperties;
or the equivalent
typedef VkExternalImageFormatProperties VkExternalImageFormatPropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
externalMemoryProperties
is an instance of the VkExternalMemoryProperties structure specifying various capabilities of the external handle type when used with the specified image creation parameters.
The VkExternalMemoryProperties
structure is defined as:
typedef struct VkExternalMemoryProperties {
VkExternalMemoryFeatureFlags externalMemoryFeatures;
VkExternalMemoryHandleTypeFlags exportFromImportedHandleTypes;
VkExternalMemoryHandleTypeFlags compatibleHandleTypes;
} VkExternalMemoryProperties;
or the equivalent
typedef VkExternalMemoryProperties VkExternalMemoryPropertiesKHR;
-
externalMemoryFeatures
is a bitmask of VkExternalMemoryFeatureFlagBits specifying the features ofhandleType
. -
exportFromImportedHandleTypes
is a bitmask of VkExternalMemoryHandleTypeFlagBits specifying which types of imported handlehandleType
can be exported from. -
compatibleHandleTypes
is a bitmask of VkExternalMemoryHandleTypeFlagBits specifying handle types which can be specified at the same time ashandleType
when creating an image compatible with external memory.
compatibleHandleTypes
must include at least handleType
.
Inclusion of a handle type in compatibleHandleTypes
does not imply the
values returned in VkImageFormatProperties2 will be the same when
VkPhysicalDeviceExternalImageFormatInfo::handleType
is set to
that type.
The application is responsible for querying the capabilities of all handle
types intended for concurrent use in a single image and intersecting them to
obtain the compatible set of capabilities.
Bits which may be set in
VkExternalMemoryProperties::externalMemoryFeatures
, specifying
features of an external memory handle type, are:
typedef enum VkExternalMemoryFeatureFlagBits {
VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT = 0x00000001,
VK_EXTERNAL_MEMORY_FEATURE_EXPORTABLE_BIT = 0x00000002,
VK_EXTERNAL_MEMORY_FEATURE_IMPORTABLE_BIT = 0x00000004,
VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT_KHR = VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT,
VK_EXTERNAL_MEMORY_FEATURE_EXPORTABLE_BIT_KHR = VK_EXTERNAL_MEMORY_FEATURE_EXPORTABLE_BIT,
VK_EXTERNAL_MEMORY_FEATURE_IMPORTABLE_BIT_KHR = VK_EXTERNAL_MEMORY_FEATURE_IMPORTABLE_BIT,
} VkExternalMemoryFeatureFlagBits;
or the equivalent
typedef VkExternalMemoryFeatureFlagBits VkExternalMemoryFeatureFlagBitsKHR;
-
VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT
specifies that images or buffers created with the specified parameters and handle type must use the mechanisms defined by VkMemoryDedicatedRequirements and VkMemoryDedicatedAllocateInfo to create (or import) a dedicated allocation for the image or buffer. -
VK_EXTERNAL_MEMORY_FEATURE_EXPORTABLE_BIT
specifies that handles of this type can be exported from Vulkan memory objects. -
VK_EXTERNAL_MEMORY_FEATURE_IMPORTABLE_BIT
specifies that handles of this type can be imported as Vulkan memory objects.
Because their semantics in external APIs roughly align with that of an image
or buffer with a dedicated allocation in Vulkan, implementations are
required to report VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT
for
the following external handle types:
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_BIT
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_KMT_BIT
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_RESOURCE_BIT
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID
for images only
Implementations must not report
VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT
for buffers with
external handle type
VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID
.
typedef VkFlags VkExternalMemoryFeatureFlags;
or the equivalent
typedef VkExternalMemoryFeatureFlags VkExternalMemoryFeatureFlagsKHR;
VkExternalMemoryFeatureFlags
is a bitmask type for setting a mask of
zero or more VkExternalMemoryFeatureFlagBits.
To query the image capabilities that are compatible with a
Linux DRM format modifier, set
VkPhysicalDeviceImageFormatInfo2::tiling
to
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
and add
VkPhysicalDeviceImageDrmFormatModifierInfoEXT to the pNext
chain
of VkPhysicalDeviceImageFormatInfo2.
The VkPhysicalDeviceImageDrmFormatModifierInfoEXT structure is defined as:
typedef struct VkPhysicalDeviceImageDrmFormatModifierInfoEXT {
VkStructureType sType;
const void* pNext;
uint64_t drmFormatModifier;
VkSharingMode sharingMode;
uint32_t queueFamilyIndexCount;
const uint32_t* pQueueFamilyIndices;
} VkPhysicalDeviceImageDrmFormatModifierInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
drmFormatModifier
is the image’s Linux DRM format modifier, corresponding to VkImageDrmFormatModifierExplicitCreateInfoEXT::modifier
or to VkImageDrmFormatModifierListCreateInfoEXT::pModifiers
. -
sharingMode
specifies how the image will be accessed by multiple queue families. -
queueFamilyIndexCount
is the number of entries in thepQueueFamilyIndices
array. -
pQueueFamilyIndices
is a list of queue families that will access the image (ignored ifsharingMode
is notVK_SHARING_MODE_CONCURRENT
).
If the drmFormatModifier
is incompatible with the parameters specified
in VkPhysicalDeviceImageFormatInfo2 and its pNext
chain, then
vkGetPhysicalDeviceImageFormatProperties2 returns
VK_ERROR_FORMAT_NOT_SUPPORTED
.
The implementation must support the query of any drmFormatModifier
,
including unknown and invalid modifier values.
To determine the number of combined image samplers required to support a
multi-planar format, add VkSamplerYcbcrConversionImageFormatProperties
to the pNext
chain of the VkImageFormatProperties2 structure in
a call to vkGetPhysicalDeviceImageFormatProperties2
.
The VkSamplerYcbcrConversionImageFormatProperties
structure is defined
as:
typedef struct VkSamplerYcbcrConversionImageFormatProperties {
VkStructureType sType;
void* pNext;
uint32_t combinedImageSamplerDescriptorCount;
} VkSamplerYcbcrConversionImageFormatProperties;
or the equivalent
typedef VkSamplerYcbcrConversionImageFormatProperties VkSamplerYcbcrConversionImageFormatPropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
combinedImageSamplerDescriptorCount
is the number of combined image sampler descriptors that the implementation uses to access the format.
combinedImageSamplerDescriptorCount
affects only the count towards the
maxDescriptorSetSamplers
, maxDescriptorSetSampledImages
,
maxPerStageDescriptorSamplers
, and
maxPerStageDescriptorSampledImages
limits, and does not affect binding
numbers in the VkDescriptorSetLayoutBinding.
combinedImageSamplerDescriptorCount
is a number between 1 and the
number of planes in the format.
To obtain optimal Android hardware buffer usage flags for specific image
creation parameters, attach an instance of
VkAndroidHardwareBufferUsageANDROID
to the pNext
chain of a
VkImageFormatProperties2 structure passed to
vkGetPhysicalDeviceImageFormatProperties2.
This structure is defined as:
typedef struct VkAndroidHardwareBufferUsageANDROID {
VkStructureType sType;
void* pNext;
uint64_t androidHardwareBufferUsage;
} VkAndroidHardwareBufferUsageANDROID;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
androidHardwareBufferUsage
returns the Android hardware buffer usage flags.
The androidHardwareBufferUsage
field must include Android hardware
buffer usage flags listed in the
AHardwareBuffer Usage
Equivalence table when the corresponding Vulkan image usage or image
creation flags are included in the usage
or flags
fields of
VkPhysicalDeviceImageFormatInfo2.
It must include at least one GPU usage flag
(AHARDWAREBUFFER_USAGE_GPU_
*), even if none of the corresponding Vulkan
usages or flags are requested.
Note
Requiring at least one GPU usage flag ensures that Android hardware buffer memory will be allocated in a memory pool accessible to the Vulkan implementation, and that specializing the memory layout based on usage flags does not prevent it from being compatible with Vulkan. Implementations may avoid unnecessary restrictions caused by this requirement by using vendor usage flags to indicate that only the Vulkan uses indicated in VkImageFormatProperties2 are required. |
35.5.1. Supported Sample Counts
vkGetPhysicalDeviceImageFormatProperties
returns a bitmask of
VkSampleCountFlagBits in sampleCounts
specifying the supported
sample counts for the image parameters.
sampleCounts
will be set to VK_SAMPLE_COUNT_1_BIT
if at least
one of the following conditions is true:
-
tiling
isVK_IMAGE_TILING_LINEAR
-
type
is notVK_IMAGE_TYPE_2D
-
flags
containsVK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
-
Neither the
VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT
flag nor theVK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT
flag inVkFormatProperties
::optimalTilingFeatures
returned by vkGetPhysicalDeviceFormatProperties is set -
VkPhysicalDeviceExternalImageFormatInfoKHR::
handleType
is an external handle type for which multisampled image support is not required. -
usage
containsVK_IMAGE_USAGE_SHADING_RATE_IMAGE_BIT_NV
-
usage
containsVK_IMAGE_USAGE_FRAGMENT_DENSITY_MAP_BIT_EXT
Otherwise, the bits set in sampleCounts
will be the sample counts
supported for the specified values of usage
and format
.
For each bit set in usage
, the supported sample counts relate to the
limits in VkPhysicalDeviceLimits
as follows:
-
If
usage
includesVK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
andformat
is a floating- or fixed-point color format, a superset ofVkPhysicalDeviceLimits
::framebufferColorSampleCounts
-
If
usage
includesVK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
, andformat
includes a depth aspect, a superset ofVkPhysicalDeviceLimits
::framebufferDepthSampleCounts
-
If
usage
includesVK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
, andformat
includes a stencil aspect, a superset ofVkPhysicalDeviceLimits
::framebufferStencilSampleCounts
-
If
usage
includesVK_IMAGE_USAGE_SAMPLED_BIT
, andformat
includes a color aspect, a superset ofVkPhysicalDeviceLimits
::sampledImageColorSampleCounts
-
If
usage
includesVK_IMAGE_USAGE_SAMPLED_BIT
, andformat
includes a depth aspect, a superset ofVkPhysicalDeviceLimits
::sampledImageDepthSampleCounts
-
If
usage
includesVK_IMAGE_USAGE_SAMPLED_BIT
, andformat
is an integer format, a superset ofVkPhysicalDeviceLimits
::sampledImageIntegerSampleCounts
-
If
usage
includesVK_IMAGE_USAGE_STORAGE_BIT
, a superset ofVkPhysicalDeviceLimits
::storageImageSampleCounts
If multiple bits are set in usage
, sampleCounts
will be the
intersection of the per-usage values described above.
If none of the bits described above are set in usage
, then there is no
corresponding limit in VkPhysicalDeviceLimits
.
In this case, sampleCounts
must include at least
VK_SAMPLE_COUNT_1_BIT
.
35.5.2. Allowed Extent Values Based On Image Type
Implementations may support extent values larger than the required minimum/maximum values for certain types of images subject to the constraints below.
Note
Implementations must support images with dimensions up to the required minimum/maximum values for all types of images. It follows that the query for additional capabilities must return extent values that are at least as large as the required values. |
For VK_IMAGE_TYPE_1D
:
-
maxExtent.width
≥ VkPhysicalDeviceLimits.maxImageDimension1D
-
maxExtent.height
= 1 -
maxExtent.depth
= 1
For VK_IMAGE_TYPE_2D
when flags
does not contain
VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
:
-
maxExtent.width
≥ VkPhysicalDeviceLimits.maxImageDimension2D
-
maxExtent.height
≥ VkPhysicalDeviceLimits.maxImageDimension2D
-
maxExtent.depth
= 1
For VK_IMAGE_TYPE_2D
when flags
contains
VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
:
-
maxExtent.width
≥ VkPhysicalDeviceLimits.maxImageDimensionCube
-
maxExtent.height
≥ VkPhysicalDeviceLimits.maxImageDimensionCube
-
maxExtent.depth
= 1
For VK_IMAGE_TYPE_3D
:
-
maxExtent.width
≥ VkPhysicalDeviceLimits.maxImageDimension3D
-
maxExtent.height
≥ VkPhysicalDeviceLimits.maxImageDimension3D
-
maxExtent.depth
≥ VkPhysicalDeviceLimits.maxImageDimension3D
35.6. Additional Buffer Capabilities
In addition to the capabilities described in the previous sections (Limits and Formats), implementations may support additional buffer capabilities.
To query the external handle types supported by buffers, call:
void vkGetPhysicalDeviceExternalBufferProperties(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceExternalBufferInfo* pExternalBufferInfo,
VkExternalBufferProperties* pExternalBufferProperties);
or the equivalent command
void vkGetPhysicalDeviceExternalBufferPropertiesKHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceExternalBufferInfo* pExternalBufferInfo,
VkExternalBufferProperties* pExternalBufferProperties);
-
physicalDevice
is the physical device from which to query the buffer capabilities. -
pExternalBufferInfo
points to an instance of the VkPhysicalDeviceExternalBufferInfo structure, describing the parameters that would be consumed by vkCreateBuffer. -
pExternalBufferProperties
points to an instance of the VkExternalBufferProperties structure in which capabilities are returned.
The VkPhysicalDeviceExternalBufferInfo
structure is defined as:
typedef struct VkPhysicalDeviceExternalBufferInfo {
VkStructureType sType;
const void* pNext;
VkBufferCreateFlags flags;
VkBufferUsageFlags usage;
VkExternalMemoryHandleTypeFlagBits handleType;
} VkPhysicalDeviceExternalBufferInfo;
or the equivalent
typedef VkPhysicalDeviceExternalBufferInfo VkPhysicalDeviceExternalBufferInfoKHR;
-
sType
is the type of this structure -
pNext
is NULL or a pointer to an extension-specific structure. -
flags
is a bitmask of VkBufferCreateFlagBits describing additional parameters of the buffer, corresponding to VkBufferCreateInfo::flags
. -
usage
is a bitmask of VkBufferUsageFlagBits describing the intended usage of the buffer, corresponding to VkBufferCreateInfo::usage
. -
handleType
is a VkExternalMemoryHandleTypeFlagBits value specifying the memory handle type that will be used with the memory associated with the buffer.
The VkExternalBufferProperties
structure is defined as:
typedef struct VkExternalBufferProperties {
VkStructureType sType;
void* pNext;
VkExternalMemoryProperties externalMemoryProperties;
} VkExternalBufferProperties;
or the equivalent
typedef VkExternalBufferProperties VkExternalBufferPropertiesKHR;
-
sType
is the type of this structure -
pNext
is NULL or a pointer to an extension-specific structure. -
externalMemoryProperties
is an instance of the VkExternalMemoryProperties structure specifying various capabilities of the external handle type when used with the specified buffer creation parameters.
35.7. Optional Semaphore Capabilities
Semaphores may support import and export of their payload to external handles. To query the external handle types supported by semaphores, call:
void vkGetPhysicalDeviceExternalSemaphoreProperties(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceExternalSemaphoreInfo* pExternalSemaphoreInfo,
VkExternalSemaphoreProperties* pExternalSemaphoreProperties);
or the equivalent command
void vkGetPhysicalDeviceExternalSemaphorePropertiesKHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceExternalSemaphoreInfo* pExternalSemaphoreInfo,
VkExternalSemaphoreProperties* pExternalSemaphoreProperties);
-
physicalDevice
is the physical device from which to query the semaphore capabilities. -
pExternalSemaphoreInfo
points to an instance of the VkPhysicalDeviceExternalSemaphoreInfo structure, describing the parameters that would be consumed by vkCreateSemaphore. -
pExternalSemaphoreProperties
points to an instance of the VkExternalSemaphoreProperties structure in which capabilities are returned.
The VkPhysicalDeviceExternalSemaphoreInfo
structure is defined as:
typedef struct VkPhysicalDeviceExternalSemaphoreInfo {
VkStructureType sType;
const void* pNext;
VkExternalSemaphoreHandleTypeFlagBits handleType;
} VkPhysicalDeviceExternalSemaphoreInfo;
or the equivalent
typedef VkPhysicalDeviceExternalSemaphoreInfo VkPhysicalDeviceExternalSemaphoreInfoKHR;
-
sType
is the type of this structure -
pNext
is NULL or a pointer to an extension-specific structure. -
handleType
is a VkExternalSemaphoreHandleTypeFlagBits value specifying the external semaphore handle type for which capabilities will be returned.
Bits which may be set in
VkPhysicalDeviceExternalSemaphoreInfo::handleType
, specifying an
external semaphore handle type, are:
typedef enum VkExternalSemaphoreHandleTypeFlagBits {
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT = 0x00000001,
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT = 0x00000002,
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT = 0x00000004,
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT = 0x00000008,
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT = 0x00000010,
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT,
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT_KHR = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT,
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT_KHR = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT,
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT_KHR = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT,
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_KHR = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT,
} VkExternalSemaphoreHandleTypeFlagBits;
or the equivalent
typedef VkExternalSemaphoreHandleTypeFlagBits VkExternalSemaphoreHandleTypeFlagBitsKHR;
-
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT
specifies a POSIX file descriptor handle that has only limited valid usage outside of Vulkan and other compatible APIs. It must be compatible with the POSIX system callsdup
,dup2
,close
, and the non-standard system calldup3
. Additionally, it must be transportable over a socket using anSCM_RIGHTS
control message. It owns a reference to the underlying synchronization primitive represented by its Vulkan semaphore object. -
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT
specifies an NT handle that has only limited valid usage outside of Vulkan and other compatible APIs. It must be compatible with the functionsDuplicateHandle
,CloseHandle
,CompareObjectHandles
,GetHandleInformation
, andSetHandleInformation
. It owns a reference to the underlying synchronization primitive represented by its Vulkan semaphore object. -
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT
specifies a global share handle that has only limited valid usage outside of Vulkan and other compatible APIs. It is not compatible with any native APIs. It does not own a reference to the underlying synchronization primitive represented its Vulkan semaphore object, and will therefore become invalid when all Vulkan semaphore objects associated with it are destroyed. -
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT
specifies an NT handle returned byID3D12Device
::CreateSharedHandle
referring to a Direct3D 12 fence. It owns a reference to the underlying synchronization primitive associated with the Direct3D fence. -
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT
specifies a POSIX file descriptor handle to a Linux Sync File or Android Fence object. It can be used with any native API accepting a valid sync file or fence as input. It owns a reference to the underlying synchronization primitive associated with the file descriptor. Implementations which support importing this handle type must accept any type of sync or fence FD supported by the native system they are running on.
Note
Handles of type |
Some external semaphore handle types can only be shared within the same underlying physical device and/or the same driver version, as defined in the following table:
Handle type |
|
|
|
Must match |
Must match |
|
Must match |
Must match |
|
Must match |
Must match |
|
Must match |
Must match |
|
No restriction |
No restriction |
typedef VkFlags VkExternalSemaphoreHandleTypeFlags;
or the equivalent
typedef VkExternalSemaphoreHandleTypeFlags VkExternalSemaphoreHandleTypeFlagsKHR;
VkExternalSemaphoreHandleTypeFlags
is a bitmask type for setting a
mask of zero or more VkExternalSemaphoreHandleTypeFlagBits.
The VkExternalSemaphoreProperties
structure is defined as:
typedef struct VkExternalSemaphoreProperties {
VkStructureType sType;
void* pNext;
VkExternalSemaphoreHandleTypeFlags exportFromImportedHandleTypes;
VkExternalSemaphoreHandleTypeFlags compatibleHandleTypes;
VkExternalSemaphoreFeatureFlags externalSemaphoreFeatures;
} VkExternalSemaphoreProperties;
or the equivalent
typedef VkExternalSemaphoreProperties VkExternalSemaphorePropertiesKHR;
-
exportFromImportedHandleTypes
is a bitmask of VkExternalSemaphoreHandleTypeFlagBits specifying which types of imported handlehandleType
can be exported from. -
compatibleHandleTypes
is a bitmask of VkExternalSemaphoreHandleTypeFlagBits specifying handle types which can be specified at the same time ashandleType
when creating a semaphore. -
externalSemaphoreFeatures
is a bitmask of VkExternalSemaphoreFeatureFlagBits describing the features ofhandleType
.
If handleType
is not supported by the implementation, then
VkExternalSemaphoreProperties::externalSemaphoreFeatures
will be
set to zero.
Possible values of
VkExternalSemaphoreProperties::externalSemaphoreFeatures
,
specifying the features of an external semaphore handle type, are:
typedef enum VkExternalSemaphoreFeatureFlagBits {
VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT = 0x00000001,
VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT = 0x00000002,
VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT_KHR = VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT,
VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT_KHR = VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT,
} VkExternalSemaphoreFeatureFlagBits;
or the equivalent
typedef VkExternalSemaphoreFeatureFlagBits VkExternalSemaphoreFeatureFlagBitsKHR;
-
VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT
specifies that handles of this type can be exported from Vulkan semaphore objects. -
VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT
specifies that handles of this type can be imported as Vulkan semaphore objects.
typedef VkFlags VkExternalSemaphoreFeatureFlags;
or the equivalent
typedef VkExternalSemaphoreFeatureFlags VkExternalSemaphoreFeatureFlagsKHR;
VkExternalSemaphoreFeatureFlags
is a bitmask type for setting a mask
of zero or more VkExternalSemaphoreFeatureFlagBits.
35.8. Optional Fence Capabilities
Fences may support import and export of their payload to external handles. To query the external handle types supported by fences, call:
void vkGetPhysicalDeviceExternalFenceProperties(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceExternalFenceInfo* pExternalFenceInfo,
VkExternalFenceProperties* pExternalFenceProperties);
or the equivalent command
void vkGetPhysicalDeviceExternalFencePropertiesKHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceExternalFenceInfo* pExternalFenceInfo,
VkExternalFenceProperties* pExternalFenceProperties);
-
physicalDevice
is the physical device from which to query the fence capabilities. -
pExternalFenceInfo
points to an instance of the VkPhysicalDeviceExternalFenceInfo structure, describing the parameters that would be consumed by vkCreateFence. -
pExternalFenceProperties
points to an instance of the VkExternalFenceProperties structure in which capabilities are returned.
The VkPhysicalDeviceExternalFenceInfo
structure is defined as:
typedef struct VkPhysicalDeviceExternalFenceInfo {
VkStructureType sType;
const void* pNext;
VkExternalFenceHandleTypeFlagBits handleType;
} VkPhysicalDeviceExternalFenceInfo;
or the equivalent
typedef VkPhysicalDeviceExternalFenceInfo VkPhysicalDeviceExternalFenceInfoKHR;
-
sType
is the type of this structure -
pNext
is NULL or a pointer to an extension-specific structure. -
handleType
is a VkExternalFenceHandleTypeFlagBits value indicating an external fence handle type for which capabilities will be returned.
Note
Handles of type |
Bits which may be set in
VkPhysicalDeviceExternalFenceInfo::handleType
, and in the
exportFromImportedHandleTypes
and compatibleHandleTypes
members
of VkExternalFenceProperties, to indicate external fence handle types,
are:
typedef enum VkExternalFenceHandleTypeFlagBits {
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT = 0x00000001,
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_BIT = 0x00000002,
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT = 0x00000004,
VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT = 0x00000008,
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR = VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT,
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_BIT_KHR = VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_BIT,
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT_KHR = VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT,
VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR = VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT,
} VkExternalFenceHandleTypeFlagBits;
or the equivalent
typedef VkExternalFenceHandleTypeFlagBits VkExternalFenceHandleTypeFlagBitsKHR;
-
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT
specifies a POSIX file descriptor handle that has only limited valid usage outside of Vulkan and other compatible APIs. It must be compatible with the POSIX system callsdup
,dup2
,close
, and the non-standard system calldup3
. Additionally, it must be transportable over a socket using anSCM_RIGHTS
control message. It owns a reference to the underlying synchronization primitive represented by its Vulkan fence object. -
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_BIT
specifies an NT handle that has only limited valid usage outside of Vulkan and other compatible APIs. It must be compatible with the functionsDuplicateHandle
,CloseHandle
,CompareObjectHandles
,GetHandleInformation
, andSetHandleInformation
. It owns a reference to the underlying synchronization primitive represented by its Vulkan fence object. -
VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT
specifies a global share handle that has only limited valid usage outside of Vulkan and other compatible APIs. It is not compatible with any native APIs. It does not own a reference to the underlying synchronization primitive represented by its Vulkan fence object, and will therefore become invalid when all Vulkan fence objects associated with it are destroyed. -
VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT
specifies a POSIX file descriptor handle to a Linux Sync File or Android Fence. It can be used with any native API accepting a valid sync file or fence as input. It owns a reference to the underlying synchronization primitive associated with the file descriptor. Implementations which support importing this handle type must accept any type of sync or fence FD supported by the native system they are running on.
Some external fence handle types can only be shared within the same underlying physical device and/or the same driver version, as defined in the following table:
Handle type |
|
|
|
Must match |
Must match |
|
Must match |
Must match |
|
Must match |
Must match |
|
No restriction |
No restriction |
typedef VkFlags VkExternalFenceHandleTypeFlags;
or the equivalent
typedef VkExternalFenceHandleTypeFlags VkExternalFenceHandleTypeFlagsKHR;
VkExternalFenceHandleTypeFlags
is a bitmask type for setting a mask of
zero or more VkExternalFenceHandleTypeFlagBits.
The VkExternalFenceProperties
structure is defined as:
typedef struct VkExternalFenceProperties {
VkStructureType sType;
void* pNext;
VkExternalFenceHandleTypeFlags exportFromImportedHandleTypes;
VkExternalFenceHandleTypeFlags compatibleHandleTypes;
VkExternalFenceFeatureFlags externalFenceFeatures;
} VkExternalFenceProperties;
or the equivalent
typedef VkExternalFenceProperties VkExternalFencePropertiesKHR;
-
exportFromImportedHandleTypes
is a bitmask of VkExternalFenceHandleTypeFlagBits indicating which types of imported handlehandleType
can be exported from. -
compatibleHandleTypes
is a bitmask of VkExternalFenceHandleTypeFlagBits specifying handle types which can be specified at the same time ashandleType
when creating a fence. -
externalFenceFeatures
is a bitmask of VkExternalFenceFeatureFlagBits indicating the features ofhandleType
.
If handleType
is not supported by the implementation, then
VkExternalFenceProperties::externalFenceFeatures
will be set to
zero.
Bits which may be set in
VkExternalFenceProperties::externalFenceFeatures
, indicating
features of a fence external handle type, are:
typedef enum VkExternalFenceFeatureFlagBits {
VK_EXTERNAL_FENCE_FEATURE_EXPORTABLE_BIT = 0x00000001,
VK_EXTERNAL_FENCE_FEATURE_IMPORTABLE_BIT = 0x00000002,
VK_EXTERNAL_FENCE_FEATURE_EXPORTABLE_BIT_KHR = VK_EXTERNAL_FENCE_FEATURE_EXPORTABLE_BIT,
VK_EXTERNAL_FENCE_FEATURE_IMPORTABLE_BIT_KHR = VK_EXTERNAL_FENCE_FEATURE_IMPORTABLE_BIT,
} VkExternalFenceFeatureFlagBits;
or the equivalent
typedef VkExternalFenceFeatureFlagBits VkExternalFenceFeatureFlagBitsKHR;
-
VK_EXTERNAL_FENCE_FEATURE_EXPORTABLE_BIT
specifies handles of this type can be exported from Vulkan fence objects. -
VK_EXTERNAL_FENCE_FEATURE_IMPORTABLE_BIT
specifies handles of this type can be imported to Vulkan fence objects.
typedef VkFlags VkExternalFenceFeatureFlags;
or the equivalent
typedef VkExternalFenceFeatureFlags VkExternalFenceFeatureFlagsKHR;
VkExternalFenceFeatureFlags
is a bitmask type for setting a mask of
zero or more VkExternalFenceFeatureFlagBits.
35.9. Timestamp Calibration Capabilities
To query the set of time domains for which a physical device supports timestamp calibration, call:
VkResult vkGetPhysicalDeviceCalibrateableTimeDomainsEXT(
VkPhysicalDevice physicalDevice,
uint32_t* pTimeDomainCount,
VkTimeDomainEXT* pTimeDomains);
-
physicalDevice
is the physical device from which to query the set of calibrateable time domains. -
pTimeDomainCount
is a pointer to an integer related to the number of calibrateable time domains available or queried, as described below. -
pTimeDomains
is eitherNULL
or a pointer to an array of VkTimeDomainEXT values, indicating the supported calibrateable time domains.
If pTimeDomains
is NULL
, then the number of calibrateable time
domains supported for the given physicalDevice
is returned in
pTimeDomainCount
.
Otherwise, pTimeDomainCount
must point to a variable set by the user
to the number of elements in the pTimeDomains
array, and on return the
variable is overwritten with the number of values actually written to
pTimeDomains
.
If the value of pTimeDomainCount
is less than the number of
calibrateable time domains supported, at most pTimeDomainCount
values
will be written to pTimeDomains
.
If pTimeDomainCount
is smaller than the number of calibrateable time
domains supported for the given physicalDevice
, VK_INCOMPLETE
will be returned instead of VK_SUCCESS
to indicate that not all the
available values were returned.
36. Debugging
To aid developers in tracking down errors in the application’s use of Vulkan, particularly in combination with an external debugger or profiler, debugging extensions may be available.
The VkObjectType enumeration defines values, each of which corresponds to a specific Vulkan handle type. These values can be used to associate debug information with a particular type of object through one or more extensions.
typedef enum VkObjectType {
VK_OBJECT_TYPE_UNKNOWN = 0,
VK_OBJECT_TYPE_INSTANCE = 1,
VK_OBJECT_TYPE_PHYSICAL_DEVICE = 2,
VK_OBJECT_TYPE_DEVICE = 3,
VK_OBJECT_TYPE_QUEUE = 4,
VK_OBJECT_TYPE_SEMAPHORE = 5,
VK_OBJECT_TYPE_COMMAND_BUFFER = 6,
VK_OBJECT_TYPE_FENCE = 7,
VK_OBJECT_TYPE_DEVICE_MEMORY = 8,
VK_OBJECT_TYPE_BUFFER = 9,
VK_OBJECT_TYPE_IMAGE = 10,
VK_OBJECT_TYPE_EVENT = 11,
VK_OBJECT_TYPE_QUERY_POOL = 12,
VK_OBJECT_TYPE_BUFFER_VIEW = 13,
VK_OBJECT_TYPE_IMAGE_VIEW = 14,
VK_OBJECT_TYPE_SHADER_MODULE = 15,
VK_OBJECT_TYPE_PIPELINE_CACHE = 16,
VK_OBJECT_TYPE_PIPELINE_LAYOUT = 17,
VK_OBJECT_TYPE_RENDER_PASS = 18,
VK_OBJECT_TYPE_PIPELINE = 19,
VK_OBJECT_TYPE_DESCRIPTOR_SET_LAYOUT = 20,
VK_OBJECT_TYPE_SAMPLER = 21,
VK_OBJECT_TYPE_DESCRIPTOR_POOL = 22,
VK_OBJECT_TYPE_DESCRIPTOR_SET = 23,
VK_OBJECT_TYPE_FRAMEBUFFER = 24,
VK_OBJECT_TYPE_COMMAND_POOL = 25,
VK_OBJECT_TYPE_SAMPLER_YCBCR_CONVERSION = 1000156000,
VK_OBJECT_TYPE_DESCRIPTOR_UPDATE_TEMPLATE = 1000085000,
VK_OBJECT_TYPE_SURFACE_KHR = 1000000000,
VK_OBJECT_TYPE_SWAPCHAIN_KHR = 1000001000,
VK_OBJECT_TYPE_DISPLAY_KHR = 1000002000,
VK_OBJECT_TYPE_DISPLAY_MODE_KHR = 1000002001,
VK_OBJECT_TYPE_DEBUG_REPORT_CALLBACK_EXT = 1000011000,
VK_OBJECT_TYPE_OBJECT_TABLE_NVX = 1000086000,
VK_OBJECT_TYPE_INDIRECT_COMMANDS_LAYOUT_NVX = 1000086001,
VK_OBJECT_TYPE_DEBUG_UTILS_MESSENGER_EXT = 1000128000,
VK_OBJECT_TYPE_VALIDATION_CACHE_EXT = 1000160000,
VK_OBJECT_TYPE_ACCELERATION_STRUCTURE_NV = 1000165000,
VK_OBJECT_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_KHR = VK_OBJECT_TYPE_DESCRIPTOR_UPDATE_TEMPLATE,
VK_OBJECT_TYPE_SAMPLER_YCBCR_CONVERSION_KHR = VK_OBJECT_TYPE_SAMPLER_YCBCR_CONVERSION,
} VkObjectType;
VkObjectType | Vulkan Handle Type |
---|---|
|
Unknown/Undefined Handle |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If this Specification was generated with any such extensions included, they will be described in the remainder of this chapter.
36.1. Debug Utilities
Vulkan provides flexible debugging utilities for debugging an application.
The Object Debug Annotation section describes how to associate either a name or binary data with a specific Vulkan object.
The Queue Labels section describes how to annotate and group the work submitted to a queue.
The Command Buffer Labels section describes how to associate logical elements of the scene with commands in a VkCommandBuffer.
The Debug Messengers section describes how to create debug messenger objects associated with an application supplied callback to capture debug messages from a variety of Vulkan components.
36.1.1. Object Debug Annotation
It can be useful for an application to provide its own content relative to a specific Vulkan object. The following commands allow application developers to associate user-defined information with Vulkan objects.
Object Naming
An object can be provided a user-defined name by calling
vkSetDebugUtilsObjectNameEXT
as defined below.
VkResult vkSetDebugUtilsObjectNameEXT(
VkDevice device,
const VkDebugUtilsObjectNameInfoEXT* pNameInfo);
-
device
is the device that created the object. -
pNameInfo
is a pointer to an instance of the VkDebugUtilsObjectNameInfoEXT structure specifying the parameters of the name to set on the object.
The VkDebugUtilsObjectNameInfoEXT
structure is defined as:
typedef struct VkDebugUtilsObjectNameInfoEXT {
VkStructureType sType;
const void* pNext;
VkObjectType objectType;
uint64_t objectHandle;
const char* pObjectName;
} VkDebugUtilsObjectNameInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
objectType
is a VkObjectType specifying the type of the object to be named. -
objectHandle
is the object to be named. -
pObjectName
is a null-terminated UTF-8 string specifying the name to apply toobjectHandle
.
Applications may change the name associated with an object simply by
calling vkSetDebugUtilsObjectNameEXT
again with a new string.
If pObjectName
is an empty string, then any previously set name is
removed.
Object Data Association
In addition to setting a name for an object, debugging and validation layers may have uses for additional binary data on a per-object basis that have no other place in the Vulkan API.
For example, a VkShaderModule
could have additional debugging data
attached to it to aid in offline shader tracing.
Additional data can be attached to an object by calling
vkSetDebugUtilsObjectTagEXT
as defined below.
VkResult vkSetDebugUtilsObjectTagEXT(
VkDevice device,
const VkDebugUtilsObjectTagInfoEXT* pTagInfo);
-
device
is the device that created the object. -
pTagInfo
is a pointer to an instance of the VkDebugUtilsObjectTagInfoEXT structure specifying the parameters of the tag to attach to the object.
The VkDebugUtilsObjectTagInfoEXT
structure is defined as:
typedef struct VkDebugUtilsObjectTagInfoEXT {
VkStructureType sType;
const void* pNext;
VkObjectType objectType;
uint64_t objectHandle;
uint64_t tagName;
size_t tagSize;
const void* pTag;
} VkDebugUtilsObjectTagInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
objectType
is a VkObjectType specifying the type of the object to be named. -
objectHandle
is the object to be tagged. -
tagName
is a numerical identifier of the tag. -
tagSize
is the number of bytes of data to attach to the object. -
pTag
is an array oftagSize
bytes containing the data to be associated with the object.
The tagName
parameter gives a name or identifier to the type of data
being tagged.
This can be used by debugging layers to easily filter for only data that can
be used by that implementation.
36.1.2. Queue Labels
All Vulkan work must be submitted using queues. It is possible for an application to use multiple queues, each containing multiple command buffers, when performing work. It can be useful to identify which queue, or even where in a queue, something has occurred.
To begin identifying a region using a debug label inside a queue, you may use the vkQueueBeginDebugUtilsLabelEXT command.
Then, when the region of interest has passed, you may end the label region using vkQueueEndDebugUtilsLabelEXT.
Additionally, a single debug label may be inserted at any time using vkQueueInsertDebugUtilsLabelEXT.
A queue debug label region is opened by calling:
void vkQueueBeginDebugUtilsLabelEXT(
VkQueue queue,
const VkDebugUtilsLabelEXT* pLabelInfo);
-
queue
is the queue in which to start a debug label region. -
pLabelInfo
is a pointer to an instance of the VkDebugUtilsLabelEXT structure specifying the parameters of the label region to open.
The VkDebugUtilsLabelEXT
structure is defined as:
typedef struct VkDebugUtilsLabelEXT {
VkStructureType sType;
const void* pNext;
const char* pLabelName;
float color[4];
} VkDebugUtilsLabelEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pLabelName
is a pointer to a null-terminated UTF-8 string that contains the name of the label. -
color
is an optional RGBA color value that can be associated with the label. A particular implementation may choose to ignore this color value. The values contain RGBA values in order, in the range 0.0 to 1.0. If all elements incolor
are set to 0.0 then it is ignored.
A queue debug label region is closed by calling:
void vkQueueEndDebugUtilsLabelEXT(
VkQueue queue);
-
queue
is the queue in which a debug label region should be closed.
The calls to vkQueueBeginDebugUtilsLabelEXT and vkQueueEndDebugUtilsLabelEXT must be matched and balanced.
A single label can be inserted into a queue by calling:
void vkQueueInsertDebugUtilsLabelEXT(
VkQueue queue,
const VkDebugUtilsLabelEXT* pLabelInfo);
-
queue
is the queue into which a debug label will be inserted. -
pLabelInfo
is a pointer to an instance of the VkDebugUtilsLabelEXT structure specifying the parameters of the label to insert.
36.1.3. Command Buffer Labels
Typical Vulkan applications will submit many command buffers in each frame, with each command buffer containing a large number of individual commands. Being able to logically annotate regions of command buffers that belong together as well as hierarchically subdivide the frame is important to a developer’s ability to navigate the commands viewed holistically.
To identify the beginning of a debug label region in a command buffer, vkCmdBeginDebugUtilsLabelEXT can be used as defined below.
To indicate the end of a debug label region in a command buffer, vkCmdEndDebugUtilsLabelEXT can be used.
To insert a single command buffer debug label inside of a command buffer, vkCmdInsertDebugUtilsLabelEXT can be used as defined below.
A command buffer debug label region can be opened by calling:
void vkCmdBeginDebugUtilsLabelEXT(
VkCommandBuffer commandBuffer,
const VkDebugUtilsLabelEXT* pLabelInfo);
-
commandBuffer
is the command buffer into which the command is recorded. -
pLabelInfo
is a pointer to an instance of the VkDebugUtilsLabelEXT structure specifying the parameters of the label region to open.
A command buffer label region can be closed by calling:
void vkCmdEndDebugUtilsLabelEXT(
VkCommandBuffer commandBuffer);
-
commandBuffer
is the command buffer into which the command is recorded.
An application may open a debug label region in one command buffer and close it in another, or otherwise split debug label regions across multiple command buffers or multiple queue submissions. When viewed from the linear series of submissions to a single queue, the calls to vkCmdBeginDebugUtilsLabelEXT and vkCmdEndDebugUtilsLabelEXT must be matched and balanced.
A single debug label can be inserted into a command buffer by calling:
void vkCmdInsertDebugUtilsLabelEXT(
VkCommandBuffer commandBuffer,
const VkDebugUtilsLabelEXT* pLabelInfo);
-
commandBuffer
is the command buffer into which the command is recorded. -
pInfo
is a pointer to an instance of the VkDebugUtilsLabelEXT structure specifying the parameters of the label to insert.
36.1.4. Debug Messengers
Vulkan allows an application to register multiple callbacks with any Vulkan component wishing to report debug information. Some callbacks may log the information to a file, others may cause a debug break point or other application defined behavior. A primary producer of callback messages are the validation layers. An application can register callbacks even when no validation layers are enabled, but they will only be called for the Vulkan loader and, if implemented, other layer and driver events.
A VkDebugUtilsMessengerEXT
is a messenger object which handles passing
along debug messages to a provided debug callback.
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDebugUtilsMessengerEXT)
The debug messenger will provide detailed feedback on the application’s use of Vulkan when events of interest occur. When an event of interest does occur, the debug messenger will submit a debug message to the debug callback that was provided during its creation. Additionally, the debug messenger is responsible with filtering out debug messages that the callback is not interested in and will only provide desired debug messages.
A debug messenger triggers a debug callback with a debug message when an event of interest occurs. To create a debug messenger which will trigger a debug callback, call:
VkResult vkCreateDebugUtilsMessengerEXT(
VkInstance instance,
const VkDebugUtilsMessengerCreateInfoEXT* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDebugUtilsMessengerEXT* pMessenger);
-
instance
the instance the messenger will be used with. -
pCreateInfo
points to a VkDebugUtilsMessengerCreateInfoEXT structure which contains the callback pointer as well as defines the conditions under which this messenger will trigger the callback. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pMessenger
is a pointer to record theVkDebugUtilsMessengerEXT
object created.
The application must ensure that vkCreateDebugUtilsMessengerEXT is
not executed in parallel with any Vulkan command that is also called with
instance
or child of instance
as the dispatchable argument.
The definition of VkDebugUtilsMessengerCreateInfoEXT
is:
typedef struct VkDebugUtilsMessengerCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkDebugUtilsMessengerCreateFlagsEXT flags;
VkDebugUtilsMessageSeverityFlagsEXT messageSeverity;
VkDebugUtilsMessageTypeFlagsEXT messageType;
PFN_vkDebugUtilsMessengerCallbackEXT pfnUserCallback;
void* pUserData;
} VkDebugUtilsMessengerCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is 0 and reserved for future use. -
messageSeverity
is a bitmask of VkDebugUtilsMessageSeverityFlagBitsEXT specifying which severity of event(s) will cause this callback to be called. -
messageType
is a bitmask of VkDebugUtilsMessageTypeFlagBitsEXT specifying which type of event(s) will cause this callback to be called. -
pfnUserCallback
is the application callback function to call. -
pUserData
is user data to be passed to the callback.
For each VkDebugUtilsMessengerEXT
that is created the
VkDebugUtilsMessengerCreateInfoEXT
::messageSeverity
and
VkDebugUtilsMessengerCreateInfoEXT
::messageType
determine when
that VkDebugUtilsMessengerCreateInfoEXT
::pfnUserCallback
is
called.
The process to determine if the user’s pfnUserCallback is triggered when an
event occurs is as follows:
-
The implementation will perform a bitwise AND of the event’s VkDebugUtilsMessageSeverityFlagBitsEXT with the
messageSeverity
provided during creation of the VkDebugUtilsMessengerEXT object.-
If the value is 0, the message is skipped.
-
-
The implementation will perform bitwise AND of the event’s VkDebugUtilsMessageTypeFlagBitsEXT with the
messageType
provided during the creation of the VkDebugUtilsMessengerEXT object.-
If the value is 0, the message is skipped.
-
-
The callback will trigger a debug message for the current event
The callback will come directly from the component that detected the event, unless some other layer intercepts the calls for its own purposes (filter them in a different way, log to a system error log, etc.).
An application can receive multiple callbacks if multiple
VkDebugUtilsMessengerEXT
objects are created.
A callback will always be executed in the same thread as the originating
Vulkan call.
A callback can be called from multiple threads simultaneously (if the application is making Vulkan calls from multiple threads).
Bits which can be set in
VkDebugUtilsMessengerCreateInfoEXT::messageSeverity
, specifying
event severities which cause a debug messenger to call the callback, are:
typedef enum VkDebugUtilsMessageSeverityFlagBitsEXT {
VK_DEBUG_UTILS_MESSAGE_SEVERITY_VERBOSE_BIT_EXT = 0x00000001,
VK_DEBUG_UTILS_MESSAGE_SEVERITY_INFO_BIT_EXT = 0x00000010,
VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT = 0x00000100,
VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT = 0x00001000,
} VkDebugUtilsMessageSeverityFlagBitsEXT;
-
VK_DEBUG_UTILS_MESSAGE_SEVERITY_VERBOSE_BIT_EXT
specifies the most verbose output indicating all diagnostic messages from the Vulkan loader, layers, and drivers should be captured. -
VK_DEBUG_UTILS_MESSAGE_SEVERITY_INFO_BIT_EXT
specifies an informational message such as resource details that may be handy when debugging an application. -
VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT
specifies use of Vulkan that may expose an app bug. Such cases may not be immediately harmful, such as a fragment shader outputting to a location with no attachment. Other cases may point to behavior that is almost certainly bad when unintended such as using an image whose memory has not been filled. In general if you see a warning but you know that the behavior is intended/desired, then simply ignore the warning. -
VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT
specifies that an error that may cause undefined results, including an application crash.
Note
The values of VkDebugUtilsMessageSeverityFlagBitsEXT are sorted based on severity. The higher the flag value, the more severe the message. This allows for simple boolean operation comparisons when looking at VkDebugUtilsMessageSeverityFlagBitsEXT values. For example:
In addition, space has been left between the enums to allow for later addition of new severities in between the existing values. |
typedef VkFlags VkDebugUtilsMessageSeverityFlagsEXT;
VkDebugUtilsMessageSeverityFlagsEXT
is a bitmask type for setting a
mask of zero or more VkDebugUtilsMessageSeverityFlagBitsEXT.
Bits which can be set in
VkDebugUtilsMessengerCreateInfoEXT::messageType
, specifying
event types which cause a debug messenger to call the callback, are:
typedef enum VkDebugUtilsMessageTypeFlagBitsEXT {
VK_DEBUG_UTILS_MESSAGE_TYPE_GENERAL_BIT_EXT = 0x00000001,
VK_DEBUG_UTILS_MESSAGE_TYPE_VALIDATION_BIT_EXT = 0x00000002,
VK_DEBUG_UTILS_MESSAGE_TYPE_PERFORMANCE_BIT_EXT = 0x00000004,
} VkDebugUtilsMessageTypeFlagBitsEXT;
-
VK_DEBUG_UTILS_MESSAGE_TYPE_GENERAL_BIT_EXT
specifies that some general event has occurred. This is typically a non-specification, non-performance event. -
VK_DEBUG_UTILS_MESSAGE_TYPE_VALIDATION_BIT_EXT
specifies that something has occurred during validation against the Vulkan specification that may indicate invalid behavior. -
VK_DEBUG_UTILS_MESSAGE_TYPE_PERFORMANCE_BIT_EXT
specifies a potentially non-optimal use of Vulkan, e.g. using vkCmdClearColorImage when setting VkAttachmentDescription::loadOp
toVK_ATTACHMENT_LOAD_OP_CLEAR
would have worked.
typedef VkFlags VkDebugUtilsMessageTypeFlagsEXT;
VkDebugUtilsMessageTypeFlagsEXT
is a bitmask type for setting a mask
of zero or more VkDebugUtilsMessageTypeFlagBitsEXT.
The prototype for the
VkDebugUtilsMessengerCreateInfoEXT::pfnUserCallback
function
implemented by the application is:
typedef VkBool32 (VKAPI_PTR *PFN_vkDebugUtilsMessengerCallbackEXT)(
VkDebugUtilsMessageSeverityFlagBitsEXT messageSeverity,
VkDebugUtilsMessageTypeFlagsEXT messageTypes,
const VkDebugUtilsMessengerCallbackDataEXT* pCallbackData,
void* pUserData);
-
messageSeverity
specifies the VkDebugUtilsMessageSeverityFlagBitsEXT that triggered this callback. -
messageTypes
is a bitmask of VkDebugUtilsMessageTypeFlagBitsEXT specifying which type of event(s) triggered this callback. -
pCallbackData
contains all the callback related data in the VkDebugUtilsMessengerCallbackDataEXT structure. -
pUserData
is the user data provided when the VkDebugUtilsMessengerEXT was created.
The callback must not call vkDestroyDebugUtilsMessengerEXT.
The callback returns a VkBool32
, which is interpreted in a
layer-specified manner.
The application should always return VK_FALSE
.
The VK_TRUE
value is reserved for use in layer development.
The definition of VkDebugUtilsMessengerCallbackDataEXT
is:
typedef struct VkDebugUtilsMessengerCallbackDataEXT {
VkStructureType sType;
const void* pNext;
VkDebugUtilsMessengerCallbackDataFlagsEXT flags;
const char* pMessageIdName;
int32_t messageIdNumber;
const char* pMessage;
uint32_t queueLabelCount;
const VkDebugUtilsLabelEXT* pQueueLabels;
uint32_t cmdBufLabelCount;
const VkDebugUtilsLabelEXT* pCmdBufLabels;
uint32_t objectCount;
const VkDebugUtilsObjectNameInfoEXT* pObjects;
} VkDebugUtilsMessengerCallbackDataEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is 0 and reserved for future use. -
pMessageIdName
is a null-terminated string that identifies the particular message ID that is associated with the provided message. If the message corresponds to a validation layer message, then this string may contain the portion of the Vulkan specification that is believed to have been violated. -
messageIdNumber
is the ID number of the triggering message. If the message corresponds to a validation layer message, then this number is related to the internal number associated with the message being triggered. -
pMessage
is a null-terminated string detailing the trigger conditions. -
queueLabelCount
is a count of items contained in thepQueueLabels
array. -
pQueueLabels
is NULL or a pointer to an array of VkDebugUtilsLabelEXT active in the currentVkQueue
at the time the callback was triggered. Refer to Queue Labels for more information. -
cmdBufLabelCount
is a count of items contained in thepCmdBufLabels
array. -
pCmdBufLabels
is NULL or a pointer to an array of VkDebugUtilsLabelEXT active in the currentVkCommandBuffer
at the time the callback was triggered. Refer to Command Buffer Labels for more information. -
objectCount
is a count of items contained in thepObjects
array. -
pObjects
is a pointer to an array of VkDebugUtilsObjectNameInfoEXT objects related to the detected issue. The array is roughly in order or importance, but the 0th element is always guaranteed to be the most important object for this message.
Note
This structure should only be considered valid during the lifetime of the triggered callback. |
Since adding queue and command buffer labels behaves like pushing and
popping onto a stack, the order of both pQueueLabels
and
pCmdBufLabels
is based on the order the labels were defined.
The result is that the first label in either pQueueLabels
or
pCmdBufLabels
will be the first defined (and therefore the oldest)
while the last label in each list will be the most recent.
Note
Likewise, |
There may be times that a user wishes to intentionally submit a debug message. To do this, call:
void vkSubmitDebugUtilsMessageEXT(
VkInstance instance,
VkDebugUtilsMessageSeverityFlagBitsEXT messageSeverity,
VkDebugUtilsMessageTypeFlagsEXT messageTypes,
const VkDebugUtilsMessengerCallbackDataEXT* pCallbackData);
-
instance
is the debug stream’s VkInstance. -
messageSeverity
is the VkDebugUtilsMessageSeverityFlagBitsEXT severity of this event/message. -
messageTypes
is a bitmask of VkDebugUtilsMessageTypeFlagBitsEXT specifying which type of event(s) to identify with this message. -
pCallbackData
contains all the callback related data in the VkDebugUtilsMessengerCallbackDataEXT structure.
The call will propagate through the layers and generate callback(s) as
indicated by the message’s flags.
The parameters are passed on to the callback in addition to the
pUserData
value that was defined at the time the messenger was
registered.
To destroy a VkDebugUtilsMessengerEXT
object, call:
void vkDestroyDebugUtilsMessengerEXT(
VkInstance instance,
VkDebugUtilsMessengerEXT messenger,
const VkAllocationCallbacks* pAllocator);
-
instance
the instance where the callback was created. -
messenger
the VkDebugUtilsMessengerEXT object to destroy.messenger
is an externally synchronized object and must not be used on more than one thread at a time. This means thatvkDestroyDebugUtilsMessengerEXT
must not be called when a callback is active. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
The application must ensure that vkDestroyDebugUtilsMessengerEXT is
not executed in parallel with any Vulkan command that is also called with
instance
or child of instance
as the dispatchable argument.
36.2. Debug Markers
Debug markers provide a flexible way for debugging and validation layers to receive annotation and debug information.
The Object Annotation section describes how to associate a name or binary data with a Vulkan object.
The Command Buffer Markers section describes how to associate logical elements of the scene with commands in the command buffer.
36.2.1. Object Annotation
The commands in this section allow application developers to associate user-defined information with Vulkan objects at will.
An object can be given a user-friendly name by calling:
VkResult vkDebugMarkerSetObjectNameEXT(
VkDevice device,
const VkDebugMarkerObjectNameInfoEXT* pNameInfo);
-
device
is the device that created the object. -
pNameInfo
is a pointer to an instance of the VkDebugMarkerObjectNameInfoEXT structure specifying the parameters of the name to set on the object.
The VkDebugMarkerObjectNameInfoEXT
structure is defined as:
typedef struct VkDebugMarkerObjectNameInfoEXT {
VkStructureType sType;
const void* pNext;
VkDebugReportObjectTypeEXT objectType;
uint64_t object;
const char* pObjectName;
} VkDebugMarkerObjectNameInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
objectType
is a VkDebugReportObjectTypeEXT specifying the type of the object to be named. -
object
is the object to be named. -
pObjectName
is a null-terminated UTF-8 string specifying the name to apply toobject
.
Applications may change the name associated with an object simply by
calling vkDebugMarkerSetObjectNameEXT
again with a new string.
To remove a previously set name, pObjectName
should be set to an
empty string.
In addition to setting a name for an object, debugging and validation layers
may have uses for additional binary data on a per-object basis that has no
other place in the Vulkan API.
For example, a VkShaderModule
could have additional debugging data
attached to it to aid in offline shader tracing.
To attach data to an object, call:
VkResult vkDebugMarkerSetObjectTagEXT(
VkDevice device,
const VkDebugMarkerObjectTagInfoEXT* pTagInfo);
-
device
is the device that created the object. -
pTagInfo
is a pointer to an instance of the VkDebugMarkerObjectTagInfoEXT structure specifying the parameters of the tag to attach to the object.
The VkDebugMarkerObjectTagInfoEXT
structure is defined as:
typedef struct VkDebugMarkerObjectTagInfoEXT {
VkStructureType sType;
const void* pNext;
VkDebugReportObjectTypeEXT objectType;
uint64_t object;
uint64_t tagName;
size_t tagSize;
const void* pTag;
} VkDebugMarkerObjectTagInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
objectType
is a VkDebugReportObjectTypeEXT specifying the type of the object to be named. -
object
is the object to be tagged. -
tagName
is a numerical identifier of the tag. -
tagSize
is the number of bytes of data to attach to the object. -
pTag
is an array oftagSize
bytes containing the data to be associated with the object.
The tagName
parameter gives a name or identifier to the type of data
being tagged.
This can be used by debugging layers to easily filter for only data that can
be used by that implementation.
36.2.2. Command Buffer Markers
Typical Vulkan applications will submit many command buffers in each frame, with each command buffer containing a large number of individual commands. Being able to logically annotate regions of command buffers that belong together as well as hierarchically subdivide the frame is important to a developer’s ability to navigate the commands viewed holistically.
The marker commands vkCmdDebugMarkerBeginEXT
and
vkCmdDebugMarkerEndEXT
define regions of a series of commands that are
grouped together, and they can be nested to create a hierarchy.
The vkCmdDebugMarkerInsertEXT
command allows insertion of a single
label within a command buffer.
A marker region can be opened by calling:
void vkCmdDebugMarkerBeginEXT(
VkCommandBuffer commandBuffer,
const VkDebugMarkerMarkerInfoEXT* pMarkerInfo);
-
commandBuffer
is the command buffer into which the command is recorded. -
pMarkerInfo
is a pointer to an instance of the VkDebugMarkerMarkerInfoEXT structure specifying the parameters of the marker region to open.
The VkDebugMarkerMarkerInfoEXT
structure is defined as:
typedef struct VkDebugMarkerMarkerInfoEXT {
VkStructureType sType;
const void* pNext;
const char* pMarkerName;
float color[4];
} VkDebugMarkerMarkerInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
pMarkerName
is a pointer to a null-terminated UTF-8 string that contains the name of the marker. -
color
is an optional RGBA color value that can be associated with the marker. A particular implementation may choose to ignore this color value. The values contain RGBA values in order, in the range 0.0 to 1.0. If all elements incolor
are set to 0.0 then it is ignored.
A marker region can be closed by calling:
void vkCmdDebugMarkerEndEXT(
VkCommandBuffer commandBuffer);
-
commandBuffer
is the command buffer into which the command is recorded.
An application may open a marker region in one command buffer and close it
in another, or otherwise split marker regions across multiple command
buffers or multiple queue submissions.
When viewed from the linear series of submissions to a single queue, the
calls to vkCmdDebugMarkerBeginEXT
and vkCmdDebugMarkerEndEXT
must be matched and balanced.
A single marker label can be inserted into a command buffer by calling:
void vkCmdDebugMarkerInsertEXT(
VkCommandBuffer commandBuffer,
const VkDebugMarkerMarkerInfoEXT* pMarkerInfo);
-
commandBuffer
is the command buffer into which the command is recorded. -
pMarkerInfo
is a pointer to an instance of the VkDebugMarkerMarkerInfoEXT structure specifying the parameters of the marker to insert.
36.3. Debug Report Callbacks
Debug report callbacks are represented by VkDebugReportCallbackEXT
handles:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDebugReportCallbackEXT)
Debug report callbacks give more detailed feedback on the application’s use of Vulkan when events of interest occur.
To register a debug report callback, an application uses vkCreateDebugReportCallbackEXT.
VkResult vkCreateDebugReportCallbackEXT(
VkInstance instance,
const VkDebugReportCallbackCreateInfoEXT* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDebugReportCallbackEXT* pCallback);
-
instance
the instance the callback will be logged on. -
pCreateInfo
points to a VkDebugReportCallbackCreateInfoEXT structure which defines the conditions under which this callback will be called. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pCallback
is a pointer to record theVkDebugReportCallbackEXT
object created.
The definition of VkDebugReportCallbackCreateInfoEXT is:
typedef struct VkDebugReportCallbackCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkDebugReportFlagsEXT flags;
PFN_vkDebugReportCallbackEXT pfnCallback;
void* pUserData;
} VkDebugReportCallbackCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to an extension-specific structure. -
flags
is a bitmask of VkDebugReportFlagBitsEXT specifying which event(s) will cause this callback to be called. -
pfnCallback
is the application callback function to call. -
pUserData
is user data to be passed to the callback.
For each VkDebugReportCallbackEXT
that is created the
VkDebugReportCallbackCreateInfoEXT
::flags
determine when that
VkDebugReportCallbackCreateInfoEXT
::pfnCallback
is called.
When an event happens, the implementation will do a bitwise AND of the
event’s VkDebugReportFlagBitsEXT flags to each
VkDebugReportCallbackEXT
object’s flags.
For each non-zero result the corresponding callback will be called.
The callback will come directly from the component that detected the event,
unless some other layer intercepts the calls for its own purposes (filter
them in a different way, log to a system error log, etc.).
An application may receive multiple callbacks if multiple
VkDebugReportCallbackEXT
objects were created.
A callback will always be executed in the same thread as the originating
Vulkan call.
A callback may be called from multiple threads simultaneously (if the application is making Vulkan calls from multiple threads).
Bits which can be set in
VkDebugReportCallbackCreateInfoEXT::flags
, specifying events
which cause a debug report, are:
typedef enum VkDebugReportFlagBitsEXT {
VK_DEBUG_REPORT_INFORMATION_BIT_EXT = 0x00000001,
VK_DEBUG_REPORT_WARNING_BIT_EXT = 0x00000002,
VK_DEBUG_REPORT_PERFORMANCE_WARNING_BIT_EXT = 0x00000004,
VK_DEBUG_REPORT_ERROR_BIT_EXT = 0x00000008,
VK_DEBUG_REPORT_DEBUG_BIT_EXT = 0x00000010,
} VkDebugReportFlagBitsEXT;
-
VK_DEBUG_REPORT_ERROR_BIT_EXT
specifies that an error that may cause undefined results, including an application crash. -
VK_DEBUG_REPORT_WARNING_BIT_EXT
specifies use of Vulkan that may expose an app bug. Such cases may not be immediately harmful, such as a fragment shader outputting to a location with no attachment. Other cases may point to behavior that is almost certainly bad when unintended such as using an image whose memory has not been filled. In general if you see a warning but you know that the behavior is intended/desired, then simply ignore the warning. -
VK_DEBUG_REPORT_PERFORMANCE_WARNING_BIT_EXT
specifies a potentially non-optimal use of Vulkan, e.g. using vkCmdClearColorImage when setting VkAttachmentDescription::loadOp
toVK_ATTACHMENT_LOAD_OP_CLEAR
would have worked. -
VK_DEBUG_REPORT_INFORMATION_BIT_EXT
specifies an informational message such as resource details that may be handy when debugging an application. -
VK_DEBUG_REPORT_DEBUG_BIT_EXT
specifies diagnostic information from the implementation and layers.
typedef VkFlags VkDebugReportFlagsEXT;
VkDebugReportFlagsEXT
is a bitmask type for setting a mask of zero or
more VkDebugReportFlagBitsEXT.
The prototype for the
VkDebugReportCallbackCreateInfoEXT::pfnCallback
function
implemented by the application is:
typedef VkBool32 (VKAPI_PTR *PFN_vkDebugReportCallbackEXT)(
VkDebugReportFlagsEXT flags,
VkDebugReportObjectTypeEXT objectType,
uint64_t object,
size_t location,
int32_t messageCode,
const char* pLayerPrefix,
const char* pMessage,
void* pUserData);
-
flags
specifies the VkDebugReportFlagBitsEXT that triggered this callback. -
objectType
is a VkDebugReportObjectTypeEXT value specifying the type of object being used or created at the time the event was triggered. -
object
is the object where the issue was detected. IfobjectType
isVK_DEBUG_REPORT_OBJECT_TYPE_UNKNOWN_EXT
,object
is undefined. -
location
is a component (layer, driver, loader) defined value that specifies the location of the trigger. This is an optional value. -
messageCode
is a layer-defined value indicating what test triggered this callback. -
pLayerPrefix
is a null-terminated string that is an abbreviation of the name of the component making the callback.pLayerPrefix
is only valid for the duration of the callback. -
pMessage
is a null-terminated string detailing the trigger conditions.pMessage
is only valid for the duration of the callback. -
pUserData
is the user data given when the VkDebugReportCallbackEXT was created.
The callback must not call vkDestroyDebugReportCallbackEXT
.
The callback returns a VkBool32
, which is interpreted in a
layer-specified manner.
The application should always return VK_FALSE
.
The VK_TRUE
value is reserved for use in layer development.
object
must be a Vulkan object or VK_NULL_HANDLE.
If objectType
is not VK_DEBUG_REPORT_OBJECT_TYPE_UNKNOWN_EXT
and
object
is not VK_NULL_HANDLE, object
must be a Vulkan
object of the corresponding type associated with objectType
as defined
in VkDebugReportObjectTypeEXT and Vulkan Handle Relationship.
Possible values passed to the objectType
parameter of the callback
function specified by
VkDebugReportCallbackCreateInfoEXT::pfnCallback
, specifying the
type of object handle being reported, are:
typedef enum VkDebugReportObjectTypeEXT {
VK_DEBUG_REPORT_OBJECT_TYPE_UNKNOWN_EXT = 0,
VK_DEBUG_REPORT_OBJECT_TYPE_INSTANCE_EXT = 1,
VK_DEBUG_REPORT_OBJECT_TYPE_PHYSICAL_DEVICE_EXT = 2,
VK_DEBUG_REPORT_OBJECT_TYPE_DEVICE_EXT = 3,
VK_DEBUG_REPORT_OBJECT_TYPE_QUEUE_EXT = 4,
VK_DEBUG_REPORT_OBJECT_TYPE_SEMAPHORE_EXT = 5,
VK_DEBUG_REPORT_OBJECT_TYPE_COMMAND_BUFFER_EXT = 6,
VK_DEBUG_REPORT_OBJECT_TYPE_FENCE_EXT = 7,
VK_DEBUG_REPORT_OBJECT_TYPE_DEVICE_MEMORY_EXT = 8,
VK_DEBUG_REPORT_OBJECT_TYPE_BUFFER_EXT = 9,
VK_DEBUG_REPORT_OBJECT_TYPE_IMAGE_EXT = 10,
VK_DEBUG_REPORT_OBJECT_TYPE_EVENT_EXT = 11,
VK_DEBUG_REPORT_OBJECT_TYPE_QUERY_POOL_EXT = 12,
VK_DEBUG_REPORT_OBJECT_TYPE_BUFFER_VIEW_EXT = 13,
VK_DEBUG_REPORT_OBJECT_TYPE_IMAGE_VIEW_EXT = 14,
VK_DEBUG_REPORT_OBJECT_TYPE_SHADER_MODULE_EXT = 15,
VK_DEBUG_REPORT_OBJECT_TYPE_PIPELINE_CACHE_EXT = 16,
VK_DEBUG_REPORT_OBJECT_TYPE_PIPELINE_LAYOUT_EXT = 17,
VK_DEBUG_REPORT_OBJECT_TYPE_RENDER_PASS_EXT = 18,
VK_DEBUG_REPORT_OBJECT_TYPE_PIPELINE_EXT = 19,
VK_DEBUG_REPORT_OBJECT_TYPE_DESCRIPTOR_SET_LAYOUT_EXT = 20,
VK_DEBUG_REPORT_OBJECT_TYPE_SAMPLER_EXT = 21,
VK_DEBUG_REPORT_OBJECT_TYPE_DESCRIPTOR_POOL_EXT = 22,
VK_DEBUG_REPORT_OBJECT_TYPE_DESCRIPTOR_SET_EXT = 23,
VK_DEBUG_REPORT_OBJECT_TYPE_FRAMEBUFFER_EXT = 24,
VK_DEBUG_REPORT_OBJECT_TYPE_COMMAND_POOL_EXT = 25,
VK_DEBUG_REPORT_OBJECT_TYPE_SURFACE_KHR_EXT = 26,
VK_DEBUG_REPORT_OBJECT_TYPE_SWAPCHAIN_KHR_EXT = 27,
VK_DEBUG_REPORT_OBJECT_TYPE_DEBUG_REPORT_CALLBACK_EXT_EXT = 28,
VK_DEBUG_REPORT_OBJECT_TYPE_DISPLAY_KHR_EXT = 29,
VK_DEBUG_REPORT_OBJECT_TYPE_DISPLAY_MODE_KHR_EXT = 30,
VK_DEBUG_REPORT_OBJECT_TYPE_OBJECT_TABLE_NVX_EXT = 31,
VK_DEBUG_REPORT_OBJECT_TYPE_INDIRECT_COMMANDS_LAYOUT_NVX_EXT = 32,
VK_DEBUG_REPORT_OBJECT_TYPE_VALIDATION_CACHE_EXT_EXT = 33,
VK_DEBUG_REPORT_OBJECT_TYPE_SAMPLER_YCBCR_CONVERSION_EXT = 1000156000,
VK_DEBUG_REPORT_OBJECT_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_EXT = 1000085000,
VK_DEBUG_REPORT_OBJECT_TYPE_ACCELERATION_STRUCTURE_NV_EXT = 1000165000,
VK_DEBUG_REPORT_OBJECT_TYPE_DEBUG_REPORT_EXT = VK_DEBUG_REPORT_OBJECT_TYPE_DEBUG_REPORT_CALLBACK_EXT_EXT,
VK_DEBUG_REPORT_OBJECT_TYPE_VALIDATION_CACHE_EXT = VK_DEBUG_REPORT_OBJECT_TYPE_VALIDATION_CACHE_EXT_EXT,
VK_DEBUG_REPORT_OBJECT_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_KHR_EXT = VK_DEBUG_REPORT_OBJECT_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_EXT,
VK_DEBUG_REPORT_OBJECT_TYPE_SAMPLER_YCBCR_CONVERSION_KHR_EXT = VK_DEBUG_REPORT_OBJECT_TYPE_SAMPLER_YCBCR_CONVERSION_EXT,
} VkDebugReportObjectTypeEXT;
VkDebugReportObjectTypeEXT | Vulkan Handle Type |
---|---|
|
Unknown/Undefined Handle |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note
The primary expected use of |
To inject its own messages into the debug stream, call:
void vkDebugReportMessageEXT(
VkInstance instance,
VkDebugReportFlagsEXT flags,
VkDebugReportObjectTypeEXT objectType,
uint64_t object,
size_t location,
int32_t messageCode,
const char* pLayerPrefix,
const char* pMessage);
-
instance
is the debug stream’s VkInstance. -
flags
specifies the VkDebugReportFlagBitsEXT classification of this event/message. -
objectType
is a VkDebugReportObjectTypeEXT specifying the type of object being used or created at the time the event was triggered. -
object
this is the object where the issue was detected.object
can be VK_NULL_HANDLE if there is no object associated with the event. -
location
is an application defined value. -
messageCode
is an application defined value. -
pLayerPrefix
is the abbreviation of the component making this event/message. -
pMessage
is a null-terminated string detailing the trigger conditions.
The call will propagate through the layers and generate callback(s) as
indicated by the message’s flags.
The parameters are passed on to the callback in addition to the
pUserData
value that was defined at the time the callback was
registered.
To destroy a VkDebugReportCallbackEXT
object, call:
void vkDestroyDebugReportCallbackEXT(
VkInstance instance,
VkDebugReportCallbackEXT callback,
const VkAllocationCallbacks* pAllocator);
-
instance
the instance where the callback was created. -
callback
the VkDebugReportCallbackEXT object to destroy.callback
is an externally synchronized object and must not be used on more than one thread at a time. This means thatvkDestroyDebugReportCallbackEXT
must not be called when a callback is active. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
36.4. Device Loss Debugging
36.4.1. Device Diagnostic Checkpoints
Device execution progress can be tracked for the purposes of debugging a device loss by annotating the command stream with application-defined diagnostic checkpoints.
Each diagnostic checkpoint command is executed at two pipeline stages:
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
, and
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
.
If the device is lost, the application can call
vkGetQueueCheckpointDataNV to retrieve checkpoint data associated with
both pipeline stages, indicating the range of diagnostic checkpoints that
are currently in the execution pipeline on the device.
Device diagnostic checkpoints are inserted into the command stream by calling vkCmdSetCheckpointNV.
void vkCmdSetCheckpointNV(
VkCommandBuffer commandBuffer,
const void* pCheckpointMarker);
-
commandBuffer
is the command buffer that will receive the marker -
pCheckpointMarker
is an opaque application-provided value that will be associated with the checkpoint.
Note that pCheckpointMarker
is treated as an opaque value.
It does not need to be a valid pointer and will not be dereferenced by the
implementation.
If the device encounters an error during execution, the implementation will
return a VK_ERROR_DEVICE_LOST
error to the application at a certain
point during host execution.
When this happens, the application can call
vkGetQueueCheckpointDataNV to retrieve information on the most recent
diagnostic checkpoints that were executed by the device.
void vkGetQueueCheckpointDataNV(
VkQueue queue,
uint32_t* pCheckpointDataCount,
VkCheckpointDataNV* pCheckpointData);
-
queue
is the VkQueue object the caller would like to retrieve checkpoint data for -
pCheckpointDataCount
is a pointer to an integer related to the number of checkpoint markers available or queried, as described below. -
pCheckpointData
is eitherNULL
or a pointer to an array ofVkCheckpointDataNV
structures.
If pCheckpointData
is NULL
, then the number of checkpoint markers
available is returned in pCheckpointDataCount
.
Otherwise, pCheckpointDataCount
must point to a variable set by the
user to the number of elements in the pCheckpointData
array, and on
return the variable is overwritten with the number of structures actually
written to pCheckpointData
.
If pCheckpointDataCount
is less than the number of checkpoint markers
available, at most pCheckpointDataCount
structures will be written.
The VkCheckpointDataNV structure is defined as:
typedef struct VkCheckpointDataNV {
VkStructureType sType;
void* pNext;
VkPipelineStageFlagBits stage;
void* pCheckpointMarker;
} VkCheckpointDataNV;
-
sType
is the type of this structure -
pNext
isNULL
or a pointer to an extension-specific structure. -
stage
indicates which pipeline stage the checkpoint marker data refers to. -
pCheckpointMarker
contains the value of the last checkpoint marker executed in the stage thatstage
refers to.
Note that the stages at which a checkpoint marker can be executed are implementation-defined and can be queried by calling vkGetPhysicalDeviceQueueFamilyProperties2.
Appendix A: Vulkan Environment for SPIR-V
Shaders for Vulkan are defined by the Khronos SPIR-V Specification as well as the Khronos SPIR-V Extended Instructions for GLSL Specification. This appendix defines additional SPIR-V requirements applying to Vulkan shaders.
Versions and Formats
A Vulkan 1.1 implementation must support the 1.0, 1.1, 1.2, and 1.3 versions of SPIR-V and the 1.0 version of the SPIR-V Extended Instructions for GLSL.
A SPIR-V module passed into vkCreateShaderModule is interpreted as a series of 32-bit words in host endianness, with literal strings packed as described in section 2.2 of the SPIR-V Specification. The first few words of the SPIR-V module must be a magic number and a SPIR-V version number, as described in section 2.3 of the SPIR-V Specification.
Capabilities
The SPIR-V capabilities listed below must be supported if the corresponding feature or extension is enabled, or if no features or extensions are listed for that capability. Extensions are only listed when there is not also a feature bit associated with that capability.
SPIR-V OpCapability |
Vulkan feature or extension name |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
shaderDenormPreserveFloat16, shaderDenormPreserveFloat32, shaderDenormPreserveFloat64 |
|
shaderDenormFlushToZeroFloat16, shaderDenormFlushToZeroFloat32, shaderDenormFlushToZeroFloat64 |
|
shaderSignedZeroInfNanPreserveFloat16, shaderSignedZeroInfNanPreserveFloat32, shaderSignedZeroInfNanPreserveFloat64 |
|
shaderRoundingModeRTEFloat16, shaderRoundingModeRTEFloat32, shaderRoundingModeRTEFloat64 |
|
shaderRoundingModeRTZFloat16, shaderRoundingModeRTZFloat32, shaderRoundingModeRTZFloat64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_KHR_variable_pointers
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_AMD_shader_explicit_vertex_parameter
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_AMD_gcn_shader
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_AMD_gpu_shader_half_float
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_AMD_gpu_shader_int16
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_AMD_shader_ballot
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_AMD_shader_fragment_mask
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_AMD_shader_image_load_store_lod
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_AMD_shader_trinary_minmax
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_AMD_texture_gather_bias_lod
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_KHR_shader_draw_parameters
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the Float16
or the Int8
SPIR-V capabilities.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_KHR_8bit_storage
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the
SPV_KHR_16bit_storage
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the
SPV_KHR_float_controls
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the
SPV_KHR_storage_buffer_storage_class
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_KHR_post_depth_coverage
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_EXT_shader_stencil_export
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_KHR_shader_ballot
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_KHR_subgroup_vote
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NV_sample_mask_override_coverage
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NV_geometry_shader_passthrough
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NV_mesh_shader
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NV_viewport_array2
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_EXT_shader_viewport_index_layer
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NVX_multiview_per_view_attributes
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_EXT_descriptor_indexing
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_KHR_vulkan_memory_model
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NV_compute_shader_derivatives
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NV_fragment_shader_barycentric
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NV_shader_image_footprint
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NV_shading_rate
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_NV_ray_tracing
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_GOOGLE_hlsl_functionality1
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_GOOGLE_decorate_string
SPIR-V extension.
The application can pass a SPIR-V module to vkCreateShaderModule that
uses the SPV_EXT_fragment_invocation_density
SPIR-V extension.
The application must not pass a SPIR-V module containing any of the following to vkCreateShaderModule:
-
any OpCapability not listed above,
-
an unsupported capability, or
-
a capability which corresponds to a Vulkan feature or extension which has not been enabled.
Validation Rules within a Module
A SPIR-V module passed to vkCreateShaderModule must conform to the following rules:
-
Every entry point must have no return value and accept no arguments.
-
Recursion: The static function-call graph for an entry point must not contain cycles.
-
The Logical addressing model must be selected.
-
Scope for execution must be limited to:
-
Workgroup
-
Subgroup
-
-
Scope for memory must be limited to:
-
Device
-
If
vulkanMemoryModel
is enabled andvulkanMemoryModelDeviceScope
is not enabled, Device scope must not be used. -
If
vulkanMemoryModel
is not enabled, Device scope only extends to the queue family, not the whole device.
-
-
QueueFamilyKHR
-
If
vulkanMemoryModel
is not enabled, QueueFamilyKHR must not be used.
-
-
Workgroup
-
Subgroup
-
Invocation
-
-
Scope for Non Uniform Group Operations must be limited to:
-
Subgroup
-
-
Storage Class must be limited to:
-
UniformConstant
-
Input
-
Uniform
-
Output
-
Workgroup
-
Private
-
Function
-
PushConstant
-
Image
-
StorageBuffer
-
RayPayloadNV
-
IncomingRayPayloadNV
-
HitAttributeNV
-
CallableDataNV
-
IncomingCallableDataNV
-
ShaderRecordBufferNV
-
-
Memory semantics must obey the following rules:
-
Acquire must not be used with
OpAtomicStore
. -
Release must not be used with
OpAtomicLoad
. -
AcquireRelease must not be used with
OpAtomicStore
orOpAtomicLoad
. -
Sequentially consistent atomics and barriers are not supported and SequentiallyConsistent is treated as AcquireRelease. SequentiallyConsistent should not be used.
-
OpMemoryBarrier
must use one of Acquire, Release, AcquireRelease, or SequentiallyConsistent and must include at least one storage class. -
If the semantics for
OpControlBarrier
includes one of Acquire, Release, AcquireRelease, or SequentiallyConsistent, then it must include at least one storage class. -
SubgroupMemory, CrossWorkgroupMemory, and AtomicCounterMemory are ignored.
-
-
Any
OpVariable
with anInitializer
operand must have one of the following as its Storage Class operand:-
Output
-
Private
-
Function
-
-
The
OriginLowerLeft
execution mode must not be used; fragment entry points must declareOriginUpperLeft
. -
The
PixelCenterInteger
execution mode must not be used. Pixels are always centered at half-integer coordinates. -
Images and Samplers
-
OpTypeImage
must declare a scalar 32-bit float or 32-bit integer type for the “Sampled Type”. (RelaxedPrecision
can be applied to a sampling instruction and to the variable holding the result of a sampling instruction.) -
OpTypeImage
must have a “Sampled” operand of 1 (sampled image) or 2 (storage image). -
If shaderStorageImageReadWithoutFormat is not enabled and an
OpTypeImage
has “Image Format” operand ofUnknown
, any variables created with the given type must be decorated withNonReadable
. -
If shaderStorageImageWriteWithoutFormat is not enabled and an
OpTypeImage
has “Image Format” operand ofUnknown
, any variables created with the given type must be decorated withNonWritable
. -
OpImageQuerySizeLod
, andOpImageQueryLevels
must only consume an “Image” operand whose type has its “Sampled” operand set to 1. -
The (u,v) coordinates used for a
SubpassData
must be the <id> of a constant vector (0,0), or if a layer coordinate is used, must be a vector that was formed with constant 0 for the u and v components. -
The “Depth” operand of
OpTypeImage
is ignored. -
Objects of types
OpTypeImage
,OpTypeSampler
,OpTypeSampledImage
, and arrays of these types must not be stored to or modified.
-
-
Decorations
-
Any
BuiltIn
decoration not listed in Built-In Variables must not be used. -
Any
BuiltIn
decoration that corresponds only to Vulkan features or extensions that have not been enabled must not be used. -
The
GLSLShared
andGLSLPacked
decorations must not be used. -
The
Flat
,NoPerspective
,Sample
, andCentroid
decorations must not be used on variables with storage class other thanInput
or on variables used in the interface of non-fragment shader entry points. -
The
Patch
decoration must not be used on variables in the interface of a vertex, geometry, or fragment shader stage’s entry point. -
The
ViewportRelativeNV
decoration must only be used on a variable decorated withLayer
in the vertex, tessellation evaluation, or geometry shader stages. -
The
ViewportRelativeNV
decoration must not be used unless a variable decorated with one ofViewportIndex
orViewportMaskNV
is also statically used by the sameOpEntryPoint
. -
The
ViewportMaskNV
andViewportIndex
decorations must not both be statically used by one or moreOpEntryPoint
’s that form the vertex processing stages of a graphics pipeline. -
Only the round-to-nearest-even and the round-to-zero rounding modes can be used for the
FPRoundingMode
decoration. -
The
FPRoundingMode
decoration can only be used for the floating-point conversion instructions as described in theSPV_KHR_16bit_storage
SPIR-V extension. -
DescriptorSet
andBinding
decorations must obey the constraints on storage class, type, and descriptor type described in DescriptorSet and Binding Assignment
-
-
OpTypeRuntimeArray
must only be used for:-
the last member of an
OpTypeStruct
that is in theStorageBuffer
storage class decorated asBlock
, or that is in theUniform
storage class decorated asBufferBlock
. -
If the
RuntimeDescriptorArrayEXT
capability is supported, an array of variables with storage classUniform
,StorageBuffer
, orUniformConstant
, or for the outermost dimension of an array of arrays of such variables.
-
-
Linkage: See Shader Interfaces for additional linking and validation rules.
-
If
OpControlBarrier
is used in fragment, vertex, tessellation evaluation, or geometry stages, the execution Scope must beSubgroup
. -
Compute Shaders
-
For each compute shader entry point, either a
LocalSize
execution mode or an object decorated with theWorkgroupSize
decoration must be specified. -
For compute shaders using the
DerivativeGroupQuadsNV
execution mode, the first two dimensions of the local workgroup size must be a multiple of two. -
For compute shaders using the
DerivativeGroupLinearNV
execution mode, the product of the dimensions of the local workgroup size must be a multiple of four.
-
-
“Result Type” for Non Uniform Group Operations must be limited to 32-bit float, 32-bit integer, boolean, or vectors of these types. If the
Float64
capability is enabled, double and vectors of double types are also permitted. -
“Mask” for
OpGroupNonUniformShuffleXor
must be a specialization constant or a constant, or if the dynamic instance is called within a loop construct it must be one of:-
A specialization constant.
-
A constant.
-
An arithmetic operation whose operands are 1., 2., or 4.
-
A phi node whose operands are 1., 2., or 3.
-
-
If
OpGroupNonUniformBallotBitCount
is used, the group operation must be one of:-
Reduce
-
InclusiveScan
-
ExclusiveScan
-
-
Atomic instructions must declare a scalar 32-bit integer type, or a scalar 64-bit integer type if the
Int64Atomics
capability is enabled, for the value pointed to by Pointer.-
shaderBufferInt64Atomics must be enabled for 64-bit integer atomic operations to be supported on a Pointer with a Storage Class of StorageBuffer or Uniform.
-
shaderSharedInt64Atomics must be enabled for 64-bit integer atomic operations to be supported on a Pointer with a Storage Class of Workgroup.
-
-
The Pointer operand of all atomic instructions must have a Storage Class limited to:
-
Uniform
-
Workgroup
-
Image
-
StorageBuffer
-
-
If an instruction loads from or stores to a resource (including atomics and image instructions) and the resource descriptor being accessed is not dynamically uniform, then the operand corresponding to that resource (e.g. the pointer or sampled image operand) must be decorated with
NonUniformEXT
. -
If
separateDenormSettings
isVK_FALSE
, then the entry point must use the same denormals execution mode for both 16-bit and 64-bit floating-point types. -
If
separateRoundingModeSettings
isVK_FALSE
, then the entry point must use the same rounding execution mode for both 16-bit and 64-bit floating-point types. -
If
shaderSignedZeroInfNanPreserveFloat16
isVK_FALSE
, thenSignedZeroInfNanPreserve
for 16-bit floating-point type must not be used. -
If
shaderSignedZeroInfNanPreserveFloat32
isVK_FALSE
, thenSignedZeroInfNanPreserve
for 32-bit floating-point type must not be used. -
If
shaderSignedZeroInfNanPreserveFloat64
isVK_FALSE
, thenSignedZeroInfNanPreserve
for 64-bit floating-point type must not be used. -
If
shaderDenormPreserveFloat16
isVK_FALSE
, thenDenormPreserve
for 16-bit floating-point type must not be used. -
If
shaderDenormPreserveFloat32
isVK_FALSE
, thenDenormPreserve
for 32-bit floating-point type must not be used. -
If
shaderDenormPreserveFloat64
isVK_FALSE
, thenDenormPreserve
for 64-bit floating-point type must not be used. -
If
shaderDenormFlushToZeroFloat16
isVK_FALSE
, thenDenormFlushToZero
for 16-bit floating-point type must not be used. -
If
shaderDenormFlushToZeroFloat32
isVK_FALSE
, thenDenormFlushToZero
for 32-bit floating-point type must not be used. -
If
shaderDenormFlushToZeroFloat64
isVK_FALSE
, thenDenormFlushToZero
for 64-bit floating-point type must not be used. -
If
shaderRoundingModeRTEFloat16
isVK_FALSE
, thenRoundingModeRTE
for 16-bit floating-point type must not be used. -
If
shaderRoundingModeRTEFloat32
isVK_FALSE
, thenRoundingModeRTE
for 32-bit floating-point type must not be used. -
If
shaderRoundingModeRTEFloat64
isVK_FALSE
, thenRoundingModeRTE
for 64-bit floating-point type must not be used. -
If
shaderRoundingModeRTZFloat16
isVK_FALSE
, thenRoundingModeRTZ
for 16-bit floating-point type must not be used. -
If
shaderRoundingModeRTZFloat32
isVK_FALSE
, thenRoundingModeRTZ
for 32-bit floating-point type must not be used. -
If
shaderRoundingModeRTZFloat64
isVK_FALSE
, thenRoundingModeRTZ
for 64-bit floating-point type must not be used. -
The
Offset
plus size of the type of each variable, in the output interface of the entry point being compiled, decorated withXfbBuffer
must not be greater thanVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackBufferDataSize
-
For any given
XfbBuffer
value, define the buffer data size to be smallest number of bytes such that, for all outputs decorated with the sameXfbBuffer
value, the size of the output interface variable plus theOffset
is less than or equal to the buffer data size. For a givenStream
, the sum of all the buffer data sizes for all buffers writing to that stream the must not exceedVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackStreamDataSize
-
Output variables or block members decorated with
Offset
that have a 64-bit type, or a composite type containing a 64-bit type, must specify anOffset
value aligned to a 8 byte boundary -
Any output block or block member decorated with
Offset
containing a 64-bit type consumes a multiple of 8 bytes -
The size of any output block, that contains any member decorated with
Offset
that is a 64-bit type, must be a multiple of 8 -
The first member of an output block that specifies a
Offset
decoration must specify aOffset
value that is aligned to an 8 byte boundary if that block contains any member decorated withOffset
and is a 64-bit type -
Output variables or block members decorated with
Offset
that have a 32-bit type, or a composite type contains a 32-bit type, must specify anOffset
value aligned to a 4 byte boundary -
Output variables, blocks or block members decorated with
Offset
must only contain base types that have components that are either 32-bit or 64-bit in size -
The Stream value to
OpEmitStreamVertex
andOpEndStreamPrimitive
must be less thanVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackStreams
-
If the geometry shader emits to more than one vertex stream and
VkPhysicalDeviceTransformFeedbackPropertiesEXT
::transformFeedbackStreamsLinesTriangles
isVK_FALSE
, then execution mode must beOutputPoints
-
Only variables or block members in the output interface decorated with
Offset
can be captured for transform feedback, and those variables or block memebers must also be decorated withXfbBuffer
andXfbStride
, or inheritXfbBuffer
andXfbStride
decorations from a block that contains them -
All variables or block members in the output interface of the entry point being compiled decorated with a specific
XfbBuffer
value must all be decorated with identicalXfbStride
values -
If any variables or block members in the output interface of the entry point being compiled are decorated with
Stream
, then all variables belonging to the sameXfbBuffer
must specify the sameStream
value -
Output variables, blocks or block members that are not decorated with
Stream
default to vertex stream zero -
For any two variables or block members in the output interface of the entry point being compiled with the same
XfbBuffer
value, the ranges determined by theOffset
decoration and the size of the type must not overlap -
The stream number value to
Stream
must be less thanVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackStreams
-
The XFB Stride value to
XfbStride
must be less than or equal toVkPhysicalDeviceTransformFeedbackPropertiesEXT
::maxTransformFeedbackBufferDataStride
-
RayPayloadNV
storage class must only be used in ray generation, any hit, closest hit or miss shaders. -
IncomingRayPayloadNV
storage class must only be used in closest hit, any hit, or miss shaders. -
HitAttributeNV
storage class must only be used in intersection, any hit, or closest hit shaders. -
CallableDataNV
storage class must only be used in ray generation, closest hit, miss, and callable shaders. -
IncomingCallableDataNV
storage class must only be used in callable shaders.
Precision and Operation of SPIR-V Instructions
The following rules apply to half, single, and double-precision floating point instructions:
-
Positive and negative infinities and positive and negative zeros are generated as dictated by IEEE 754, but subject to the precisions allowed in the following table.
-
Dividing a non-zero by a zero results in the appropriately signed IEEE 754 infinity.
-
Signaling NaNs are not required to be generated and exceptions are never raised. Signaling NaN may be converted to quiet NaNs values by any floating point instruction.
-
By default, the implementation may perform optimizations on half, single, or double-precision floating-point instructions respectively that ignore sign of a zero, or assume that arguments and results are not Nans or \(\pm\infty\), this doesn’t apply to
OpIsNan
andOpIsInf
, which must always correctly detect Nans and \(\pm\infty\). If the entry point is declared with theSignedZeroInfNanPreserve
execution mode, then sign of a zero, Nans, and \(\pm\infty\) must not be ignored.-
The following core SPIR-V instructions must respect the
SignedZeroInfNanPreserve
execution mode:OpPhi
,OpSelect
,OpReturnValue
,OpVectorExtractDynamic
,OpVectorInsertDynamic
,OpVectorShuffle
,OpCompositeConstruct
,OpCompositeExtract
,OpCompositeInsert
,OpCopyObject
,OpTranspose
,OpFConvert
,OpFNegate
,OpFAdd
,OpFSub
,OpFMul
,OpStore
. This execution mode must also be respected byOpLoad
except for loads from theInput
storage class in the fragment shader stage with the floating-point result type. Other SPIR-V instruction may also respect theSignedZeroInfNanPreserve
execution mode.
-
-
Denormalized values are supported.
-
By default, any half, single, or double-precision denormalized value input into a shader or potentially generated by any instruction or any extended instructions for GLSL in a shader may be flushed to zero.
-
If the entry point is declared with the
DenormFlushToZero
execution mode then for the affected instuctions the denormalized result must be flushed to zero and the denormalized operands may be flushed to zero. Denormalized values obtained via unpacking an integer into a vector of values with smaller bit width and interpreting those values as floating-point numbers must be flushed to zero. -
The following core SPIR-V instructions must respect the
DenormFlushToZero
execution mode:OpSpecConstantOp
(except when the opcode isOpQuantizeToF16
),OpFConvert
,OpFNegate
,OpFAdd
,OpFSub
,OpFMul
,OpFDiv
,OpFRem
,OpFMod
,OpVectorTimesScalar
,OpMatrixTimesScalar
,OpVectorTimesMatrix
,OpMatrixTimesVector
,OpMatrixTimesMatrix
,OpOuterProduct
,OpDot
; and the following extended instructions for GLSL:Round
,RoundEven
,Trunc
,FAbs
,Floor
,Ceil
,Fract
,Radians
,Degrees
,Sin
,Cos
,Tan
,Asin
,Acos
,Atan
,Sinh
,Cosh
,Tanh
,Asinh
,Acosh
,Atanh
,Atan2
,Pow
,Exp
,Log
,Exp2
,Log2
,Sqrt
,InverseSqrt
,Determinant
,MatrixInverse
,Modf
,ModfStruct
,FMin
,FMax
,FClamp
,FMix
,Step
,SmoothStep
,Fma
,UnpackHalf2x16
,UnpackDouble2x32
,Length
,Distance
,Cross
,Normalize
,FaceForward
,Reflect
,Refract
,NMin
,NMax
,NClamp
. Other SPIR-V instruction may also respect theDenormFlushToZero
execution mode. -
The following core SPIR-V instructions must respect the
DenormPreserve
execution mode:OpPhi
,OpSelect
,OpReturnValue
,OpVectorExtractDynamic
,OpVectorInsertDynamic
,OpVectorShuffle
,OpCompositeConstruct
,OpCompositeExtract
,OpCompositeInsert
,OpCopyObject
,OpTranspose
,OpStore
,OpSpecConstantOp
,OpFConvert
,OpFNegate
,OpFAdd
,OpFSub
,OpFMul
,OpVectorTimesScalar
,OpMatrixTimesScalar
,OpVectorTimesMatrix
,OpMatrixTimesVector
,OpMatrixTimesMatrix
,OpOuterProduct
,OpDot
,OpFOrdEqual
,OpFUnordEqual
,OpFOrdNotEqual
,OpFUnordNotEqual
,OpFOrdLessThan
,OpFUnordLessThan
,OpFOrdGreaterThan
,OpFUnordGreaterThan
,OpFOrdLessThanEqual
,OpFUnordLessThanEqual
,OpFOrdGreaterThanEqual
,OpFUnordGreaterThanEqual
; and the following extended instructions for GLSL:FAbs
,FSign
,Radians
,Degrees
,FMin
,FMax
,FClamp
,FMix
,Fma
,PackHalf2x16
,PackDouble2x32
,UnpackHalf2x16
,UnpackDouble2x32
,NMin
,NMax
,NClamp
. This execution mode must also be respected byOpLoad
except for loads from theInput
storage class in the fragment shader stage with the floating-point result type. Other SPIR-V instruction may also respect theDenormPreserve
execution mode.
-
The precision of double-precision instructions is at least that of single precision.
The precision of operations is defined either in terms of rounding, as an error bound in ULP, or as inherited from a formula as follows.
Operations described as “correctly rounded” will return the infinitely
precise result, x, rounded so as to be representable in
floating-point.
The rounding mode is not specified, unless the entry point is declared with
the RoundingModeRTE
or the RoundingModeRTZ
execution mode.
These execution modes affect only correctly rounded SPIR-V instructions.
These execution modes do not affect OpQuantizeToF16
.
If the rounding mode is not specified then this rounding is implementation
specific, subject to the following rules.
If x is exactly representable then x will be returned.
Otherwise, either the floating-point value closest to and no less than
x or the value closest to and no greater than x will be
returned.
Where an error bound of n ULP (units in the last place) is given, for an operation with infinitely precise result x the value returned must be in the range [x - n * ulp(x), x + n * ulp(x)]. The function ulp(x) is defined as follows:
-
If there exist non-equal floating-point numbers a and b such that a ≤ x ≤ b then ulp(x) is the minimum possible distance between such numbers, \(ulp(x) = \mathrm{min}_{a,b} | b - a |\). If such numbers do not exist then ulp(x) is defined to be the difference between the two finite floating-point numbers nearest to x.
Where the range of allowed return values includes any value of magnitude larger than that of the largest representable finite floating-point number, operations may return an infinity of the appropriate sign. If the infinitely precise result of the operation is not mathematically defined then the value returned is undefined.
Where an operation’s precision is described as being inherited from a
formula, the result returned must be at least as accurate as the result of
computing an approximation to x using a formula equivalent to the
given formula applied to the supplied inputs.
Specifically, the formula given may be transformed using the mathematical
associativity, commutativity and distributivity of the operators involved to
yield an equivalent formula.
The SPIR-V precision rules, when applied to each such formula and the given
input values, define a range of permitted values.
If NaN is one of the permitted values then the operation may return
any result, otherwise let the largest permitted value in any of the ranges
be Fmax and the smallest be Fmin.
The operation must return a value in the range [x - E, x + E] where
\(E = \mathrm{max} \left( | x - F_{\mathrm{min}} |, | x -
F_{\mathrm{max}} | \right) \).
If the entry point is declared with the DenormFlushToZero
execution
mode, then any intermediate denormal value(s) while evaluating the formula
may be flushed to zero.
Denormal final results must be flushed to zero.
If the entry point is declared with the DenormPreserve
execution mode,
then denormals must be preserved throughout the formula.
For half- (16 bit) and single- (32 bit) precision instructions, precisions are required to be at least as follows:
Instruction | Single precision, unless decorated with RelaxedPrecision | Half precision |
---|---|---|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Inherited from \(\sum_{i = 0}^{n - 1} x_{i} \times y_{i}\). |
|
|
Correct result. |
|
|
Correct result. |
|
|
Correct result. |
|
|
Correct result. |
|
|
Correct result. |
|
|
2.5 ULP for y in the range [2-126, 2126]. |
2.5 ULP for y in the range [2-14, 214]. |
|
Inherited from x - y × trunc(x/y), for y in the range [2-126, 2126]. |
Inherited from x - y × trunc(x/y), for y in the range [2-14, 214]. |
|
Inherited from x - y × floor(x/y), for y in the range [2-126, 2126]. |
Inherited from x - y × floor(x/y), for y in the range [2-14, 214]. |
conversions between types |
Correctly rounded. |
Note
The |
Instruction | Single precision, unless decorated with RelaxedPrecision | Half precision |
---|---|---|
|
Inherited from |
|
|
3 + 2 × |x| ULP. |
1 + 2 × |x| ULP. |
|
3 ULP outside the range [0.5, 2.0]. Absolute error < 2-21 inside the range [0.5, 2.0]. |
3 ULP outside the range [0.5, 2.0]. Absolute error < 2-7 inside the range [0.5, 2.0]. |
|
Inherited from |
|
|
Inherited from 1.0 / |
|
|
2 ULP. |
|
|
Inherited from \(\frac{x \times \pi}{180}\). |
|
|
Inherited from \(\frac{x \times 180}{\pi}\). |
|
|
Absolute error \(\leq 2^{-11}\) inside the range \([-\pi, \pi]\). |
Absolute error \(\leq 2^{-7}\) inside the range \([-\pi, \pi]\). |
|
Absolute error \(\leq 2^{-11}\) inside the range \([-\pi, \pi]\). |
Absolute error \(\leq 2^{-7}\) inside the range \([-\pi, \pi]\). |
|
Inherited from \(\frac{sin()}{cos()}\). |
|
|
Inherited from \(atan2(x, sqrt(1.0 - x^2))\). |
|
|
Inherited from \(atan2(sqrt(1.0 - x^2), x)\). |
|
|
4096 ULP |
5 ULP. |
|
Inherited from \((exp(x) - exp(-x)) \times 0.5\). |
|
|
Inherited from \((exp(x) + exp(-x)) \times 0.5\). |
|
|
Inherited from \(\frac{sinh()}{cosh()}\). |
|
|
Inherited from \(log(x + sqrt(x^2 + 1.0))\). |
|
|
Inherited from \(log(x + sqrt(x^2 - 1.0))\). |
|
|
Inherited from \(log(\frac{1.0 + x}{1.0 - x}) \times 0.5\). |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Inherited from \(sqrt(dot(x, x))\). |
|
|
Inherited from \(length(x - y)\). |
|
|
Inherited from |
|
|
Inherited from \(\frac{x}{length(x)}\). |
|
|
Correctly rounded. |
|
|
Inherited from x - 2.0 × |
|
|
Inherited from eta × I - (eta × |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Inherited from x × (1.0 - a) + y × a. |
|
|
Correctly rounded. |
|
|
Inherited from t × t × (3.0 - 2.0 × t), where \(t = clamp(\frac{x - edge0}{edge1 - edge0}, 0.0, 1.0)\). |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
|
|
Correctly rounded. |
GLSL.std.450 extended instructions specifically defined in terms of the above instructions inherit the above errors. GLSL.std.450 extended instructions not listed above and not defined in terms of the above have undefined precision. These include, for example, the trigonometric functions and determinant.
For the OpSRem
and OpSMod
instructions, if either operand is
negative the result is undefined.
Note
While the |
Compatibility Between SPIR-V Image Formats And Vulkan Formats
Images which are read from or written to by shaders must have SPIR-V image formats compatible with the Vulkan image formats backing the image under the circumstances described for texture image validation. The compatibile formats are:
SPIR-V Image Format | Compatible Vulkan Format |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Appendix B: Memory Model
Agent
Operation is a general term for any task that is executed on the system.
An operation is by definition something that is executed, thus if an instruction is skipped due to flow control it does not constitute an operation. |
Each operation is executed by a particular agent. Possible agents include each shader invocation, each host thread, and each fixed-function stage of the pipeline.
Memory Location
A memory location identifies unique storage for 8 bits of data. Memory operations access a set of memory locations consisting of one or more memory locations at a time, e.g. an operation accessing a 32-bit integer in memory would read/write a set of four memory locations. Two sets of memory locations overlap if the intersection of their sets of memory locations is non-empty. A memory operation must not affect memory at a memory location not within its set of memory locations.
Memory locations for buffers and images are explicitly allocated in VkDeviceMemory objects, and are implicitly allocated for SPIR-V variables in each shader invocation.
Allocation
The values stored in newly allocated memory locations are determined by a SPIR-V variable’s initializer, if present, or else are undefined. At the time an allocation is created there have been no memory operations to any of its memory locations. The initialization is not considered to be a memory operation.
For tessellation control shader output variables, a consequence of initialization not being considered a memory operation is that some implementations may need to insert a barrier between the initialization of the output variables and any reads of those variables. |
Memory Operation
For an operation A and memory location M:
A write whose value is the same as what was already in those memory locations is still considered to be a write and has all the same effects. |
Reference
A reference is an object that a particular agent can use to access a set of memory locations. On the host, a reference is a host virtual address. On the device, a reference is:
-
The descriptor that a variable is bound to, for variables in Image, Uniform, or StorageBuffer storage classes. If the variable is an array (or array of arrays, etc.) then each element of the array may be a unique reference.
-
The variable itself for variables in other storage classes.
Two memory accesses through distinct references may require availability and visibility operations as defined below.
Program-Order
A dynamic instance of an instruction is defined in SPIR-V (https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#DynamicInstance) as a way of referring to a particular execution of a static instruction. Program-order is an ordering on dynamic instances of instructions executed by a single shader invocation:
-
(Basic block): If instructions A and B are in the same basic block, and A is listed in the module before B, then the n’th dynamic instance of A is program-ordered before the n’th dynamic instance of B.
-
(Branch): The dynamic instance of a branch or switch instruction is program-ordered before the dynamic instance of the OpLabel instruction to which it transfers control.
-
(Call entry): The dynamic instance of a function call instruction is program-ordered before the dynamic instances of the
OpFunctionParameter
instructions and the body of the called function. -
(Call exit): The dynamic instance of the instruction following a function call instruction is program-ordered after the dynamic instance of the return instruction executed by the called function.
-
(Transitive Closure): If dynamic instance A of any instruction is program-ordered before dynamic instance B of any instruction and B is program-ordered before dynamic instance C of any instruction then A is program-ordered before C.
-
(Complete definition): No other dynamic instances are program-ordered.
For instructions executed on the host, the source language defines the program-order relation (e.g. as “sequenced-before”).
Scope
A scope describes a set of shader invocations, where each such set is a scope instance. Scopes are defined hierarchically such that a more inclusive scope includes one or more sets of less inclusive scope instances. The scopes defined by SPIR-V are as follows, defined from most inclusive to least inclusive:
-
CrossDevice
identifies all shader invocations in a Vulkan instance across all shader launches, and all host threads interacting with that instance. -
Device
identifes all shader invocations that execute on a given device, including those from different shader launches. -
QueueFamilyKHR
identifes all shader invocations that execute on any queue in a given queue family, including those from different shader launches. -
Workgroup
identifies all invocations in a single workgroup. -
Subgroup
identifies all invocations in a single subgroup. -
Invocation
identifies a single invocation.
Atomic and barrier instructions include scopes which identify sets of shader invocations that must obey the requested ordering and atomicity rules of the operation, as defined below.
Atomic Operation
An atomic operation on the device is any SPIR-V operation whose name
begins with OpAtomic
.
An atomic operation on the host is any operation performed with an
std::atomic typed object.
Each atomic operation has a memory scope and a
semantics.
Informally, the scope determines which other agents it is atomic with
respect to, and the semantics constrains
its ordering against other memory accesses.
Device atomic operations have explicit scopes and semantics.
Each host atomic operation implicitly uses the CrossDevice
scope, and
uses a memory semantics equivalent to a C++ std::memory_order value of
relaxed, acquire, release, acq_rel, or seq_cst.
Two atomic operations A and B are potentially-mutually-ordered if and only if all of the following are true:
-
They access the same set of memory locations.
-
They use the same reference.
-
A is in the instance of B’s memory scope.
-
B is in the instance of A’s memory scope.
Two atomic operations A and B are mutually-ordered if and only if they are potentially-mutually-ordered and any of the following are true:
-
A and B are both device operations.
-
A and B are both host operations.
-
A is a device operation, B is a host operation, and the implementation supports concurrent host- and device-atomics.
If two atomic operations are not mutually-ordered, and if their sets of memory locations overlap, then each must be synchronized against the other as if they were non-atomic operations. |
Scoped Modification Order
For a given atomic operation A, all atomic operations that are mutually-ordered with A occur in an order known as A’s scoped modification order. A’s scoped modification order relates no other operations.
Invocations outside the instance of A’s memory scope may observe the values at A’s set of memory locations becoming visible to it in an order that disagrees with the scoped modification order. |
It is valid to have non-atomic operations or atomics in a different scope instance to the same set of memory locations, as long as they are synchronized against each other as if they were non-atomic (if they are not, it is treated as a data race). That means this definition of A’s scoped modification order could include atomic operations that occur much later, after intervening non-atomics. That is a bit non-intuitive, but it helps to keep this definition simple and non-circular. |
Memory Semantics
Non-atomic memory operations, by default, may be observed by one agent in a different order than they were written by another agent.
Atomics and some synchronization operations include memory semantics, which are flags that constrain the order in which other memory accesses (including non-atomic memory accesses and availability and visibility operations) performed by the same agent can be observed by other agents, or can observe accesses by other agents.
Device instructions that include semantics are OpAtomic
*,
OpControlBarrier
, OpMemoryBarrier
, and OpMemoryNamedBarrier
.
Host instructions that include semantics are some std::atomic methods and
memory fences.
SPIR-V supports the following memory semantics:
-
Relaxed: No constraints on order of other memory accesses.
-
Acquire: A memory read with this semantic performs an acquire operation. A memory barrier with this semantic is an acquire barrier.
-
Release: A memory write with this semantic performs a release operation. A memory barrier with this semantic is a release barrier.
-
AcquireRelease: A memory read-modify-write operation with this semantic performs both an acquire operation and a release operation, and inherits the limitations on ordering from both of those operations. A memory barrier with this semantic is both a release and acquire barrier.
SPIR-V does not support “consume” semantics on the device. |
The memory semantics operand also includes storage class semantics which indicate which storage classes are constrained by the synchronization. SPIR-V storage class semantics include:
-
UniformMemory
-
WorkgroupMemory
-
ImageMemory
-
OutputMemoryKHR
Each SPIR-V memory operation accesses a single storage class. Semantics in synchronization operations can include a combination of storage classes.
The UniformMemory storage class semantic applies to accesses to memory in the Uniform and StorageBuffer storage classes. The WorkgroupMemory storage class semantic applies to accesses to memory in the Workgroup storage class. The ImageMemory storage class semantic applies to accesses to memory in the Image storage class. The OutputMemoryKHR storage class semantic applies to accesses to memory in the Output storage class.
Informally, these constraints limit how memory operations can be reordered, and these limits apply not only to the order of accesses as performed in the agent that executes the instruction, but also to the order the effects of writes become visible to all other agents within the same instance of the instruction’s memory scope. |
Release and acquire operations in different threads can act as synchronization operations, to guarantee that writes that happened before the release are visible after the acquire. (This is not a formal definition, just an informative forward reference.) |
The OutputMemoryKHR storage class semantic is only useful in tessellation control shaders, which is the only execution model where output variables are shared between invocations. |
The memory semantics operand also optionally includes availability and visibility flags, which apply optional availability and visibility operations as described in availability and visibility. The availability/visibility flags are:
-
MakeAvailable: Semantics must be Release or AcquireRelease. Performs an availability operation before the release operation or barrier.
-
MakeVisible: Semantics must be Acquire or AcquireRelease. Performs a visibility operation after the acquire operation or barrier.
The specifics of these operations are defined in Availability and Visibility Semantics.
Host atomic operations may support a different list of memory semantics and synchronization operations, depending on the host architecture and source language.
Release Sequence
After an atomic operation A performs a release operation on a set of memory locations M, the release sequence headed by A is the longest continuous subsequence of A’s scoped modification order that consists of:
-
the atomic operation A as its first element
-
atomic read-modify-write operations on M by any agent
The atomics in the last bullet must be mutually-ordered with A by virtue of being in A’s scoped modification order. |
This intentionally omits “atomic writes to M performed by the same agent that performed A”, which is present in the corresponding C++ definition. |
Synchronizes-With
Synchronizes-with is a relation between operations, where each operation is either an atomic operation or a memory barrier (aka fence on the host).
If A and B are atomic operations, then A synchronizes-with B if and only if all of the following are true:
-
A performs a release operation
-
B performs an acquire operation
-
A and B are mutually-ordered
-
B reads a value written by A or by an operation in the release sequence headed by A
OpControlBarrier
, OpMemoryBarrier
, and OpMemoryNamedBarrier
are memory barrier instructions in SPIR-V.
If A is a release barrier and B is an atomic operation that performs an acquire operation, then A synchronizes-with B if and only if all of the following are true:
-
there exists an atomic write X (with any memory semantics)
-
A is program-ordered before X
-
X and B are mutually-ordered
-
B reads a value written by X or by an operation in the release sequence headed by X
-
If X is relaxed, it is still considered to head a hypothetical release sequence for this rule
-
-
A and B are in the instance of each other’s memory scopes
-
X’s storage class is in A’s semantics.
If A is an atomic operation that performs a release operation and B is an acquire barrier, then A synchronizes-with B if and only if all of the following are true:
-
there exists an atomic read X (with any memory semantics)
-
X is program-ordered before B
-
X and A are mutually-ordered
-
X reads a value written by A or by an operation in the release sequence headed by A
-
A and B are in the instance of each other’s memory scopes
-
X’s storage class is in B’s semantics.
If A is a release barrier and B is an acquire barrier, then A synchronizes-with B if all of the following are true:
-
there exists an atomic write X (with any memory semantics)
-
A is program-ordered before X
-
there exists an atomic read Y (with any memory semantics)
-
Y is program-ordered before B
-
X and Y are mutually-ordered
-
Y reads the value written by X or by an operation in the release sequence headed by X
-
If X is relaxed, it is still considered to head a hypothetical release sequence for this rule
-
-
A and B are in the instance of each other’s memory scopes
-
X’s and Y’s storage class is in A’s and B’s semantics.
-
NOTE: X and Y must have the same storage class, because they are mutually ordered.
-
If A is a release barrier and B is an acquire barrier and C is a control barrier (where A can optionally equal C and B can optionally equal C), then A synchronizes-with B if all of the following are true:
-
A is program-ordered before (or equals) C
-
C is program-ordered before (or equals) B
-
A and B are in the instance of each other’s memory scopes
-
A and B are in the instance of C’s execution scope
This is similar to the barrier-barrier synchronization above, but with a control barrier filling the role of the relaxed atomics. |
No other release and acquire barriers synchronize-with each other.
System-Synchronizes-With
System-synchronizes-with is a relation between arbitrary operations on the device or host. Certain operations system-synchronize-with each other, which informally means the first operation occurs before the second and that the synchronization is performed without using application-visible memory accesses.
If there is an execution dependency between two operations A and B, then the operation in the first synchronization scope system-synchronizes-with the operation in the second synchronization scope.
This covers all Vulkan synchronization primitives, including device operations executing before a synchronization primitive is signaled, wait operations happening before subsequent device operations, signal operations happening before host operations that wait on them, and host operations happening before vkQueueSubmit. The list is spread throughout the synchronization chapter, and is not repeated here. |
System-synchronizes-with implicitly includes all storage class semantics and
has CrossDevice
scope.
If A system-synchronizes-with B, we also say A is system-synchronized-before B and B is system-synchronized-after A.
Private vs. Non-Private
By default, non-atomic memory operations are treated as private, meaning such a memory operation is not intended to be used for communication with other agents. Memory operations with the NonPrivatePointerKHR/NonPrivateTexelKHR bit set are treated as non-private, and are intended to be used for communication with other agents.
More precisely, for private memory operations to be Location-Ordered between distinct agents requires using system-synchronizes-with rather than shader-based synchronization. Non-private memory operations still obey program-order.
Atomic operations are always considered non-private.
Inter-Thread-Happens-Before
Let SC be a non-empty set of storage class semantics. Then (using template syntax) operation A inter-thread-happens-before<SC> operation B if and only if any of the following is true:
-
A system-synchronizes-with B
-
A synchronizes-with B, and both A and B have all of SC in their semantics
-
A is an operation on memory in a storage class in SC or that has all of SC in its semantics, B is a release barrier or release atomic with all of SC in its semantics, and A is program-ordered before B
-
A is an acquire barrier or acquire atomic with all of SC in its semantics, B is an operation on memory in a storage class in SC or that has all of SC in its semantics, and A is program-ordered before B
-
A and B are both host operations and A inter-thread-happens-before B as defined in the host language spec
-
A inter-thread-happens-before<SC> some X and X inter-thread-happens-before<SC> B
Happens-Before
Operation A happens-before operation B if and only if any of the following is true:
-
A is program-ordered before B
-
A inter-thread-happens-before<SC> B for some set of storage classes SC
Happens-after is defined similarly.
Unlike C++, happens-before is not always sufficient for a write to be visible to a read. Additional availability and visibility operations may be required for writes to be visible-to other memory accesses. |
Happens-before is not transitive, but each of program-order and inter-thread-happens-before<SC> are transitive. These can be thought of as covering the “single-threaded” case and the “multi-threaded” case, and it’s not necessary (and not valid) to form chains between the two. |
Availability and Visibility
Availability and visibility are states of a write operation, which (informally) track how far the write has permeated the system, i.e. which agents and references are able to observe the write. Availability state is per memory domain. Visibility state is per (agent,reference) pair. Availability and visibility states are per-memory location for each write.
Memory domains are named according to the agents whose memory accesses use the domain. Domains used by shader invocations are organized hierarchically into multiple smaller memory domains which correspond to the different scopes. The memory domains defined in Vulkan include:
-
host - accessible by host agents
-
device - accessible by all device agents for a particular device
-
shader - accessible by shader agents for a particular device, corresponding to the
Device
scope -
queue family instance - accessible by shader agents in a single queue family, corresponding to the
QueueFamilyKHR
scope. -
workgroup instance - accessible by shader agents in the same workgroup, corresponding to the
Workgroup
scope. -
subgroup instance - accessible by shader agents in the same subgroup, corresponding to the
Subgroup
scope.
These do not correspond to storage classes or device-local and host-local VkDeviceMemory allocations, rather they indicate whether a write can be made visible only to agents in the same subgroup, same workgroup, in any shader invocation, or anywhere on the device, or host. The shader, queue family instance, workgroup instance, and subgroup instance domains are only used for shader-based availability/visibility operatons, in other cases writes can be made available from/visible to the shader via the device domain. |
Availability operations, visibility operations, and memory domain operations alter the state of the write operations that happen-before them, and which are included in their source scope to be available or visible to their destination scope.
-
For an availability operation, the source scope is a set of (agent,reference,memory location) tuples, and the destination scope is a set of memory domains.
-
For a memory domain operation, the source scope is a memory domain and the destination scope is a memory domain.
-
For a visibility operation, the source scope is a set of memory domains and the destination scope is a set of (agent,reference,memory location) tuples.
How the scopes are determined depends on the specific operation. Availability and memory domain operations expand the set of memory domains to which the write is available. Visibility operations expand the set of (agent,reference,memory location) tuples to which the write is visible.
Recall that availability and visibility states are per-memory location, and let W be a write operation to one or more locations performed by agent A via reference R. Let L be one of the locations written. (W,L) (the write W to L), is initially not available to any memory domain and only visible to (A,R,L). An availability operation AV that happens-after W and that includes (A,R,L) in its source scope makes (W,L) available to the memory domains in its destination scope.
A memory domain operation DOM that happens-after AV and for which (W,L) is available in the source scope makes (W,L) available in the destination memory domain.
A visibility operation VIS that happens-after AV (or DOM) and for which (W,L) is available in any domain in the source scope makes (W,L) visible to all (agent,reference,L) tuples included in its destination scope.
If write W2 happens-after W, and their sets of memory locations overlap, then W will not be available/visible to all agents/references for those memory locations that overlap (and future AV/DOM/VIS ops can’t revive W’s write to those locations).
Availability, memory domain, and visibility operations are treated like other non-atomic memory accesses for the purpose of memory semantics, meaning they can be ordered by release-acquire sequences or memory barriers.
Availability, Visibility, and Domain Operations
The following operations generate availability, visibility, and domain operations. When multiple availability/visibility/domain operations are described, they are system-synchronized-with each other in the order listed.
An operation that performs a memory dependency generates:
-
If the source access mask includes
VK_ACCESS_HOST_WRITE_BIT
, then the dependency includes a memory domain operation from host domain to device domain. -
An availability operation with source scope of all writes in the first access scope of the dependency and a destination scope of the device domain.
-
A visibility operation with source scope of the device domain and destination scope of the second access scope of the dependency.
-
If the destination access mask includes
VK_ACCESS_HOST_READ_BIT
orVK_ACCESS_HOST_WRITE_BIT
, then the dependency includes a memory domain operation from device domain to host domain.
vkFlushMappedMemoryRanges performs an availability operation, with a source scope of (agents,references) = (all host threads, all mapped memory ranges passed to the command), and destination scope of the host domain.
vkInvalidateMappedMemoryRanges performs a visibility operation, with a source scope of the host domain and a destination scope of (agents,references) = (all host threads, all mapped memory ranges passed to the command).
vkQueueSubmit performs a memory domain operation from host to device, and a visibility operation with source scope of the device domain and destination scope of all agents and references on the device.
Availability and Visibility Semantics
A memory barrier or atomic operation via agent A that includes MakeAvailable in its semantics performs an availability operation whose source scope includes agent A and all references in the storage classes in that instruction’s storage class semantics, and all memory locations, and whose destination scope is a set of memory domains selected as specified below. The implicit availability operation is program-ordered between the barrier or atomic and all other operations program-ordered before the barrier or atomic.
A memory barrier or atomic operation via agent A that includes MakeVisible in its semantics performs a visibility operation whose source scope is a set of memory domains selected as specified below, and whose destination scope includes agent A and all references in the storage classes in that instruction’s storage class semantics, and all memory locations. The implicit visibility operation is program-ordered between the barrier or atomic and all other operations program-ordered after the barrier or atomic.
The memory domains are selected based on the memory scope of the instruction as follows:
-
Device
scope uses the shader domain -
QueueFamilyKHR
scope uses the queue family instance domain -
Workgroup
scope uses the workgroup instance domain -
Subgroup
uses the subgroup instance domain -
Invocation
perform no availability/visibility operations.
When an availability operation performed by an agent A includes a memory domain D in its destination scope, where D corresponds to scope instance S, it also includes the memory domains that correspond to each smaller scope instance S' that is a subset of S and that includes A. Similarly for visibility operations.
Per-Instruction Availability and Visibility Semantics
A memory write instruction that includes MakePointerAvailable, or an image write instruction that includes MakeTexelAvailable, performs an availability operation whose source scope includes the agent and reference used to perform the write and the memory locations written by the instruction, and whose destination scope is a set of memory domains selected by the Scope operand specified in Availability and Visibility Semantics. The implicit availability operation is program-ordered between the write and all other operations program-ordered after the write.
A memory read instruction that includes MakePointerVisible, or an image read instruction that includes MakeTexelVisible, performs a visibility operation whose source scope is a set of memory domains selected by the Scope operand as specified in Availability and Visibility Semantics, and whose destination scope includes the agent and reference used to perform the read and the memory locations read by the instruction. The implicit visibility operation is program-ordered between read and all other operations program-ordered before the read.
Although reads with per-instruction visibility only perform visibility ops from the shader or workgroup instance or subgroup instance domain, they will also see writes that were made visible via the device domain, i.e. those writes previously performed by non-shader agents and made visible via API commands. |
It is expected that all invocations in a subgroup execute on the same processor with the same path to memory, and thus availability and visibility operations with subgroup scope can be expected to be “free”. |
Location-Ordered
Let X and Y be memory accesses to overlapping sets of memory locations M, where X != Y. Let (AX,RX) be the agent and reference used for X, and (AY,RY) be the agent and reference used for Y. For now, let “→” denote happens-before and “→rcpo” denote the reflexive closure of program-ordered before.
If D1 and D2 are different memory domains, then let DOM(D1,D2) be a memory domain operation from D1 to D2. Otherwise, let DOM(D,D) be a placeholder such that X→DOM(D,D)→Y if and only if X→Y.
X is location-ordered before Y for a location L in M if and only if any of the following is true:
-
AX == AY and RX == RY and X→Y
-
NOTE: this case means no availability/visibility ops required when it’s the same (agent,reference).
-
-
X and Y are mutually-ordered atomics, and X is before Y in X’s scoped modification order
-
X is a read, both X and Y are non-private, and X→Y
-
X is a read, and X (transitively) system-synchronizes with Y
-
If RX == RY and AX and AY access a common memory domain D (e.g. are in the same workgroup instance if D is the workgroup instance domain), and both X and Y are non-private:
-
X is a write, Y is a write, AV(AX,RX,D,L) is an availability operation making (X,L) available to domain D, and X→rcpoAV(AX,RX,D,L)→Y
-
X is a write, Y is a read, AV(AX,RX,D,L) is an availability operation making (X,L) available to domain D, VIS(AY,RY,D,L) is a visibility operation making writes to L available in domain D visible to Y, and X→rcpoAV(AX,RX,D,L)→VIS(AY,RY,D,L)→rcpoY
-
-
Let DX and DY each be either the device domain or the host domain, depending on whether AX and AY execute on the device or host:
-
X is a write and Y is a write, and X→AV(AX,RX,DX,L)→DOM(DX,DY)→Y
-
X is a write and Y is a read, and X→AV(AX,RX,DX,L)→DOM(DX,DY)→VIS(AY,RY,DY,L)→Y
-
The final bullet (synchronization through device/host domain) requires API-level synchronization operations, since the device/host domains are not accessible via shader instructions. And “device domain” is not to be confused with “device scope”, which synchronizes through the “shader domain”. |
Data Race
Let X and Y be operations that access overlapping sets of memory locations M, where X != Y, and at least one of X and Y is a write, and X and Y are not mutually-ordered atomic operations. If there does not exist a location-ordered relation between X and Y for each location in M, then there is a data race.
Applications must ensure that no data races occur during the execution of their application.
Data races can only occur due to instructions that are actually executed, and for example an instruction skipped due to flow control must not contribute to a data race. |
Visible-To
Let X be a write and Y be a read whose sets of memory locations overlap, and let M be the set of memory locations that overlap. Let M2 be a non-empty subset of M. Then X is visible-to Y for memory locations M2 if and only if all of the following are true:
-
X is location-ordered before Y for each location L in M2.
-
There does not exist another write Z to any location L in M2 such that X is location-ordered before Z for location L and Z is location-ordered before Y for location L.
If X is visible-to Y, then Y reads the value written by X for locations M2.
It is possible for there to be a write between X and Y that overwrites a subset of the memory locations, but the remaining memory locations (M2) will still be visible-to Y. |
Scoped Modification Order Coherence
Let A and B be mutually-ordered atomic operations, where A happens-before B, and let O be A’s scoped modification order. Then:
-
If A and B are both writes, then A must be earlier than B in O
-
If A and B are both reads, then the write that A takes its value from must be earlier in O than (or the same as) the write that B takes its value from
-
If A is a write and B is a read, then B must take its value from A or a write later than A in O
-
If A is a read and B is a write, then A must take its value from a write earlier than B in O
Shader I/O
If a shader invocation A in a shader stage other than Vertex
performs a
memory read operation X from an object in the Input
storage class, then
X is system-synchronized-after all writes to the corresponding Output
storage variable(s) in the upstream shader invocation(s) that contribute to
generating invocation A, and those writes are all visible-to X.
It is not necessary for the upstream shader invocations to have completed execution, they only need to have generated the output that is being read. |
Deallocation
A call to vkFreeMemory must happen-after all memory operations on all memory locations in that VkDeviceMemory object.
Normally, device memory operations in a given queue are synchronized with vkFreeMemory by having a host thread wait on a fence signalled by that queue, and the wait happens-before the call to vkFreeMemory on the host. |
The deallocation of SPIR-V variables is managed by the system and happens-after all operations on those variables.
Informative Descriptions
This subsection is non-normative, and offers more easily understandable consequences of the memory model for app/compiler developers.
Let SC be the storage class(es) specified by a release or acquire operation or barrier.
-
An atomic write with release semantics must not be reordered against any read or write to SC that is program-ordered before it (regardless of the storage class the atomic is in).
-
An atomic read with acquire semantics must not be reordered against any read or write to SC that is program-ordered after it (regardless of the storage class the atomic is in).
-
Any write to SC program-ordered after a release barrier must not be reordered against any read or write to SC program-ordered before that barrier.
-
Any read from SC program-ordered before an acquire barrier must not be reordered against any read or write to SC program-ordered after the barrier.
A control barrier (even if it has no memory semantics) must not be reordered against any memory barriers.
This memory model allows memory accesses with and without availability and visibility operations, as well as atomic operations, all to be performed on the same memory location. This is critical to allow it to reason about memory that is reused in multiple ways, e.g. across the lifetime of different shader invocations or draw calls. While GLSL (and legacy SPIR-V) applies the “coherent” decoration to variables (for historical reasons), this model treats each memory access instruction as having optional implicit availability/visibility operations. GLSL to SPIR-V compilers should map all (non-atomic) operations on a coherent variable to Make{Pointer,Texel}{Available}{Visible} flags in this model.
Atomic operations implicitly have availability/visibility operations, and the scope of those operations is taken from the atomic operation’s scope.
Tessellation Output Ordering
For SPIR-V that uses the Vulkan Memory Model, the OutputMemory
storage
class is used to synchronize accesses to tessellation control output
variables.
For legacy SPIR-V that does not enable the Vulkan Memory Model via
OpMemoryModel
, tessellation outputs can be ordered using a control
barrier with no particular memory scope or semantics, as defined below.
Let X and Y be memory operations performed by shader invocations AX and AY. Operation X is tessellation-output-ordered before operation Y if and only if all of the following are true:
-
There is a dynamic instance of an
OpControlBarrier
instruction C such that X is program-ordered before C in AX and C is program-ordered before Y in AY. -
AX and AY are in the same instance of C’s execution scope.
If shader invocations AX and AY in the TessellationControl
execution model execute memory operations X and Y, respectively, on the
Output
storage class, and X is tessellation-output-ordered before Y
with a scope of Workgroup
, then X is location-ordered before Y, and if
X is a write and Y is a read then X is visible-to Y.
Appendix C: Compressed Image Formats
The compressed texture formats used by Vulkan are described in the specifically identified sections of the Khronos Data Format Specification, version 1.1.
Unless otherwise described, the quantities encoded in these compressed formats are treated as normalized, unsigned values.
Those formats listed as sRGB-encoded have in-memory representations of R, G and B components which are nonlinearly-encoded as R', G', and B'; any alpha component is unchanged. As part of filtering, the nonlinear R', G', and B' values are converted to linear R, G, and B components; any alpha component is unchanged. The conversion between linear and nonlinear encoding is performed as described in the “KHR_DF_TRANSFER_SRGB” section of the Khronos Data Format Specification.
Block-Compressed Image Formats
VkFormat | Khronos Data Format Specification description |
---|---|
Formats described in the “S3TC Compressed Texture Image Formats” chapter |
|
|
BC1 with no alpha |
|
BC1 with no alpha, sRGB-encoded |
|
BC1 with alpha |
|
BC1 with alpha, sRGB-encoded |
|
BC2 |
|
BC2, sRGB-encoded |
|
BC3 |
|
BC3, sRGB-encoded |
Formats described in the “RGTC Compressed Texture Image Formats” chapter |
|
|
BC4 unsigned |
|
BC4 signed |
|
BC5 unsigned |
|
BC5 signed |
Formats described in the “BPTC Compressed Texture Image Formats” chapter |
|
|
BC6H (unsigned version) |
|
BC6H (signed version) |
|
BC7 |
|
BC7, sRGB-encoded |
ETC Compressed Image Formats
The following formats are described in the “ETC2 Compressed Texture Image Formats” chapter of the Khronos Data Format Specification.
VkFormat | Khronos Data Format Specification description |
---|---|
|
RGB ETC2 |
|
RGB ETC2 with sRGB encoding |
|
RGB ETC2 with punch-through alpha |
|
RGB ETC2 with punch-through alpha and sRGB |
|
RGBA ETC2 |
|
RGBA ETC2 with sRGB encoding |
|
Unsigned R11 EAC |
|
Signed R11 EAC |
|
Unsigned RG11 EAC |
|
Signed RG11 EAC |
ASTC Compressed Image Formats
ASTC formats are described in the “ASTC Compressed Texture Image Formats” chapter of the Khronos Data Format Specification.
VkFormat | Compressed texel block dimensions | sRGB-encoded |
---|---|---|
|
4 × 4 |
No |
|
4 × 4 |
Yes |
|
5 × 4 |
No |
|
5 × 4 |
Yes |
|
5 × 5 |
No |
|
5 × 5 |
Yes |
|
6 × 5 |
No |
|
6 × 5 |
Yes |
|
6 × 6 |
No |
|
6 × 6 |
Yes |
|
8 × 5 |
No |
|
8 × 5 |
Yes |
|
8 × 6 |
No |
|
8 × 6 |
Yes |
|
8 × 8 |
No |
|
8 × 8 |
Yes |
|
10 × 5 |
No |
|
10 × 5 |
Yes |
|
10 × 6 |
No |
|
10 × 6 |
Yes |
|
10 × 8 |
No |
|
10 × 8 |
Yes |
|
10 × 10 |
No |
|
10 × 10 |
Yes |
|
12 × 10 |
No |
|
12 × 10 |
Yes |
|
12 × 12 |
No |
|
12 × 12 |
Yes |
ASTC decode mode
If the VK_EXT_astc_decode_mode
extension is enabled the ASTC decoding
described in the Khronos Data Format Specification is
modified by replacing or modifying the corresponding sections as described
below.
VkFormat | Decoding mode |
---|---|
|
decode_float16 |
|
decode_unorm8 |
|
decode_rgb9e5 |
LDR and HDR Modes
Note
This replaces section 16.5 in the Khronos Data Format Specification. |
The decoding process for LDR content can be simplified if it is known in advance that sRGB output is required. This selection is therefore included as part of the global configuration.
The two modes differ in various ways, as shown in ASTC differences between LDR and HDR modes.
Operation | LDR Mode | HDR Mode |
---|---|---|
Returned Value |
Determined by decoding mode |
Determined by decoding mode |
sRGB compatible |
Yes |
No |
LDR endpoint decoding precision |
16 bits, or 8 bits for sRGB |
16 bits |
HDR endpoint mode results |
Error color |
As decoded |
Error results |
Error color |
Vector of NaNs (0xFFFF) |
The type of the values returned by the decoding process is determined by the decoding mode as shown in ASTC decoding modes.
Decode mode | LDR Mode | HDR Mode |
---|---|---|
decode_float16 |
Vector of FP16 values |
Vector of FP16 values |
decode_unorm8 |
Vector of 8-bit unsigned normalized values |
invalid |
decode_rgb9e5 |
Vector using a shared exponent format |
Vector using a shared exponent format |
Using the decode_unorm8 decoding mode in HDR mode gives undefined results.
For sRGB, the decoding mode is ignored, and the decoding always returns a vector of 8-bit unsigned normalized values.
The error color is opaque fully-saturated magenta [(R,G,B,A) = (0xFF,0x00,0xFF,0xFF). This has been chosen as it is much more noticeable than black or white, and occurs far less often in valid images.
For linear RGB decode, the error color may be either opaque fully-saturated magenta (R,G,B,A) = (1.0,0.0,1.0,1.0) or a vector of four NaNs (R,G,B,A) = (NaN,NaN,NaN,NaN). In the latter case, the recommended NaN value returned is 0xFFFF.
When using the decode_rgb9e5 decoding mode in HDR mode, error results will return the error color because NaN cannot be represented.
The error color is returned as an informative response to invalid conditions, including invalid block encodings or use of reserved endpoint modes.
Future, forward-compatible extensions to ASTC may define valid interpretations of these conditions, which will decode to some other color. Therefore, encoders and applications must not rely on invalid encodings as a way of generating the error color.
Note
This replaces section 16.19 in the Khronos Data Format Specification. |
Once the effective weight i for the texel has been calculated, the color endpoints are interpolated and expanded.
For LDR endpoint modes, each color component C is calculated from the corresponding 8-bit endpoint components C0 and C1 as follows:
If sRGB conversion is not enabled, or for the alpha channel in any case, C0 and C1 are first expanded to 16 bits by bit replication:
C0 = (C0 << 8) | C0; C1 = (C1 << 8) | C1;
If sRGB conversion is enabled, C0 and C1 for the R, G, and B channels are expanded to 16 bits differently, as follows:
C0 = (C0 << 8) | 0x80; C1 = (C1 << 8) | 0x80;
C0 and C1 are then interpolated to produce a UNORM16 result C:
C = floor( (C0*(64-i) + C1*i + 32)/64 )
If sRGB conversion is not enabled and the decoding mode is decode_float16, then if C = 65535 the final result is 1.0 (0x3C00); otherwise C is divided by 65536 and the infinite-precision result of the division is converted to FP16 with round-to-zero semantics.
If sRGB conversion is not enabled and the decoding mode is decode_unorm8, then top 8 bits of the interpolation result for the R, G, B, and A channels are used as the final result.
If sRGB conversion is not enabled and the decoding mode is decode_rgb9e5, then the final result is a combination of the (UNORM16) values of C for the three color components (Cr, Cg, and Cb) computed as follows:
int lz = clz17( Cr | Cg | Cb | 1); if (Cr == 65535 ) { Cr = 65536; lz = 0; } if (Cg == 65535 ) { Cg = 65536; lz = 0; } if (Cb == 65535 ) { Cb = 65536; lz = 0; } Cr <<= lz; Cg <<= lz; Cb <<= lz; Cr = (Cr >> 8) & 0x1FF; Cg = (Cg >> 8) & 0x1FF; Cb = (Cb >> 8) & 0x1FF; uint32_t exponent = 16 - lz; uint32_t texel = (exponent << 27) | (Cb << 18) | (Cg << 9) | Cr;
The clz17() function counts leading zeros in a 17-bit value.
If sRGB conversion is enabled, then the decoding mode is ignored, and the top 8 bits of the interpolation result for the R, G and B channels are passed to the external sRGB conversion block and used as the final result. The A channle uses the decode_float16 decoding mode.
For HDR endpoint modes, color values are represented in a 12-bit pseudo-logarithmic representation, and interpolation occurs in a piecewise-approximate logarithmic manner as follows:
In LDR mode, the error result is returned.
In HDR mode, the color components from each endpoint, C0 and C1, are initially shifted left 4 bits to become 16-bit integer values and these are interpolated in the same way as LDR. The 16-bit value C is then decomposed into the top five bits, E, and the bottom 11 bits M, which are then processed and recombined with E to form the final value Cf:
C = floor( (C0*(64-i) + C1*i + 32)/64 ) E = (C & 0xF800) >> 11; M = C & 0x7FF; if (M < 512) { Mt = 3*M; } else if (M >= 1536) { Mt = 5*M - 2048; } else { Mt = 4*M - 512; } Cf = (E<<10) + (Mt>>3)
This interpolation is a considerably closer approximation to a logarithmic space than simple 16-bit interpolation.
This final value Cf is interpreted as an IEEE FP16 value. If the result is +Inf or NaN, it is converted to the bit pattern 0x7BFF, which is the largest representable finite value.
If the decoding mode is decode_rgb9e5, then the final result is a combination of the (IEEE FP16) values of Cf for the three color components (Cr, Cg, and Cb) computed as follows:
if( Cr > 0x7c00 ) Cr = 0; else if( Cr == 0x7c00 ) Cr = 0x7bff; if( Cg > 0x7c00 ) Cg = 0; else if( Cg == 0x7c00 ) Cg = 0x7bff; if( Cb > 0x7c00 ) Cb = 0; else if( Cb == 0x7c00 ) Cb = 0x7bff; int Re = (Cr >> 10) & 0x1F; int Ge = (Cg >> 10) & 0x1F; int Be = (Cb >> 10) & 0x1F; int Rex = Re == 0 ? 1 : Re; int Gex = Ge == 0 ? 1 : Ge; int Bex = Be == 0 ? 1 : Be; int Xm = ((Cr | Cg | Cb) & 0x200) >> 9; int Xe = Re | Ge | Be; uint32_t rshift, gshift, bshift, expo; if (Xe == 0) { expo = rshift = gshift = bshift = Xm; } else if (Re >= Ge && Re >= Be) { expo = Rex + 1; rshift = 2; gshift = Rex - Gex + 2; bshift = Rex - Bex + 2; } else if (Ge >= Be) { expo = Gex + 1; rshift = Gex - Rex + 2; gshift = 2; bshift = Gex - Bex + 2; } else { expo = Bex + 1; rshift = Bex - Rex + 2; gshift = Bex - Gex + 2; bshift = 2; } int Rm = (Cr & 0x3FF) | (Re == 0 ? 0 : 0x400); int Gm = (Cg & 0x3FF) | (Ge == 0 ? 0 : 0x400); int Bm = (Cb & 0x3FF) | (Be == 0 ? 0 : 0x400); Rm = (Rm >> rshift) & 0x1FF; Gm = (Gm >> gshift) & 0x1FF; Bm = (Bm >> bshift) & 0x1FF; uint32_t texel = (expo << 27) | (Bm << 18) | (Gm << 9) | (Rm << 0);
Void-Extent Blocks
Note
This modifies section 16.23 in the Khronos Data Format Specification. |
In the HDR case, if the decoding mode is decode_rgb9e5, then any negative color component values are set to 0 before conversion to the shared exponent format (as described in Weight Application).
Appendix D: Core Revisions (Informative)
New minor versions of the Vulkan API are defined periodically by the Khronos Vulkan Working Group. These consist of some amount of additional functionality added to the core API, some of which may be promoted from extensions, other parts of which may be new. Extensions that are promoted in this way typically have their functionality replicated directly in the core, but with extension suffixes dropped. The existing values with suffixes are still present in the API itself as aliases of the original extension functionality. Any differences between the core and extension version of the functionality will be documented in the extension appendix, and mentioned briefly in the version description in this appendix.
Note
For structure and enumeration aliases, the aliased extension type is
semantically identical to the new core type.
The C99 headers simply For command aliases, however, there are two separate entry point definitions, due to the fact that the C99 ABI has no way to alias command definitions without resorting to the preprocessor. Calling via either entry point definition will produce identical behavior within the bounds of the specification, and should still invoke the same entry point in the implementation. Debug tools may use separate entry points with different debug behavior; to write the appropriate command name to an output log, for instance. |
It’s possible to build the specification for earlier versions, but to aid readability of the latest versions, this appendix gives an overview of the changes as compared to earlier versions.
Version 1.1
Vulkan Version 1.1 promoted a number of key extensions into the core API:
The only changes to the functionality added by these extensions were to
VK_KHR_shader_draw_parameters
, which had a
feature bit added to determine
support in the core API, and
variablePointersStorageBuffer
from VK_KHR_variable_pointers
was
made optional.
Additionally, Vulkan 1.1 added support for subgroup operations, protected memory, and a new command to enumerate the instance version.
New Object Types
New Defines
New Enum Constants
-
Extending VkBufferCreateFlagBits:
-
VK_BUFFER_CREATE_PROTECTED_BIT
-
-
Extending VkCommandPoolCreateFlagBits:
-
VK_COMMAND_POOL_CREATE_PROTECTED_BIT
-
-
Extending VkDependencyFlagBits:
-
VK_DEPENDENCY_DEVICE_GROUP_BIT
-
VK_DEPENDENCY_VIEW_LOCAL_BIT
-
-
Extending VkDeviceQueueCreateFlagBits:
-
VK_DEVICE_QUEUE_CREATE_PROTECTED_BIT
-
-
Extending VkFormat:
-
VK_FORMAT_G8B8G8R8_422_UNORM
-
VK_FORMAT_B8G8R8G8_422_UNORM
-
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM
-
VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
-
VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM
-
VK_FORMAT_G8_B8R8_2PLANE_422_UNORM
-
VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM
-
VK_FORMAT_R10X6_UNORM_PACK16
-
VK_FORMAT_R10X6G10X6_UNORM_2PACK16
-
VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16
-
VK_FORMAT_G10X6B10X6G10X6R10X6_422_UNORM_4PACK16
-
VK_FORMAT_B10X6G10X6R10X6G10X6_422_UNORM_4PACK16
-
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_420_UNORM_3PACK16
-
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_420_UNORM_3PACK16
-
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16
-
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_422_UNORM_3PACK16
-
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_444_UNORM_3PACK16
-
VK_FORMAT_R12X4_UNORM_PACK16
-
VK_FORMAT_R12X4G12X4_UNORM_2PACK16
-
VK_FORMAT_R12X4G12X4B12X4A12X4_UNORM_4PACK16
-
VK_FORMAT_G12X4B12X4G12X4R12X4_422_UNORM_4PACK16
-
VK_FORMAT_B12X4G12X4R12X4G12X4_422_UNORM_4PACK16
-
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_420_UNORM_3PACK16
-
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_420_UNORM_3PACK16
-
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_422_UNORM_3PACK16
-
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_422_UNORM_3PACK16
-
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_444_UNORM_3PACK16
-
VK_FORMAT_G16B16G16R16_422_UNORM
-
VK_FORMAT_B16G16R16G16_422_UNORM
-
VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM
-
VK_FORMAT_G16_B16R16_2PLANE_420_UNORM
-
VK_FORMAT_G16_B16_R16_3PLANE_422_UNORM
-
VK_FORMAT_G16_B16R16_2PLANE_422_UNORM
-
VK_FORMAT_G16_B16_R16_3PLANE_444_UNORM
-
-
Extending VkFormatFeatureFlagBits:
-
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT
-
VK_FORMAT_FEATURE_TRANSFER_DST_BIT
-
VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER_BIT
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_SEPARATE_RECONSTRUCTION_FILTER_BIT
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_FORCEABLE_BIT
-
VK_FORMAT_FEATURE_DISJOINT_BIT
-
VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT
-
-
Extending VkImageAspectFlagBits:
-
VK_IMAGE_ASPECT_PLANE_0_BIT
-
VK_IMAGE_ASPECT_PLANE_1_BIT
-
VK_IMAGE_ASPECT_PLANE_2_BIT
-
-
Extending VkImageCreateFlagBits:
-
VK_IMAGE_CREATE_ALIAS_BIT
-
VK_IMAGE_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT
-
VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT
-
VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT
-
VK_IMAGE_CREATE_EXTENDED_USAGE_BIT
-
VK_IMAGE_CREATE_PROTECTED_BIT
-
VK_IMAGE_CREATE_DISJOINT_BIT
-
-
Extending VkImageCreateFlagBits:
-
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL
-
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL
-
-
Extending VkMemoryHeapFlagBits:
-
VK_MEMORY_HEAP_MULTI_INSTANCE_BIT
-
-
Extending VkMemoryPropertyFlagBits:
-
VK_MEMORY_PROPERTY_PROTECTED_BIT
-
-
Extending VkObjectType:
-
VK_OBJECT_TYPE_SAMPLER_YCBCR_CONVERSION
-
VK_OBJECT_TYPE_DESCRIPTOR_UPDATE_TEMPLATE
-
-
Extending VkPipelineCreateFlagBits:
-
VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT
-
VK_PIPELINE_CREATE_DISPATCH_BASE
-
-
Extending VkQueueFlagBits:
-
VK_QUEUE_PROTECTED_BIT
-
-
Extending VkResult:
-
VK_ERROR_OUT_OF_POOL_MEMORY
-
VK_ERROR_INVALID_EXTERNAL_HANDLE
-
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SUBGROUP_PROPERTIES
-
VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_INFO
-
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES
-
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS
-
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO
-
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_FLAGS_INFO
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_RENDER_PASS_BEGIN_INFO
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_COMMAND_BUFFER_BEGIN_INFO
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_SUBMIT_INFO
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_BIND_SPARSE_INFO
-
VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_DEVICE_GROUP_INFO
-
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_DEVICE_GROUP_INFO
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_GROUP_PROPERTIES
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_DEVICE_CREATE_INFO
-
VK_STRUCTURE_TYPE_BUFFER_MEMORY_REQUIREMENTS_INFO_2
-
VK_STRUCTURE_TYPE_IMAGE_MEMORY_REQUIREMENTS_INFO_2
-
VK_STRUCTURE_TYPE_IMAGE_SPARSE_MEMORY_REQUIREMENTS_INFO_2
-
VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2
-
VK_STRUCTURE_TYPE_SPARSE_IMAGE_MEMORY_REQUIREMENTS_2
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2
-
VK_STRUCTURE_TYPE_FORMAT_PROPERTIES_2
-
VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2
-
VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MEMORY_PROPERTIES_2
-
VK_STRUCTURE_TYPE_SPARSE_IMAGE_FORMAT_PROPERTIES_2
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SPARSE_IMAGE_FORMAT_INFO_2
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_POINT_CLIPPING_PROPERTIES
-
VK_STRUCTURE_TYPE_RENDER_PASS_INPUT_ATTACHMENT_ASPECT_CREATE_INFO
-
VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO
-
VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_DOMAIN_ORIGIN_STATE_CREATE_INFO
-
VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_FEATURES
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_PROPERTIES
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VARIABLE_POINTER_FEATURES
-
VK_STRUCTURE_TYPE_PROTECTED_SUBMIT_INFO
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROTECTED_MEMORY_FEATURES
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROTECTED_MEMORY_PROPERTIES
-
VK_STRUCTURE_TYPE_DEVICE_QUEUE_INFO_2
-
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_CREATE_INFO
-
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_INFO
-
VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO
-
VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLER_YCBCR_CONVERSION_FEATURES
-
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_IMAGE_FORMAT_PROPERTIES
-
VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_IMAGE_FORMAT_INFO
-
VK_STRUCTURE_TYPE_EXTERNAL_IMAGE_FORMAT_PROPERTIES
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_BUFFER_INFO
-
VK_STRUCTURE_TYPE_EXTERNAL_BUFFER_PROPERTIES
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES
-
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_BUFFER_CREATE_INFO
-
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO
-
VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_FENCE_INFO
-
VK_STRUCTURE_TYPE_EXTERNAL_FENCE_PROPERTIES
-
VK_STRUCTURE_TYPE_EXPORT_FENCE_CREATE_INFO
-
VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_CREATE_INFO
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_SEMAPHORE_INFO
-
VK_STRUCTURE_TYPE_EXTERNAL_SEMAPHORE_PROPERTIES
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MAINTENANCE_3_PROPERTIES
-
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_SUPPORT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_DRAW_PARAMETER_FEATURES
-
New Enums
New Structures
New Functions
Appendix E: Layers & Extensions (Informative)
Extensions to the Vulkan API can be defined by authors, groups of authors, and the Khronos Vulkan Working Group. In order not to compromise the readability of the Vulkan Specification, the core Specification does not incorporate most extensions. The online Registry of extensions is available at URL
and allows generating versions of the Specification incorporating different extensions.
Most of the content previously in this appendix does not specify use of specific Vulkan extensions and layers, but rather specifies the processes by which extensions and layers are created. As of version 1.0.21 of the Vulkan Specification, this content has been migrated to the Vulkan Documentation and Extensions document. Authors creating extensions and layers must follow the mandatory procedures in that document.
The remainder of this appendix documents a set of extensions chosen when this document was built. Versions of the Specification published in the Registry include:
-
Core API + mandatory extensions required of all Vulkan implementations.
-
Core API + all registered and published Khronos (
KHR
) extensions. -
Core API + all registered and published extensions.
Extensions are grouped as Khronos KHR
, multivendor EXT
, and then
alphabetically by author ID.
Within each group, extensions are listed in alphabetical order by their
name.
Note
As of the initial Vulkan 1.1 public release, the Some vendors may use an alternate author ID ending in |
List of Current Extensions
VK_KHR_8bit_storage
- Name String
-
VK_KHR_8bit_storage
- Extension Type
-
Device extension
- Registered Extension Number
-
178
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_storage_buffer_storage_class
-
- Contact
-
-
Alexander Galazin alegal-arm
-
- Last Modified Date
-
2018-02-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension requires SPV_KHR_8bit_storage
-
- Contributors
-
-
Alexander Galazin, Arm
-
The VK_KHR_8bit_storage
extension allows use of 8-bit types in uniform and
storage buffers, and push constant blocks.
This extension introduces several new optional features which map to SPIR-V
capabilities and allow access to 8-bit data in Block
-decorated objects
in the Uniform
and the StorageBuffer
storage classes, and objects
in the PushConstant
storage class.
The StorageBuffer8BitAccess
capability must be supported by all
implementations of this extension.
The other capabilities are optional.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_8BIT_STORAGE_FEATURES_KHR
-
New Structures
New SPIR-V Capabilities
Issues
Version History
-
Revision 1, 2018-02-05 (Alexander Galazin)
-
Initial draft
-
VK_KHR_android_surface
- Name String
-
VK_KHR_android_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
9
- Revision
-
6
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Jesse Hall critsec
-
- Last Modified Date
-
2016-01-14
- IP Status
-
No known IP claims.
- Contributors
-
-
Patrick Doane, Blizzard
-
Jason Ekstrand, Intel
-
Ian Elliott, LunarG
-
Courtney Goeltzenleuchter, LunarG
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Antoine Labour, Google
-
Jon Leech, Khronos
-
David Mao, AMD
-
Norbert Nopper, Freescale
-
Alon Or-bach, Samsung
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Ray Smith, ARM
-
Jeff Vigil, Qualcomm
-
Chia-I Wu, LunarG
-
The VK_KHR_android_surface
extension is an instance extension.
It provides a mechanism to create a VkSurfaceKHR object (defined by
the VK_KHR_surface
extension) that refers to an ANativeWindow
,
Android’s native surface type.
The ANativeWindow
represents the producer endpoint of any buffer queue,
regardless of consumer endpoint.
Common consumer endpoints for ANativeWindows
are the system window
compositor, video encoders, and application-specific compositors importing
the images through a SurfaceTexture
.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_ANDROID_SURFACE_CREATE_INFO_KHR
-
New Enums
None
New Structures
New Functions
Issues
1) Does Android need a way to query for compatibility between a particular physical device (and queue family?) and a specific Android display?
RESOLVED: No. Currently on Android, any physical device is expected to be able to present to the system compositor, and all queue families must support the necessary image layout transitions and synchronization operations.
Version History
-
Revision 1, 2015-09-23 (Jesse Hall)
-
Initial draft.
-
-
Revision 2, 2015-10-26 (Ian Elliott)
-
Renamed from VK_EXT_KHR_android_surface to VK_KHR_android_surface.
-
-
Revision 3, 2015-11-03 (Daniel Rakos)
-
Added allocation callbacks to surface creation function.
-
-
Revision 4, 2015-11-10 (Jesse Hall)
-
Removed VK_ERROR_INVALID_ANDROID_WINDOW_KHR.
-
-
Revision 5, 2015-11-28 (Daniel Rakos)
-
Updated the surface create function to take a pCreateInfo structure.
-
-
Revision 6, 2016-01-14 (James Jones)
-
Moved VK_ERROR_NATIVE_WINDOW_IN_USE_KHR from the VK_KHR_android_surface to the VK_KHR_surface extension.
-
VK_KHR_create_renderpass2
- Name String
-
VK_KHR_create_renderpass2
- Extension Type
-
Device extension
- Registered Extension Number
-
110
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_multiview
-
Requires
VK_KHR_maintenance2
-
- Contact
-
-
Tobias Hector tobias
-
- Last Modified Date
-
2018-02-07
- Contributors
-
-
Tobias Hector
-
Jeff Bolz
-
This extension provides a new entry point to create render passes in a way
that can be easily extended by other extensions through the substructures of
render pass creation.
The Vulkan 1.0 render pass creation sub-structures do not include
sType
/pNext
members.
Additionally, the renderpass begin/next/end commands have been augmented
with new extensible structures for passing additional subpass information.
The VkRenderPassMultiviewCreateInfo and VkInputAttachmentAspectReference structures that extended the original VkRenderPassCreateInfo are not accepted into the new creation functions, and instead their parameters are folded into this extension as follows:
-
Elements of VkRenderPassMultiviewCreateInfo::
pViewMasks
are now specified in VkSubpassDescription2KHR::viewMask
. -
Elements of VkRenderPassMultiviewCreateInfo::
pViewOffsets
are now specified in VkSubpassDependency2KHR::viewOffset
. -
VkRenderPassMultiviewCreateInfo::
correlationMaskCount
and VkRenderPassMultiviewCreateInfo::pCorrelationMasks
are directly specified in VkRenderPassCreateInfo2KHR. -
VkInputAttachmentAspectReference::
aspectMask
is now specified in the relevant input attachment description in VkAttachmentDescription2KHR::aspectMask
The details of these mappings are explained fully in the new structures.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_ATTACHMENT_DESCRIPTION_2_KHR
-
VK_STRUCTURE_TYPE_ATTACHMENT_REFERENCE_2_KHR
-
VK_STRUCTURE_TYPE_SUBPASS_DESCRIPTION_2_KHR
-
VK_STRUCTURE_TYPE_SUBPASS_DEPENDENCY_2_KHR
-
VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO_2_KHR
-
VK_STRUCTURE_TYPE_SUBPASS_BEGIN_INFO_KHR
-
VK_STRUCTURE_TYPE_SUBPASS_END_INFO_KHR
-
New Structures
New Functions
Version History
-
Revision 1, 2018-02-07 (Tobias Hector)
-
Internal revisions
-
VK_KHR_display
- Name String
-
VK_KHR_display
- Extension Type
-
Instance extension
- Registered Extension Number
-
3
- Revision
-
21
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
- Last Modified Date
-
2017-03-13
- IP Status
-
No known IP claims.
- Contributors
-
-
James Jones, NVIDIA
-
Norbert Nopper, Freescale
-
Jeff Vigil, Qualcomm
-
Daniel Rakos, AMD
-
This extension provides the API to enumerate displays and available modes on a given device.
New Object Types
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DISPLAY_MODE_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_DISPLAY_SURFACE_CREATE_INFO_KHR
-
New Enums
New Structures
New Functions
Issues
1) Which properties of a mode should be fixed in the mode info vs. settable in some other function when setting the mode? E.g., do we need to double the size of the mode pool to include both stereo and non-stereo modes? YUV and RGB scanout even if they both take RGB input images? BGR vs. RGB input? etc.
PROPOSED RESOLUTION: Many modern displays support at most a handful of resolutions and timings natively. Other “modes” are expected to be supported using scaling hardware on the display engine or GPU. Other properties, such as rotation and mirroring should not require duplicating hardware modes just to express all combinations. Further, these properties may be implemented on a per-display or per-overlay granularity.
To avoid the exponential growth of modes as mutable properties are added, as
was the case with EGLConfig
/WGL pixel formats/GLXFBConfig
, this
specification should separate out hardware properties and configurable state
into separate objects.
Modes and overlay planes will express capabilities of the hardware, while a
separate structure will allow applications to configure scaling, rotation,
mirroring, color keys, LUT values, alpha masks, etc.
for a given swapchain independent of the mode in use.
Constraints on these settings will be established by properties of the
immutable objects.
Note the resolution of this issue may affect issue 5 as well.
2) What properties of a display itself are useful?
PROPOSED RESOLUTION: This issue is too broad. It was meant to prompt general discussion, but resolving this issue amounts to completing this specification. All interesting properties should be included. The issue will remain as a placeholder since removing it would make it hard to parse existing discussion notes that refer to issues by number.
3) How are multiple overlay planes within a display or mode enumerated?
PROPOSED RESOLUTION: They are referred to by an index. Each display will report the number of overlay planes it contains.
4) Should swapchains be created relative to a mode or a display?
PROPOSED RESOLUTION: When using this extension, swapchains are created relative to a mode and a plane. The mode implies the display object the swapchain will present to. If the specified mode is not the display’s current mode, the new mode will be applied when the first image is presented to the swapchain, and the default operating system mode, if any, will be restored when the swapchain is destroyed.
5) Should users query generic ranges from displays and construct their own modes explicitly using those constraints rather than querying a fixed set of modes (Most monitors only have one real “mode” these days, even though many support relatively arbitrary scaling, either on the monitor side or in the GPU display engine, making “modes” something of a relic/compatibility construct).
PROPOSED RESOLUTION: Expose both. Display info structures will expose a set of predefined modes, as well as any attributes necessary to construct a customized mode.
6) Is it fine if we return the display and display mode handles in the structure used to query their properties?
PROPOSED RESOLUTION: Yes.
7) Is there a possibility that not all displays of a device work with all of the present queues of a device? If yes, how do we determine which displays work with which present queues?
PROPOSED RESOLUTION: No known hardware has such limitations, but
determining such limitations is supported automatically using the existing
VK_KHR_surface
and VK_KHR_swapchain
query mechanisms.
8) Should all presentation need to be done relative to an overlay plane, or can a display mode + display be used alone to target an output?
PROPOSED RESOLUTION: Require specifying a plane explicitly.
9) Should displays have an associated window system display, such as an
HDC
or Display
*?
PROPOSED RESOLUTION: No.
Displays are independent of any windowing system in use on the system.
Further, neither HDC
nor Display
* refer to a physical display
object.
10) Are displays queried from a physical GPU or from a device instance?
PROPOSED RESOLUTION: Developers prefer to query modes directly from the physical GPU so they can use display information as an input to their device selection algorithms prior to device creation. This avoids the need to create dummy device instances to enumerate displays.
This preference must be weighed against the extra initialization that must be done by driver vendors prior to device instance creation to support this usage.
11) Should displays and/or modes be dispatchable objects? If functions are to take displays, overlays, or modes as their first parameter, they must be dispatchable objects as defined in Khronos bug 13529. If they are not added to the list of dispatchable objects, functions operating on them must take some higher-level object as their first parameter. There is no performance case against making them dispatchable objects, but they would be the first extension objects to be dispatchable.
PROPOSED RESOLUTION: Do not make displays or modes dispatchable. They will dispatch based on their associated physical device.
12) Should hardware cursor capabilities be exposed?
PROPOSED RESOLUTION: Defer. This could be a separate extension on top of the base WSI specs.
editing-note
There appears to be a missing sentence for the first part of issue 13 here. |
if they are one physical display device to an end user, but may internally be implemented as two side-by-side displays using the same display engine (and sometimes cabling) resources as two physically separate display devices.
RESOLVED: Tiled displays will appear as a single display object in this API.
14) Should the raw EDID data be included in the display information?
RESOLVED: No. A future extension could be added which reports the EDID if necessary. This may be complicated by the outcome of issue 13.
15) Should min and max scaling factor capabilities of overlays be exposed?
RESOLVED: Yes. This is exposed indirectly by allowing applications to query the min/max position and extent of the source and destination regions from which image contents are fetched by the display engine when using a particular mode and overlay pair.
16) Should devices be able to expose planes that can be moved between displays? If so, how?
RESOLVED: Yes. Applications can determine which displays a given plane supports using vkGetDisplayPlaneSupportedDisplaysKHR.
17) Should there be a way to destroy display modes? If so, does it support destroying “built in” modes?
RESOLVED: Not in this extension. A future extension could add this functionality.
18) What should the lifetime of display and built-in display mode objects be?
RESOLVED: The lifetime of the instance. These objects cannot be destroyed. A future extension may be added to expose a way to destroy these objects and/or support display hotplug.
19) Should persistent mode for smart panels be enabled/disabled at swapchain creation time, or on a per-present basis.
RESOLVED: On a per-present basis.
Examples
Note
The example code for the |
Version History
-
Revision 1, 2015-02-24 (James Jones)
-
Initial draft
-
-
Revision 2, 2015-03-12 (Norbert Nopper)
-
Added overlay enumeration for a display.
-
-
Revision 3, 2015-03-17 (Norbert Nopper)
-
Fixed typos and namings as discussed in Bugzilla.
-
Reordered and grouped functions.
-
Added functions to query count of display, mode and overlay.
-
Added native display handle, which is maybe needed on some platforms to create a native Window.
-
-
Revision 4, 2015-03-18 (Norbert Nopper)
-
Removed primary and virtualPostion members (see comment of James Jones in Bugzilla).
-
Added native overlay handle to info structure.
-
Replaced , with ; in struct.
-
-
Revision 6, 2015-03-18 (Daniel Rakos)
-
Added WSI extension suffix to all items.
-
Made the whole API more "Vulkanish".
-
Replaced all functions with a single vkGetDisplayInfoKHR function to better match the rest of the API.
-
Made the display, display mode, and overlay objects be first class objects, not subclasses of VkBaseObject as they do not support the common functions anyways.
-
Renamed *Info structures to *Properties.
-
Removed overlayIndex field from VkOverlayProperties as there is an implicit index already as a result of moving to a "Vulkanish" API.
-
Displays are not get through device, but through physical GPU to match the rest of the Vulkan API. Also this is something ISVs explicitly requested.
-
Added issue (6) and (7).
-
-
Revision 7, 2015-03-25 (James Jones)
-
Added an issues section
-
Added rotation and mirroring flags
-
-
Revision 8, 2015-03-25 (James Jones)
-
Combined the duplicate issues sections introduced in last change.
-
Added proposed resolutions to several issues.
-
-
Revision 9, 2015-04-01 (Daniel Rakos)
-
Rebased extension against Vulkan 0.82.0
-
-
Revision 10, 2015-04-01 (James Jones)
-
Added issues (10) and (11).
-
Added more straw-man issue resolutions, and cleaned up the proposed resolution for issue (4).
-
Updated the rotation and mirroring enums to have proper bitmask semantics.
-
-
Revision 11, 2015-04-15 (James Jones)
-
Added proposed resolution for issues (1) and (2).
-
Added issues (12), (13), (14), and (15)
-
Removed pNativeHandle field from overlay structure.
-
Fixed small compilation errors in example code.
-
-
Revision 12, 2015-07-29 (James Jones)
-
Rewrote the guts of the extension against the latest WSI swapchain specifications and the latest Vulkan API.
-
Address overlay planes by their index rather than an object handle and refer to them as "planes" rather than "overlays" to make it slightly clearer that even a display with no "overlays" still has at least one base "plane" that images can be displayed on.
-
Updated most of the issues.
-
Added an "extension type" section to the specification header.
-
Re-used the VK_EXT_KHR_surface surface transform enumerations rather than redefining them here.
-
Updated the example code to use the new semantics.
-
-
Revision 13, 2015-08-21 (Ian Elliott)
-
Renamed this extension and all of its enumerations, types, functions, etc. This makes it compliant with the proposed standard for Vulkan extensions.
-
Switched from "revision" to "version", including use of the VK_MAKE_VERSION macro in the header file.
-
-
Revision 14, 2015-09-01 (James Jones)
-
Restore single-field revision number.
-
-
Revision 15, 2015-09-08 (James Jones)
-
Added alpha flags enum.
-
Added premultiplied alpha support.
-
-
Revision 16, 2015-09-08 (James Jones)
-
Added description section to the spec.
-
Added issues 16 - 18.
-
-
Revision 17, 2015-10-02 (James Jones)
-
Planes are now a property of the entire device rather than individual displays. This allows planes to be moved between multiple displays on devices that support it.
-
Added a function to create a VkSurfaceKHR object describing a display plane and mode to align with the new per-platform surface creation conventions.
-
Removed detailed mode timing data. It was agreed that the mode extents and refresh rate are sufficient for current use cases. Other information could be added back2 in as an extension if it is needed in the future.
-
Added support for smart/persistent/buffered display devices.
-
-
Revision 18, 2015-10-26 (Ian Elliott)
-
Renamed from VK_EXT_KHR_display to VK_KHR_display.
-
-
Revision 19, 2015-11-02 (James Jones)
-
Updated example code to match revision 17 changes.
-
-
Revision 20, 2015-11-03 (Daniel Rakos)
-
Added allocation callbacks to creation functions.
-
-
Revision 21, 2015-11-10 (Jesse Hall)
-
Added VK_DISPLAY_PLANE_ALPHA_OPAQUE_BIT_KHR, and use VkDisplayPlaneAlphaFlagBitsKHR for VkDisplayPlanePropertiesKHR::alphaMode instead of VkDisplayPlaneAlphaFlagsKHR, since it only represents one mode.
-
Added reserved flags bitmask to VkDisplayPlanePropertiesKHR.
-
Use VkSurfaceTransformFlagBitsKHR instead of obsolete VkSurfaceTransformKHR.
-
Renamed vkGetDisplayPlaneSupportedDisplaysKHR parameters for clarity.
-
-
Revision 22, 2015-12-18 (James Jones)
-
Added missing "planeIndex" parameter to vkGetDisplayPlaneSupportedDisplaysKHR()
-
-
Revision 23, 2017-03-13 (James Jones)
-
Closed all remaining issues. The specification and implementations have been shipping with the proposed resolutions for some time now.
-
Removed the sample code and noted it has been integrated into the official Vulkan SDK cube demo.
-
VK_KHR_display_swapchain
- Name String
-
VK_KHR_display_swapchain
- Extension Type
-
Device extension
- Registered Extension Number
-
4
- Revision
-
9
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_swapchain
-
Requires
VK_KHR_display
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2017-03-13
- IP Status
-
No known IP claims.
- Contributors
-
-
James Jones, NVIDIA
-
Jeff Vigil, Qualcomm
-
Jesse Hall, Google
-
This extension provides an API to create a swapchain directly on a device’s display without any underlying window system.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DISPLAY_PRESENT_INFO_KHR
-
-
Extending VkResult:
-
VK_ERROR_INCOMPATIBLE_DISPLAY_KHR
-
New Enums
None
New Structures
New Functions
Issues
1) Should swapchains sharing images each hold a reference to the images, or should it be up to the application to destroy the swapchains and images in an order that avoids the need for reference counting?
RESOLVED: Take a reference. The lifetime of presentable images is already complex enough.
2) Should the srcRect
/dstRect
parameters be specified as part of
the present command, or at swapchain creation time?
RESOLVED: As part of the presentation command. This allows moving and scaling the image on the screen without the need to respecify the mode or create a new swapchain and presentable images.
3) Should srcRect
/dstRect
be specified as rects, or separate
offset/extent values?
RESOLVED: As rects. Specifying them separately might make it easier for hardware to expose support for one but not the other, but in such cases applications must just take care to obey the reported capabilities and not use non-zero offsets or extents that require scaling, as appropriate.
4) How can applications create multiple swapchains that use the same images?
RESOLVED: By calling vkCreateSharedSwapchainsKHR.
An earlier resolution used vkCreateSwapchainKHR, chaining multiple
VkSwapchainCreateInfoKHR structures through pNext
.
In order to allow each swapchain to also allow other extension structs, a
level of indirection was used: VkSwapchainCreateInfoKHR::pNext
pointed to a different structure, which had both an sType
/pNext
for additional extensions, and also had a pointer to the next
VkSwapchainCreateInfoKHR structure.
The number of swapchains to be created could only be found by walking this
linked list of alternating structures, and the pSwapchains
out
parameter was reinterpreted to be an array of VkSwapchainKHR handles.
Another option considered was a method to specify a “shared” swapchain when creating a new swapchain, such that groups of swapchains using the same images could be built up one at a time. This was deemed unusable because drivers need to know all of the displays an image will be used on when determining which internal formats and layouts to use for that image.
Examples
Note
The example code for the |
Version History
-
Revision 1, 2015-07-29 (James Jones)
-
Initial draft
-
-
Revision 2, 2015-08-21 (Ian Elliott)
-
Renamed this extension and all of its enumerations, types, functions, etc. This makes it compliant with the proposed standard for Vulkan extensions.
-
Switched from "revision" to "version", including use of the VK_MAKE_VERSION macro in the header file.
-
-
Revision 3, 2015-09-01 (James Jones)
-
Restore single-field revision number.
-
-
Revision 4, 2015-09-08 (James Jones)
-
Allow creating multiple swap chains that share the same images using a single call to vkCreateSwapChainKHR().
-
-
Revision 5, 2015-09-10 (Alon Or-bach)
-
Removed underscores from SWAP_CHAIN in two enums.
-
-
Revision 6, 2015-10-02 (James Jones)
-
Added support for smart panels/buffered displays.
-
-
Revision 7, 2015-10-26 (Ian Elliott)
-
Renamed from VK_EXT_KHR_display_swapchain to VK_KHR_display_swapchain.
-
-
Revision 8, 2015-11-03 (Daniel Rakos)
-
Updated sample code based on the changes to VK_KHR_swapchain.
-
-
Revision 9, 2015-11-10 (Jesse Hall)
-
Replaced VkDisplaySwapchainCreateInfoKHR with vkCreateSharedSwapchainsKHR, changing resolution of issue #4.
-
-
Revision 10, 2017-03-13 (James Jones)
-
Closed all remaining issues. The specification and implementations have been shipping with the proposed resolutions for some time now.
-
Removed the sample code and noted it has been integrated into the official Vulkan SDK cube demo.
-
VK_KHR_draw_indirect_count
- Name String
-
VK_KHR_draw_indirect_count
- Extension Type
-
Device extension
- Registered Extension Number
-
170
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Piers Daniell pdaniell-nv
-
- Status
-
Draft
- Last Modified Date
-
2017-08-25
- IP Status
-
No known IP claims.
- Contributors
-
-
Matthaeus G. Chajdas, AMD
-
Derrick Owens, AMD
-
Graham Sellers, AMD
-
Daniel Rakos, AMD
-
Dominik Witczak, AMD
-
Piers Daniell, NVIDIA
-
This extension is based off the VK_AMD_draw_indirect_count extension. This extension allows an application to source the number of draw calls for indirect draw calls from a buffer. This enables applications to generate arbitrary amounts of draw commands and execute them without host intervention.
New Functions
Version History
-
Revision 1, 2017-08-25 (Piers Daniell)
-
Initial draft based off VK_AMD_draw_indirect_count
-
VK_KHR_driver_properties
- Name String
-
VK_KHR_driver_properties
- Extension Type
-
Device extension
- Registered Extension Number
-
197
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Rakos drakos-amd
-
- Last Modified Date
-
2018-04-11
- IP Status
-
No known IP claims.
- Contributors
-
-
Baldur Karlsson
-
Matthaeus G. Chajdas, AMD
-
Piers Daniell, NVIDIA
-
Alexander Galazin, Arm
-
Jesse Hall, Google
-
Daniel Rakos, AMD
-
This extension provides a new physical device query which allows retrieving information about the driver implementation, allowing applications to determine which physical device corresponds to which particular vendor’s driver, and which conformance test suite version the driver implementation is compliant with.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DRIVER_PROPERTIES_KHR
-
-
VK_MAX_DRIVER_NAME_SIZE_KHR
-
VK_MAX_DRIVER_INFO_SIZE_KHR
New Enums
None.
New Structures
New Functions
None.
Issues
None.
Examples
None.
Version History
-
Revision 1, 2018-04-11 (Daniel Rakos)
-
Internal revisions
-
VK_KHR_external_fence_fd
- Name String
-
VK_KHR_external_fence_fd
- Extension Type
-
Device extension
- Registered Extension Number
-
116
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_fence
-
- Contact
-
-
Jesse Hall critsec
-
- Last Modified Date
-
2017-05-08
- IP Status
-
No known IP claims.
- Contributors
-
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Cass Everitt, Oculus
-
Contributors to
VK_KHR_external_semaphore_fd
-
An application using external memory may wish to synchronize access to that memory using fences. This extension enables an application to export fence payload to and import fence payload from POSIX file descriptors.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_IMPORT_FENCE_FD_INFO_KHR
-
VK_STRUCTURE_TYPE_FENCE_GET_FD_INFO_KHR
New Enums
None.
New Structs
New Functions
Issues
This extension borrows concepts, semantics, and language from
VK_KHR_external_semaphore_fd
.
That extension’s issues apply equally to this extension.
VK_KHR_external_fence_win32
- Name String
-
VK_KHR_external_fence_win32
- Extension Type
-
Device extension
- Registered Extension Number
-
115
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_fence
-
- Contact
-
-
Jesse Hall critsec
-
- Last Modified Date
-
2017-05-08
- IP Status
-
No known IP claims.
- Contributors
-
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Cass Everitt, Oculus
-
Contributors to
VK_KHR_external_semaphore_win32
-
An application using external memory may wish to synchronize access to that memory using fences. This extension enables an application to export fence payload to and import fence payload from Windows handles.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_IMPORT_FENCE_WIN32_HANDLE_INFO_KHR
-
VK_STRUCTURE_TYPE_EXPORT_FENCE_WIN32_HANDLE_INFO_KHR
-
VK_STRUCTURE_TYPE_FENCE_GET_WIN32_HANDLE_INFO_KHR
New Enums
None.
New Structs
New Functions
Issues
This extension borrows concepts, semantics, and language from
VK_KHR_external_semaphore_win32
.
That extension’s issues apply equally to this extension.
1) Should D3D12 fence handle types be supported, like they are for semaphores?
RESOLVED: No.
Doing so would require extending the fence signal and wait operations to
provide values to signal / wait for, like VkD3D12FenceSubmitInfoKHR
does.
A D3D12 fence can be signaled by importing it into a VkSemaphore
instead of a VkFence, and applications can check status or wait on the
D3D12 fence using non-Vulkan APIs.
The convenience of being able to do these operations on VkFence
objects doesn’t justify the extra API complexity.
VK_KHR_external_memory_fd
- Name String
-
VK_KHR_external_memory_fd
- Extension Type
-
Device extension
- Registered Extension Number
-
75
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_memory
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-10-21
- IP Status
-
No known IP claims.
- Contributors
-
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
An application may wish to reference device memory in multiple Vulkan logical devices or instances, in multiple processes, and/or in multiple APIs. This extension enables an application to export POSIX file descriptor handles from Vulkan memory objects and to import Vulkan memory objects from POSIX file descriptor handles exported from other Vulkan memory objects or from similar resources in other APIs.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR
-
VK_STRUCTURE_TYPE_MEMORY_FD_PROPERTIES_KHR
-
VK_STRUCTURE_TYPE_MEMORY_GET_FD_INFO_KHR
New Enums
None.
New Functions
Issues
1) Does the application need to close the file descriptor returned by vkGetMemoryFdKHR?
RESOLVED: Yes, unless it is passed back in to a driver instance to import the memory. A successful get call transfers ownership of the file descriptor to the application, and a successful import transfers it back to the driver. Destroying the original memory object will not close the file descriptor or remove its reference to the underlying memory resource associated with it.
2) Do drivers ever need to expose multiple file descriptors per memory object?
RESOLVED: No. This would indicate there are actually multiple memory objects, rather than a single memory object.
3) How should the valid size and memory type for POSIX file descriptor memory handles created outside of Vulkan be specified?
RESOLVED: The valid memory types are queried directly from the external handle. The size will be specified by future extensions that introduce such external memory handle types.
VK_KHR_external_memory_win32
- Name String
-
VK_KHR_external_memory_win32
- Extension Type
-
Device extension
- Registered Extension Number
-
74
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_memory
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-10-21
- IP Status
-
No known IP claims.
- Contributors
-
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Carsten Rohde, NVIDIA
-
An application may wish to reference device memory in multiple Vulkan logical devices or instances, in multiple processes, and/or in multiple APIs. This extension enables an application to export Windows handles from Vulkan memory objects and to import Vulkan memory objects from Windows handles exported from other Vulkan memory objects or from similar resources in other APIs.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_IMPORT_MEMORY_WIN32_HANDLE_INFO_KHR
-
VK_STRUCTURE_TYPE_EXPORT_MEMORY_WIN32_HANDLE_INFO_KHR
-
VK_STRUCTURE_TYPE_MEMORY_WIN32_HANDLE_PROPERTIES_KHR
-
VK_STRUCTURE_TYPE_MEMORY_GET_WIN32_HANDLE_INFO_KHR
New Enums
None.
New Structs
New Functions
Issues
1) Do applications need to call CloseHandle
() on the values returned
from vkGetMemoryWin32HandleKHR when handleType
is
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_KHR
?
editing-note
(Jon) This issue refers to a token from
|
RESOLVED: Yes, unless it is passed back in to another driver instance to import the object. A successful get call transfers ownership of the handle to the application. Destroying the memory object will not destroy the handle or the handle’s reference to the underlying memory resource.
2) Should the language regarding KMT/Windows 7 handles be moved to a separate extension so that it can be deprecated over time?
RESOLVED: No. Support for them can be deprecated by drivers if they choose, by no longer returning them in the supported handle types of the instance level queries.
3) How should the valid size and memory type for windows memory handles created outside of Vulkan be specified?
RESOLVED: The valid memory types are queried directly from the external handle. The size is determined by the associated image or buffer memory requirements for external handle types that require dedicated allocations, and by the size specified when creating the object from which the handle was exported for other external handle types.
VK_KHR_external_semaphore_fd
- Name String
-
VK_KHR_external_semaphore_fd
- Extension Type
-
Device extension
- Registered Extension Number
-
80
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_semaphore
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-10-21
- IP Status
-
No known IP claims.
- Contributors
-
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Carsten Rohde, NVIDIA
-
An application using external memory may wish to synchronize access to that memory using semaphores. This extension enables an application to export semaphore payload to and import semaphore payload from POSIX file descriptors.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_IMPORT_SEMAPHORE_FD_INFO_KHR
-
VK_STRUCTURE_TYPE_SEMAPHORE_GET_FD_INFO_KHR
New Enums
None.
New Structs
New Functions
Issues
1) Does the application need to close the file descriptor returned by vkGetSemaphoreFdKHR?
RESOLVED: Yes, unless it is passed back in to a driver instance to import the semaphore. A successful get call transfers ownership of the file descriptor to the application, and a successful import transfers it back to the driver. Destroying the original semaphore object will not close the file descriptor or remove its reference to the underlying semaphore resource associated with it.
VK_KHR_external_semaphore_win32
- Name String
-
VK_KHR_external_semaphore_win32
- Extension Type
-
Device extension
- Registered Extension Number
-
79
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_semaphore
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-10-21
- IP Status
-
No known IP claims.
- Contributors
-
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Carsten Rohde, NVIDIA
-
An application using external memory may wish to synchronize access to that memory using semaphores. This extension enables an application to export semaphore payload to and import semaphore payload from Windows handles.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_IMPORT_SEMAPHORE_WIN32_HANDLE_INFO_KHR
-
VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_WIN32_HANDLE_INFO_KHR
-
VK_STRUCTURE_TYPE_D3D12_FENCE_SUBMIT_INFO_KHR
-
VK_STRUCTURE_TYPE_SEMAPHORE_GET_WIN32_HANDLE_INFO_KHR
New Enums
None.
New Structs
New Functions
Issues
1) Do applications need to call CloseHandle
() on the values returned
from vkGetSemaphoreWin32HandleKHR when handleType
is
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_BIT_KHR
?
RESOLVED: Yes, unless it is passed back in to another driver instance to import the object. A successful get call transfers ownership of the handle to the application. Destroying the semaphore object will not destroy the handle or the handle’s reference to the underlying semaphore resource.
2) Should the language regarding KMT/Windows 7 handles be moved to a separate extension so that it can be deprecated over time?
RESOLVED: No. Support for them can be deprecated by drivers if they choose, by no longer returning them in the supported handle types of the instance level queries.
3) Should applications be allowed to specify additional object attributes for shared handles?
RESOLVED: Yes. Applications will be allowed to provide similar attributes to those they would to any other handle creation API.
4) How do applications communicate the desired fence values to use with
D3D12_FENCE
-based Vulkan semaphores?
RESOLVED: There are a couple of options. The values for the signaled and reset states could be communicated up front when creating the object and remain static for the life of the Vulkan semaphore, or they could be specified using auxiliary structures when submitting semaphore signal and wait operations, similar to what is done with the keyed mutex extensions. The latter is more flexible and consistent with the keyed mutex usage, but the former is a much simpler API.
Since Vulkan tends to favor flexibility and consistency over simplicity, a new structure specifying D3D12 fence acquire and release values is added to the vkQueueSubmit function.
VK_KHR_get_display_properties2
- Name String
-
VK_KHR_get_display_properties2
- Extension Type
-
Instance extension
- Registered Extension Number
-
122
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_display
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2017-02-21
- IP Status
-
No known IP claims.
- Contributors
-
-
Ian Elliott, Google
-
James Jones, NVIDIA
-
This extension provides new entry points to query device display properties and capabilities in a way that can be easily extended by other extensions, without introducing any further entry points. This extension can be considered the VK_KHR_display equivalent of the VK_KHR_get_physical_device_properties2 extension.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DISPLAY_PROPERTIES_2_KHR
-
VK_STRUCTURE_TYPE_DISPLAY_PLANE_PROPERTIES_2_KHR
-
VK_STRUCTURE_TYPE_DISPLAY_MODE_PROPERTIES_2_KHR
-
VK_STRUCTURE_TYPE_DISPLAY_PLANE_INFO_2_KHR
-
VK_STRUCTURE_TYPE_DISPLAY_PLANE_CAPABILITIES_2_KHR
-
New Enums
None.
New Structures
New Functions
Issues
1) What should this extension be named?
RESOLVED: VK_KHR_get_display_properties2. Other alternatives:
-
VK_KHR_display2
-
One extension, combined with VK_KHR_surface_capabilites2.
2) Should extensible input structs be added for these new functions:
RESOLVED:
-
vkGetPhysicalDeviceDisplayProperties2KHR: No. The only current input is a VkPhysicalDevice. Other inputs wouldn’t make sense.
-
vkGetPhysicalDeviceDisplayPlaneProperties2KHR: No. The only current input is a VkPhysicalDevice. Other inputs wouldn’t make sense.
-
vkGetDisplayModeProperties2KHR: No. The only curent inputs are a VkPhysicalDevice and a VkDisplayModeKHR. Other inputs wouldn’t make sense.
3) Should additional display query functions be extended?
RESOLVED:
-
vkGetDisplayPlaneSupportedDisplaysKHR: No. Extensions should instead extend vkGetDisplayPlaneCapabilitiesKHR().
Version History
-
Revision 1, 2017-02-21 (James Jones)
-
Initial draft.
-
VK_KHR_get_surface_capabilities2
- Name String
-
VK_KHR_get_surface_capabilities2
- Extension Type
-
Instance extension
- Registered Extension Number
-
120
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2017-02-27
- IP Status
-
No known IP claims.
- Contributors
-
-
Ian Elliott, Google
-
James Jones, NVIDIA
-
Alon Or-bach, Samsung
-
This extension provides new entry points to query device surface
capabilities in a way that can be easily extended by other extensions,
without introducing any further entry points.
This extension can be considered the VK_KHR_surface
equivalent of the
VK_KHR_get_physical_device_properties2
extension.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SURFACE_INFO_2_KHR
-
VK_STRUCTURE_TYPE_SURFACE_CAPABILITIES_2_KHR
-
VK_STRUCTURE_TYPE_SURFACE_FORMAT_2_KHR
-
New Enums
None.
Issues
1) What should this extension be named?
RESOLVED: VK_KHR_get_surface_capabilities2
.
Other alternatives:
-
VK_KHR_surface2
-
One extension, combining a separate display-specific query extension.
2) Should additional WSI query functions be extended?
RESOLVED:
-
vkGetPhysicalDeviceSurfaceCapabilitiesKHR: Yes. The need for this motivated the extension.
-
vkGetPhysicalDeviceSurfaceSupportKHR: No. Currently only has boolean output. Extensions should instead extend vkGetPhysicalDeviceSurfaceCapabilities2KHR.
-
vkGetPhysicalDeviceSurfacePresentModesKHR: No. Recent discussion concluded this introduced too much variability for applications to deal with. Extensions should instead extend vkGetPhysicalDeviceSurfaceCapabilities2KHR.
-
vkGetPhysicalDeviceXlibPresentationSupportKHR: Not in this extension.
-
vkGetPhysicalDeviceXcbPresentationSupportKHR: Not in this extension.
-
vkGetPhysicalDeviceWaylandPresentationSupportKHR: Not in this extension.
-
vkGetPhysicalDeviceWin32PresentationSupportKHR: Not in this extension.
Version History
-
Revision 1, 2017-02-27 (James Jones)
-
Initial draft.
-
VK_KHR_image_format_list
- Name String
-
VK_KHR_image_format_list
- Extension Type
-
Device extension
- Registered Extension Number
-
148
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jason Ekstrand jekstrand
-
- Last Modified Date
-
2017-03-20
- IP Status
-
No known IP claims.
- Contributors
-
-
Jason Ekstrand, Intel
-
Jan-Harald Fredriksen, ARM
-
Jeff Bolz, NVIDIA
-
Jeff Leger, Qualcomm
-
Neil Henning, Codeplay
-
On some implementations, setting the
VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
on image creation can cause access
to that image to perform worse than an equivalent image created without
VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
because the implementation does not
know what view formats will be paired with the image.
This extension allows an application to provide the list of all formats that can be used with an image when it is created. The implementation may then be able to create a more efficient image that supports the subset of formats required by the application without having to support all formats in the format compatibility class of the image format.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_IMAGE_FORMAT_LIST_CREATE_INFO_KHR
New Enums
None.
New Structs
New Functions
None.
Issues
VK_KHR_incremental_present
- Name String
-
VK_KHR_incremental_present
- Extension Type
-
Device extension
- Registered Extension Number
-
85
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_swapchain
-
- Contact
-
-
Ian Elliott ianelliottus
-
- Last Modified Date
-
2016-11-02
- IP Status
-
No known IP claims.
- Contributors
-
-
Ian Elliott, Google
-
Jesse Hall, Google
-
Alon Or-bach, Samsung
-
James Jones, NVIDIA
-
Daniel Rakos, AMD
-
Ray Smith, ARM
-
Mika Isojarvi, Google
-
Jeff Juliano, NVIDIA
-
Jeff Bolz, NVIDIA
-
This device extension extends vkQueuePresentKHR, from the
VK_KHR_swapchain
extension, allowing an application to specify a list
of rectangular, modified regions of each image to present.
This should be used in situations where an application is only changing a
small portion of the presentable images within a swapchain, since it enables
the presentation engine to avoid wasting time presenting parts of the
surface that haven’t changed.
This extension is leveraged from the EGL_KHR_swap_buffers_with_damage extension.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PRESENT_REGIONS_KHR
-
New Enums
None.
New Structures
New Functions
None.
Examples
None.
Issues
1) How should we handle steroescopic-3D swapchains? We need to add a layer
for each rectangle.
One approach is to create another struct that contains the VkRect2D
plus layer, and have VkPresentRegionsKHR point to an array of that
struct.
Another approach is to have two parallel arrays, pRectangles
and
pLayers
, where pRectangles
[i] and pLayers
[i] must be used
together.
Which approach should we use, and if the array of a new structure, what
should that be called?
RESOLVED: Create a new structure, which is a VkRect2D plus a layer, and will be called VkRectLayerKHR.
2) Where is the origin of the VkRectLayerKHR?
RESOLVED: The upper left corner of the presentable image(s) of the swapchain, per the definition of framebuffer coordinates.
3) Does the rectangular region, VkRectLayerKHR, specify pixels of the swapchain’s image(s), or of the surface?
RESOLVED: Of the image(s). Some presentation engines may scale the pixels of a swapchain’s image(s) to the size of the surface. The size of the swapchain’s image(s) will be consistent, where the size of the surface may vary over time.
4) What if all of the rectangles for a given swapchain contain a width and/or height of zero?
RESOLVED: The application is indicating that no pixels changed since the last present. The presentation engine may use such a hint and not update any pixels for the swapchain. However, all other semantics of vkQueuePresentKHR must still be honored, including waiting for semaphores to signal.
Version History
-
Revision 1, 2016-11-02 (Ian Elliott)
-
Internal revisions
-
VK_KHR_push_descriptor
- Name String
-
VK_KHR_push_descriptor
- Extension Type
-
Device extension
- Registered Extension Number
-
81
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2017-09-12
- IP Status
-
No known IP claims.
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Michael Worcester, Imagination Technologies
-
This extension allows descriptors to be written into the command buffer, while the implementation is responsible for managing their memory. Push descriptors may enable easier porting from older APIs and in some cases can be more efficient than writing descriptors into descriptor sets.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PUSH_DESCRIPTOR_PROPERTIES_KHR
-
-
Extending VkDescriptorSetLayoutCreateFlagBits
-
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR
-
-
Extending VkDescriptorUpdateTemplateType
-
VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_PUSH_DESCRIPTORS_KHR
=== New Enums
-
None.
New Structures
Issues
None.
Examples
None.
Version History
-
Revision 1, 2016-10-15 (Jeff Bolz)
-
Internal revisions
-
-
Revision 2, 2017-09-12 (Tobias Hector)
-
Added interactions with Vulkan 1.1
-
VK_KHR_sampler_mirror_clamp_to_edge
- Name String
-
VK_KHR_sampler_mirror_clamp_to_edge
- Extension Type
-
Device extension
- Registered Extension Number
-
15
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Tobias Hector tobski
-
- Last Modified Date
-
2016-02-16
- Contributors
-
-
Tobias Hector, Imagination Technologies
-
VK_KHR_sampler_mirror_clamp_to_edge
extends the set of sampler address
modes to include an additional mode
(VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE
) that effectively uses a
texture map twice as large as the original image in which the additional
half of the new image is a mirror image of the original image.
This new mode relaxes the need to generate images whose opposite edges match by using the original image to generate a matching “mirror image”. This mode allows the texture to be mirrored only once in the negative s, t, and r directions.
New Enum Constants
-
Extending VkSamplerAddressMode:
-
VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE
-
Example
Creating a sampler with the new address mode in each dimension
VkSamplerCreateInfo createInfo =
{
VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO // sType
// Other members set to application-desired values
};
createInfo.addressModeU = VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE;
createInfo.addressModeV = VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE;
createInfo.addressModeW = VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE;
VkSampler sampler;
VkResult result = vkCreateSampler(
device,
&createInfo,
&sampler);
Version History
-
Revision 1, 2016-02-16 (Tobias Hector)
-
Initial draft
-
VK_KHR_shader_atomic_int64
- Name String
-
VK_KHR_shader_atomic_int64
- Extension Type
-
Device extension
- Registered Extension Number
-
181
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Aaron Hagan ahagan
-
- Last Modified Date
-
2018-07-05
- Interactions and External Dependencies
-
-
This extension requires the GL_ARB_gpu_shader_int64 and GL_EXT_shader_atomic_int64 extensions for GLSL source languages.
-
- Contributors
-
-
Aaron Hagan, AMD
-
Daniel Rakos, AMD
-
Jeff Bolz, NVIDIA
-
Neil Henning, Codeplay
-
This extension advertises the SPIR-V Int64Atomics capability for Vulkan, which allows a shader to contain 64-bit atomic operations on signed and unsigned integers. The supported operations include OpAtomicMin, OpAtomicMax, OpAtomicAnd, OpAtomicOr, OpAtomicXor, OpAtomicAdd, OpAtomicExchange, and OpAtomicCompareExchange.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_ATOMIC_INT64_FEATURES_KHR
-
New SPIR-V Capabilities
New Structures
Version History
-
Revision 1, 2018-07-05 (Aaron Hagan)
-
Internal revisions
-
VK_KHR_shader_float16_int8
- Name String
-
VK_KHR_shader_float16_int8
- Extension Type
-
Device extension
- Registered Extension Number
-
83
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Alexander Galazin alegal-arm
-
- Last Modified Date
-
2018-03-07
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension interacts with
VK_KHR_8bit_storage
-
This extension interacts with
VK_KHR_16bit_storage
-
This extension interacts with
VK_KHR_shader_float_controls
-
- Contributors
-
-
Alexander Galazin, Arm
-
Jan-Harald Fredriksen, Arm
-
Jeff Bolz, NVIDIA
-
Graeme Leese, Broadcom
-
Daniel Rakos, AMD
-
Description
The VK_KHR_shader_float16_int8
extension allows use of 16-bit
floating-point types and 8-bit integer types in shaders for arithmetic
operations.
It introduces two new optional features shaderFloat16
and
shaderInt8
which directly map to the Float16
and the Int8
SPIR-V capabilities.
The VK_KHR_shader_float16_int8
extension also specifies precision
requirements for half-precision floating-point SPIR-V operations.
This extension doesn’t enable use of 8-bit integer types or 16-bit
floating-point types in any shader input and
output interfaces and therefore doesn’t supersede the
VK_KHR_8bit_storage
or VK_KHR_16bit_storage
extensions.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT16_INT8_FEATURES_KHR
-
New Structures
New Functions
-
None
Version History
-
Revision 1, 2018-03-07 (Alexander Galazin)
-
Initial draft
-
VK_KHR_shader_float_controls
- Name String
-
VK_KHR_shader_float_controls
- Extension Type
-
Device extension
- Registered Extension Number
-
198
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Alexander Galazin alegal-arm
-
- Last Modified Date
-
2018-09-11
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension requires
SPV_KHR_float_controls
-
- Contributors
-
-
Alexander Galazin, Arm
-
Jan-Harald Fredriksen, Arm
-
Jeff Bolz, NVIDIA
-
Graeme Leese, Broadcom
-
Daniel Rakos, AMD
-
Description
The VK_KHR_shader_float_controls
extension enables efficient use of
floating-point computations through the ability to query and override the
implementation’s default behavior for rounding modes, denormals, signed
zero, and infinity.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR
-
New Enums
-
None
New Structures
New Functions
-
None
New SPIR-V Capabilities
Issues
1) Which instructions must flush denorms?
RESOLVED: Only floating-point conversion, floating-point arithmetic,
floating-point relational (except OpIsNaN
, OpIsInf
), and
floating-point GLSL.std.450 extended instructions must flush denormals.
2) What is the denorm behavior for intermediate results?
RESOLVED: When a SPIR-V instruction is implemented as a sequence of other
instructions:
- in the DenormFlushToZero
execution mode the intermediate
instructions may flush denormals, the final result of the sequence must
not be denormal.
- in the DenormPreserve
execution mode denormals must be preserved
throughout the whole sequence.
3) Do denorm and rounding mode controls apply to OpSpecConstantOp
?
RESOLVED: Yes, except when the opcode is OpQuantizeToF16
.
4) The SPIR-V specification says that OpConvertFToU
and
OpConvertFToS
unconditionally round towards zero.
Do the rounding mode controls specified through the execution modes apply to
them?
RESOLVED: No, these instructions unconditionally round towards zero.
5) Do any of the "Pack" GLSL.std.450 instructions count as conversion instructions and have the rounding mode apply?
RESOLVED: No, only instructions listed in the section "3.32.11. Conversion Instructions" of the SPIR-V specification count as conversion instructions.
6) When using inf/nan-ignore mode, what is expected of OpIsNan
and
OpIsInf
?
RESOLVED: These instructions must always accurately detect inf/nan if it is passed to them.
Version History
-
Revision 3, 2018-09-11 (Alexander Galazin)
-
Minor restructuring
-
-
Revision 2, 2018-04-17 (Alexander Galazin)
-
Added issues and resolutions
-
-
Revision 1, 2018-04-11 (Alexander Galazin)
-
Initial draft
-
VK_KHR_shared_presentable_image
- Name String
-
VK_KHR_shared_presentable_image
- Extension Type
-
Device extension
- Registered Extension Number
-
112
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_swapchain
-
Requires
VK_KHR_get_surface_capabilities2
-
- Contact
-
-
Alon Or-bach alonorbach
-
- Last Modified Date
-
2017-03-20
- IP Status
-
No known IP claims.
- Contributors
-
-
Alon Or-bach, Samsung Electronics
-
Ian Elliott, Google
-
Jesse Hall, Google
-
Pablo Ceballos, Google
-
Chris Forbes, Google
-
Jeff Juliano, NVIDIA
-
James Jones, NVIDIA
-
Daniel Rakos, AMD
-
Tobias Hector, Imagination Technologies
-
Graham Connor, Imagination Technologies
-
Michael Worcester, Imagination Technologies
-
Cass Everitt, Oculus
-
Johannes Van Waveren, Oculus
-
This extension extends VK_KHR_swapchain
to enable creation of a shared
presentable image.
This allows the application to use the image while the presention engine is
accessing it, in order to reduce the latency between rendering and
presentation.
New Object Types
None.
New Enum Constants
-
Extending VkPresentModeKHR:
-
VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR
-
VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR
-
-
Extending VkImageLayout:
-
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
-
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_SHARED_PRESENT_SURFACE_CAPABILITIES_KHR
-
New Enums
None.
New Structures
New Functions
Issues
1) Should we allow a Vulkan WSI swapchain to toggle between normal usage and shared presentation usage?
RESOLVED: No. WSI swapchains are typically recreated with new properties instead of having their properties changed. This can also save resources, assuming that fewer images are needed for shared presentation, and assuming that most VR applications do not need to switch between normal and shared usage.
2) Should we have a query for determining how the presentation engine refresh is triggered?
RESOLVED: Yes. This is done via which presentation modes a surface supports.
3) Should the object representing a shared presentable image be an extension of a VkSwapchainKHR or a separate object?
RESOLVED: Extension of a swapchain due to overlap in creation properties and to allow common functionality between shared and normal presentable images and swapchains.
4) What should we call the extension and the new structures it creates?
RESOLVED: Shared presentable image / shared present.
5) Should the minImageCount
and presentMode
values of the
VkSwapchainCreateInfoKHR be ignored, or required to be compatible
values?
RESOLVED: minImageCount
must be set to 1, and presentMode
should be set to either VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR
or
VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR
.
6) What should the layout of the shared presentable image be?
RESOLVED: After acquiring the shared presentable image, the application
must transition it to the VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
layout
prior to it being used.
After this initial transition, any image usage that was requested during
swapchain creation can be performed on the image without layout transitions
being performed.
7) Do we need a new API for the trigger to refresh new content?
RESOLVED: vkQueuePresentKHR to act as API to trigger a refresh, as will allow combination with other compatible extensions to vkQueuePresentKHR.
8) How should an application detect a VK_ERROR_OUT_OF_DATE_KHR
error
on a swapchain using the VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR
present mode?
RESOLVED: Introduce vkGetSwapchainStatusKHR to allow applications to query the status of a swapchain using a shared presentation mode.
9) What should subsequent calls to vkQueuePresentKHR for
VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR
swapchains be defined to
do?
RESOLVED: State that implementations may use it as a hint for updated content.
10) Can the ownership of a shared presentable image be transferred to a different queue?
RESOLVED: No.
It is not possible to transfer ownership of a shared presentable image
obtained from a swapchain created using VK_SHARING_MODE_EXCLUSIVE
after it has been presented.
11) How should vkQueueSubmit behave if a command buffer uses an image
from an VK_ERROR_OUT_OF_DATE_KHR
swapchain?
RESOLVED: vkQueueSubmit is expected to return the
VK_ERROR_DEVICE_LOST
error.
12) Can Vulkan provide any guarantee on the order of rendering, to enable beam chasing?
RESOLVED: This could be achieved via use of render passes to ensure strip rendering.
Version History
-
Revision 1, 2017-03-20 (Alon Or-bach)
-
Internal revisions
-
VK_KHR_surface
- Name String
-
VK_KHR_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
1
- Revision
-
25
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
James Jones cubanismo
-
Ian Elliott ianelliottus
-
- Last Modified Date
-
2016-08-25
- IP Status
-
No known IP claims.
- Contributors
-
-
Patrick Doane, Blizzard
-
Ian Elliott, LunarG
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
David Mao, AMD
-
Norbert Nopper, Freescale
-
Alon Or-bach, Samsung
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Jeff Vigil, Qualcomm
-
Chia-I Wu, LunarG
-
Jason Ekstrand, Intel
-
The VK_KHR_surface
extension is an instance extension.
It introduces VkSurfaceKHR objects, which abstract native platform
surface or window objects for use with Vulkan.
It also provides a way to determine whether a queue family in a physical
device supports presenting to particular surface.
Separate extensions for each platform provide the mechanisms for creating
VkSurfaceKHR objects, but once created they may be used in this and
other platform-independent extensions, in particular the
VK_KHR_swapchain
extension.
New Object Types
New Enum Constants
-
Extending VkResult:
-
VK_ERROR_SURFACE_LOST_KHR
-
VK_ERROR_NATIVE_WINDOW_IN_USE_KHR
-
New Enums
New Structures
New Functions
Examples
Note
The example code for the |
Issues
1) Should this extension include a method to query whether a physical device supports presenting to a specific window or native surface on a given platform?
RESOLVED: Yes. Without this, applications would need to create a device instance to determine whether a particular window can be presented to. Knowing that a device supports presentation to a platform in general is not sufficient, as a single machine might support multiple seats, or instances of the platform that each use different underlying physical devices. Additionally, on some platforms, such as the X Window System, different drivers and devices might be used for different windows depending on which section of the desktop they exist on.
2) Should the vkGetPhysicalDeviceSurfaceCapabilitiesKHR,
vkGetPhysicalDeviceSurfaceFormatsKHR, and
vkGetPhysicalDeviceSurfacePresentModesKHR functions from
VK_KHR_swapchain
be modified to operate on physical devices and moved
to this extension to implement the resolution of issue 1?
RESOLVED: No, separate query functions are needed, as the purposes served
are similar but incompatible.
The vkGetPhysicalDeviceSurface*KHR
functions return information that
could potentially depend on an initialized device.
For example, the formats supported for presentation to the surface might
vary depending on which device extensions are enabled.
The query introduced to resolve issue 1 should be used only to query generic
driver or platform properties.
The physical device parameter is intended to serve only as an identifier
rather than a stateful object.
3) Should Vulkan include support Xlib or XCB as the API for accessing the X Window System platform?
RESOLVED: Both. XCB is a more modern and efficient API, but Xlib usage is deeply ingrained in many applications and likely will remain in use for the foreseeable future. Not all drivers necessarily need to support both, but including both as options in the core specification will probably encourage support, which should in turn ease adoption of the Vulkan API in older codebases. Additionally, the performance improvements possible with XCB likely will not have a measurable impact on the performance of Vulkan presentation and other minimal window system interactions defined here.
4) Should the GBM platform be included in the list of platform enums?
RESOLVED: Deferred, and will be addressed with a platform-specific extension to be written in the future.
Version History
-
Revision 1, 2015-05-20 (James Jones)
-
Initial draft, based on LunarG KHR spec, other KHR specs, patches attached to bugs.
-
-
Revision 2, 2015-05-22 (Ian Elliott)
-
Created initial Description section.
-
Removed query for whether a platform requires the use of a queue for presentation, since it was decided that presentation will always be modeled as being part of the queue.
-
Fixed typos and other minor mistakes.
-
-
Revision 3, 2015-05-26 (Ian Elliott)
-
Improved the Description section.
-
-
Revision 4, 2015-05-27 (James Jones)
-
Fixed compilation errors in example code.
-
-
Revision 5, 2015-06-01 (James Jones)
-
Added issues 1 and 2 and made related spec updates.
-
-
Revision 6, 2015-06-01 (James Jones)
-
Merged the platform type mappings table previously removed from VK_KHR_swapchain with the platform description table in this spec.
-
Added issues 3 and 4 documenting choices made when building the initial list of native platforms supported.
-
-
Revision 7, 2015-06-11 (Ian Elliott)
-
Updated table 1 per input from the KHR TSG.
-
Updated issue 4 (GBM) per discussion with Daniel Stone. He will create a platform-specific extension sometime in the future.
-
-
Revision 8, 2015-06-17 (James Jones)
-
Updated enum-extending values using new convention.
-
Fixed the value of VK_SURFACE_PLATFORM_INFO_TYPE_SUPPORTED_KHR.
-
-
Revision 9, 2015-06-17 (James Jones)
-
Rebased on Vulkan API version 126.
-
-
Revision 10, 2015-06-18 (James Jones)
-
Marked issues 2 and 3 resolved.
-
-
Revision 11, 2015-06-23 (Ian Elliott)
-
Examples now show use of function pointers for extension functions.
-
Eliminated extraneous whitespace.
-
-
Revision 12, 2015-07-07 (Daniel Rakos)
-
Added error section describing when each error is expected to be reported.
-
Replaced the term "queue node index" with "queue family index" in the spec as that is the agreed term to be used in the latest version of the core header and spec.
-
Replaced bool32_t with VkBool32.
-
-
Revision 13, 2015-08-06 (Daniel Rakos)
-
Updated spec against latest core API header version.
-
-
Revision 14, 2015-08-20 (Ian Elliott)
-
Renamed this extension and all of its enumerations, types, functions, etc. This makes it compliant with the proposed standard for Vulkan extensions.
-
Switched from "revision" to "version", including use of the VK_MAKE_VERSION macro in the header file.
-
Did miscellaneous cleanup, etc.
-
-
Revision 15, 2015-08-20 (Ian Elliott—porting a 2015-07-29 change from James Jones)
-
Moved the surface transform enums here from VK_WSI_swapchain so they could be re-used by VK_WSI_display.
-
-
Revision 16, 2015-09-01 (James Jones)
-
Restore single-field revision number.
-
-
Revision 17, 2015-09-01 (James Jones)
-
Fix example code compilation errors.
-
-
Revision 18, 2015-09-26 (Jesse Hall)
-
Replaced VkSurfaceDescriptionKHR with the VkSurfaceKHR object, which is created via layered extensions. Added VkDestroySurfaceKHR.
-
-
Revision 19, 2015-09-28 (Jesse Hall)
-
Renamed from VK_EXT_KHR_swapchain to VK_EXT_KHR_surface.
-
-
Revision 20, 2015-09-30 (Jeff Vigil)
-
Add error result VK_ERROR_SURFACE_LOST_KHR.
-
-
Revision 21, 2015-10-15 (Daniel Rakos)
-
Updated the resolution of issue #2 and include the surface capability queries in this extension.
-
Renamed SurfaceProperties to SurfaceCapabilities as it better reflects that the values returned are the capabilities of the surface on a particular device.
-
Other minor cleanup and consistency changes.
-
-
Revision 22, 2015-10-26 (Ian Elliott)
-
Renamed from VK_EXT_KHR_surface to VK_KHR_surface.
-
-
Revision 23, 2015-11-03 (Daniel Rakos)
-
Added allocation callbacks to vkDestroySurfaceKHR.
-
-
Revision 24, 2015-11-10 (Jesse Hall)
-
Removed VkSurfaceTransformKHR. Use VkSurfaceTransformFlagBitsKHR instead.
-
Rename VkSurfaceCapabilitiesKHR member maxImageArraySize to maxImageArrayLayers.
-
-
Revision 25, 2016-01-14 (James Jones)
-
Moved VK_ERROR_NATIVE_WINDOW_IN_USE_KHR from the VK_KHR_android_surface to the VK_KHR_surface extension.
-
-
2016-08-23 (Ian Elliott)
-
Update the example code, to not have so many characters per line, and to split out a new example to show how to obtain function pointers.
-
-
2016-08-25 (Ian Elliott)
-
A note was added at the beginning of the example code, stating that it will be removed from future versions of the appendix.
-
VK_KHR_swapchain
- Name String
-
VK_KHR_swapchain
- Extension Type
-
Device extension
- Registered Extension Number
-
2
- Revision
-
70
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
James Jones cubanismo
-
Ian Elliott ianelliottus
-
- Last Modified Date
-
2017-10-06
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Interacts with Vulkan 1.1
-
- Contributors
-
-
Patrick Doane, Blizzard
-
Ian Elliott, LunarG
-
Jesse Hall, Google
-
Mathias Heyer, NVIDIA
-
James Jones, NVIDIA
-
David Mao, AMD
-
Norbert Nopper, Freescale
-
Alon Or-bach, Samsung
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Jeff Vigil, Qualcomm
-
Chia-I Wu, LunarG
-
Jason Ekstrand, Intel
-
Matthaeus G. Chajdas, AMD
-
Ray Smith, ARM
-
The VK_KHR_swapchain
extension is the device-level companion to the
VK_KHR_surface
extension.
It introduces VkSwapchainKHR objects, which provide the ability to
present rendering results to a surface.
New Object Types
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_PRESENT_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_PRESENT_CAPABILITIES_KHR
-
VK_STRUCTURE_TYPE_IMAGE_SWAPCHAIN_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_SWAPCHAIN_INFO_KHR
-
VK_STRUCTURE_TYPE_ACQUIRE_NEXT_IMAGE_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_PRESENT_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_SWAPCHAIN_CREATE_INFO_KHR
-
-
Extending VkImageLayout:
-
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR
-
-
Extending VkResult:
-
VK_SUBOPTIMAL_KHR
-
VK_ERROR_OUT_OF_DATE_KHR
-
New Structures
New Functions
Issues
1) Does this extension allow the application to specify the memory backing of the presentable images?
RESOLVED: No. Unlike standard images, the implementation will allocate the memory backing of the presentable image.
2) What operations are allowed on presentable images?
RESOLVED: This is determined by the image usage flags specified when creating the presentable image’s swapchain.
3) Does this extension support MSAA presentable images?
RESOLVED: No. Presentable images are always single-sampled. Multi-sampled rendering must use regular images. To present the rendering results the application must manually resolve the multi- sampled image to a single-sampled presentable image prior to presentation.
4) Does this extension support stereo/multi-view presentable images?
RESOLVED: Yes.
The number of views associated with a presentable image is determined by the
imageArrayLayers
specified when creating a swapchain.
All presentable images in a given swapchain use the same array size.
5) Are the layers of stereo presentable images half-sized?
RESOLVED: No. The image extents always match those requested by the application.
6) Do the “present” and “acquire next image” commands operate on a queue? If not, do they need to include explicit semaphore objects to interlock them with queue operations?
RESOLVED: The present command operates on a queue. The image ownership operation it represents happens in order with other operations on the queue, so no explicit semaphore object is required to synchronize its actions.
Applications may want to acquire the next image in separate threads from those in which they manage their queue, or in multiple threads. To make such usage easier, the acquire next image command takes a semaphore to signal as a method of explicit synchronization. The application must later queue a wait for this semaphore before queuing execution of any commands using the image.
7) Does vkAcquireNextImageKHR block if no images are available?
RESOLVED: The command takes a timeout parameter.
Special values for the timeout are 0, which makes the call a non-blocking
operation, and UINT64_MAX
, which blocks indefinitely.
Values in between will block for up to the specified time.
The call will return when an image becomes available or an error occurs.
It may, but is not required to, return before the specified timeout expires
if the swapchain becomes out of date.
8) Can multiple presents be queued using one vkQueuePresentKHR call?
RESOLVED: Yes. VkPresentInfoKHR contains a list of swapchains and corresponding image indices that will be presented. When supported, all presentations queued with a single vkQueuePresentKHR call will be applied atomically as one operation. The same swapchain must not appear in the list more than once. Later extensions may provide applications stronger guarantees of atomicity for such present operations, and/or allow them to query whether atomic presentation of a particular group of swapchains is possible.
9) How do the presentation and acquire next image functions notify the application the targeted surface has changed?
RESOLVED: Two new result codes are introduced for this purpose:
-
VK_SUBOPTIMAL_KHR
- Presentation will still succeed, subject to the window resize behavior, but the swapchain is no longer configured optimally for the surface it targets. Applications should query updated surface information and recreate their swapchain at the next convenient opportunity. -
VK_ERROR_OUT_OF_DATE_KHR
- Failure. The swapchain is no longer compatible with the surface it targets. The application must query updated surface information and recreate the swapchain before presentation will succeed.
These can be returned by both vkAcquireNextImageKHR and vkQueuePresentKHR.
10) Does the vkAcquireNextImageKHR command return a semaphore to the application via an output parameter, or accept a semaphore to signal from the application as an object handle parameter?
RESOLVED: Accept a semaphore to signal as an object handle. This avoids the need to specify whether the application must destroy the semaphore or whether it is owned by the swapchain, and if the latter, what its lifetime is and whether it can be re-used for other operations once it is received from vkAcquireNextImageKHR.
11) What types of swapchain queuing behavior should be exposed? Options include swap interval specification, mailbox/most recent vs. FIFO queue management, targeting specific vertical blank intervals or absolute times for a given present operation, and probably others. For some of these, whether they are specified at swapchain creation time or as per-present parameters needs to be decided as well.
RESOLVED: The base swapchain extension will expose 3 possible behaviors (of which, FIFO will always be supported):
-
Immediate present: Does not wait for vertical blanking period to update the current image, likely resulting in visible tearing. No internal queue is used. Present requests are applied immediately.
-
Mailbox queue: Waits for the next vertical blanking period to update the current image. No tearing should be observed. An internal single-entry queue is used to hold pending presentation requests. If the queue is full when a new presentation request is received, the new request replaces the existing entry, and any images associated with the prior entry become available for re-use by the application.
-
FIFO queue: Waits for the next vertical blanking period to update the current image. No tearing should be observed. An internal queue containing
numSwapchainImages
- 1 entries is used to hold pending presentation requests. New requests are appended to the end of the queue, and one request is removed from the beginning of the queue and processed during each vertical blanking period in which the queue is non-empty
Not all surfaces will support all of these modes, so the modes supported will be returned using a surface info query. All surfaces must support the FIFO queue mode. Applications must choose one of these modes up front when creating a swapchain. Switching modes can be accomplished by recreating the swapchain.
12) Can VK_PRESENT_MODE_MAILBOX_KHR
provide non-blocking guarantees
for vkAcquireNextImageKHR? If so, what is the proper criteria?
RESOLVED: Yes. The difficulty is not immediately obvious here. Naively, if at least 3 images are requested, mailbox mode should always have an image available for the application if the application does not own any images when the call to vkAcquireNextImageKHR was made. However, some presentation engines may have more than one “current” image, and would still need to block in some cases. The right requirement appears to be that if the application allocates the surface’s minimum number of images + 1 then it is guaranteed non-blocking behavior when it does not currently own any images.
13) Is there a way to create and initialize a new swapchain for a surface
that has generated a VK_SUBOPTIMAL_KHR
return code while still using
the old swapchain?
RESOLVED: Not as part of this specification. This could be useful to allow the application to create an “optimal” replacement swapchain and rebuild all its command buffers using it in a background thread at a low priority while continuing to use the “suboptimal” swapchain in the main thread. It could probably use the same “atomic replace” semantics proposed for recreating direct-to-device swapchains without incurring a mode switch. However, after discussion, it was determined some platforms probably could not support concurrent swapchains for the same surface though, so this will be left out of the base KHR extensions. A future extension could add this for platforms where it is supported.
14) Should there be a special value for
VkSurfaceCapabilitiesKHR::maxImageCount
to indicate there are no
practical limits on the number of images in a swapchain?
RESOLVED: Yes. There where often be cases where there is no practical limit to the number of images in a swapchain other than the amount of available resources (I.e., memory) in the system. Trying to derive a hard limit from things like memory size is prone to failure. It is better in such cases to leave it to applications to figure such soft limits out via trial/failure iterations.
15) Should there be a special value for
VkSurfaceCapabilitiesKHR::currentExtent
to indicate the size of
the platform surface is undefined?
RESOLVED: Yes. On some platforms (Wayland, for example), the surface size is defined by the images presented to it rather than the other way around.
16) Should there be a special value for
VkSurfaceCapabilitiesKHR::maxImageExtent
to indicate there is no
practical limit on the surface size?
RESOLVED: No. It seems unlikely such a system would exist. 0 could be used to indicate the platform places no limits on the extents beyond those imposed by Vulkan for normal images, but this query could just as easily return those same limits, so a special “unlimited” value does not seem useful for this field.
17) How should surface rotation and mirroring be exposed to applications? How do they specify rotation and mirroring transforms applied prior to presentation?
RESOLVED: Applications can query both the supported and current transforms
of a surface.
Both are specified relative to the device’s “natural” display rotation and
direction.
The supported transforms indicates which orientations the presentation
engine accepts images in.
For example, a presentation engine that does not support transforming
surfaces as part of presentation, and which is presenting to a surface that
is displayed with a 90-degree rotation, would return only one supported
transform bit: VK_SURFACE_TRANSFORM_ROTATE_90_BIT_KHR
.
Applications must transform their rendering by the transform they specify
when creating the swapchain in preTransform
field.
18) Can surfaces ever not support VK_MIRROR_NONE
? Can they support
vertical and horizontal mirroring simultaneously? Relatedly, should
VK_MIRROR_NONE
[_BIT] be zero, or bit one, and should applications be
allowed to specify multiple pre and current mirror transform bits, or
exactly one?
RESOLVED: Since some platforms may not support presenting with a transform
other than the native window’s current transform, and prerotation/mirroring
are specified relative to the device’s natural rotation and direction,
rather than relative to the surface’s current rotation and direction, it is
necessary to express lack of support for no mirroring.
To allow this, the MIRROR_NONE
enum must occupy a bit in the flags.
Since MIRROR_NONE
must be a bit in the bitmask rather than a bitmask
with no values set, allowing more than one bit to be set in the bitmask
would make it possible to describe undefined transforms such as
VK_MIRROR_NONE_BIT
| VK_MIRROR_HORIZONTAL_BIT
, or a transform
that includes both “no mirroring” and “horizontal mirroring”
simultaneously.
Therefore, it is desirable to allow specifying all supported mirroring
transforms using only one bit.
The question then becomes, should there be a
VK_MIRROR_HORIZONTAL_AND_VERTICAL_BIT
to represent a simultaneous
horizontal and vertical mirror transform? However, such a transform is
equivalent to a 180 degree rotation, so presentation engines and
applications that wish to support or use such a transform can express it
through rotation instead.
Therefore, 3 exclusive bits are sufficient to express all needed mirroring
transforms.
19) Should support for sRGB be required?
RESOLVED: In the advent of UHD and HDR display devices, proper color space information is vital to the display pipeline represented by the swapchain. The app can discover the supported format/color-space pairs and select a pair most suited to its rendering needs. Currently only the sRGB color space is supported, future extensions may provide support for more color spaces. See issues 23 and 24.
20) Is there a mechanism to modify or replace an existing swapchain with one targeting the same surface?
RESOLVED: Yes. This is described above in the text.
21) Should there be a way to set prerotation and mirroring using native APIs when presenting using a Vulkan swapchain?
RESOLVED: Yes.
The transforms that can be expressed in this extension are a subset of those
possible on native platforms.
If a platform exposes a method to specify the transform of presented images
for a given surface using native methods and exposes more transforms or
other properties for surfaces than Vulkan supports, it might be impossible,
difficult, or inconvenient to set some of those properties using Vulkan KHR
extensions and some using the native interfaces.
To avoid overwriting properties set using native commands when presenting
using a Vulkan swapchain, the application can set the pretransform to
“inherit”, in which case the current native properties will be used, or if
none are available, a platform-specific default will be used.
Platforms that do not specify a reasonable default or do not provide native
mechanisms to specify such transforms should not include the inherit bits in
the supportedTransforms
bitmask they return in
VkSurfaceCapabilitiesKHR.
22) Should the content of presentable images be clipped by objects obscuring their target surface?
RESOLVED: Applications can choose which behavior they prefer. Allowing the content to be clipped could enable more optimal presentation methods on some platforms, but some applications might rely on the content of presentable images to perform techniques such as partial updates or motion blurs.
23) What is the purpose of specifying a VkColorSpaceKHR along with VkFormat when creating a swapchain?
RESOLVED: While Vulkan itself is color space agnostic (e.g. even the
meaning of R, G, B and A can be freely defined by the rendering
application), the swapchain eventually will have to present the images on a
display device with specific color reproduction characteristics.
If any color space transformations are necessary before an image can be
displayed, the color space of the presented image must be known to the
swapchain.
A swapchain will only support a restricted set of color format and -space
pairs.
This set can be discovered via vkGetPhysicalDeviceSurfaceFormatsKHR.
As it can be expected that most display devices support the sRGB color
space, at least one format/color-space pair has to be exposed, where the
color space is VK_COLOR_SPACE_SRGB_NONLINEAR_KHR
.
24) How are sRGB formats and the sRGB color space related?
RESOLVED: While Vulkan exposes a number of SRGB texture formats, using
such formats does not guarantee working in a specific color space.
It merely means that the hardware can directly support applying the
non-linear transfer functions defined by the sRGB standard color space when
reading from or writing to images of that these formats.
Still, it is unlikely that a swapchain will expose a *_SRGB
format
along with any color space other than
VK_COLOR_SPACE_SRGB_NONLINEAR_KHR
.
On the other hand, non-*_SRGB
formats will be very likely exposed in
pair with a SRGB color space.
This means, the hardware will not apply any transfer function when reading
from or writing to such images, yet they will still be presented on a device
with sRGB display characteristics.
In this case the application is responsible for applying the transfer
function, for instance by using shader math.
25) How are the lifetime of surfaces and swapchains targeting them related?
RESOLVED: A surface must outlive any swapchains targeting it. A VkSurfaceKHR owns the binding of the native window to the Vulkan driver.
26) How can the client control the way the alpha channel of swapchain images is treated by the presentation engine during compositing?
RESOLVED: We should add new enum values to allow the client to negotiate
with the presentation engine on how to treat image alpha values during the
compositing process.
Since not all platforms can practically control this through the Vulkan
driver, a value of VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR
is provided like
for surface transforms.
27) Is vkCreateSwapchainKHR the right function to return
VK_ERROR_NATIVE_WINDOW_IN_USE_KHR
, or should the various
platform-specific VkSurfaceKHR factory functions catch this error
earlier?
RESOLVED: For most platforms, the VkSurfaceKHR structure is a simple container holding the data that identifies a native window or other object representing a surface on a particular platform. For the surface factory functions to return this error, they would likely need to register a reference on the native objects with the native display server somehow, and ensure no other such references exist. Surfaces were not intended to be that heavyweight.
Swapchains are intended to be the objects that directly manipulate native windows and communicate with the native presentation mechanisms. Swapchains will already need to communicate with the native display server to negotiate allocation and/or presentation of presentable images for a native surface. Therefore, it makes more sense for swapchain creation to be the point at which native object exclusivity is enforced. Platforms may choose to enforce further restrictions on the number of VkSurfaceKHR objects that may be created for the same native window if such a requirement makes sense on a particular platform, but a global requirement is only sensible at the swapchain level.
Examples
Note
The example code for the |
Version History
-
Revision 1, 2015-05-20 (James Jones)
-
Initial draft, based on LunarG KHR spec, other KHR specs, patches attached to bugs.
-
-
Revision 2, 2015-05-22 (Ian Elliott)
-
Made many agreed-upon changes from 2015-05-21 KHR TSG meeting. This includes using only a queue for presentation, and having an explicit function to acquire the next image.
-
Fixed typos and other minor mistakes.
-
-
Revision 3, 2015-05-26 (Ian Elliott)
-
Improved the Description section.
-
Added or resolved issues that were found in improving the Description. For example, pSurfaceDescription is used consistently, instead of sometimes using pSurface.
-
-
Revision 4, 2015-05-27 (James Jones)
-
Fixed some grammatical errors and typos
-
Filled in the description of imageUseFlags when creating a swapchain.
-
Added a description of swapInterval.
-
Replaced the paragraph describing the order of operations on a queue for image ownership and presentation.
-
-
Revision 5, 2015-05-27 (James Jones)
-
Imported relevant issues from the (abandoned) vk_wsi_persistent_swapchain_images extension.
-
Added issues 6 and 7, regarding behavior of the acquire next image and present commands with respect to queues.
-
Updated spec language and examples to align with proposed resolutions to issues 6 and 7.
-
-
Revision 6, 2015-05-27 (James Jones)
-
Added issue 8, regarding atomic presentation of multiple swapchains
-
Updated spec language and examples to align with proposed resolution to issue 8.
-
-
Revision 7, 2015-05-27 (James Jones)
-
Fixed compilation errors in example code, and made related spec fixes.
-
-
Revision 8, 2015-05-27 (James Jones)
-
Added issue 9, and the related VK_SUBOPTIMAL_KHR result code.
-
Renamed VK_OUT_OF_DATE_KHR to VK_ERROR_OUT_OF_DATE_KHR.
-
-
Revision 9, 2015-05-27 (James Jones)
-
Added inline proposed resolutions (marked with [JRJ]) to some XXX questions/issues. These should be moved to the issues section in a subsequent update if the proposals are adopted.
-
-
Revision 10, 2015-05-28 (James Jones)
-
Converted vkAcquireNextImageKHR back to a non-queue operation that uses a VkSemaphore object for explicit synchronization.
-
Added issue 10 to determine whether vkAcquireNextImageKHR generates or returns semaphores, or whether it operates on a semaphore provided by the application.
-
-
Revision 11, 2015-05-28 (James Jones)
-
Marked issues 6, 7, and 8 resolved.
-
Renamed VkSurfaceCapabilityPropertiesKHR to VkSurfacePropertiesKHR to better convey the mutable nature of the info it contains.
-
-
Revision 12, 2015-05-28 (James Jones)
-
Added issue 11 with a proposed resolution, and the related issue 12.
-
Updated various sections of the spec to match the proposed resolution to issue 11.
-
-
Revision 13, 2015-06-01 (James Jones)
-
Moved some structures to VK_EXT_KHR_swap_chain to resolve the spec’s issues 1 and 2.
-
-
Revision 14, 2015-06-01 (James Jones)
-
Added code for example 4 demonstrating how an application might make use of the two different present and acquire next image KHR result codes.
-
Added issue 13.
-
-
Revision 15, 2015-06-01 (James Jones)
-
Added issues 14 - 16 and related spec language.
-
Fixed some spelling errors.
-
Added language describing the meaningful return values for vkAcquireNextImageKHR and vkQueuePresentKHR.
-
-
Revision 16, 2015-06-02 (James Jones)
-
Added issues 17 and 18, as well as related spec language.
-
Removed some erroneous text added by mistake in the last update.
-
-
Revision 17, 2015-06-15 (Ian Elliott)
-
Changed special value from "-1" to "0" so that the data types can be unsigned.
-
-
Revision 18, 2015-06-15 (Ian Elliott)
-
Clarified the values of VkSurfacePropertiesKHR::minImageCount and the timeout parameter of the vkAcquireNextImageKHR function.
-
-
Revision 19, 2015-06-17 (James Jones)
-
Misc. cleanup. Removed resolved inline issues and fixed typos.
-
Fixed clarification of VkSurfacePropertiesKHR::minImageCount made in version 18.
-
Added a brief "Image Ownership" definition to the list of terms used in the spec.
-
-
Revision 20, 2015-06-17 (James Jones)
-
Updated enum-extending values using new convention.
-
-
Revision 21, 2015-06-17 (James Jones)
-
Added language describing how to use VK_IMAGE_LAYOUT_PRESENT_SOURCE_KHR.
-
Cleaned up an XXX comment regarding the description of which queues vkQueuePresentKHR can be used on.
-
-
Revision 22, 2015-06-17 (James Jones)
-
Rebased on Vulkan API version 126.
-
-
Revision 23, 2015-06-18 (James Jones)
-
Updated language for issue 12 to read as a proposed resolution.
-
Marked issues 11, 12, 13, 16, and 17 resolved.
-
Temporarily added links to the relevant bugs under the remaining unresolved issues.
-
Added issues 19 and 20 as well as proposed resolutions.
-
-
Revision 24, 2015-06-19 (Ian Elliott)
-
Changed special value for VkSurfacePropertiesKHR::currentExtent back to "-1" from "0". This value will never need to be unsigned, and "0" is actually a legal value.
-
-
Revision 25, 2015-06-23 (Ian Elliott)
-
Examples now show use of function pointers for extension functions.
-
Eliminated extraneous whitespace.
-
-
Revision 26, 2015-06-25 (Ian Elliott)
-
Resolved Issues 9 & 10 per KHR TSG meeting.
-
-
Revision 27, 2015-06-25 (James Jones)
-
Added oldSwapchain member to VkSwapchainCreateInfoKHR.
-
-
Revision 28, 2015-06-25 (James Jones)
-
Added the "inherit" bits to the rotation and mirroring flags and the associated issue 21.
-
-
Revision 29, 2015-06-25 (James Jones)
-
Added the "clipped" flag to VkSwapchainCreateInfoKHR, and the associated issue 22.
-
Specified that presenting an image does not modify it.
-
-
Revision 30, 2015-06-25 (James Jones)
-
Added language to the spec that clarifies the behavior of vkCreateSwapchainKHR() when the oldSwapchain field of VkSwapchainCreateInfoKHR is not NULL.
-
-
Revision 31, 2015-06-26 (Ian Elliott)
-
Example of new VkSwapchainCreateInfoKHR members, "oldSwapchain" and "clipped".
-
Example of using VkSurfacePropertiesKHR::{min|max}ImageCount to set VkSwapchainCreateInfoKHR::minImageCount.
-
Rename vkGetSurfaceInfoKHR()'s 4th parameter to "pDataSize", for consistency with other functions.
-
Add macro with C-string name of extension (just to header file).
-
-
Revision 32, 2015-06-26 (James Jones)
-
Minor adjustments to the language describing the behavior of "oldSwapchain"
-
Fixed the version date on my previous two updates.
-
-
Revision 33, 2015-06-26 (Jesse Hall)
-
Add usage flags to VkSwapchainCreateInfoKHR
-
-
Revision 34, 2015-06-26 (Ian Elliott)
-
Rename vkQueuePresentKHR()'s 2nd parameter to "pPresentInfo", for consistency with other functions.
-
-
Revision 35, 2015-06-26 (Jason Ekstrand)
-
Merged the VkRotationFlagBitsKHR and VkMirrorFlagBitsKHR enums into a single VkSurfaceTransformFlagBitsKHR enum.
-
-
Revision 36, 2015-06-26 (Jason Ekstrand)
-
Added a VkSurfaceTransformKHR enum that is not a bitmask. Each value in VkSurfaceTransformKHR corresponds directly to one of the bits in VkSurfaceTransformFlagBitsKHR so transforming from one to the other is easy. Having a separate enum means that currentTransform and preTransform are now unambiguous by definition.
-
-
Revision 37, 2015-06-29 (Ian Elliott)
-
Corrected one of the signatures of vkAcquireNextImageKHR, which had the last two parameters switched from what it is elsewhere in the specification and header files.
-
-
Revision 38, 2015-06-30 (Ian Elliott)
-
Corrected a typo in description of the vkGetSwapchainInfoKHR() function.
-
Corrected a typo in header file comment for VkPresentInfoKHR::sType.
-
-
Revision 39, 2015-07-07 (Daniel Rakos)
-
Added error section describing when each error is expected to be reported.
-
Replaced bool32_t with VkBool32.
-
-
Revision 40, 2015-07-10 (Ian Elliott)
-
Updated to work with version 138 of the "vulkan.h" header. This includes declaring the VkSwapchainKHR type using the new VK_DEFINE_NONDISP_HANDLE macro, and no longer extending VkObjectType (which was eliminated).
-
-
Revision 41 2015-07-09 (Mathias Heyer)
-
Added color space language.
-
-
Revision 42, 2015-07-10 (Daniel Rakos)
-
Updated query mechanism to reflect the convention changes done in the core spec.
-
Removed "queue" from the name of VK_STRUCTURE_TYPE_QUEUE_PRESENT_INFO_KHR to be consistent with the established naming convention.
-
Removed reference to the no longer existing VkObjectType enum.
-
-
Revision 43, 2015-07-17 (Daniel Rakos)
-
Added support for concurrent sharing of swapchain images across queue families.
-
Updated sample code based on recent changes
-
-
Revision 44, 2015-07-27 (Ian Elliott)
-
Noted that support for VK_PRESENT_MODE_FIFO_KHR is required. That is ICDs may optionally support IMMEDIATE and MAILBOX, but must support FIFO.
-
-
Revision 45, 2015-08-07 (Ian Elliott)
-
Corrected a typo in spec file (type and variable name had wrong case for the imageColorSpace member of the VkSwapchainCreateInfoKHR struct).
-
Corrected a typo in header file (last parameter in PFN_vkGetSurfacePropertiesKHR was missing "KHR" at the end of type: VkSurfacePropertiesKHR).
-
-
Revision 46, 2015-08-20 (Ian Elliott)
-
Renamed this extension and all of its enumerations, types, functions, etc. This makes it compliant with the proposed standard for Vulkan extensions.
-
Switched from "revision" to "version", including use of the VK_MAKE_VERSION macro in the header file.
-
Made improvements to several descriptions.
-
Changed the status of several issues from PROPOSED to RESOLVED, leaving no unresolved issues.
-
Resolved several TODOs, did miscellaneous cleanup, etc.
-
-
Revision 47, 2015-08-20 (Ian Elliott—porting a 2015-07-29 change from James Jones)
-
Moved the surface transform enums to VK_WSI_swapchain so they could be re-used by VK_WSI_display.
-
-
Revision 48, 2015-09-01 (James Jones)
-
Various minor cleanups.
-
-
Revision 49, 2015-09-01 (James Jones)
-
Restore single-field revision number.
-
-
Revision 50, 2015-09-01 (James Jones)
-
Update Example #4 to include code that illustrates how to use the oldSwapchain field.
-
-
Revision 51, 2015-09-01 (James Jones)
-
Fix example code compilation errors.
-
-
Revision 52, 2015-09-08 (Matthaeus G. Chajdas)
-
Corrected a typo.
-
-
Revision 53, 2015-09-10 (Alon Or-bach)
-
Removed underscore from SWAP_CHAIN left in VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR.
-
-
Revision 54, 2015-09-11 (Jesse Hall)
-
Described the execution and memory coherence requirements for image transitions to and from VK_IMAGE_LAYOUT_PRESENT_SOURCE_KHR.
-
-
Revision 55, 2015-09-11 (Ray Smith)
-
Added errors for destroying and binding memory to presentable images
-
-
Revision 56, 2015-09-18 (James Jones)
-
Added fence argument to vkAcquireNextImageKHR
-
Added example of how to meter a host thread based on presentation rate.
-
-
Revision 57, 2015-09-26 (Jesse Hall)
-
Replace VkSurfaceDescriptionKHR with VkSurfaceKHR.
-
Added issue 25 with agreed resolution.
-
-
Revision 58, 2015-09-28 (Jesse Hall)
-
Renamed from VK_EXT_KHR_device_swapchain to VK_EXT_KHR_swapchain.
-
-
Revision 59, 2015-09-29 (Ian Elliott)
-
Changed vkDestroySwapchainKHR() to return void.
-
-
Revision 60, 2015-10-01 (Jeff Vigil)
-
Added error result VK_ERROR_SURFACE_LOST_KHR.
-
-
Revision 61, 2015-10-05 (Jason Ekstrand)
-
Added the VkCompositeAlpha enum and corresponding structure fields.
-
-
Revision 62, 2015-10-12 (Daniel Rakos)
-
Added VK_PRESENT_MODE_FIFO_RELAXED_KHR.
-
-
Revision 63, 2015-10-15 (Daniel Rakos)
-
Moved surface capability queries to VK_EXT_KHR_surface.
-
-
Revision 64, 2015-10-26 (Ian Elliott)
-
Renamed from VK_EXT_KHR_swapchain to VK_KHR_swapchain.
-
-
Revision 65, 2015-10-28 (Ian Elliott)
-
Added optional pResult member to VkPresentInfoKHR, so that per-swapchain results can be obtained from vkQueuePresentKHR().
-
-
Revision 66, 2015-11-03 (Daniel Rakos)
-
Added allocation callbacks to create and destroy functions.
-
Updated resource transition language.
-
Updated sample code.
-
-
Revision 67, 2015-11-10 (Jesse Hall)
-
Add reserved flags bitmask to VkSwapchainCreateInfoKHR.
-
Modify naming and member ordering to match API style conventions, and so the VkSwapchainCreateInfoKHR image property members mirror corresponding VkImageCreateInfo members but with an 'image' prefix.
-
Make VkPresentInfoKHR::pResults non-const; it is an output array parameter.
-
Make pPresentInfo parameter to vkQueuePresentKHR const.
-
-
Revision 68, 2016-04-05 (Ian Elliott)
-
Moved the "validity" include for vkAcquireNextImage to be in its proper place, after the prototype and list of parameters.
-
Clarified language about presentable images, including how they are acquired, when applications can and cannot use them, etc. As part of this, removed language about "ownership" of presentable images, and replaced it with more-consistent language about presentable images being "acquired" by the application.
-
-
2016-08-23 (Ian Elliott)
-
Update the example code, to use the final API command names, to not have so many characters per line, and to split out a new example to show how to obtain function pointers. This code is more similar to the LunarG "cube" demo program.
-
-
2016-08-25 (Ian Elliott)
-
A note was added at the beginning of the example code, stating that it will be removed from future versions of the appendix.
-
-
Revision 69, 2017-09-07 (Tobias Hector)
-
Added interactions with Vulkan 1.1
-
-
Revision 70, 2017-10-06 (Ian Elliott)
-
Corrected interactions with Vulkan 1.1
-
VK_KHR_swapchain_mutable_format
- Name String
-
VK_KHR_swapchain_mutable_format
- Extension Type
-
Device extension
- Registered Extension Number
-
201
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_swapchain
-
Requires
VK_KHR_maintenance2
-
Requires
VK_KHR_image_format_list
-
- Contact
-
-
Daniel Rakos drakos-arm
-
- Last Modified Date
-
2018-03-28
- IP Status
-
No known IP claims.
- Contributors
-
-
Jason Ekstrand, Intel
-
Jan-Harald Fredriksen, ARM
-
Jesse Hall, Google
-
Daniel Rakos, AMD
-
Ray Smith, ARM
-
Short Description
Allows processing of swapchain images as different formats to that used by the window system, which is particularly useful for switching between sRGB and linear RGB formats.
Description
This extension adds a new swapchain creation flag that enables creating image views from presentable images with a different format than the one used to create the swapchain.
New Object Types
None.
New Enum Constants
-
Extending VkSwapchainCreateFlagBitsKHR:
-
VK_SWAPCHAIN_CREATE_MUTABLE_FORMAT_BIT_KHR
-
New Enums
None.
New Structures
None.
New Functions
None.
Issues
1) Are there any new capabilities needed?
RESOLVED: No. It is expected that all implementations exposing this extension support swapchain image format mutability.
2) Do we need a separate VK_SWAPCHAIN_CREATE_EXTENDED_USAGE_BIT_KHR
?
RESOLVED: No.
This extension requires VK_KHR_maintenance2
and presentable images of
swapchains created with VK_SWAPCHAIN_CREATE_MUTABLE_FORMAT_BIT_KHR
are
created internally in a way equivalent to specifying both
VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
and
VK_IMAGE_CREATE_EXTENDED_USAGE_BIT_KHR
.
3) Do we need a separate structure to allow specifying an image format list for swapchains?
RESOLVED: No.
We simply use the same VkImageFormatListCreateInfoKHR structure
introduced by VK_KHR_image_format_list
.
The structure is required to be included in the pNext
chain of
VkSwapchainCreateInfoKHR for swapchains created with
VK_SWAPCHAIN_CREATE_MUTABLE_FORMAT_BIT_KHR
.
Version History
-
Revision 1, 2018-03-28 (Daniel Rakos)
-
Internal revisions.
-
VK_KHR_wayland_surface
- Name String
-
VK_KHR_wayland_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
7
- Revision
-
6
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Jesse Hall critsec
-
Ian Elliott ianelliottus
-
- Last Modified Date
-
2015-11-28
- IP Status
-
No known IP claims.
- Contributors
-
-
Patrick Doane, Blizzard
-
Jason Ekstrand, Intel
-
Ian Elliott, LunarG
-
Courtney Goeltzenleuchter, LunarG
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Antoine Labour, Google
-
Jon Leech, Khronos
-
David Mao, AMD
-
Norbert Nopper, Freescale
-
Alon Or-bach, Samsung
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Ray Smith, ARM
-
Jeff Vigil, Qualcomm
-
Chia-I Wu, LunarG
-
The VK_KHR_wayland_surface
extension is an instance extension.
It provides a mechanism to create a VkSurfaceKHR object (defined by
the VK_KHR_surface
extension) that refers to a Wayland
wl_surface
, as well as a query to determine support for rendering to a
Wayland compositor.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_WAYLAND_SURFACE_CREATE_INFO_KHR
-
New Enums
None
New Structures
Issues
1) Does Wayland need a way to query for compatibility between a particular
physical device and a specific Wayland display? This would be a more general
query than vkGetPhysicalDeviceSurfaceSupportKHR: if the
Wayland-specific query returned VK_TRUE
for a (VkPhysicalDevice,
struct wl_display*
) pair, then the physical device could be assumed to
support presentation to any VkSurfaceKHR for surfaces on the display.
RESOLVED: Yes. vkGetPhysicalDeviceWaylandPresentationSupportKHR was added to address this issue.
2) Should we require surfaces created with vkCreateWaylandSurfaceKHR
to support the VK_PRESENT_MODE_MAILBOX_KHR
present mode?
RESOLVED: Yes.
Wayland is an inherently mailbox window system and mailbox support is
required for some Wayland compositor interactions to work as expected.
While handling these interactions may be possible with
VK_PRESENT_MODE_FIFO_KHR
, it is much more difficult to do without
deadlock and requiring all Wayland applications to be able to support
implementations which only support VK_PRESENT_MODE_FIFO_KHR
would be
an onerous restriction on application developers.
Version History
-
Revision 1, 2015-09-23 (Jesse Hall)
-
Initial draft, based on the previous contents of VK_EXT_KHR_swapchain (later renamed VK_EXT_KHR_surface).
-
-
Revision 2, 2015-10-02 (James Jones)
-
Added vkGetPhysicalDeviceWaylandPresentationSupportKHR() to resolve issue #1.
-
Adjusted wording of issue #1 to match the agreed-upon solution.
-
Renamed "window" parameters to "surface" to match Wayland conventions.
-
-
Revision 3, 2015-10-26 (Ian Elliott)
-
Renamed from VK_EXT_KHR_wayland_surface to VK_KHR_wayland_surface.
-
-
Revision 4, 2015-11-03 (Daniel Rakos)
-
Added allocation callbacks to vkCreateWaylandSurfaceKHR.
-
-
Revision 5, 2015-11-28 (Daniel Rakos)
-
Updated the surface create function to take a pCreateInfo structure.
-
-
Revision 6, 2017-02-08 (Jason Ekstrand)
-
Added the requirement that implementations support
VK_PRESENT_MODE_MAILBOX_KHR
. -
Added wording about interactions between vkQueuePresentKHR and the Wayland requests sent to the compositor.
-
VK_KHR_win32_keyed_mutex
- Name String
-
VK_KHR_win32_keyed_mutex
- Extension Type
-
Device extension
- Registered Extension Number
-
76
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_memory_win32
-
- Contact
-
-
Carsten Rohde crohde
-
- Last Modified Date
-
2016-10-21
- IP Status
-
No known IP claims.
- Contributors
-
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Carsten Rohde, NVIDIA
-
Applications that wish to import Direct3D 11 memory objects into the Vulkan API may wish to use the native keyed mutex mechanism to synchronize access to the memory between Vulkan and Direct3D. This extension provides a way for an application to access the keyed mutex associated with an imported Vulkan memory object when submitting command buffers to a queue.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_WIN32_KEYED_MUTEX_ACQUIRE_RELEASE_INFO_KHR
New Enums
None.
New Structs
New Functions
None.
Issues
None.
VK_KHR_win32_surface
- Name String
-
VK_KHR_win32_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
10
- Revision
-
6
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Jesse Hall critsec
-
Ian Elliott ianelliottus
-
- Last Modified Date
-
2017-04-24
- IP Status
-
No known IP claims.
- Contributors
-
-
Patrick Doane, Blizzard
-
Jason Ekstrand, Intel
-
Ian Elliott, LunarG
-
Courtney Goeltzenleuchter, LunarG
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Antoine Labour, Google
-
Jon Leech, Khronos
-
David Mao, AMD
-
Norbert Nopper, Freescale
-
Alon Or-bach, Samsung
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Ray Smith, ARM
-
Jeff Vigil, Qualcomm
-
Chia-I Wu, LunarG
-
The VK_KHR_win32_surface
extension is an instance extension.
It provides a mechanism to create a VkSurfaceKHR object (defined by
the VK_KHR_surface
extension) that refers to a Win32 HWND
, as
well as a query to determine support for rendering to the windows desktop.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_WIN32_SURFACE_CREATE_INFO_KHR
-
New Enums
None
New Structures
Issues
1) Does Win32 need a way to query for compatibility between a particular physical device and a specific screen? Compatibility between a physical device and a window generally only depends on what screen the window is on. However, there is not an obvious way to identify a screen without already having a window on the screen.
RESOLVED: No. While it may be useful, there is not a clear way to do this on Win32. However, a method was added to query support for presenting to the windows desktop as a whole.
2) If a native window object (HWND
) is used by one graphics API, and
then is later used by a different graphics API (one of which is Vulkan), can
these uses interfere with each other?
RESOLVED: Yes.
Uses of a window object by multiple graphics APIs results in undefined behavior. Such behavior may succeed when using one Vulkan implementation but fail when using a different Vulkan implementation. Potential failures include:
-
Creating then destroying a flip presentation model DXGI swapchain on a window object can prevent vkCreateSwapchainKHR from succeeding on the same window object.
-
Creating then destroying a VkSwapchainKHR on a window object can prevent creation of a bitblt model DXGI swapchain on the same window object.
-
Creating then destroying a VkSwapchainKHR on a window object can effectively
SetPixelFormat
to a different format than the format chosen by an OpenGL application. -
Creating then destroying a VkSwapchainKHR on a window object on one VkPhysicalDevice can prevent vkCreateSwapchainKHR from succeeding on the same window object, but on a different VkPhysicalDevice that is associated with a different Vulkan ICD.
In all cases the problem can be worked around by creating a new window object.
Technical details include:
-
Creating a DXGI swapchain over a window object can alter the object for the remainder of its lifetime. The alteration persists even after the DXGI swapchain has been destroyed. This alteration can make it impossible for a conformant Vulkan implementation to create a VkSwapchainKHR over the same window object. Mention of this alteration can be found in the remarks section of the MSDN documentation for
DXGI_SWAP_EFFECT
. -
Calling GDI’s
SetPixelFormat
(needed by OpenGL’s WGL layer) on a window object alters the object for the remainder of its lifetime. The MSDN documentation forSetPixelFormat
explains that a window object’s pixel format can be set only one time. -
Creating a VkSwapchainKHR over a window object can alter the object for the remaining life of its lifetime. Either of the above alterations may occur as a side-effect of VkSwapchainKHR.
Version History
-
Revision 1, 2015-09-23 (Jesse Hall)
-
Initial draft, based on the previous contents of VK_EXT_KHR_swapchain (later renamed VK_EXT_KHR_surface).
-
-
Revision 2, 2015-10-02 (James Jones)
-
Added presentation support query for win32 desktops.
-
-
Revision 3, 2015-10-26 (Ian Elliott)
-
Renamed from VK_EXT_KHR_win32_surface to VK_KHR_win32_surface.
-
-
Revision 4, 2015-11-03 (Daniel Rakos)
-
Added allocation callbacks to vkCreateWin32SurfaceKHR.
-
-
Revision 5, 2015-11-28 (Daniel Rakos)
-
Updated the surface create function to take a pCreateInfo structure.
-
-
Revision 6, 2017-04-24 (Jeff Juliano)
-
Add issue 2 addressing reuse of a native window object in a different Graphics API, or by a different Vulkan ICD.
-
VK_KHR_xcb_surface
- Name String
-
VK_KHR_xcb_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
6
- Revision
-
6
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Jesse Hall critsec
-
Ian Elliott ianelliottus
-
- Last Modified Date
-
2015-11-28
- IP Status
-
No known IP claims.
- Contributors
-
-
Patrick Doane, Blizzard
-
Jason Ekstrand, Intel
-
Ian Elliott, LunarG
-
Courtney Goeltzenleuchter, LunarG
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Antoine Labour, Google
-
Jon Leech, Khronos
-
David Mao, AMD
-
Norbert Nopper, Freescale
-
Alon Or-bach, Samsung
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Ray Smith, ARM
-
Jeff Vigil, Qualcomm
-
Chia-I Wu, LunarG
-
The VK_KHR_xcb_surface
extension is an instance extension.
It provides a mechanism to create a VkSurfaceKHR object (defined by
the VK_KHR_surface
extension) that refers to an X11 Window
, using
the XCB client-side library, as well as a query to determine support for
rendering via XCB.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_XCB_SURFACE_CREATE_INFO_KHR
-
New Enums
None
New Structures
Issues
1) Does XCB need a way to query for compatibility between a particular
physical device and a specific screen? This would be a more general query
than vkGetPhysicalDeviceSurfaceSupportKHR: If it returned
VK_TRUE
, then the physical device could be assumed to support
presentation to any window on that screen.
RESOLVED: Yes, this is needed for toolkits that want to create a VkDevice before creating a window. To ensure the query is reliable, it must be made against a particular X visual rather than the screen in general.
Version History
-
Revision 1, 2015-09-23 (Jesse Hall)
-
Initial draft, based on the previous contents of VK_EXT_KHR_swapchain (later renamed VK_EXT_KHR_surface).
-
-
Revision 2, 2015-10-02 (James Jones)
-
Added presentation support query for an (xcb_connection_t*, xcb_visualid_t) pair.
-
Removed "root" parameter from CreateXcbSurfaceKHR(), as it is redundant when a window on the same screen is specified as well.
-
Adjusted wording of issue #1 and added agreed upon resolution.
-
-
Revision 3, 2015-10-14 (Ian Elliott)
-
Removed "root" parameter from CreateXcbSurfaceKHR() in one more place.
-
-
Revision 4, 2015-10-26 (Ian Elliott)
-
Renamed from VK_EXT_KHR_xcb_surface to VK_KHR_xcb_surface.
-
-
Revision 5, 2015-10-23 (Daniel Rakos)
-
Added allocation callbacks to vkCreateXcbSurfaceKHR.
-
-
Revision 6, 2015-11-28 (Daniel Rakos)
-
Updated the surface create function to take a pCreateInfo structure.
-
VK_KHR_xlib_surface
- Name String
-
VK_KHR_xlib_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
5
- Revision
-
6
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Jesse Hall critsec
-
Ian Elliott ianelliottus
-
- Last Modified Date
-
2015-11-28
- IP Status
-
No known IP claims.
- Contributors
-
-
Patrick Doane, Blizzard
-
Jason Ekstrand, Intel
-
Ian Elliott, LunarG
-
Courtney Goeltzenleuchter, LunarG
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Antoine Labour, Google
-
Jon Leech, Khronos
-
David Mao, AMD
-
Norbert Nopper, Freescale
-
Alon Or-bach, Samsung
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Ray Smith, ARM
-
Jeff Vigil, Qualcomm
-
Chia-I Wu, LunarG
-
The VK_KHR_xlib_surface
extension is an instance extension.
It provides a mechanism to create a VkSurfaceKHR object (defined by
the VK_KHR_surface
extension) that refers to an X11 Window
, using
the Xlib client-side library, as well as a query to determine support for
rendering via Xlib.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_XLIB_SURFACE_CREATE_INFO_KHR
-
New Enums
None
New Structures
Issues
1) Does X11 need a way to query for compatibility between a particular
physical device and a specific screen? This would be a more general query
than vkGetPhysicalDeviceSurfaceSupportKHR : if it returned
VK_TRUE
, then the physical device could be assumed to support
presentation to any window on that screen.
RESOLVED: Yes, this is needed for toolkits that want to create a VkDevice before creating a window. To ensure the query is reliable, it must be made against a particular X visual rather than the screen in general.
Version History
-
Revision 1, 2015-09-23 (Jesse Hall)
-
Initial draft, based on the previous contents of VK_EXT_KHR_swapchain (later renamed VK_EXT_KHR_surface).
-
-
Revision 2, 2015-10-02 (James Jones)
-
Added presentation support query for (Display*, VisualID) pair.
-
Removed "root" parameter from CreateXlibSurfaceKHR(), as it is redundant when a window on the same screen is specified as well.
-
Added appropriate X errors.
-
Adjusted wording of issue #1 and added agreed upon resolution.
-
-
Revision 3, 2015-10-14 (Ian Elliott)
-
Renamed this extension from VK_EXT_KHR_x11_surface to VK_EXT_KHR_xlib_surface.
-
-
Revision 4, 2015-10-26 (Ian Elliott)
-
Renamed from VK_EXT_KHR_xlib_surface to VK_KHR_xlib_surface.
-
-
Revision 5, 2015-11-03 (Daniel Rakos)
-
Added allocation callbacks to vkCreateXlibSurfaceKHR.
-
-
Revision 6, 2015-11-28 (Daniel Rakos)
-
Updated the surface create function to take a pCreateInfo structure.
-
VK_EXT_acquire_xlib_display
- Name String
-
VK_EXT_acquire_xlib_display
- Extension Type
-
Instance extension
- Registered Extension Number
-
90
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_EXT_direct_mode_display
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-12-13
- IP Status
-
No known IP claims.
- Contributors
-
-
Dave Airlie, Red Hat
-
Pierre Boudier, NVIDIA
-
James Jones, NVIDIA
-
Damien Leone, NVIDIA
-
Pierre-Loup Griffais, Valve
-
Liam Middlebrook, NVIDIA
-
Daniel Vetter, Intel
-
This extension allows an application to take exclusive control on a display currently associated with an X11 screen. When control is acquired, the display will be deassociated from the X11 screen until control is released or the specified display connection is closed. Essentially, the X11 screen will behave as if the monitor has been unplugged until control is released.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
Issues
1) Should vkAcquireXlibDisplayEXT take an RandR display ID, or a Vulkan display handle as input?
RESOLVED: A Vulkan display handle. Otherwise there would be no way to specify handles to displays that had been “blacklisted” or prevented from being included in the X11 display list by some native platform or vendor-specific mechanism.
2) How does an application figure out which RandR display corresponds to a Vulkan display?
RESOLVED: A new function, vkGetRandROutputDisplayEXT, is introduced for this purpose.
3) Should vkGetRandROutputDisplayEXT be part of this extension, or a general Vulkan + RandR or Vulkan + Xlib extension?
RESOLVED: To avoid yet another extension, include it in this extension.
Version History
-
Revision 1, 2016-12-13 (James Jones)
-
Initial draft
-
VK_EXT_astc_decode_mode
- Name String
-
VK_EXT_astc_decode_mode
- Extension Type
-
Device extension
- Registered Extension Number
-
68
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jan-Harald Fredriksen janharaldfredriksen-arm
-
- Last Modified Date
-
2018-08-07
- Contributors
-
-
Jan-Harald Fredriksen, Arm
-
The existing specification requires that low dynamic range (LDR) ASTC textures are decompressed to FP16 values per component. In many cases, decompressing LDR textures to a lower precision intermediate result gives acceptable image quality. Source material for LDR textures is typically authored as 8-bit UNORM values, so decoding to FP16 values adds little value. On the other hand, reducing precision of the decoded result reduces the size of the decompressed data, potentially improving texture cache performance and saving power.
The goal of this extension is to enable this efficiency gain on existing ASTC texture data. This is achieved by giving the application the ability to select the intermediate decoding precision.
Three decoding options are provided:
-
Decode to
VK_FORMAT_R16G16B16A16_SFLOAT
precision: This is the default, and matches the required behavior in the core API. -
Decode to
VK_FORMAT_R8G8B8A8_UNORM
precision: This is provided as an option in LDR mode. -
Decode to
VK_FORMAT_E5B9G9R9_UFLOAT_PACK32
precision: This is provided as an option in both LDR and HDR mode. In this mode, negative values cannot be represented and are clamped to zero. The alpha component is ignored, and the results are as if alpha was 1.0. This decode mode is optional and support can be queried via the physical device properties.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_IMAGE_VIEW_ASTC_DECODE_MODE_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ASTC_DECODE_FEATURES_EXT
-
New Enums
None.
New Functions
None.
Issues
1) Are implementations allowed to decode at a higher precision than what is requested?
RESOLUTION: No. If we allow this, then this extension could be exposed on all implementations that support ASTC. But developers would have no way of knowing what precision was actually used, and thus whether the image quality is sufficient at reduced precision.
2) Should the decode mode be image view state and/or sampler state?
RESOLUTION: Image view state only. Some implementations treat the different decode modes as different texture formats.
Example
Create an image view that decodes to VK_FORMAT_R8G8B8A8_UNORM precision:
VkImageViewASTCDecodeModeEXT decodeMode =
{
VK_STRUCTURE_TYPE_IMAGE_VIEW_ASTC_DECODE_MODE_EXT, // sType
NULL, // pNext
VK_FORMAT_R8G8B8A8_UNORM // decode mode
};
VkImageViewCreateInfo createInfo =
{
VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, // sType
&decodeMode, // pNext
// flags, image, viewType set to application-desired values
VK_FORMAT_ASTC_8x8_UNORM_BLOCK, // format
// components, subresourceRange set to application-desired values
};
VkImageView imageView;
VkResult result = vkCreateImageView(
device,
&createInfo,
NULL,
&imageView);
Version History
-
Revision 1, 2018-08-07 (Jan-Harald Fredriksen)
-
Initial revision
-
VK_EXT_blend_operation_advanced
- Name String
-
VK_EXT_blend_operation_advanced
- Extension Type
-
Device extension
- Registered Extension Number
-
149
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2017-06-12
- Contributors
-
-
Jeff Bolz, NVIDIA
-
This extension adds a number of “advanced” blending operations that can be used to perform new color blending operations, many of which are more complex than the standard blend modes provided by unextended Vulkan. This extension requires different styles of usage, depending on the level of hardware support and the enabled features:
-
If VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT::
advancedBlendCoherentOperations
isVK_FALSE
, the new blending operations are supported, but a memory dependency must separate each advanced blend operation on a given sample.VK_ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT_BIT_EXT
is used to synchronize reads using advanced blend operations. -
If VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT::
advancedBlendCoherentOperations
isVK_TRUE
, advanced blend operations obey primitive order just like basic blend operations.
In unextended Vulkan, the set of blending operations is limited, and can be
expressed very simply.
The VK_BLEND_OP_MIN
and VK_BLEND_OP_MAX
blend operations simply
compute component-wise minimums or maximums of source and destination color
components.
The VK_BLEND_OP_ADD
, VK_BLEND_OP_SUBTRACT
, and
VK_BLEND_OP_REVERSE_SUBTRACT
modes multiply the source and destination
colors by source and destination factors and either add the two products
together or subtract one from the other.
This limited set of operations supports many common blending operations but
precludes the use of more sophisticated transparency and blending operations
commonly available in many dedicated imaging APIs.
This extension provides a number of new “advanced” blending operations.
Unlike traditional blending operations using VK_BLEND_OP_ADD
, these
blending equations do not use source and destination factors specified by
VkBlendFactor.
Instead, each blend operation specifies a complete equation based on the
source and destination colors.
These new blend operations are used for both RGB and alpha components; they
must not be used to perform separate RGB and alpha blending (via different
values of color and alpha VkBlendOp).
These blending operations are performed using premultiplied colors, where
RGB colors can be considered premultiplied or non-premultiplied by alpha,
according to the srcPremultiplied
and dstPremultiplied
members
of VkPipelineColorBlendAdvancedStateCreateInfoEXT.
If a color is considered non-premultiplied, the (R,G,B) color components are
multiplied by the alpha component prior to blending.
For non-premultiplied color components in the range [0,1], the
corresponding premultiplied color component would have values in the range
[0 × A, 1 × A].
Many of these advanced blending equations are formulated where the result of
blending source and destination colors with partial coverage have three
separate contributions: from the portions covered by both the source and the
destination, from the portion covered only by the source, and from the
portion covered only by the destination.
The blend parameter
VkPipelineColorBlendAdvancedStateCreateInfoEXT::blendOverlap
can be used to specify a correlation between source and destination pixel
coverage.
If set to VK_BLEND_OVERLAP_CONJOINT_EXT
, the source and destination
are considered to have maximal overlap, as would be the case if drawing two
objects on top of each other.
If set to VK_BLEND_OVERLAP_DISJOINT_EXT
, the source and destination
are considered to have minimal overlap, as would be the case when rendering
a complex polygon tessellated into individual non-intersecting triangles.
If set to VK_BLEND_OVERLAP_UNCORRELATED_EXT
, the source and
destination coverage are assumed to have no spatial correlation within the
pixel.
In addition to the coherency issues on implementations not supporting
advancedBlendCoherentOperations
, this extension has several
limitations worth noting.
First, the new blend operations have a limit on the number of color
attachments they can be used with, as indicated by
VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT::advancedBlendMaxColorAttachments
.
Additionally, blending precision may be limited to 16-bit floating-point,
which may result in a loss of precision and dynamic range for framebuffer
formats with 32-bit floating-point components, and in a loss of precision
for formats with 12- and 16-bit signed or unsigned normalized integer
components.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_BLEND_OPERATION_ADVANCED_FEATURES_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_BLEND_OPERATION_ADVANCED_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_ADVANCED_STATE_CREATE_INFO_EXT
-
-
Extending VkAccessFlagBits:
-
VK_ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT_BIT_EXT
-
-
Extending VkBlendOp:
-
VK_BLEND_OP_ZERO_EXT
-
VK_BLEND_OP_SRC_EXT
-
VK_BLEND_OP_DST_EXT
-
VK_BLEND_OP_SRC_OVER_EXT
-
VK_BLEND_OP_DST_OVER_EXT
-
VK_BLEND_OP_SRC_IN_EXT
-
VK_BLEND_OP_DST_IN_EXT
-
VK_BLEND_OP_SRC_OUT_EXT
-
VK_BLEND_OP_DST_OUT_EXT
-
VK_BLEND_OP_SRC_ATOP_EXT
-
VK_BLEND_OP_DST_ATOP_EXT
-
VK_BLEND_OP_XOR_EXT
-
VK_BLEND_OP_MULTIPLY_EXT
-
VK_BLEND_OP_SCREEN_EXT
-
VK_BLEND_OP_OVERLAY_EXT
-
VK_BLEND_OP_DARKEN_EXT
-
VK_BLEND_OP_LIGHTEN_EXT
-
VK_BLEND_OP_COLORDODGE_EXT
-
VK_BLEND_OP_COLORBURN_EXT
-
VK_BLEND_OP_HARDLIGHT_EXT
-
VK_BLEND_OP_SOFTLIGHT_EXT
-
VK_BLEND_OP_DIFFERENCE_EXT
-
VK_BLEND_OP_EXCLUSION_EXT
-
VK_BLEND_OP_INVERT_EXT
-
VK_BLEND_OP_INVERT_RGB_EXT
-
VK_BLEND_OP_LINEARDODGE_EXT
-
VK_BLEND_OP_LINEARBURN_EXT
-
VK_BLEND_OP_VIVIDLIGHT_EXT
-
VK_BLEND_OP_LINEARLIGHT_EXT
-
VK_BLEND_OP_PINLIGHT_EXT
-
VK_BLEND_OP_HARDMIX_EXT
-
VK_BLEND_OP_HSL_HUE_EXT
-
VK_BLEND_OP_HSL_SATURATION_EXT
-
VK_BLEND_OP_HSL_COLOR_EXT
-
VK_BLEND_OP_HSL_LUMINOSITY_EXT
-
VK_BLEND_OP_PLUS_EXT
-
VK_BLEND_OP_PLUS_CLAMPED_EXT
-
VK_BLEND_OP_PLUS_CLAMPED_ALPHA_EXT
-
VK_BLEND_OP_PLUS_DARKER_EXT
-
VK_BLEND_OP_MINUS_EXT
-
VK_BLEND_OP_MINUS_CLAMPED_EXT
-
VK_BLEND_OP_CONTRAST_EXT
-
VK_BLEND_OP_INVERT_OVG_EXT
-
VK_BLEND_OP_RED_EXT
-
VK_BLEND_OP_GREEN_EXT
-
VK_BLEND_OP_BLUE_EXT
-
New Enums
New Structures
New Functions
None.
Issues
None.
Version History
-
Revisions 1-2, 2017-06-12 (Jeff Bolz)
-
Internal revisions
-
VK_EXT_calibrated_timestamps
- Name String
-
VK_EXT_calibrated_timestamps
- Extension Type
-
Device extension
- Registered Extension Number
-
185
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Rakos drakos-amd
-
- Last Modified Date
-
2018-10-04
- IP Status
-
No known IP claims.
- Contributors
-
-
Matthaeus G. Chajdas, AMD
-
Alan Harrison, AMD
-
Derrick Owens, AMD
-
Daniel Rakos, AMD
-
Keith Packard, Valve
-
This extension provides an interface to query calibrated timestamps obtained quasi simultaneously from two time domains.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_CALIBRATED_TIMESTAMP_INFO_EXT
-
New Enums
New Structures
Issues
1) Is the device timestamp value returned in the same time domain as the timestamp values written by vkCmdWriteTimestamp?
RESOLVED: Yes.
2) What time domain is the host timestamp returned in?
RESOLVED: A query is provided to determine the calibrateable time domains. The expected host time domain used on Windows is that of QueryPerformanceCounter, and on Linux that of CLOCK_MONOTONIC.
3) Should we support other time domain combinations than just one host and the device time domain?
RESOLVED: Supporting that would need the application to query the set of supported time domains, while supporting only one host and the device time domain would only need a query for the host time domain type. The proposed API chooses the general approach for the sake of extensibility.
4) Shouldn’t we use CLOCK_MONOTONIC_RAW instead of CLOCK_MONOTONIC?
RESOLVED: CLOCK_MONOTONIC is usable in a wider set of situations, however, it is subject to NTP adjustments so some use cases may prefer CLOCK_MONOTONIC_RAW. Thus this extension allows both to be exposed.
5) How can the application extrapolate future device timestamp values from the calibrated timestamp value?
RESOLVED: VkPhysicalDeviceLimits::timestampPeriod
makes it
possible to calculate future device timestamps as follows:
futureTimestamp = calibratedTimestamp + deltaNanoseconds / timestampPeriod
6) Can the host and device timestamp values drift apart over longer periods of time?
RESOLVED: Yes, especially as some time domains by definition allow for that to happen (e.g. CLOCK_MONOTONIC is subject to NTP adjustments). Thus it’s recommended that applications re-calibrate from time to time.
7) Should we add a query for reporting the maximum deviation of the timestamp values returned by calibrated timestamp queries?
RESOLVED: A global query seems inappropriate and difficult to enforce. However, it’s possible to return the maximum deviation any single calibrated timestamp query can have by sampling one of the time domains twice as follows:
timestampX = timestampX_before = SampleTimeDomain(X)
for each time domain Y != X
timestampY = SampleTimeDomain(Y)
timestampX_after = SampleTimeDomain(X)
maxDeviation = timestampX_after - timestampX_before
8) Can the maximum deviation reported ever be zero?
RESOLVED: Unless the tick of each clock corresponding to the set of time domains coincides and all clocks can literally be sampled simutaneously, there isn’t really a possibility for the maximum deviation to be zero, so by convention the maximum deviation is always at least the maximum of the length of the ticks of the set of time domains calibrated and thus can never be zero.
Version History
-
Revision 1, 2018-10-04 (Daniel Rakos)
-
Internal revisions.
-
VK_EXT_conditional_rendering
- Name String
-
VK_EXT_conditional_rendering
- Extension Type
-
Device extension
- Registered Extension Number
-
82
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Vikram Kushwaha vkushwaha
-
- Last Modified Date
-
2018-05-21
- IP Status
-
No known IP claims.
- Contributors
-
-
Vikram Kushwaha, NVIDIA
-
Daniel Rakos, AMD
-
Jesse Hall, Google
-
Jeff Bolz, NVIDIA
-
Piers Daniell, NVIDIA
-
Stuart Smith, Imagination Technologies
-
This extension allows the execution of one or more rendering commands to be conditional on a value in buffer memory. This may help an application reduce the latency by conditionally discarding rendering commands without application intervention. The conditional rendering commands are limited to draws, compute dispatches and clearing attachments within a conditional rendering block.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_COMMAND_BUFFER_INHERITANCE_CONDITIONAL_RENDERING_INFO_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_CONDITIONAL_RENDERING_FEATURES_EXT
-
VK_STRUCTURE_TYPE_CONDITIONAL_RENDERING_BEGIN_INFO_EXT
-
-
Extending VkAccessFlagBits:
-
VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT
-
-
Extending VkBufferUsageFlagBits:
-
VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT
-
-
Extending VkPipelineStageFlagBits:
-
VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT
-
New Enums
Issues
1) Should conditional rendering affect copy and blit commands?
RESOLVED: Conditional rendering should not affect copies and blits.
2) Should secondary command buffers be allowed to execute while conditional rendering is active in the primary command buffer?
RESOLVED: The rendering commands in secondary command buffer will be
affected by an active conditional rendering in primary command buffer if the
conditionalRenderingEnable
is set to VK_TRUE
.
Conditional rendering must not be active in the primary command buffer if
conditionalRenderingEnable
is VK_FALSE
.
Examples
None.
Version History
-
Revision 1, 2018-04-19 (Vikram Kushwaha)
-
First Version
-
-
Revision 2, 2018-05-21 (Vikram Kushwaha)
-
Add new pipeline stage, access flags and limit conditional rendering to a subpass or entire renderpass.
-
VK_EXT_conservative_rasterization
- Name String
-
VK_EXT_conservative_rasterization
- Extension Type
-
Device extension
- Registered Extension Number
-
102
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Piers Daniell pdaniell-nv
-
- Last Modified Data
-
2017-08-28
- Contributors
-
-
Daniel Koch, NVIDIA
-
Daniel Rakos, AMD
-
Jeff Bolz, NVIDIA
-
Slawomir Grajewski, Intel
-
Stu Smith, Imagination Technologies
-
This extension adds a new rasterization mode called conservative rasterization. There are two modes of conservative rasterization; overestimation and underestimation.
When overestimation is enabled, if any part of the primitive, including its edges, covers any part of the rectangular pixel area, including its sides, then a fragment is generated with all coverage samples turned on. This extension allows for some variation in implementations by accounting for differences in overestimation, where the generating primitive size is increased at each of its edges by some sub-pixel amount to further increase conservative pixel coverage. Implementations can allow the application to specify an extra overestimation beyond the base overestimation the implementation already does. It also allows implementations to either cull degenerate primitives or rasterize them.
When underestimation is enabled, fragments are only generated if the rectangular pixel area is fully covered by the generating primitive. If supported by the implementation, when a pixel rectangle is fully covered the fragment shader input variable builtin called FullyCoveredEXT is set to true. The shader variable works in either overestimation or underestimation mode.
Implementations can process degenerate triangles and lines by either discarding them or generating conservative fragments for them. Degenerate triangles are those that end up with zero area after the rasterizer quantizes them to the fixed-point pixel grid. Degenerate lines are those with zero length after quantization.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_CONSERVATIVE_RASTERIZATION_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_CONSERVATIVE_STATE_CREATE_INFO_EXT
-
New Structures
New Functions
None.
Issues
None.
Version History
-
Revision 1, 2017-08-28 (Piers Daniell)
-
Internal revisions
-
VK_EXT_debug_utils
- Name String
-
VK_EXT_debug_utils
- Extension Type
-
Instance extension
- Registered Extension Number
-
129
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Mark Young marky-lunarg
-
- Last Modified Date
-
2017-09-14
- Revision
-
1
- IP Status
-
No known IP claims.
- Dependencies
-
-
This extension is written against version 1.0 of the Vulkan API.
-
Requires VkObjectType
-
- Contributors
-
-
Mark Young, LunarG
-
Baldur Karlsson
-
Ian Elliott, Google
-
Courtney Goeltzenleuchter, Google
-
Karl Schultz, LunarG
-
Mark Lobodzinski, LunarG
-
Mike Schuchardt, LunarG
-
Jaakko Konttinen, AMD
-
Dan Ginsburg, Valve Software
-
Rolando Olivares, Epic Games
-
Dan Baker, Oxide Games
-
Kyle Spagnoli, NVIDIA
-
Jon Ashburn, LunarG
-
Due to the nature of the Vulkan interface, there is very little error
information available to the developer and application.
By using the VK_EXT_debug_utils
extension, developers can obtain more
information.
When combined with validation layers, even more detailed feedback on the
application’s use of Vulkan will be provided.
This extension provides the following capabilities:
-
The ability to create a debug messenger which will pass along debug messages to an application supplied callback.
-
The ability to identify specific Vulkan objects using a name or tag to improve tracking.
-
The ability to identify specific sections within a
VkQueue
orVkCommandBuffer
using labels to aid organization and offline analysis in external tools.
The main difference between this extension and VK_EXT_debug_report
and
VK_EXT_debug_marker
is that those extensions use
VkDebugReportObjectTypeEXT to identify objects.
This extension uses the core VkObjectType in place of
VkDebugReportObjectTypeEXT.
The primary reason for this move is that no future object type handle
enumeration values will be added to VkDebugReportObjectTypeEXT since
the creation of VkObjectType.
In addition, this extension combines the functionality of both
VK_EXT_debug_report
and VK_EXT_debug_marker
by allowing object
name and debug markers (now called labels) to be returned to the
application’s callback function.
This should assist in clarifying the details of a debug message including:
what objects are involved and potentially which location within a VkQueue or
VkCommandBuffer the message occurred.
New Object Types
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DEBUG_UTILS_OBJECT_NAME_INFO_EXT
-
VK_STRUCTURE_TYPE_DEBUG_UTILS_OBJECT_TAG_INFO_EXT
-
VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT
-
VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CALLBACK_DATA_EXT
-
VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CREATE_INFO_EXT
-
-
Extending VkResult:
-
VK_ERROR_VALIDATION_FAILED_EXT
-
New Structures
New Functions
New Function Pointers
Examples
Example 1
VK_EXT_debug_utils
allows an application to register multiple callbacks
with any Vulkan component wishing to report debug information.
Some callbacks may log the information to a file, others may cause a debug
break point or other application defined behavior.
An application can register callbacks even when no validation layers are
enabled, but they will only be called for loader and, if implemented, driver
events.
To capture events that occur while creating or destroying an instance an
application can link a VkDebugUtilsMessengerCreateInfoEXT structure
to the pNext
element of the VkInstanceCreateInfo structure given
to vkCreateInstance.
This callback is only valid for the duration of the vkCreateInstance
and the vkDestroyInstance call.
Use vkCreateDebugUtilsMessengerEXT to create persistent callback
objects.
Example uses: Create three callback objects.
One will log errors and warnings to the debug console using Windows
OutputDebugString
.
The second will cause the debugger to break at that callback when an error
happens and the third will log warnings to stdout.
extern VkInstance instance;
VkResult res;
VkDebugUtilsMessengerEXT cb1, cb2, cb3;
// Must call extension functions through a function pointer:
PFN_vkCreateDebugUtilsMessengerEXT pfnCreateDebugUtilsMessengerEXT = (PFN_vkCreateDebugUtilsMessengerEXT)vkGetDeviceProcAddr(device, "vkCreateDebugUtilsMessengerEXT");
PFN_vkDestroyDebugUtilsMessengerEXT pfnDestroyDebugUtilsMessengerEXT = (PFN_vkDestroyDebugUtilsMessengerEXT)vkGetDeviceProcAddr(device, "vkDestroyDebugUtilsMessengerEXT");
VkDebugUtilsMessengeCreateInfoEXT callback1 = {
VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CREATE_INFO_EXT, // sType
NULL, // pNext
0, // flags
VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT | // messageSeverity
VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT,
VK_DEBUG_UTILS_MESSAGE_TYPE_GENERAL_BIT_EXT | // messageType
VK_DEBUG_UTILS_MESSAGE_TYPE_VALIDATION_BIT_EXT,
myOutputDebugString, // pfnUserCallback
NULL // pUserData
};
res = pfnCreateDebugUtilsMessengerEXT(instance, &callback1, NULL, &cb1);
if (res != VK_SUCCESS) {
// Do error handling for VK_ERROR_OUT_OF_MEMORY
}
callback1.messageSeverity = VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT;
callback1.pfnCallback = myDebugBreak;
callback1.pUserData = NULL;
res = pfnCreateDebugUtilsMessengerEXT(instance, &callback1, NULL, &cb2);
if (res != VK_SUCCESS) {
// Do error handling for VK_ERROR_OUT_OF_MEMORY
}
VkDebugUtilsMessengerCreateInfoEXT callback3 = {
VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CREATE_INFO_EXT, // sType
NULL, // pNext
0, // flags
VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT, // messageSeverity
VK_DEBUG_UTILS_MESSAGE_TYPE_GENERAL_BIT_EXT | // messageType
VK_DEBUG_UTILS_MESSAGE_TYPE_VALIDATION_BIT_EXT,
mystdOutLogger, // pfnUserCallback
NULL // pUserData
};
res = pfnCreateDebugUtilsMessengerEXT(instance, &callback3, NULL, &cb3);
if (res != VK_SUCCESS) {
// Do error handling for VK_ERROR_OUT_OF_MEMORY
}
...
// Remove callbacks when cleaning up
pfnDestroyDebugUtilsMessengerEXT(instance, cb1, NULL);
pfnDestroyDebugUtilsMessengerEXT(instance, cb2, NULL);
pfnDestroyDebugUtilsMessengerEXT(instance, cb3, NULL);
Example 2
Associate a name with an image, for easier debugging in external tools or with validation layers that can print a friendly name when referring to objects in error messages.
extern VkDevice device;
extern VkImage image;
// Must call extension functions through a function pointer:
PFN_vkSetDebugUtilsObjectNameEXT pfnSetDebugUtilsObjectNameEXT = (PFN_vkSetDebugUtilsObjectNameEXT)vkGetDeviceProcAddr(device, "vkSetDebugUtilsObjectNameEXT");
// Set a name on the image
const VkDebugUtilsObjectNameInfoEXT imageNameInfo =
{
VK_STRUCTURE_TYPE_DEBUG_UTILS_OBJECT_NAME_INFO_EXT, // sType
NULL, // pNext
VK_OBJECT_TYPE_IMAGE, // objectType
(uint64_t)image, // object
"Brick Diffuse Texture", // pObjectName
};
pfnSetDebugUtilsObjectNameEXT(device, &imageNameInfo);
// A subsequent error might print:
// Image 'Brick Diffuse Texture' (0xc0dec0dedeadbeef) is used in a
// command buffer with no memory bound to it.
Example 3
Annotating regions of a workload with naming information so that offline analysis tools can display a more usable visualization of the commands submitted.
extern VkDevice device;
extern VkCommandBuffer commandBuffer;
// Must call extension functions through a function pointer:
PFN_vkQueueBeginDebugUtilsLabelEXT pfnQueueBeginDebugUtilsLabelEXT = (PFN_vkQueueBeginDebugUtilsLabelEXT)vkGetDeviceProcAddr(device, "vkQueueBeginDebugUtilsLabelEXT");
PFN_vkQueueEndDebugUtilsLabelEXT pfnQueueEndDebugUtilsLabelEXT = (PFN_vkQueueEndDebugUtilsLabelEXT)vkGetDeviceProcAddr(device, "vkQueueEndDebugUtilsLabelEXT");
PFN_vkCmdBeginDebugUtilsLabelEXT pfnCmdBeginDebugUtilsLabelEXT = (PFN_vkCmdBeginDebugUtilsLabelEXT)vkGetDeviceProcAddr(device, "vkCmdBeginDebugUtilsLabelEXT");
PFN_vkCmdEndDebugUtilsLabelEXT pfnCmdEndDebugUtilsLabelEXT = (PFN_vkCmdEndDebugUtilsLabelEXT)vkGetDeviceProcAddr(device, "vkCmdEndDebugUtilsLabelEXT");
PFN_vkCmdInsertDebugUtilsLabelEXT pfnCmdInsertDebugUtilsLabelEXT = (PFN_vkCmdInsertDebugUtilsLabelEXT)vkGetDeviceProcAddr(device, "vkCmdInsertDebugUtilsLabelEXT");
// Describe the area being rendered
const VkDebugUtilsLabelEXT houseLabel =
{
VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT, // sType
NULL, // pNext
"Brick House", // pLabelName
{ 1.0f, 0.0f, 0.0f, 1.0f }, // color
};
// Start an annotated group of calls under the 'Brick House' name
pfnCmdBeginDebugUtilsLabelEXT(commandBuffer, &houseLabel);
{
// A mutable structure for each part being rendered
VkDebugUtilsLabelEXT housePartLabel =
{
VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT, // sType
NULL, // pNext
NULL, // pLabelName
{ 0.0f, 0.0f, 0.0f, 0.0f }, // color
};
// Set the name and insert the marker
housePartLabel.pLabelName = "Walls";
pfnCmdInsertDebugUtilsLabelEXT(commandBuffer, &housePartLabel);
// Insert the drawcall for the walls
vkCmdDrawIndexed(commandBuffer, 1000, 1, 0, 0, 0);
// Insert a recursive region for two sets of windows
housePartLabel.pLabelName = "Windows";
pfnCmdBeginDebugUtilsLabelEXT(commandBuffer, &housePartLabel);
{
vkCmdDrawIndexed(commandBuffer, 75, 6, 1000, 0, 0);
vkCmdDrawIndexed(commandBuffer, 100, 2, 1450, 0, 0);
}
pfnCmdEndDebugUtilsLabelEXT(commandBuffer);
housePartLabel.pLabelName = "Front Door";
pfnCmdInsertDebugUtilsLabelEXT(commandBuffer, &housePartLabel);
vkCmdDrawIndexed(commandBuffer, 350, 1, 1650, 0, 0);
housePartLabel.pLabelName = "Roof";
pfnCmdInsertDebugUtilsLabelEXT(commandBuffer, &housePartLabel);
vkCmdDrawIndexed(commandBuffer, 500, 1, 2000, 0, 0);
}
// End the house annotation started above
pfnCmdEndDebugUtilsLabelEXT(commandBuffer);
// Do other work
vkEndCommandBuffer(commandBuffer);
// Describe the queue being used
const VkDebugUtilsLabelEXT queueLabel =
{
VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT, // sType
NULL, // pNext
"Main Render Work", // pLabelName
{ 0.0f, 1.0f, 0.0f, 1.0f }, // color
};
// Identify the queue label region
pfnQueueBeginDebugUtilsLabelEXT(queue, &queueLabel);
// Submit the work for the main render thread
const VkCommandBuffer cmd_bufs[] = {commandBuffer};
VkSubmitInfo submit_info = {.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO,
.pNext = NULL,
.waitSemaphoreCount = 0,
.pWaitSemaphores = NULL,
.pWaitDstStageMask = NULL,
.commandBufferCount = 1,
.pCommandBuffers = cmd_bufs,
.signalSemaphoreCount = 0,
.pSignalSemaphores = NULL};
vkQueueSubmit(queue, 1, &submit_info, fence);
// End the queue label region
pfnQueueEndDebugUtilsLabelEXT(queue);
Issues
1) Should we just name this extension VK_EXT_debug_report2
RESOLVED: No. There is enough additional changes to the structures to break backwards compatibility. So, a new name was decided that would not indicate any interaction with the previous extension.
2) Will validation layers immediately support all the new features.
RESOLVED: Not immediately. As one can imagine, there is a lot of work involved with converting the validation layer logging over to the new functionality. Basic logging, as seen in the origin VK_EXT_debug_report extension will be made available immediately. However, adding the labels and object names will take time. Since the priority for Khronos at this time is to continue focusing on Valid Usage statements, it may take a while before the new functionality is fully exposed.
3) If the validation layers won’t expose the new functionality immediately, then what’s the point of this extension?
RESOLVED: We needed a replacement for VK_EXT_debug_report because the VkDebugReportObjectTypeEXT enumeration will no longer be updated and any new objects will need to be debugged using the new functionality provided by this extension.
4) Should this extension be split into two separate parts (1 extension that is an instance extension providing the callback functionality, and another device extension providing the general debug marker and annotation functionality)?
RESOLVED: No, the functionality for this extension is too closely related. If we did split up the extension, where would the structures and enums live, and how would you define that the device behavior in the instance extension is really only valid if the device extension is enabled, and the functionality is passed in. It’s cleaner to just define this all as an instance extension, plus it allows the application to enable all debug functionality provided with one enable string during vkCreateInstance.
Version History
-
Revision 1, 2017-09-14 (Mark Young and all listed Contributors)
-
Initial draft, based on VK_EXT_debug_report and VK_EXT_debug_marker in addition to previous feedback supplied from various companies including Valve, Epic, and Oxide games.
-
VK_EXT_depth_range_unrestricted
- Name String
-
VK_EXT_depth_range_unrestricted
- Extension Type
-
Device extension
- Registered Extension Number
-
14
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Piers Daniell pdaniell-nv
-
- Last Modified Date
-
2017-06-22
- Contributors
-
-
Daniel Koch, NVIDIA
-
Jeff Bolz, NVIDIA
-
This extension removes the VkViewport minDepth
and
maxDepth
restrictions that the values must be between 0.0
and 1.0
,
inclusive.
It also removes the same restriction on
VkPipelineDepthStencilStateCreateInfo minDepthBounds
and
maxDepthBounds
.
Finally it removes the restriction on the depth
value in
VkClearDepthStencilValue.
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
Issues
1) How do VkViewport minDepth
and maxDepth
values outside
of the 0.0
to 1.0
range interact with
Primitive Clipping?
RESOLVED: The behavior described in Primitive
Clipping still applies.
If depth clamping is disabled the depth values are still clipped to 0
≤ zc ≤ wc before the viewport transform.
If depth clamping is enabled the above equation is ignored and the depth
values are instead clamped to the VkViewport minDepth
and
maxDepth
values, which in the case of this extension can be outside of
the 0.0
to 1.0
range.
2) What happens if a resulting depth fragment is outside of the 0.0
to
1.0
range and the depth buffer is fixed-point rather than floating-point?
RESOLVED: The supported range of a fixed-point depth buffer is 0.0
to
1.0
and depth fragments are clamped to this range.
Version History
-
Revision 1, 2017-06-22 (Piers Daniell)
-
Internal revisions
-
VK_EXT_descriptor_indexing
- Name String
-
VK_EXT_descriptor_indexing
- Extension Type
-
Device extension
- Registered Extension Number
-
162
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_maintenance3
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Status
-
Complete
- Last Modified Data
-
2017-10-02
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Daniel Rakos, AMD
-
Slawomir Grajewski, Intel
-
Tobias Hector, Imagination Technologies
-
This extension adds several small features which together enable
applications to create large descriptor sets containing substantially all of
their resources, and selecting amongst those resources with dynamic
(non-uniform) indexes in the shader.
There are feature enables and SPIR-V capabilities for non-uniform descriptor
indexing in the shader, and non-uniform indexing in the shader requires use
of a new NonUniformEXT
decoration defined in the
SPV_EXT_descriptor_indexing SPIR-V extension.
There are descriptor set layout binding creation flags enabling several
features:
-
Descriptors can be updated after they are bound to a command buffer, such that the execution of the command buffer reflects the most recent update to the descriptors.
-
Descriptors that are not used by any pending command buffers can be updated, which enables writing new descriptors for frame N+1 while frame N is executing.
-
Relax the requirement that all descriptors in a binding that is “statically used” must be valid, such that descriptors that are not accessed by a submission need not be valid and can be updated while that submission is executing.
-
The final binding in a descriptor set layout can have a variable size (and unsized arrays of resources are allowed in the GL_EXT_nonuniform_qualifier and SPV_EXT_descriptor_indexing extensions).
Note that it is valid for multiple descriptor arrays in a shader to use the same set and binding number, as long as they are all compatible with the descriptor type in the pipeline layout. This means a single array binding in the descriptor set can serve multiple texture dimensionalities, or an array of buffer descriptors can be used with multiple different block layouts.
There are new descriptor set layout and descriptor pool creation flags that
are required to opt in to the update-after-bind functionality, and there are
separate maxPerStage
* and maxDescriptorSet
* limits that apply to
these descriptor set layouts which may be much higher than the pre-existing
limits.
The old limits only count descriptors in non-updateAfterBind descriptor set
layouts, and the new limits count descriptors in all descriptor set layouts
in the pipeline layout.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_BINDING_FLAGS_CREATE_INFO_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DESCRIPTOR_INDEXING_FEATURES_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DESCRIPTOR_INDEXING_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_VARIABLE_DESCRIPTOR_COUNT_ALLOCATE_INFO_EXT
-
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_VARIABLE_DESCRIPTOR_COUNT_LAYOUT_SUPPORT_EXT
-
-
Extending VkDescriptorPoolCreateFlagBits:
-
VK_DESCRIPTOR_POOL_CREATE_UPDATE_AFTER_BIND_BIT_EXT
-
-
Extending VkDescriptorSetLayoutCreateFlagBits:
-
VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT
-
-
Extending VkResult:
-
VK_ERROR_FRAGMENTATION_EXT
-
New Enums
New Structures
New Functions
None.
Issues
None.
Version History
-
Revision 1, 2017-07-26 (Jeff Bolz)
-
Internal revisions
-
VK_EXT_direct_mode_display
- Name String
-
VK_EXT_direct_mode_display
- Extension Type
-
Instance extension
- Registered Extension Number
-
89
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_display
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-12-13
- IP Status
-
No known IP claims.
- Contributors
-
-
Pierre Boudier, NVIDIA
-
James Jones, NVIDIA
-
Damien Leone, NVIDIA
-
Pierre-Loup Griffais, Valve
-
Liam Middlebrook, NVIDIA
-
This is extension, along with related platform exentions, allows applications to take exclusive control of displays associated with a native windowing system. This is especially useful for virtual reality applications that wish to hide HMDs (head mounted displays) from the native platform’s display management system, desktop, and/or other applications.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
Issues
1) Should this extension and its related platform-specific extensions
leverage VK_KHR_display
, or provide separate equivalent interfaces.
RESOLVED: Use VK_KHR_display
concepts and objects.
VK_KHR_display
can be used to enumerate all displays on the system,
including those attached to/in use by a window system or native platform,
but VK_KHR_display_swapchain
will fail to create a swapchain on in-use
displays.
This extension and its platform-specific children will allow applications to
grab in-use displays away from window systems and/or native platforms,
allowing them to be used with VK_KHR_display_swapchain
.
2) Are separate calls needed to acquire displays and enable direct mode?
RESOLVED: No, these operations happen in one combined command. Acquiring a display puts it into direct mode.
Version History
-
Revision 1, 2016-12-13 (James Jones)
-
Initial draft
-
VK_EXT_discard_rectangles
- Name String
-
VK_EXT_discard_rectangles
- Extension Type
-
Device extension
- Registered Extension Number
-
100
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Piers Daniell pdaniell-nv
-
- Last Modified Date
-
2016-12-22
- Interactions and External Dependencies
-
-
Interacts with VK_KHR_device_group
-
Interacts with Vulkan 1.1
-
- Contributors
-
-
Daniel Koch, NVIDIA
-
Jeff Bolz, NVIDIA
-
This extension provides additional orthogonally aligned “discard rectangles” specified in framebuffer-space coordinates that restrict rasterization of all points, lines and triangles.
From zero to an implementation-dependent limit (specified by
maxDiscardRectangles
) number of discard rectangles can be operational
at once.
When one or more discard rectangles are active, rasterized fragments can
either survive if the fragment is within any of the operational discard
rectangles (VK_DISCARD_RECTANGLE_MODE_INCLUSIVE_EXT
mode) or be
rejected if the fragment is within any of the operational discard rectangles
(VK_DISCARD_RECTANGLE_MODE_EXCLUSIVE_EXT
mode).
These discard rectangles operate orthogonally to the existing scissor test functionality. The discard rectangles can be different for each physical device in a device group by specifying the device mask and setting discard rectangle dynamic state.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DISCARD_RECTANGLE_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_PIPELINE_DISCARD_RECTANGLE_STATE_CREATE_INFO_EXT
-
-
Extending VkDynamicState
-
VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT
-
New Structures
New Functions
Issues
None.
Version History
-
Revision 1, 2016-12-22 (Piers Daniell)
-
Internal revisions
-
VK_EXT_display_control
- Name String
-
VK_EXT_display_control
- Extension Type
-
Device extension
- Registered Extension Number
-
92
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_EXT_display_surface_counter
-
Requires
VK_KHR_swapchain
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-12-13
- IP Status
-
No known IP claims.
- Contributors
-
-
Pierre Boudier, NVIDIA
-
James Jones, NVIDIA
-
Damien Leone, NVIDIA
-
Pierre-Loup Griffais, Valve
-
Daniel Vetter, Intel
-
This extension defines a set of utility functions for use with the
VK_KHR_display
and VK_KHR_display_swapchain
extensions.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DISPLAY_POWER_INFO_EXT
-
VK_STRUCTURE_TYPE_DEVICE_EVENT_INFO_EXT
-
VK_STRUCTURE_TYPE_DISPLAY_EVENT_INFO_EXT
-
VK_STRUCTURE_TYPE_SWAPCHAIN_COUNTER_CREATE_INFO_EXT
-
New Structures
New Functions
Issues
1) Should this extension add an explicit “WaitForVsync” API or a fence signaled at vsync that the application can wait on?
RESOLVED: A fence. A separate API could later be provided that allows exporting the fence to a native object that could be inserted into standard run loops on POSIX and Windows systems.
2) Should callbacks be added for a vsync event, or in general to monitor events in Vulkan?
RESOLVED: No, fences should be used. Some events are generated by interrupts which are managed in the kernel. In order to use a callback provided by the application, drivers would need to have the userspace driver spawn threads that would wait on the kernel event, and hence the callbacks could be difficult for the application to synchronize with its other work given they would arrive on a foreign thread.
3) Should vblank or scanline events be exposed?
RESOLVED: Vblank events. Scanline events could be added by a separate extension, but the latency of processing an interrupt and waking up a userspace event is high enough that the accuracy of a scanline event would be rather low. Further, per-scanline interrupts are not supported by all hardware.
Version History
-
Revision 1, 2016-12-13 (James Jones)
-
Initial draft
-
VK_EXT_display_surface_counter
- Name String
-
VK_EXT_display_surface_counter
- Extension Type
-
Instance extension
- Registered Extension Number
-
91
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_display
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-12-13
- IP Status
-
No known IP claims.
- Contributors
-
-
Pierre Boudier, NVIDIA
-
James Jones, NVIDIA
-
Damien Leone, NVIDIA
-
Pierre-Loup Griffais, Valve
-
Daniel Vetter, Intel
-
This is extension defines a vertical blanking period counter associated with display surfaces. It provides a mechanism to query support for such a counter from a VkSurfaceKHR object.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_SURFACE_CAPABILITIES_2_EXT
-
New Structures
New Functions
Issues
None.
Version History
-
Revision 1, 2016-12-13 (James Jones)
-
Initial draft
-
VK_EXT_external_memory_dma_buf
- Name String
-
VK_EXT_external_memory_dma_buf
- Extension Type
-
Device extension
- Registered Extension Number
-
126
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_memory_fd
-
- Contact
-
-
Chad Versace chadversary
-
- Last Modified Date
-
2017-10-10
- IP Status
-
No known IP claims.
- Contributors
-
-
Chad Versace, Google
-
James Jones, NVIDIA
-
Jason Ekstrand, Intel
-
A dma_buf is a type of file descriptor, defined by the Linux kernel, that allows sharing memory across kernel device drivers and across processes. This extension enables applications to import a dma_buf as VkDeviceMemory; to export VkDeviceMemory as a dma_buf; and to create VkBuffer objects that can be bound to that memory.
New Enum Constants
-
Extending VkExternalMemoryHandleTypeFlagBitsKHR:
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT
-
Issues
1. How does the application, when creating a VkImage that it intends to bind to dma_buf VkDeviceMemory that contains an externally produced image, specify the memory layout (such as row pitch and DRM format modifier) of the VkImage? In other words, how does the application achieve behavior comparable to that provided by EGL_EXT_image_dma_buf_import and EGL_EXT_image_dma_buf_import_modifiers?
+
RESOLVED. Features comparable to those in EGL_EXT_image_dma_buf_import and EGL_EXT_image_dma_buf_import_modifiers will be provided by an extension layered atop this one.
2. Without the ability to specify the memory layout of external dma_buf images, how is this extension useful?
+
RESOLVED.
This extension provides exactly one new feature: the ability to
import/export between dma_bufs and VkDeviceMemory.
This feature, together with features provided by
VK_KHR_external_memory_fd
, is sufficient to bind a VkBuffer to
dma_buf.
Version History
-
Revision 1, 2017-10-10 (Chad Versace)
-
Squashed internal revisions
-
VK_EXT_external_memory_host
- Name String
-
VK_EXT_external_memory_host
- Extension Type
-
Device extension
- Registered Extension Number
-
179
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_memory
-
- Contact
-
-
Daniel Rakos drakos-amd
-
- Last Modified Date
-
2017-11-10
- IP Status
-
No known IP claims.
- Contributors
-
-
Jaakko Konttinen, AMD
-
David Mao, AMD
-
Daniel Rakos, AMD
-
Tobias Hector, Imagination Technologies
-
Jason Ekstrand, Intel
-
James Jones, NVIDIA
-
This extension enables an application to import host allocations and host mapped foreign device memory to Vulkan memory objects.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_IMPORT_MEMORY_HOST_POINTER_INFO_EXT
-
VK_STRUCTURE_TYPE_MEMORY_HOST_POINTER_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_MEMORY_HOST_PROPERTIES_EXT
-
-
Extending VkExternalMemoryHandleTypeFlagBitsKHR:
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_MAPPED_FOREIGN_MEMORY_BIT_EXT
-
New Enums
None.
New Structs
New Functions
Issues
1) What memory type has to be used to import host pointers?
RESOLVED: Depends on the implementation. Applications have to use the new vkGetMemoryHostPointerPropertiesEXT command to query the supported memory types for a particular host pointer. The reported memory types may include memory types that come from a memory heap that is otherwise not usable for regular memory object allocation and thus such a heap’s size may be zero.
2) Can the application still access the contents of the host allocation after importing?
RESOLVED: Yes. However, usual synchronization requirements apply.
3) Can the application free the host allocation?
RESOLVED: No, it violates valid usage conditions. Using the memory object imported from a host allocation that’s already freed thus results in undefined behavior.
4) Is vkMapMemory expected to return the same host address which was specified when importing it to the memory object?
RESOLVED: No. Implementations are allowed to return the same address but it’s not required. Some implementations might return a different virtual mapping of the allocation, although the same physical pages will be used.
5) Is there any limitation on the alignment of the host pointer and/or size?
RESOLVED: Yes.
Both the address and the size have to be an integer multiple of
minImportedHostPointerAlignment
.
In addition, some platforms and foreign devices may have additional
restrictions.
6) Can the same host allocation be imported multiple times into a given physical device?
RESOLVED: No, at least not guaranteed by this extension. Some platforms do not allow locking the same physical pages for device access multiple times, so attempting to do it may result in undefined behavior.
7) Does this extension support exporting the new handle type?
RESOLVED: No.
8) Should we include the possibility to import host mapped foreign device memory using this API?
RESOLVED: Yes, through a separate handle type. Implementations are still allowed to support only one of the handle types introduced by this extension by not returning import support for a particular handle type as returned in VkExternalMemoryPropertiesKHR.
Version History
-
Revision 1, 2017-11-10 (Daniel Rakos)
-
Internal revisions
-
VK_EXT_fragment_density_map
- Name String
-
VK_EXT_fragment_density_map
- Extension Type
-
Device extension
- Registered Extension Number
-
219
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Matthew Netsch mnetsch
-
- Last Modified Date
-
2018-09-25
- Interactions and External Dependencies
-
-
This extension requires the SPV_EXT_fragment_invocation_density SPIR-V extension.
-
- Contributors
-
-
Matthew Netsch, Qualcomm Technologies, Inc.
-
Robert VanReenen, Qualcomm Technologies, Inc.
-
Jonathan Wicks, Qualcomm Technologies, Inc.
-
Tate Hornbeck, Qualcomm Technologies, Inc.
-
Sam Holmes, Qualcomm Technologies, Inc.
-
Jeff Leger, Qualcomm Technologies, Inc.
-
Jan-Harald Fredriksen, ARM
-
Jeff Bolz, NVIDIA
-
Pat Brown, NVIDIA
-
Daniel Rakos, AMD
-
Piers Daniell, NVIDIA
-
This extension allows an application to specify areas of the render target where the fragment shader may be invoked fewer times. These fragments are broadcasted out to multiple pixels to cover the render target.
The primary use of this extension is to reduce workloads in areas where lower quality may not be perceived such as the distorted edges of a lens or the periphery of a user’s gaze.
New Object Types
None.
New Enum Constants
-
Extending VkAccessFlagBits:
-
VK_ACCESS_FRAGMENT_DENSITY_MAP_READ_BIT_EXT
-
-
Extending VkFormatFeatureFlagBits:
-
VK_FORMAT_FEATURE_FRAGMENT_DENSITY_MAP_BIT_EXT
-
-
Extending VkImageCreateFlagBits:
-
VK_IMAGE_CREATE_SUBSAMPLED_BIT_EXT
-
-
Extending VkImageLayout:
-
VK_IMAGE_LAYOUT_FRAGMENT_DENSITY_MAP_OPTIMAL_EXT
-
-
Extending VkImageUsageFlagBits:
-
VK_IMAGE_USAGE_FRAGMENT_DENSITY_MAP_BIT_EXT
-
-
Extending VkImageViewCreateFlagBits:
-
VK_IMAGE_VIEW_CREATE_FRAGMENT_DENSITY_MAP_DYNAMIC_BIT_EXT
-
-
Extending VkPipelineStageFlagBits:
-
VK_PIPELINE_STAGE_FRAGMENT_DENSITY_PROCESS_BIT_EXT
-
-
Extending VkSamplerCreateFlagBits:
-
VK_SAMPLER_CREATE_SUBSAMPLED_BIT_EXT
-
VK_SAMPLER_CREATE_SUBSAMPLED_COARSE_RECONSTRUCTION_BIT_EXT
-
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAGMENT_DENSITY_MAP_FEATURES_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAGMENT_DENSITY_MAP_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_RENDER_PASS_FRAGMENT_DENSITY_MAP_CREATE_INFO_EXT
-
New Enums
None.
New Structures
New Functions
None.
New or Modified Built-In Variables
New Variable Decorations
None.
New SPIR-V Capabilities
Version History
-
Revision 1, 2018-09-25 (Matthew Netsch)
-
Initial version
-
VK_EXT_global_priority
- Name String
-
VK_EXT_global_priority
- Extension Type
-
Device extension
- Registered Extension Number
-
175
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Andres Rodriguez lostgoat
-
- Last Modified Date
-
2017-10-06
- IP Status
-
No known IP claims.
- Contributors
-
-
Andres Rodriguez, Valve
-
Pierre-Loup Griffais, Valve
-
Dan Ginsburg, Valve
-
Mitch Singer, AMD
-
In Vulkan, users can specify device-scope queue priorities.
In some cases it may be useful to extend this concept to a system-wide
scope.
This extension provides a mechanism for caller’s to set their system-wide
priority.
The default queue priority is VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT
.
The driver implementation will attempt to skew hardware resource allocation in favour of the higher-priority task. Therefore, higher-priority work may retain similar latency and throughput characteristics even if the system is congested with lower priority work.
The global priority level of a queue shall take predence over the
per-process queue priority
(VkDeviceQueueCreateInfo
::pQueuePriorities
).
Abuse of this feature may result in starving the rest of the system from
hardware resources.
Therefore, the driver implementation may deny requests to acquire a priority
above the default priority (VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT
) if
the caller does not have sufficient privileges.
In this scenario VK_ERROR_NOT_PERMITTED_EXT
is returned.
The driver implementation may fail the queue allocation request if resources
required to complete the operation have been exhausted (either by the same
process or a different process).
In this scenario VK_ERROR_INITIALIZATION_FAILED
is returned.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DEVICE_QUEUE_GLOBAL_PRIORITY_CREATE_INFO_EXT
-
-
Extending VkResult:
-
VK_ERROR_NOT_PERMITTED_EXT
-
New Enums
New Structures
New Functions
None.
Issues
None.
Version History
-
Revision 2, 2017-11-03 (Andres Rodriguez)
-
Fixed VkQueueGlobalPriorityEXT missing _EXT suffix
-
-
Revision 1, 2017-10-06 (Andres Rodriguez)
-
First version.
-
VK_EXT_hdr_metadata
- Name String
-
VK_EXT_hdr_metadata
- Extension Type
-
Device extension
- Registered Extension Number
-
106
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_swapchain
-
- Contact
-
-
Courtney Goeltzenleuchter courtney-g
-
- Last Modified Date
-
2017-03-04
- IP Status
-
No known IP claims.
- Contributors
-
-
Courtney Goeltzenleuchter, Google
-
This extension defines two new structures and a function to assign SMPTE
(the Society of Motion Picture and Television Engineers) 2086 metadata and
CTA (Consumer Technology Association) 861.3 metadata to a swapchain.
The metadata includes the color primaries, white point, and luminance range
of the mastering display, which all together define the color volume that
contains all the possible colors the mastering display can produce.
The mastering display is the display where creative work is done and
creative intent is established.
To preserve such creative intent as much as possible and achieve consistent
color reproduction on different viewing displays, it is useful for the
display pipeline to know the color volume of the original mastering display
where content was created or tuned.
This avoids performing unnecessary mapping of colors that are not
displayable on the original mastering display.
The metadata also includes the maxContentLightLevel
and
maxFrameAverageLightLevel
as defined by CTA 861.3.
While the general purpose of the metadata is to assist in the transformation between different color volumes of different displays and help achieve better color reproduction, it is not in the scope of this extension to define how exactly the metadata should be used in such a process. It is up to the implementation to determine how to make use of the metadata.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_HDR_METADATA_EXT
-
New Structures
New Functions
Issues
1) Do we need a query function?
PROPOSED: No, Vulkan does not provide queries for state that the application can track on its own.
2) Should we specify default if not specified by the application?
PROPOSED: No, that leaves the default up to the display.
Version History
-
Revision 1, 2016-12-27 (Courtney Goeltzenleuchter)
-
Initial version
-
VK_EXT_image_drm_format_modifier
- Name String
-
VK_EXT_image_drm_format_modifier
- Extension Type
-
Device extension
- Registered Extension Number
-
159
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_bind_memory2
-
Requires
VK_KHR_image_format_list
-
Requires
VK_KHR_sampler_ycbcr_conversion
-
- Contact
-
-
Chad Versace chadversary
-
- Last Modified Date
-
2018-08-29
- IP Status
-
No known IP claims.
- Contributors
-
-
Antoine Labour, Google
-
Bas Nieuwenhuizen, Google
-
Chad Versace, Google
-
James Jones, NVIDIA
-
Jason Ekstrand, Intel
-
Jőrg Wagner, ARM
-
Kristian Høgsberg Kristensen, Google
-
Ray Smith, ARM
-
Overview
Summary
This extension provides the ability to use DRM format modifiers with images, enabling Vulkan to better integrate with the Linux ecosystem of graphics, video, and display APIs.
Its functionality closely overlaps with
EGL_EXT_image_dma_buf_import_modifiers
2
and
EGL_MESA_image_dma_buf_export
3.
Unlike the EGL extensions, this extension does not require the use of a
specific handle type (such as a dma_buf) for external memory and provides
more explicit control of image creation.
Introduction to DRM Format Modifiers
A DRM format modifier is a 64-bit, vendor-prefixed, semi-opaque unsigned
integer.
Most modifiers represent a concrete, vendor-specific tiling format for
images.
Some exceptions are DRM_FORMAT_MOD_LINEAR
(which is not
vendor-specific); DRM_FORMAT_MOD_NONE
(which is an alias of
DRM_FORMAT_MOD_LINEAR
due to historical accident); and
DRM_FORMAT_MOD_INVALID
(which does not represent a tiling format).
The modifier’s vendor prefix consists of the 8 most significant bits.
The canonical list of modifiers and vendor prefixes is found in
drm_fourcc.h
in the Linux kernel source.
The other dominant source of modifiers are vendor kernel trees.
One goal of modifiers in the Linux ecosystem is to enumerate for each vendor a reasonably sized set of tiling formats that are appropriate for images shared across processes, APIs, and/or devices, where each participating component may possibly be from different vendors. A non-goal is to enumerate all tiling formats supported by all vendors. Some tiling formats used internally by vendors are inappropriate for sharing; no modifiers should be assigned to such tiling formats.
Modifier values typically do not describe memory layouts. More precisely, a modifier's lower 56 bits usually have no structure. Instead, modifiers name memory layouts; they name a small set of vendor-preferred layouts for image sharing. As a consequence, in each vendor namespace the modifier values are often sequentially allocated starting at 1.
Each modifier is usually supported by a single vendor and its name matches
the pattern {VENDOR}_FORMAT_MOD_*
or DRM_FORMAT_MOD_{VENDOR}_*
.
Examples are I915_FORMAT_MOD_X_TILED
and
DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED
.
An exception is DRM_FORMAT_MOD_LINEAR
, which is supported by most
vendors.
Many APIs in Linux use modifiers to negotiate and specify the memory
layout of shared images.
For example, a Wayland compositor and Wayland client may, by relaying
modifiers over the Wayland protocol zwp_linux_dmabuf_v1
, negotiate a
vendor-specific tiling format for a shared wl_buffer
.
The client may allocate the underlying memory for the wl_buffer
with
GBM, providing the chosen modifier to gbm_bo_create_with_modifiers
.
The client may then import the wl_buffer
into Vulkan for producing
image content, providing the resource’s dma_buf to
VkImportMemoryFdInfoKHR and its modifier to
VkImageDrmFormatModifierExplicitCreateInfoEXT.
The compositor may then import the wl_buffer
into OpenGL for sampling,
providing the resource’s dma_buf and modifier to eglCreateImage
.
The compositor may also bypass OpenGL and submit the wl_buffer
directly
to the kernel’s display API, providing the dma_buf and modifier through
drm_mode_fb_cmd2
.
Format Translation
Modifier-capable APIs often pair modifiers with DRM formats, which are
defined in
drm_fourcc.h
.
However, VK_EXT_image_drm_format_modifier
uses VkFormat instead of
DRM formats.
The application must convert between VkFormat and DRM format when it
sends or receives a DRM format to or from an external API.
The mapping from VkFormat to DRM format is lossy. Therefore, when receiving a DRM format from an external API, often the application must use information from the external API to accurately map the DRM format to a VkFormat. For example, DRM formats do not distinguish between RGB and sRGB (as of 2018-03-28); external information is required to identify the image’s colorspace.
The mapping between VkFormat and DRM format is also incomplete. For some DRM formats there exist no corresponding Vulkan format, and for some Vulkan formats there exist no corresponding DRM format.
Usage Patterns
Three primary usage patterns are intended for this extension:
-
Negotiation. The application negotiates with modifier-aware, external components to determine sets of image creation parameters supported among all components.
In the Linux ecosystem, the negotiation usually assumes the image is a 2D, single-sampled, non-mipmapped, non-array image; this extension permits that assumption but does not require it. The result of the negotiation usually resembles a set of tuples such as (drmFormat, drmFormatModifier), where each participating component supports all tuples in the set.
Many details of this negotiation—such as the protocol used during negotiation, the set of image creation parameters expressable in the protocol, and how the protocol chooses which process and which API will create the image—are outside the scope of this specification.
In this extension, vkGetPhysicalDeviceFormatProperties2 with VkDrmFormatModifierPropertiesListEXT serves a primary role during the negotiation, and vkGetPhysicalDeviceImageFormatProperties2 with VkPhysicalDeviceImageDrmFormatModifierInfoEXT serves a secondary role.
-
Import. The application imports an image with a modifier.
In this pattern, the application receives from an external source the image’s memory and its creation parameters, which are often the result of the negotiation described above. Some image creation parameters are implicitly defined by the external source; for example,
VK_IMAGE_TYPE_2D
is often assumed. Some image creation parameters are usually explicit, such as the image’sformat
,drmFormatModifier
, andextent
; and each plane’soffset
androwPitch
.Before creating the image, the application first verifies that the physical device supports the received creation parameters by querying vkGetPhysicalDeviceFormatProperties2 with VkDrmFormatModifierPropertiesListEXT and vkGetPhysicalDeviceImageFormatProperties2 with VkPhysicalDeviceImageDrmFormatModifierInfoEXT. Then the application creates the image by chaining VkImageDrmFormatModifierExplicitCreateInfoEXT and VkExternalMemoryImageCreateInfo onto VkImageCreateInfo.
-
Export. The application creates an image and allocates its memory. Then the application exports to modifier-aware consumers the image’s memory handles; its creation parameters; its modifier; and the
offset
,size
, androwPitch
of each memory plane.In this pattern, the Vulkan device is the authority for the image; it is the allocator of the image’s memory and the decider of the image’s creation parameters. When choosing the image’s creation parameters, the application usually chooses a tuple (format, drmFormatModifier) from the result of the negotiation described above. The negotiation’s result often contains multiple tuples that share the same format but differ in their modifier. In this case, the application should defer the choice of the image’s modifier to the Vulkan implementation by providing all such modifiers to VkImageDrmFormatModifierListCreateInfoEXT::
pDrmFormatModifiers
; and the implementation should choose frompDrmFormatModifiers
the optimal modifier in consideration with the other image parameters.The application creates the image by chaining VkImageDrmFormatModifierListCreateInfoEXT and VkExternalMemoryImageCreateInfo onto VkImageCreateInfo. The protocol and APIs by which the application will share the image with external consumers will likely determine the value of VkExternalMemoryImageCreateInfo::
handleTypes
. The implementation chooses for the image an optimal modifier from VkImageDrmFormatModifierListCreateInfoEXT::pDrmFormatModifiers
. The application then queries the implementation-chosen modifier with vkGetImageDrmFormatModifierPropertiesEXT, and queries the memory layout of each plane with vkGetImageSubresourceLayout.The application then allocates the image’s memory with VkMemoryAllocateInfo, adding chained extension structures for external memory; binds it to the image; and exports the memory, for example, with vkGetMemoryFdKHR.
Finally, the application sends the image’s creation parameters, its modifier, its per-plane memory layout, and the exported memory handle to the external consumers. The details of how the application transmits this information to external consumers is outside the scope of this specification.
Prior Art
Extension
EGL_EXT_image_dma_buf_import
1
introduced the ability to create an EGLImage
by importing for each
plane a dma_buf, offset, and row pitch.
Later, extension
EGL_EXT_image_dma_buf_import_modifiers
2
introduced the ability to query which combination of formats and modifiers
the implementation supports and to specify modifiers during creation of
the EGLImage
.
Extension
EGL_MESA_image_dma_buf_export
3
is the inverse of EGL_EXT_image_dma_buf_import_modifiers
.
The Linux kernel modesetting API (KMS), when configuring the display’s
framebuffer with struct
drm_mode_fb_cmd2
4, allows one to
specify the frambuffer’s modifier as well as a per-plane memory handle,
offset, and row pitch.
GBM, a graphics buffer manager for Linux, allows creation of a gbm_bo
(that is, a graphics buffer object) by importing data similar to that in
EGL_EXT_image_dma_buf_import_modifiers
1;
and symmetrically allows exporting the same data from the gbm_bo
.
See the references to modifier and plane in
gbm.h
5.
New Object Types
None.
New Enum Constants
-
Extending VkResult:
-
VK_ERROR_INVALID_DRM_FORMAT_MODIFIER_PLANE_LAYOUT_EXT
-
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DRM_FORMAT_MODIFIER_PROPERTIES_LIST_EXT
-
VK_STRUCTURE_TYPE_DRM_FORMAT_MODIFIER_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_DRM_FORMAT_MODIFIER_INFO_EXT
-
VK_STRUCTURE_TYPE_IMAGE_DRM_FORMAT_MODIFIER_LIST_CREATE_INFO_EXT
-
VK_STRUCTURE_TYPE_IMAGE_DRM_FORMAT_MODIFIER_EXPLICIT_CREATE_INFO_EXT
-
VK_STRUCTURE_TYPE_IMAGE_DRM_FORMAT_MODIFIER_PROPERTIES_EXT
-
-
Extending VkImageTiling:
-
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
-
-
Extending VkImageAspectFlagBits:
-
VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT_EXT
-
VK_IMAGE_ASPECT_MEMORY_PLANE_1_BIT_EXT
-
VK_IMAGE_ASPECT_MEMORY_PLANE_2_BIT_EXT
-
VK_IMAGE_ASPECT_MEMORY_PLANE_3_BIT_EXT
-
New Enums
None.
New Structures
-
Extends VkFormatProperties2:
-
Member of VkDrmFormatModifierPropertiesListEXT:
-
Extends VkPhysicalDeviceImageFormatInfo2:
-
Extends VkImageCreateInfo:
-
Parameter to vkGetImageDrmFormatModifierPropertiesEXT:
New Functions
Issues
1) Should this extension define a single DRM format modifier per
VkImage
? Or define one per plane?
+
RESOLVED: There exists a single DRM format modifier per VkImage
.
DISCUSSION: Prior art, such as
EGL_EXT_image_dma_buf_import_modifiers
2,
struct drm_mode_fb_cmd2
4, and
struct
gbm_import_fd_modifier_data
5,
allows defining one modifier per plane.
However, developers of the GBM and kernel APIs concede it was a mistake.
Beginning in Linux 4.10, the kernel requires that the application provide
the same DRM format modifier for each plane.
(See Linux commit
bae781b259269590109e8a4a8227331362b88212).
And GBM provides an entrypoint, gbm_bo_get_modifier
, for querying the
modifier of the image but does not provide one to query the modifier of
individual planes.
2) When creating an image with VkImageDrmFormatModifierExplicitCreateInfoEXT, which is typically used when importing an image, should the application explicitly provide the size of each plane?
+
RESOLVED: No.
The application must not provide the size.
To enforce this, the API requires that
VkImageDrmFormatModifierExplicitCreateInfoEXT::pPlaneLayouts
::size
must be 0.
DISCUSSION: Prior art, such as
EGL_EXT_image_dma_buf_import_modifiers
2,
struct drm_mode_fb_cmd2
4, and
struct
gbm_import_fd_modifier_data
5,
omits from the API the size of each plane.
Instead, the APIs infer each plane’s size from the import parameters, which
include the image’s pixel format and a dma_buf, offset, and row pitch for
each plane.
However, Vulkan differs from EGL and GBM with regards to image creation in the following ways:
-
Undedicated allocation by default. When importing or exporting a set of dma_bufs as an
EGLImage
orgbm_bo
, common practice mandates that each dma_buf’s memory be dedicated (in the sense ofVK_KHR_dedicated_allocation
) to the image (though not necessarily dedicated to a single plane). In particular, neither the GBM documentation nor the EGL extension specifications explicitly state this requirement, but in light of common practice this is likely due to under-specification rather than intentional omission. In contrast,VK_EXT_image_drm_format_modifier
permits, but does not require, the implementation to require dedicated allocations for images created withVK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
. -
Separation of image creation and memory allocation. When importing a set of dma_bufs as an
EGLImage
orgbm_bo
, EGL and GBM create the image resource and bind it to memory (the dma_bufs) simultaneously. This allows EGL and GBM to query each dma_buf’s size during image creation. In Vulkan, image creation and memory allocation are independent unless a dedicated allocation is used (as inVK_KHR_dedicated_allocation
). Therefore, without requiring dedicated allocation, Vulkan cannot query the size of each dma_buf (or other external handle) when calculating the image’s memory layout. Even if dedication allocation were required, Vulkan cannot calculate the image’s memory layout until after the image is bound to its dma_ufs.
The above differences complicate the potential inference of plane size in Vulkan. Consider the following problematic cases:
-
Padding. Some plane of the image may require implementation-dependent padding.
-
Metadata. For some modifiers, the image may have a metadata plane which requires a non-trivial calculation to determine its size.
-
Mipmapped, array, and 3D images. The implementation may support
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
for images whosemipLevels
,arrayLayers
, ordepth
is greater than 1. For such images with certain modifiers, the calculation of each plane’s size may be non-trivial.
However, an application-provided plane size solves none of the above problems.
For simplicity, consider an external image with a single memory plane.
The implementation is obviously capable calculating the image’s size when
its tiling is VK_IMAGE_TILING_OPTIMAL
.
Likewise, any reasonable implementation is capable of calculating the
image’s size when its tiling uses a supported modifier.
Suppose that the external image’s size is smaller than the
implementation-calculated size.
If the application provided the external image’s size to
vkCreateImage, the implementation would observe the mismatched size
and recognize its inability to comprehend the external image’s layout
(unless the implementation used the application-provided size to select a
refinement of the tiling layout indicated by the modifier, which is
strongly discouraged).
The implementation would observe the conflict, and reject image creation
with VK_ERROR_INVALID_DRM_FORMAT_MODIFIER_PLANE_LAYOUT_EXT
.
On the other hand, if the application did not provide the external image’s
size to vkCreateImage, then the application would observe after
calling vkGetImageMemoryRequirements that the external image’s size is
less than the size required by the implementation.
The application would observe the conflict and refuse to bind the
VkImage
to the external memory.
In both cases, the result is explicit failure.
Suppose that the external image’s size is larger than the
implementation-calculated size.
If the application provided the external image’s size to
vkCreateImage, for reasons similar to above the implementation would
observe the mismatched size and recognize its inability to comprehend the
image data residing in the extra size.
The implementation, however, must assume that image data resides in the
entire size provided by the application.
The implementation would observe the conflict and reject image creation with
VK_ERROR_INVALID_DRM_FORMAT_MODIFIER_PLANE_LAYOUT_EXT
.
On the other hand, if the application did not provide the external image’s
size to vkCreateImage, then the application would observe after
calling vkGetImageMemoryRequirements that the external image’s size is
larger than the implementation-usable size.
The application would observe the conflict and refuse to bind the
VkImage
to the external memory.
In both cases, the result is explicit failure.
Therefore, an application-provided size provides no benefit, and this
extension should not require it.
This decision renders VkSubresourceLayout::size
an unused field
during image creation, and thus introduces a risk that implementations may
require applications to submit sideband creation parameters in the unused
field.
To prevent implementations from relying on sideband data, this extension
requires the application to set size
to 0.
References
Version History
-
Revision 1.0, 2018-08-29 (Chad Versace)
-
First stable revision
-
VK_EXT_inline_uniform_block
- Name String
-
VK_EXT_inline_uniform_block
- Extension Type
-
Device extension
- Registered Extension Number
-
139
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_maintenance1
-
- Contact
-
-
Daniel Rakos aqnuep
-
- Last Modified Date
-
2018-08-01
- IP Status
-
No known IP claims.
- Contributors
-
-
Daniel Rakos, AMD
-
Jeff Bolz, NVIDIA
-
Slawomir Grajewski, Intel
-
Neil Henning, Codeplay
-
This extension introduces the ability to back uniform blocks directly with descriptor sets by storing inline uniform data within descriptor pool storage. Compared to push constants this new construct allows uniform data to be reused across multiple disjoint sets of draw or dispatch commands and may enable uniform data to be accessed with less indirections compared to uniforms backed by buffer memory.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_INLINE_UNIFORM_BLOCK_FEATURES_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_INLINE_UNIFORM_BLOCK_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET_INLINE_UNIFORM_BLOCK_EXT
-
VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_INLINE_UNIFORM_BLOCK_CREATE_INFO_EXT
-
-
Extending VkDescriptorType:
-
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT
-
New Enums
None
New Structures
New Functions
None
New Built-In Variables
None
Issues
1) Do we need a new storage class for inline uniform blocks vs uniform blocks?
RESOLVED: No.
The Uniform
storage class is used to allow the same syntax used for
both uniform buffers and inline uniform blocks.
2) Is the descriptor array index and array size expressed in terms of bytes or dwords for inline uniform block descriptors?
RESOLVED: In bytes, but both must be a multiple of 4, similar to how push
constant ranges are specified.
The descriptorCount
of VkDescriptorSetLayoutBinding
thus
provides the total number of bytes a particular binding with an inline
uniform block descriptor type can hold, while the srcArrayElement
,
dstArrayElement
, and descriptorCount
members of
VkWriteDescriptorSet
, VkCopyDescriptorSet
, and
VkDescriptorUpdateTemplateEntry
(where applicable) specify the byte
offset and number of bytes to write/copy to the binding’s backing store.
Additionally, the stride
member of
VkDescriptorUpdateTemplateEntry
is ignored for inline uniform blocks
and a default value of one is used, meaning that the data to update inline
uniform block bindings with must be contiguous in memory.
3) What layout rules apply for uniform blocks corresponding to inline constants?
RESOLVED: They use the same layout rules as uniform buffers.
4) Do we need to add non-uniform indexing features/properties as introduced
by VK_EXT_descriptor_indexing
for inline uniform blocks?
RESOLVED: No, because inline uniform blocks are not allowed to be “arrayed”. A single binding with an inline uniform block descriptor type corresponds to a single uniform block instance and the array indices inside that binding refer to individual offsets within the uniform block (see issue #2). However, this extension does introduce new features/properties about the level of support for update-after-bind inline uniform blocks.
5) Is the descriptorBindingVariableDescriptorCount
feature introduced
by VK_EXT_descriptor_indexing
supported for inline uniform blocks?
RESOLVED: Yes, as long as other inline uniform block specific limits are respected.
6) Do the robustness guarantees of robustBufferAccess
apply to inline
uniform block accesses?
RESOLVED: No, similarly to push constants, as they are not backed by buffer memory like uniform buffers.
Version History
-
Revision 1, 2018-08-01 (Daniel Rakos)
-
Internal revisions
-
VK_EXT_pci_bus_info
- Name String
-
VK_EXT_pci_bus_info
- Extension Type
-
Device extension
- Registered Extension Number
-
213
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Matthaeus G. Chajdas anteru
-
- Last Modified Date
-
2018-12-10
- IP Status
-
No known IP claims.
- Contributors
-
-
Matthaeus G. Chajdas, AMD
-
Daniel Rakos, AMD
-
This extension adds a new query to obtain PCI bus information about a physical device.
Not all physical devices have PCI bus information, either due to the device not being connected to the system through a PCI interface or due to platform specific restrictions and policies. Thus this extension is only expected to be supported by physical devices which can provide the information.
As a consequence, applications should always check for the presence of the extension string for each individual physical device for which they intend to issue the new query for and should not have any assumptions about the availability of the extension on any given platform.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PCI_BUS_INFO_PROPERTIES_EXT
-
New Enums
None.
New Structures
New Functions
None.
Issues
None.
Examples
None.
Version History
-
Revision 2, 2018-12-10 (Daniel Rakos)
-
Changed all members of the new structure to have the uint32_t type
-
-
Revision 1, 2018-10-11 (Daniel Rakos)
-
Initial revision
-
VK_EXT_post_depth_coverage
- Name String
-
VK_EXT_post_depth_coverage
- Extension Type
-
Device extension
- Registered Extension Number
-
156
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Koch dgkoch
-
- Last Modified Date
-
2017-07-17
- Interactions and External Dependencies
-
-
This extension requires the SPV_KHR_post_depth_coverage SPIR-V extension.
-
This extension requires GL_ARB_post_depth_coverage or GL_EXT_post_depth_coverage for GLSL-based source languages.
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_KHR_post_depth_coverage
which allows the fragment shader to control whether values in the
SampleMask
built-in input variable reflect the coverage after the
early per-fragment depth and stencil tests are applied.
This extension adds a new PostDepthCoverage
execution mode under the
SampleMaskPostDepthCoverage
capability.
When this mode is specified along with EarlyFragmentTests
, the value of
an input variable decorated with the
SampleMask
built-in
reflects the coverage after the early fragment
tests are applied.
Otherwise, it reflects the coverage before the depth and stencil tests.
When using GLSL source-based shading languages, the post_depth_coverage
layout qualifier from GL_ARB_post_depth_coverage or
GL_EXT_post_depth_coverage maps to the PostDepthCoverage
execution
mode.
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
New Built-In Variables
None.
New Variable Decoration
None.
New SPIR-V Capabilities
Issues
None yet.
Version History
-
Revision 1, 2017-07-17 (Daniel Koch)
-
Internal revisions
-
VK_EXT_queue_family_foreign
- Name String
-
VK_EXT_queue_family_foreign
- Extension Type
-
Device extension
- Registered Extension Number
-
127
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_memory
-
- Contact
-
-
Chad Versace chadversary
-
- Last Modified Date
-
2017-11-01
- IP Status
-
No known IP claims.
- Contributors
-
-
Chad Versace, Google
-
James Jones, NVIDIA
-
Jason Ekstrand, Intel
-
Jesse Hall, Google
-
Daniel Rakos, AMD
-
Ray Smith, ARM
-
This extension defines a special queue family,
VK_QUEUE_FAMILY_FOREIGN_EXT
, which can be used to transfer ownership
of resources backed by external memory to foreign, external queues.
This is similar to VK_QUEUE_FAMILY_EXTERNAL_KHR
, defined in
VK_KHR_external_memory
.
The key differences between the two are:
-
The queues represented by
VK_QUEUE_FAMILY_EXTERNAL_KHR
must share the same physical device and the same driver version as the current VkInstance.VK_QUEUE_FAMILY_FOREIGN_EXT
has no such restrictions. It can represent devices and drivers from other vendors, and can even represent non-Vulkan-capable devices. -
All resources backed by external memory support
VK_QUEUE_FAMILY_EXTERNAL_KHR
. Support forVK_QUEUE_FAMILY_FOREIGN_EXT
is more restrictive. -
Applications should expect transitions to/from
VK_QUEUE_FAMILY_FOREIGN_EXT
to be more expensive than transitions to/fromVK_QUEUE_FAMILY_EXTERNAL_KHR
.
New Enum Constants
-
Special constants:
-
VK_QUEUE_FAMILY_FOREIGN_EXT
-
Version History
-
Revision 1, 2017-11-01 (Chad Versace)
-
Squashed internal revisions
-
VK_EXT_sample_locations
- Name String
-
VK_EXT_sample_locations
- Extension Type
-
Device extension
- Registered Extension Number
-
144
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Rakos drakos-amd
-
- Last Modified Date
-
2017-08-02
- Contributors
-
-
Mais Alnasser, AMD
-
Matthaeus G. Chajdas, AMD
-
Maciej Jesionowski, AMD
-
Daniel Rakos, AMD
-
Slawomir Grajewski, Intel
-
Jeff Bolz, NVIDIA
-
Bill Licea-Kane, Qualcomm
-
This extension allows an application to modify the locations of samples within a pixel used in rasterization. Additionally, it allows applications to specify different sample locations for each pixel in a group of adjacent pixels, which can increase antialiasing quality (particularly if a custom resolve shader is used that takes advantage of these different locations).
It is common for implementations to optimize the storage of depth values by storing values that can be used to reconstruct depth at each sample location, rather than storing separate depth values for each sample. For example, the depth values from a single triangle may be represented using plane equations. When the depth value for a sample is needed, it is automatically evaluated at the sample location. Modifying the sample locations causes the reconstruction to no longer evaluate the same depth values as when the samples were originally generated, thus the depth aspect of a depth/stencil attachment must be cleared before rendering to it using different sample locations.
Some implementations may need to evaluate depth image values while performing image layout transitions. To accommodate this, instances of the VkSampleLocationsInfoEXT structure can be specified for each situation where an explicit or automatic layout transition has to take place. VkSampleLocationsInfoEXT can be chained from VkImageMemoryBarrier structures to provide sample locations for layout transitions performed by vkCmdWaitEvents and vkCmdPipelineBarrier calls, and VkRenderPassSampleLocationsBeginInfoEXT can be chained from VkRenderPassBeginInfo to provide sample locations for layout transitions performed implicitly by a render pass instance.
New Object Types
None.
New Enum Constants
-
Extending VkImageCreateFlagBits:
-
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT
-
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_SAMPLE_LOCATIONS_INFO_EXT
-
VK_STRUCTURE_TYPE_RENDER_PASS_SAMPLE_LOCATIONS_BEGIN_INFO_EXT
-
VK_STRUCTURE_TYPE_PIPELINE_SAMPLE_LOCATIONS_STATE_CREATE_INFO_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT
-
-
Extending VkDynamicState:
-
VK_DYNAMIC_STATE_SAMPLE_LOCATIONS_EXT
-
New Enums
None.
New Structures
Issues
None.
Version History
-
Revision 1, 2017-08-02 (Daniel Rakos)
-
Internal revisions
-
VK_EXT_sampler_filter_minmax
- Name String
-
VK_EXT_sampler_filter_minmax
- Extension Type
-
Device extension
- Registered Extension Number
-
131
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2017-05-19
- IP Status
-
No known IP claims.
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Piers Daniell, NVIDIA
-
In unextended Vulkan, minification and magnification filters such as LINEAR allow sampled image lookups to return a filtered texel value produced by computing a weighted average of a collection of texels in the neighborhood of the texture coordinate provided.
This extension provides a new sampler parameter which allows applications to produce a filtered texel value by computing a component-wise minimum (MIN) or maximum (MAX) of the texels that would normally be averaged. The reduction mode is orthogonal to the minification and magnification filter parameters. The filter parameters are used to identify the set of texels used to produce a final filtered value; the reduction mode identifies how these texels are combined.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLER_FILTER_MINMAX_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_SAMPLER_REDUCTION_MODE_CREATE_INFO_EXT
-
-
Extending VkFormatFeatureFlagBits
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_MINMAX_BIT_EXT
-
New Enums
New Functions
None.
New Built-In Variables
None.
New SPIR-V Capabilities
None.
Issues
None.
Examples
None.
Version History
-
Revision 2, 2017-05-19 (Piers Daniell)
-
Renamed to EXT
-
-
Revision 1, 2017-03-25 (Jeff Bolz)
-
Internal revisions
-
VK_EXT_scalar_block_layout
- Name String
-
VK_EXT_scalar_block_layout
- Extension Type
-
Device extension
- Registered Extension Number
-
222
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Tobias Hector tobski
-
- Last Modified Date
-
2018-11-14
- Contributors
-
-
Jeff Bolz
-
Jan-Harald Fredriksen
-
Graeme Leese
-
Jason Ekstrand
-
John Kessenich
-
Short Description
Enables C-like structure layout for SPIR-V blocks.
Description
This extension modifies the alignment rules for uniform buffers, storage buffers and push constants, allowing non-scalar types to be aligned solely based on the size of their components, without additional requirements.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SCALAR_BLOCK_LAYOUT_FEATURES_EXT
-
New Structures
Issues
None.
Version History
-
Revision 1, 2018-11-14 (Tobias Hector)
-
Initial draft
-
VK_EXT_separate_stencil_usage
- Name String
-
VK_EXT_separate_stencil_usage
- Extension Type
-
Device extension
- Registered Extension Number
-
247
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Rakos drakos-amd
-
- Last Modified Date
-
2018-11-08
- IP Status
-
No known IP claims.
- Contributors
-
-
Daniel Rakos, AMD
-
Jordan Logan, AMD
-
This extension allows specifying separate usage flags for the stencil aspect of images with a depth-stencil format at image creation time.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_IMAGE_STENCIL_USAGE_CREATE_INFO_EXT
-
New Enums
None.
New Structures
New Functions
None.
Issues
None.
Version History
-
Revision 1, 2018-11-08 (Daniel Rakos)
-
Internal revisions.
-
VK_EXT_shader_stencil_export
- Name String
-
VK_EXT_shader_stencil_export
- Extension Type
-
Device extension
- Registered Extension Number
-
141
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Dominik Witczak dominikwitczakamd
-
- Last Modified Date
-
2017-07-19
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Requires the SPV_EXT_shader_stencil_export SPIR-V extension.
-
- Contributors
-
-
Dominik Witczak, AMD
-
Daniel Rakos, AMD
-
Rex Xu, AMD
-
This extension adds support for the SPIR-V extension SPV_EXT_shader_stencil_export, providing a mechanism whereby a shader may generate the stencil reference value per invocation. When stencil testing is enabled, this allows the test to be performed against the value generated in the shader.
Version History
-
Revision 1, 2017-07-19 (Dominik Witczak)
-
Initial draft
-
VK_EXT_shader_subgroup_ballot
- Name String
-
VK_EXT_shader_subgroup_ballot
- Extension Type
-
Device extension
- Registered Extension Number
-
65
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Koch dgkoch
-
- Last Modified Date
-
2016-11-28
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension requires the SPV_KHR_shader_ballot SPIR-V extension.
-
This extension requires the GL_ARB_shader_ballot extension for GLSL source languages.
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Neil Henning, Codeplay
-
Daniel Koch, NVIDIA Corporation
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_KHR_shader_ballot
This extension provides the ability for a group of invocations, which execute in parallel, to do limited forms of cross-invocation communication via a group broadcast of a invocation value, or broadcast of a bitarray representing a predicate value from each invocation in the group.
This extension provides access to a number of additional built-in shader variables in Vulkan:
-
SubgroupEqMaskKHR
, which contains the subgroup mask of the current subgroup invocation, -
SubgroupGeMaskKHR
, which contains the subgroup mask of the invocations greater than or equal to the current invocation, -
SubgroupGtMaskKHR
, which contains the subgroup mask of the invocations greater than the current invocation, -
SubgroupLeMaskKHR
, which contains the subgroup mask of the invocations less than or equal to the current invocation, -
SubgroupLtMaskKHR
, which contains the subgroup mask of the invocations less than the current invocation, -
SubgroupLocalInvocationId
, which contains the index of an invocation within a subgroup, and -
SubgroupSize
, which contains the maximum number of invocations in a subgroup.
Additionally, this extension provides access to the new SPIR-V instructions:
-
OpSubgroupBallotKHR
, -
OpSubgroupFirstInvocationKHR
, and -
OpSubgroupReadInvocationKHR
,
When using GLSL source-based shader languages, the following variables and shader functions from GL_ARB_shader_ballot can map to these SPIR-V built-in decorations and instructions:
-
in uint64_t gl_SubGroupEqMaskARB;
→SubgroupEqMaskKHR
, -
in uint64_t gl_SubGroupGeMaskARB;
→SubgroupGeMaskKHR
, -
in uint64_t gl_SubGroupGtMaskARB;
→SubgroupGtMaskKHR
, -
in uint64_t gl_SubGroupLeMaskARB;
→SubgroupLeMaskKHR
, -
in uint64_t gl_SubGroupLtMaskARB;
→SubgroupLtMaskKHR
, -
in uint gl_SubGroupInvocationARB;
→SubgroupLocalInvocationId
, -
uniform uint gl_SubGroupSizeARB;
→SubgroupSize
, -
ballotARB
() →OpSubgroupBallotKHR
, -
readFirstInvocationARB
() →OpSubgroupFirstInvocationKHR
, and -
readInvocationARB
() →OpSubgroupReadInvocationKHR
.
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
New Built-In Variables
New SPIR-V Capabilities
Issues
None.
Version History
-
Revision 1, 2016-11-28 (Daniel Koch)
-
Initial draft
-
VK_EXT_shader_subgroup_vote
- Name String
-
VK_EXT_shader_subgroup_vote
- Extension Type
-
Device extension
- Registered Extension Number
-
66
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Koch dgkoch
-
- Last Modified Date
-
2016-11-28
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension requires the SPV_KHR_subgroup_vote SPIR-V extension.
-
This extension requires the GL_ARB_shader_group_vote extension for GLSL source languages.
-
- Contributors
-
-
Neil Henning, Codeplay
-
Daniel Koch, NVIDIA Corporation
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_KHR_subgroup_vote
This extension provides new SPIR-V instructions:
-
OpSubgroupAllKHR
, -
OpSubgroupAnyKHR
, and -
OpSubgroupAllEqualKHR
.
to compute the composite of a set of boolean conditions across a group of shader invocations that are running concurrently (a subgroup). These composite results may be used to execute shaders more efficiently on a VkPhysicalDevice.
When using GLSL source-based shader languages, the following shader functions from GL_ARB_shader_group_vote can map to these SPIR-V instructions:
-
anyInvocationARB
() →OpSubgroupAnyKHR
, -
allInvocationsARB
() →OpSubgroupAllKHR
, and -
allInvocationsEqualARB
() →OpSubgroupAllEqualKHR
.
The subgroup across which the boolean conditions are evaluated is implementation-dependent, and this extension provides no guarantee over how individual shader invocations are assigned to subgroups. In particular, a subgroup has no necessary relationship with the compute shader local workgroup — any pair of shader invocations in a compute local workgroup may execute in different subgroups as used by these instructions.
Compute shaders operate on an explicitly specified group of threads (a local workgroup), but many implementations will also group non-compute shader invocations and execute them concurrently. When executing code like
if (condition) {
result = do_fast_path();
} else {
result = do_general_path();
}
where condition
diverges between invocations, an implementation might
first execute do_fast_path
() for the invocations where condition
is true and leave the other invocations dormant.
Once do_fast_path
() returns, it might call do_general_path
() for
invocations where condition
is false
and leave the other
invocations dormant.
In this case, the shader executes both the fast and the general path and
might be better off just using the general path for all invocations.
This extension provides the ability to avoid divergent execution by evaluating a condition across an entire subgroup using code like:
if (allInvocationsARB(condition)) {
result = do_fast_path();
} else {
result = do_general_path();
}
The built-in function allInvocationsARB
() will return the same value
for all invocations in the group, so the group will either execute
do_fast_path
() or do_general_path
(), but never both.
For example, shader code might want to evaluate a complex function
iteratively by starting with an approximation of the result and then
refining the approximation.
Some input values may require a small number of iterations to generate an
accurate result (do_fast_path
) while others require a larger number
(do_general_path
).
In another example, shader code might want to evaluate a complex function
(do_general_path
) that can be greatly simplified when assuming a
specific value for one of its inputs (do_fast_path
).
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
New Built-In Variables
None.
New SPIR-V Capabilities
Issues
None.
Version History
-
Revision 1, 2016-11-28 (Daniel Koch)
-
Initial draft
-
VK_EXT_shader_viewport_index_layer
- Name String
-
VK_EXT_shader_viewport_index_layer
- Extension Type
-
Device extension
- Registered Extension Number
-
163
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Koch dgkoch
-
- Last Modified Date
-
2017-08-08
- Interactions and External Dependencies
-
-
This extension requires the SPV_EXT_shader_viewport_index_layer SPIR-V extension.
-
This extension requires the GL_ARB_shader_viewport_layer_array, GL_AMD_vertex_shader_layer, GL_AMD_vertex_shader_viewport_index, or GL_NV_viewport_array2 extensions for GLSL source languages.
-
This extension requires the
multiViewport
feature. -
This extension interacts with the
tessellationShader
feature.
-
- Contributors
-
-
Piers Daniell, NVIDIA
-
Jeff Bolz, NVIDIA
-
Jan-Harald Fredriksen, ARM
-
Daniel Rakos, AMD
-
Slawomir Grajeswki, Intel
-
This extension adds support for the ShaderViewportIndexLayerEXT
capability from the SPV_EXT_shader_viewport_index_layer extension in
Vulkan.
This extension allows variables decorated with the Layer
and
ViewportIndex
built-ins to be exported from vertex or tessellation
shaders, using the ShaderViewportIndexLayerEXT
capability.
When using GLSL source-based shading languages, the gl_ViewportIndex
and gl_Layer
built-in variables map to the SPIR-V ViewportIndex
and Layer
built-in decorations, respectively.
Behaviour of these variables is extended as described in the
GL_ARB_shader_viewport_layer_array (or the precursor
GL_AMD_vertex_shader_layer, GL_AMD_vertex_shader_viewport_index, and
GL_NV_viewport_array2 extensions).
Note
The |
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
New or Modified Built-In Variables
-
(modified)
Layer
-
(modified)
ViewportIndex
New Variable Decoration
None.
New SPIR-V Capabilities
Issues
None yet!
Version History
-
Revision 1, 2017-08-08 (Daniel Koch)
-
Internal drafts
-
VK_EXT_swapchain_colorspace
- Name String
-
VK_EXT_swapchain_colorspace
- Extension Type
-
Instance extension
- Registered Extension Number
-
105
- Revision
-
3
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Courtney Goeltzenleuchter courtney-g
-
- Last Modified Date
-
2017-03-15
- IP Status
-
No known IP claims.
- Contributors
-
-
Courtney Goeltzenleuchter, Google
-
New Enum Constants
-
Extending VkColorSpaceKHR:
-
VK_COLOR_SPACE_DISPLAY_P3_NONLINEAR_EXT
- supports the Display-P3 color space and applies an sRGB-like transfer function. -
VK_COLOR_SPACE_EXTENDED_SRGB_LINEAR_EXT
- supports the extended sRGB color space and applies a linear transfer function. -
VK_COLOR_SPACE_EXTENDED_SRGB_NONLINEAR_EXT
- supports the extended sRGB color space with an sRGB nonlinear transfer function. -
VK_COLOR_SPACE_DCI_P3_LINEAR_EXT
- supports the DCI-P3 color space and applies a linear OETF. -
VK_COLOR_SPACE_DCI_P3_NONLINEAR_EXT
- supports the DCI-P3 color space and applies the Gamma 2.6 OETF. -
VK_COLOR_SPACE_BT709_LINEAR_EXT
- supports the BT709 color space and applies a linear transfer function. -
VK_COLOR_SPACE_BT709_NONLINEAR_EXT
- supports the BT709 color space and applies the SMPTE 170M OETF. -
VK_COLOR_SPACE_BT2020_LINEAR_EXT
- supports the BT2020 color space and applies a linear OETF. -
VK_COLOR_SPACE_HDR10_ST2084_EXT
- supports HDR10 (BT2020 color space and applies the SMPTE ST2084 Perceptual Quantizer (PQ) OETF). -
VK_COLOR_SPACE_DOLBYVISION_EXT
- supports Dolby Vision (BT2020 color space, proprietary encoding, and applies the SMPTE ST2084 OETF). -
VK_COLOR_SPACE_HDR10_HLG_EXT
- supports HDR10 (BT2020 color space and applies the Hybrid Log Gamma (HLG) OETF). -
VK_COLOR_SPACE_ADOBERGB_LINEAR_EXT
- supports the AdobeRGB color space and applies a linear OETF. -
VK_COLOR_SPACE_ADOBERGB_NONLINEAR_EXT
- supports the AdobeRGB color space and applies the Gamma 2.2 OETF. -
VK_COLOR_SPACE_PASS_THROUGH_EXT
- color components used “as is”. Intended to allow application to supply data for color spaces not described here.
-
Issues
1) Does the spec need to specify which kinds of image formats support the color spaces?
RESOLVED: Pixel format is independent of color space (though some color spaces really want / need floating point color components to be useful). Therefore, do not plan on documenting what formats support which colorspaces. An application can call vkGetPhysicalDeviceSurfaceFormatsKHR to query what a particular implementation supports.
2) How does application determine if HW supports appropriate transfer function for a colorspace?
RESOLVED: Extension indicates that implementation must not do the OETF encoding if it is not sRGB. That responsibility falls to the application shaders. Any other native OETF / EOTF functions supported by an implementation can be described by separate extension.
Version History
-
Revision 1, 2016-12-27 (Courtney Goeltzenleuchter)
-
Initial version
-
-
Revision 2, 2017-01-19 (Courtney Goeltzenleuchter)
-
Add pass through and multiple options for BT2020.
-
Clean up some issues with equations not displaying properly.
-
-
Revision 3, 2017-06-23 (Courtney Goeltzenleuchter)
-
Add extended sRGB non-linear enum.
-
VK_EXT_transform_feedback
- Name String
-
VK_EXT_transform_feedback
- Extension Type
-
Device extension
- Registered Extension Number
-
29
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Piers Daniell pdaniell-nv
-
- Last Modified Data
-
2018-10-09
- Contributors
-
-
Baldur Karlsson, Valve
-
Boris Zanin, Mobica
-
Daniel Rakos, AMD
-
Donald Scorgie, Imagination
-
Henri Verbeet, CodeWeavers
-
Jan-Harald Fredriksen, Arm
-
Jason Ekstrand, Intel
-
Jeff Bolz, NVIDIA
-
Jesse Barker, Unity
-
Jesse Hall, Google
-
Pierre-Loup Griffais, Valve
-
Philip Rebohle, DXVK
-
Ruihao Zhang, Qualcomm
-
Samuel Pitoiset, Valve
-
Slawomir Grajewski, Intel
-
Stu Smith, Imagination Technologies
-
This extension adds transform feedback to the Vulkan API by exposing the
SPIR-V TransformFeedback
and GeometryStreams
capabilities to
capture vertex, tessellation or geometry shader outputs to one or more
buffers.
It adds API functionality to bind transform feedback buffers to capture the
primitives emitted by the graphics pipeline from SPIR-V outputs decorated
for transform feedback.
The transform feedback capture can be paused and resumed by way of storing
and retrieving a byte counter.
The captured data can be drawn again where the vertex count is derived from
the byte counter without CPU intervention.
If the implementation is capable, a vertex stream other than zero can be
rasterized.
All these features are designed to match the full capabilities of OpenGL core transform feedback functionality and beyond. Many of the features are optional to allow base OpenGL ES GPUs to also implement this extension.
The primary purpose of the functionality exposed by this extension is to support translation layers from other 3D APIs. This functionality is not considered forward looking, and is not expected to be promoted to a KHR extension or to core Vulkan. Unless this is needed for translation, it is recommended that developers use alternative techniques of using the GPU to process and capture vertex data.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_FEATURES_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_STREAM_CREATE_INFO_EXT
-
-
Extending VkQueryType:
-
VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT
-
-
Extending VkBufferUsageFlagBits:
-
VK_BUFFER_USAGE_TRANSFORM_FEEDBACK_BUFFER_BIT_EXT
-
VK_BUFFER_USAGE_TRANSFORM_FEEDBACK_COUNTER_BUFFER_BIT_EXT
-
-
Extending VkAccessFlagBits:
-
VK_ACCESS_TRANSFORM_FEEDBACK_WRITE_BIT_EXT
-
VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_READ_BIT_EXT
-
VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT
-
-
Extending VkPipelineStageFlagBits:
-
VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT
-
New Structures
New Functions
Issues
1) Should we include pause/resume functionality?
RESOLVED: Yes, this is needed to ease layering other APIs which have this
functionality.
To pause use vkCmdEndTransformFeedbackEXT
and provide valid buffer
handles in the pCounterBuffers
array and offsets in the
pCounterBufferOffsets
array for the implementation to save the resume
points.
Then to resume use vkCmdBeginTransformFeedbackEXT
with the previous
pCounterBuffers
and pCounterBufferOffsets
values.
Between the pause and resume there needs to be a memory barrier for the
counter buffers with a source access of
VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT
at pipeline stage
VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT
to a destination access
of VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_READ_BIT_EXT
at pipeline stage
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
.
2) How does this interact with multiview?
RESOLVED: Transform feedback cannot be made active in a render pass with multiview enabled.
3) How should queries be done?
RESOLVED: There is a new query type
VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT
.
A query pool created with this type will capture 2 integers -
numPrimitivesWritten and numPrimitivesNeeded - for the specified vertex
stream output from the last vertex processing stage.
The vertex stream output queried is zero by default, but can be specified
with the new vkCmdBeginQueryIndexedEXT
and
vkCmdEndQueryIndexedEXT
commands.
Version History
-
Revision 1, 2018-10-09 (Piers Daniell)
-
Internal revisions
-
VK_EXT_validation_cache
- Name String
-
VK_EXT_validation_cache
- Extension Type
-
Device extension
- Registered Extension Number
-
161
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Cort Stratton cdwfs
-
- Last Modified Date
-
2017-08-29
- IP Status
-
No known IP claims.
- Contributors
-
-
Cort Stratton, Google
-
Chris Forbes, Google
-
This extension provides a mechanism for caching the results of potentially expensive internal validation operations across multiple runs of a Vulkan application. At the core is the VkValidationCacheEXT object type, which is managed similarly to the existing VkPipelineCache.
The new struct VkShaderModuleValidationCacheCreateInfoEXT can be
included in the pNext
chain at vkCreateShaderModule time.
It contains a VkValidationCacheEXT to use when validating the
VkShaderModule.
New Object Types
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_VALIDATION_CACHE_CREATE_INFO_EXT
-
VK_STRUCTURE_TYPE_SHADER_MODULE_VALIDATION_CACHE_CREATE_INFO_EXT
-
New Functions
Issues
None.
Version History
-
Revision 1, 2017-08-29 (Cort Stratton)
-
Initial draft
-
VK_EXT_validation_flags
- Name String
-
VK_EXT_validation_flags
- Extension Type
-
Instance extension
- Registered Extension Number
-
62
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Tobin Ehlis tobine
-
- Last Modified Date
-
2016-09-06
- IP Status
-
No known IP claims.
- Contributors
-
-
Tobin Ehlis, Google
-
Courtney Goeltzenleuchter, Google
-
This extension provides the VkValidationFlagsEXT struct that can be
included in the pNext
chain of the VkInstanceCreateInfo
structure passed as the pCreateInfo
parameter of
vkCreateInstance.
The new struct contains an array of VkValidationCheckEXT values that
will be disabled by the validation layers.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_VALIDATION_FLAGS_EXT
-
New Enums
New Structures
New Functions
None.
Issues
None.
Version History
-
Revision 1, 2016-08-26 (Courtney Goeltzenleuchter)
-
Initial draft
-
VK_EXT_vertex_attribute_divisor
- Name String
-
VK_EXT_vertex_attribute_divisor
- Extension Type
-
Device extension
- Registered Extension Number
-
191
- Revision
-
3
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Vikram Kushwaha vkushwaha
-
- Last Modified Date
-
2018-08-03
- IP Status
-
No known IP claims.
- Contributors
-
-
Vikram Kushwaha, NVIDIA
-
Jason Ekstrand, Intel
-
This extension allows instance-rate vertex attributes to be repeated for certain number of instances instead of advancing for every instance when instanced rendering is enabled.
New Object Types
None.
New Enum Constants
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_PROPERTIES_EXT
-
VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_DIVISOR_STATE_CREATE_INFO_EXT
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_FEATURES_EXT
New Enums
None.
New Structures
New Functions
None.
Issues
1) What is the effect of a non-zero value for firstInstance
?
RESOLVED: The Vulkan API should follow the OpenGL convention and offset
attribute fetching by firstInstance
while computing vertex attribute
offsets.
2) Should zero be an allowed divisor?
RESOLVED: Yes. A zero divisor means the vertex attribute is repeated for all instances.
Examples
To create a vertex binding such that the first binding uses instanced rendering and the same attribute is used for every 4 draw instances, an application could use the following set of structures:
const VkVertexInputBindingDivisorDescriptionEXT divisorDesc =
{
0,
4
};
const VkPipelineVertexInputDivisorStateCreateInfoEXT divisorInfo =
{
VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_DIVISOR_STATE_CREATE_INFO_EXT, // sType
NULL, // pNext
1, // vertexBindingDivisorCount
&divisorDesc // pVertexBindingDivisors
}
const VkVertexInputBindingDescription binding =
{
0, // binding
sizeof(Vertex), // stride
VK_VERTEX_INPUT_RATE_INSTANCE // inputRate
};
const VkPipelineVertexInputStateCreateInfo viInfo =
{
VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_CREATE_INFO, // sType
&divisorInfo, // pNext
...
};
//...
Version History
-
Revision 1, 2017-12-04 (Vikram Kushwaha)
-
First Version
-
-
Revision 2, 2018-07-16 (Jason Ekstrand)
-
Adjust the interaction between
divisor
andfirstInstance
to match the OpenGL convention. -
Disallow divisors of zero.
-
-
Revision 3, 2018-08-03 (Vikram Kushwaha)
-
Allow a zero divisor.
-
Add a physical device features structure to query/enable this feature.
-
VK_AMD_buffer_marker
- Name String
-
VK_AMD_buffer_marker
- Extension Type
-
Device extension
- Registered Extension Number
-
180
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Rakos drakos-amd
-
- Last Modified Date
-
2018-01-26
- IP Status
-
No known IP claims.
- Contributors
-
-
Matthaeus G. Chajdas, AMD
-
Jaakko Konttinen, AMD
-
Daniel Rakos, AMD
-
This extension adds a new operation to execute pipelined writes of small
marker values into a VkBuffer
object.
The primary purpose of these markers is to facilitate the development of debugging tools for tracking which pipelined command contributed to device loss.
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
Examples
None.
Version History
-
Revision 1, 2018-01-26 (Jaakko Konttinen)
-
Initial revision
-
VK_AMD_gcn_shader
- Name String
-
VK_AMD_gcn_shader
- Extension Type
-
Device extension
- Registered Extension Number
-
26
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Dominik Witczak dominikwitczakamd
-
- Last Modified Date
-
2016-05-30
- IP Status
-
No known IP claims.
- Contributors
-
-
Dominik Witczak, AMD
-
Daniel Rakos, AMD
-
Rex Xu, AMD
-
Graham Sellers, AMD
-
This extension adds support for the following SPIR-V extension in Vulkan:
editing-note
Shouldn’t the SPV extension be in the Interactions and External Dependencies block? |
Version History
-
Revision 1, 2016-05-30 (Dominik Witczak)
-
Initial draft
-
VK_AMD_gpu_shader_half_float
- Name String
-
VK_AMD_gpu_shader_half_float
- Extension Type
-
Device extension
- Registered Extension Number
-
37
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Dominik Witczak dominikwitczakamd
-
- Last Modified Date
-
2016-09-21
- IP Status
-
No known IP claims.
- Contributors
-
-
Daniel Rakos, AMD
-
Dominik Witczak, AMD
-
Donglin Wei, AMD
-
Graham Sellers, AMD
-
Qun Lin, AMD
-
Rex Xu, AMD
-
This extension adds support for the following SPIR-V extension in Vulkan:
editing-note
Shouldn’t the SPV extension be in the Interactions and External Dependencies block? |
Version History
-
Revision 1, 2019-09-21 (Dominik Witczak)
-
Initial draft
-
VK_AMD_gpu_shader_int16
- Name String
-
VK_AMD_gpu_shader_int16
- Extension Type
-
Device extension
- Registered Extension Number
-
133
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Qun Lin linqun
-
- Last Modified Date
-
2017-06-08
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Requires the SPV_AMD_gpu_shader_int16 SPIR-V extension.
-
- Contributors
-
-
Daniel Rakos, AMD
-
Dominik Witczak, AMD
-
Matthaeus G. Chajdas, AMD
-
Rex Xu, AMD
-
Timothy Lottes, AMD
-
Zhi Cai, AMD
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_AMD_gpu_shader_int16
Version History
-
Revision 1, 2017-06-18 (Dominik Witczak)
-
First version.
-
VK_AMD_memory_overallocation_behavior
- Name String
-
VK_AMD_memory_overallocation_behavior
- Extension Type
-
Device extension
- Registered Extension Number
-
190
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Martin Dinkov mdinkov
-
- Last Modified Date
-
2018-09-19
- IP Status
-
No known IP claims.
- Contributors
-
-
Martin Dinkov, AMD
-
Matthaeus Chajdas, AMD
-
Daniel Rakos, AMD
-
Jon Campbell, AMD
-
This extension allows controlling whether explicit overallocation beyond the device memory heap sizes (reported by VkPhysicalDeviceMemoryProperties) is allowed or not. Overallocation may lead to performance loss and is not supported for all platforms.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DEVICE_MEMORY_OVERALLOCATION_CREATE_INFO_AMD
-
New Enums
New Structures
New Functions
None.
Examples
None.
Version History
-
Revision 1, 2018-09-19 (Martin Dinkov)
-
Initial draft.
-
VK_AMD_mixed_attachment_samples
- Name String
-
VK_AMD_mixed_attachment_samples
- Extension Type
-
Device extension
- Registered Extension Number
-
137
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Matthaeus G. Chajdas anteru
-
- Last Modified Date
-
2017-07-24
- Contributors
-
-
Mais Alnasser, AMD
-
Matthaeus G. Chajdas, AMD
-
Maciej Jesionowski, AMD
-
Daniel Rakos, AMD
-
This extension enables applications to use multisampled rendering with a depth/stencil sample count that is larger than the color sample count. Having a depth/stencil sample count larger than the color sample count allows maintaining geometry and coverage information at a higher sample rate than color information. All samples are depth/stencil tested, but only the first color sample count number of samples get a corresponding color output.
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
Issues
None.
Version History
-
Revision 1, 2017-07-24 (Daniel Rakos)
-
Internal revisions
-
VK_AMD_rasterization_order
- Name String
-
VK_AMD_rasterization_order
- Extension Type
-
Device extension
- Registered Extension Number
-
19
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Rakos drakos-amd
-
- Last Modified Date
-
2016-04-25
- IP Status
-
No known IP claims.
- Contributors
-
-
Matthaeus G. Chajdas, AMD
-
Jaakko Konttinen, AMD
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Dominik Witczak, AMD
-
This extension introduces the possibility for the application to control the order of primitive rasterization. In unextended Vulkan, the following stages are guaranteed to execute in API order:
-
depth bounds test
-
stencil test, stencil op, and stencil write
-
depth test and depth write
-
occlusion queries
-
blending, logic op, and color write
This extension enables applications to opt into a relaxed, implementation defined primitive rasterization order that may allow better parallel processing of primitives and thus enabling higher primitive throughput. It is applicable in cases where the primitive rasterization order is known to not affect the output of the rendering or any differences caused by a different rasterization order are not a concern from the point of view of the application’s purpose.
A few examples of cases when using the relaxed primitive rasterization order would not have an effect on the final rendering:
-
If the primitives rendered are known to not overlap in framebuffer space.
-
If depth testing is used with a comparison operator of
VK_COMPARE_OP_LESS
,VK_COMPARE_OP_LESS_OR_EQUAL
,VK_COMPARE_OP_GREATER
, orVK_COMPARE_OP_GREATER_OR_EQUAL
, and the primitives rendered are known to not overlap in clip space. -
If depth testing is not used and blending is enabled for all attachments with a commutative blend operator.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_RASTERIZATION_ORDER_AMD
-
New Enums
New Structures
New Functions
None
Issues
1) How is this extension useful to application developers?
RESOLVED: Allows them to increase primitive throughput for cases when strict API order rasterization is not important due to the nature of the content, the configuration used, or the requirements towards the output of the rendering.
2) How does this extension interact with content optimizations aiming to reduce overdraw by appropriately ordering the input primitives?
RESOLVED: While the relaxed rasterization order might somewhat limit the effectiveness of such content optimizations, most of the benefits of it are expected to be retained even when the relaxed rasterization order is used, so applications should still apply these optimizations even if they intend to use the extension.
3) Are there any guarantees about the primitive rasterization order when using the new relaxed mode?
RESOLVED: No. In this case the rasterization order is completely implementation dependent, but in practice it is expected to partially still follow the order of incoming primitives.
4) Does the new relaxed rasterization order have any adverse effect on repeatability and other invariance rules of the API?
RESOLVED: Yes, in the sense that it extends the list of exceptions when the repeatability requirement does not apply.
Examples
None
Issues
None
Version History
-
Revision 1, 2016-04-25 (Daniel Rakos)
-
Initial draft.
-
VK_AMD_shader_ballot
- Name String
-
VK_AMD_shader_ballot
- Extension Type
-
Device extension
- Registered Extension Number
-
38
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Dominik Witczak dominikwitczakamd
-
- Last Modified Date
-
2016-09-19
- IP Status
-
No known IP claims.
- Contributors
-
-
Qun Lin, AMD
-
Graham Sellers, AMD
-
Daniel Rakos, AMD
-
Rex Xu, AMD
-
Dominik Witczak, AMD
-
Matthäus G. Chajdas, AMD
-
This extension adds support for the following SPIR-V extension in Vulkan:
editing-note
Shouldn’t the SPV extension be in the Interactions and External Dependencies block? |
Version History
-
Revision 1, 2016-09-19 (Dominik Witczak)
-
Initial draft
-
VK_AMD_shader_core_properties
- Name String
-
VK_AMD_shader_core_properties
- Extension Type
-
Device extension
- Registered Extension Number
-
186
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Martin Dinkov mdinkov
-
- Last Modified Date
-
2018-02-15
- IP Status
-
No known IP claims.
- Contributors
-
-
Martin Dinkov, AMD
-
Matthaeus Chajdas, AMD
-
This extension exposes shader core properties for a target physical device through the VK_KHR_get_physical_device_properties2 extension. Please refer to the example below for proper usage.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_AMD
-
New Enums
None.
New Structures
New Functions
None.
Examples
This example retrieves the shader core properties for a physical device.
extern VkInstance instance;
PFN_vkGetPhysicalDeviceProperties2 pfnVkGetPhysicalDeviceProperties2 =
reinterpret_cast<PFN_vkGetPhysicalDeviceProperties2>
(vkGetInstanceProcAddr(instance, "vkGetPhysicalDeviceProperties2") );
VkPhysicalDeviceProperties2 general_props;
VkPhysicalDeviceShaderCorePropertiesAMD shader_core_properties;
shader_core_properties.pNext = nullptr;
shader_core_properties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_AMD;
general_props.pNext = &shader_core_properties;
general_props.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2;
// After this call, shader_core_properties has been populated
pfnVkGetPhysicalDeviceProperties2(device, &general_props);
printf("Number of shader engines: %d\n",
m_shader_core_properties.shader_engine_count =
shader_core_properties.shaderEngineCount;
printf("Number of shader arrays: %d\n",
m_shader_core_properties.shader_arrays_per_engine_count =
shader_core_properties.shaderArraysPerEngineCount;
printf("Number of CUs per shader array: %d\n",
m_shader_core_properties.compute_units_per_shader_array =
shader_core_properties.computeUnitsPerShaderArray;
printf("Number of SIMDs per compute unit: %d\n",
m_shader_core_properties.simd_per_compute_unit =
shader_core_properties.simdPerComputeUnit;
printf("Number of wavefront slots in each SIMD: %d\n",
m_shader_core_properties.wavefronts_per_simd =
shader_core_properties.wavefrontsPerSimd;
printf("Number of threads per wavefront: %d\n",
m_shader_core_properties.wavefront_size =
shader_core_properties.wavefrontSize;
printf("Number of physical SGPRs per SIMD: %d\n",
m_shader_core_properties.sgprs_per_simd =
shader_core_properties.sgprsPerSimd;
printf("Minimum number of SGPRs that can be allocated by a wave: %d\n",
m_shader_core_properties.min_sgpr_allocation =
shader_core_properties.minSgprAllocation;
printf("Number of available SGPRs: %d\n",
m_shader_core_properties.max_sgpr_allocation =
shader_core_properties.maxSgprAllocation;
printf("SGPRs are allocated in groups of this size: %d\n",
m_shader_core_properties.sgpr_allocation_granularity =
shader_core_properties.sgprAllocationGranularity;
printf("Number of physical VGPRs per SIMD: %d\n",
m_shader_core_properties.vgprs_per_simd =
shader_core_properties.vgprsPerSimd;
printf("Minimum number of VGPRs that can be allocated by a wave: %d\n",
m_shader_core_properties.min_vgpr_allocation =
shader_core_properties.minVgprAllocation;
printf("Number of available VGPRs: %d\n",
m_shader_core_properties.max_vgpr_allocation =
shader_core_properties.maxVgprAllocation;
printf("VGPRs are allocated in groups of this size: %d\n",
m_shader_core_properties.vgpr_allocation_granularity =
shader_core_properties.vgprAllocationGranularity;
Version History
-
Revision 1, 2018-02-15 (Martin Dinkov)
-
Initial draft.
-
VK_AMD_shader_explicit_vertex_parameter
- Name String
-
VK_AMD_shader_explicit_vertex_parameter
- Extension Type
-
Device extension
- Registered Extension Number
-
22
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Qun Lin linqun
-
- Last Modified Date
-
2016-05-10
- IP Status
-
No known IP claims.
- Contributors
-
-
Matthaeus G. Chajdas, AMD
-
Qun Lin, AMD
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Rex Xu, AMD
-
This extension adds support for the following SPIR-V extension in Vulkan:
editing-note
Shouldn’t the SPV extension be in the Interactions and External Dependencies block? |
Version History
-
Revision 1, 2016-05-10 (Daniel Rakos)
-
Initial draft
-
VK_AMD_shader_fragment_mask
- Name String
-
VK_AMD_shader_fragment_mask
- Extension Type
-
Device extension
- Registered Extension Number
-
138
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Aaron Hagan AaronHaganAMD
-
- Last Modified Date
-
2017-08-16
- IP Status
-
No known IP claims.
- Dependencies
-
-
Requires the SPV_AMD_shader_fragment_mask SPIR-V extension.
-
- Contributors
-
-
Aaron Hagan, AMD
-
Daniel Rakos, AMD
-
Timothy Lottes, AMD
-
This extension provides efficient read access to the fragment mask in compressed multisampled color surfaces. The fragment mask is a lookup table that associates color samples with color fragment values.
From a shader, the fragment mask can be fetched with a call to
fragmentMaskFetchAMD
, which returns a single uint
where each
subsequent four bits specify the color fragment index corresponding to the
color sample, starting from the least significant bit.
For example, when eight color samples are used, the color fragment index for
color sample 0 will be in bits 0-3 of the fragment mask, for color sample 7
the index will be in bits 28-31.
The color fragment for a particular color sample may then be fetched with
the corresponding fragment mask value using the fragmentFetchAMD
shader
function.
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New SPIR-V Capabilities
New Structures
None.
New Functions
None.
Examples
This example shows a shader that queries the fragment mask from a multisampled compressed surface and uses it to query fragment values.
#version 450 core
#extension GL_AMD_shader_fragment_mask: enable
layout(binding = 0) uniform sampler2DMS s2DMS;
layout(binding = 1) uniform isampler2DMSArray is2DMSArray;
layout(binding = 2, input_attachment_index = 0) uniform usubpassInputMS usubpassMS;
layout(location = 0) out vec4 fragColor;
void main()
{
vec4 fragOne = vec4(0.0);
uint fragMask = fragmentMaskFetchAMD(s2DMS, ivec2(2, 3));
uint fragIndex = (fragMask & 0xF0) >> 4;
fragOne += fragmentFetchAMD(s2DMS, ivec2(2, 3), 1);
fragMask = fragmentMaskFetchAMD(is2DMSArray, ivec3(2, 3, 1));
fragIndex = (fragMask & 0xF0) >> 4;
fragOne += fragmentFetchAMD(is2DMSArray, ivec3(2, 3, 1), fragIndex);
fragMask = fragmentMaskFetchAMD(usubpassMS);
fragIndex = (fragMask & 0xF0) >> 4;
fragOne += fragmentFetchAMD(usubpassMS, fragIndex);
fragColor = fragOne;
}
Version History
-
Revision 1, 2017-08-16 (Aaron Hagan)
-
Initial draft
-
VK_AMD_shader_image_load_store_lod
- Name String
-
VK_AMD_shader_image_load_store_lod
- Extension Type
-
Device extension
- Registered Extension Number
-
47
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Dominik Witczak dominikwitczakamd
-
- Last Modified Date
-
2017-08-21
- Interactions and External Dependencies
-
-
This extension requires the SPV_AMD_shader_image_load_store_lod SPIR-V extension.
-
This extension requires GL_AMD_shader_image_load_store_lod for GLSL-based source languages.
-
- IP Status
-
No known IP claims.
- Contributors
-
-
Dominik Witczak, AMD
-
Qun Lin, AMD
-
Rex Xu, AMD
-
This extension adds support for the following SPIR-V extension in Vulkan:
Version History
-
Revision 1, 2017-08-21 (Dominik Witczak)
-
Initial draft
-
VK_AMD_shader_info
- Name String
-
VK_AMD_shader_info
- Extension Type
-
Device extension
- Registered Extension Number
-
43
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jaakko Konttinen jaakkoamd
-
- Last Modified Date
-
2017-10-09
- IP Status
-
No known IP claims.
- Contributors
-
-
Jaakko Konttinen, AMD
-
This extension adds a way to query certain information about a compiled shader which is part of a pipeline. This information may include shader disassembly, shader binary and various statistics about a shader’s resource usage.
While this extension provides a mechanism for extracting this information, the details regarding the contents or format of this information are not specified by this extension and may be provided by the vendor externally.
Furthermore, all information types are optionally supported, and users should not assume every implementation supports querying every type of information.
New Object Types
None.
New Enum Constants
None.
New Enums
New Structures
New Functions
Examples
This example extracts the register usage of a fragment shader within a particular graphics pipeline:
extern VkDevice device;
extern VkPipeline gfxPipeline;
PFN_vkGetShaderInfoAMD pfnGetShaderInfoAMD = (PFN_vkGetShaderInfoAMD)vkGetDeviceProcAddr(
device, "vkGetShaderInfoAMD");
VkShaderStatisticsInfoAMD statistics = {};
size_t dataSize = sizeof(statistics);
if (pfnGetShaderInfoAMD(device,
gfxPipeline,
VK_SHADER_STAGE_FRAGMENT_BIT,
VK_SHADER_INFO_TYPE_STATISTICS_AMD,
&dataSize,
&statistics) == VK_SUCCESS)
{
printf("VGPR usage: %d\n", statistics.resourceUsage.numUsedVgprs);
printf("SGPR usage: %d\n", statistics.resourceUsage.numUsedSgprs);
}
The following example continues the previous example by subsequently attempting to query and print shader disassembly about the fragment shader:
// Query disassembly size (if available)
if (pfnGetShaderInfoAMD(device,
gfxPipeline,
VK_SHADER_STAGE_FRAGMENT_BIT,
VK_SHADER_INFO_TYPE_DISASSEMBLY_AMD,
&dataSize,
nullptr) == VK_SUCCESS)
{
printf("Fragment shader disassembly:\n");
void* disassembly = malloc(dataSize);
// Query disassembly and print
if (pfnGetShaderInfoAMD(device,
gfxPipeline,
VK_SHADER_STAGE_FRAGMENT_BIT,
VK_SHADER_INFO_TYPE_DISASSEMBLY_AMD,
&dataSize,
disassembly) == VK_SUCCESS)
{
printf((char*)disassembly);
}
free(disassembly);
}
Version History
-
Revision 1, 2017-10-09 (Jaakko Konttinen)
-
Initial revision
-
VK_AMD_shader_trinary_minmax
- Name String
-
VK_AMD_shader_trinary_minmax
- Extension Type
-
Device extension
- Registered Extension Number
-
21
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Qun Lin linqun
-
- Last Modified Date
-
2016-05-10
- IP Status
-
No known IP claims.
- Contributors
-
-
Matthaeus G. Chajdas, AMD
-
Qun Lin, AMD
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Rex Xu, AMD
-
This extension adds support for the following SPIR-V extension in Vulkan:
editing-note
Shouldn’t the SPV extension be in the Interactions and External Dependencies block? |
Version History
-
Revision 1, 2016-05-10 (Daniel Rakos)
-
Initial draft
-
VK_AMD_texture_gather_bias_lod
- Name String
-
VK_AMD_texture_gather_bias_lod
- Extension Type
-
Device extension
- Registered Extension Number
-
42
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Rex Xu amdrexu
-
- Last Modified Date
-
2017-03-21
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Requires the SPV_AMD_texture_gather_bias_lod SPIR-V extension.
-
- Contributors
-
-
Dominik Witczak, AMD
-
Daniel Rakos, AMD
-
Graham Sellers, AMD
-
Matthaeus G. Chajdas, AMD
-
Qun Lin, AMD
-
Rex Xu, AMD
-
Timothy Lottes, AMD
-
This extension adds two related features.
Firstly, support for the following SPIR-V extension in Vulkan is added:
-
SPV_AMD_texture_gather_bias_lod
Secondly, the extension allows the application to query which formats can be used together with the new function prototypes introduced by the SPIR-V extension.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_TEXTURE_LOD_GATHER_FORMAT_PROPERTIES_AMD
-
New Enums
None.
New SPIR-V Capabilities
New Structures
New Functions
None.
Examples
struct VkTextureLODGatherFormatPropertiesAMD
{
VkStructureType sType;
const void* pNext;
VkBool32 supportsTextureGatherLODBiasAMD;
};
// ----------------------------------------------------------------------------------------
// How to detect if an image format can be used with the new function prototypes.
VkPhysicalDeviceImageFormatInfo2 formatInfo;
VkImageFormatProperties2 formatProps;
VkTextureLODGatherFormatPropertiesAMD textureLODGatherSupport;
textureLODGatherSupport.sType = VK_STRUCTURE_TYPE_TEXTURE_LOD_GATHER_FORMAT_PROPERTIES_AMD;
textureLODGatherSupport.pNext = nullptr;
formatInfo.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2;
formatInfo.pNext = nullptr;
formatInfo.format = ...;
formatInfo.type = ...;
formatInfo.tiling = ...;
formatInfo.usage = ...;
formatInfo.flags = ...;
formatProps.sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2;
formatProps.pNext = &textureLODGatherSupport;
vkGetPhysicalDeviceImageFormatProperties2(physical_device, &formatInfo, &formatProps);
if (textureLODGatherSupport.supportsTextureGatherLODBiasAMD == VK_TRUE)
{
// physical device supports SPV_AMD_texture_gather_bias_lod for the specified
// format configuration.
}
else
{
// physical device does not support SPV_AMD_texture_gather_bias_lod for the
// specified format configuration.
}
Version History
-
Revision 1, 2017-03-21 (Dominik Witczak)
-
Initial draft
-
VK_ANDROID_external_memory_android_hardware_buffer
- Name String
-
VK_ANDROID_external_memory_android_hardware_buffer
- Extension Type
-
Device extension
- Registered Extension Number
-
130
- Revision
-
3
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_sampler_ycbcr_conversion
-
Requires
VK_KHR_external_memory
-
Requires
VK_EXT_queue_family_foreign
-
- Contact
-
-
Jesse Hall critsec
-
- Last Modified Date
-
2018-03-04
- IP Status
-
No known IP claims.
- Contributors
-
-
Ray Smith, ARM
-
Chad Versace, Google
-
Jesse Hall, Google
-
Tobias Hector, Imagination
-
James Jones, NVIDIA
-
Tony Zlatinski, NVIDIA
-
Matthew Netsch, Qualcomm
-
Andrew Garrard, Samsung
-
This extension enables an application to import Android AHardwareBuffer
objects created outside of the Vulkan device into Vulkan memory objects,
where they can be bound to images and buffers.
It also allows exporting an AHardwareBuffer
from a Vulkan memory object
for symmetry with other operating systems.
But since not all AHardwareBuffer
usages and formats have Vulkan
equivalents, exporting from Vulkan provides strictly less functionality than
creating the AHardwareBuffer
externally and importing it.
Some AHardwareBuffer
images have implementation-defined external
formats that may not correspond to Vulkan formats.
Sampler Y’CbCr conversion can be used to sample from these images and
convert them to a known color space.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_USAGE_ANDROID
-
VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_PROPERTIES_ANDROID
-
VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_FORMAT_PROPERTIES_ANDROID
-
VK_STRUCTURE_TYPE_IMPORT_ANDROID_HARDWARE_BUFFER_INFO_ANDROID
-
VK_STRUCTURE_TYPE_MEMORY_GET_ANDROID_HARDWARE_BUFFER_INFO_ANDROID
-
VK_STRUCTURE_TYPE_EXTERNAL_FORMAT_ANDROID
-
-
Extending VkExternalMemoryHandleTypeFlagBits:
-
VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID
-
New Enums
None.
New Structs
Issues
1) Other external memory objects are represented as weakly-typed handles
(e.g. Win32 HANDLE
or POSIX file descriptor), and require a handle type
parameter along with handles.
AHardwareBuffer
is strongly typed, so naming the handle type is
redundant.
Does symmetry justify adding handle type parameters/fields anyway?
RESOLVED: No.
The handle type is already provided in places that treat external memory
objects generically.
In the places we would add it, the application code that would have to
provide the handle type value is already dealing with
AHardwareBuffer
-specific commands/structures; the extra symmetry would
not be enough to make that code generic.
2) The internal layout and therefore size of a AHardwareBuffer
image
may depend on native usage flags that do not have corresponding Vulkan
counterparts.
Do we provide this info to vkCreateImage somehow, or allow the
allocation size reported by vkGetImageMemoryRequirements to be
approximate?
RESOLVED: Allow the allocation size to be unspecified when allocating the
memory.
It has to work this way for exported image memory anyway, since
AHardwareBuffer
allocation happens in vkAllocateMemory, and
internally is performed by a separate HAL, not the Vulkan implementation
itself.
There is a similar issue with vkGetImageSubresourceLayout: the layout
is determined by the allocator HAL, so it is not known until the image is
bound to memory.
3) Should the result of sampling an external-format image with the suggested
Y’CbCr conversion parameters yield the same results as using a
samplerExternalOES
in OpenGL ES?
RESOLVED: This would be desirable, so that apps converting from OpenGL ES to Vulkan could get the same output given the same input. But since sampling and conversion from Y’CbCr images is so loosely defined in OpenGL ES, multiple implementations do it in a way that doesn’t conform to Vulkan’s requirements. Modifying the OpenGL ES implementation would be difficult, and would change the output of existing unmodified applications. Changing the output only for applications that are being modified gives developers the chance to notice and mitigate any problems. Implementations are encouraged to minimize differences as much as possible without causing compatibility problems for existing OpenGL ES applications or violating Vulkan requirements.
4) Should an AHardwareBuffer
with AHARDWAREBUFFER_USAGE_CPU_
*
usage be mappable in Vulkan? Should it be possible to export an
AHardwareBuffers
with such usage?
RESOLVED: Optional, and mapping in Vulkan is not the same as
AHardwareBuffer_lock
.
The semantics of these are different: mapping in memory is persistent, just
gives a raw view of the memory contents, and does not involve ownership.
AHardwareBuffer_lock
gives the host exclusive access to the buffer, is
temporary, and allows for reformatting copy-in/copy-out.
Implementations are not required to support host-visible memory types for
imported Android hardware buffers or resources backed by them.
If a host-visible memory type is supported and used, the memory can be
mapped in Vulkan, but doing so follows Vulkan semantics: it is just a raw
view of the data and does not imply ownership (this means implementations
must not internally call AHardwareBuffer_lock
to implement
vkMapMemory, or assume the application has done so).
Implementations are not required to support linear-tiled images backed by
Android hardware buffers, even if the AHardwareBuffer
has CPU usage.
There is no reliable way to allocate memory in Vulkan that can be exported
to a AHardwareBuffer
with CPU usage.
5) Android may add new AHardwareBuffer
formats and usage flags over
time.
Can reference to them be added to this extension, or do they need a new
extension?
RESOLVED: This extension can document the interaction between the new AHB
formats/usages and existing Vulkan features.
No new Vulkan features or implementation requirements can be added.
The extension version number will be incremented when this additional
documentation is added, but the version number does not indicate that an
implementaiton supports Vulkan memory or resources that map to the new
AHardwareBuffer
features: support for that must be queried with
vkGetPhysicalDeviceImageFormatProperties2 or is implied by
successfully allocating a AHardwareBuffer
outside of Vulkan that uses
the new feature and has a GPU usage flag.
In essence, these are new features added to a new Android API level, rather than new Vulkan features. The extension will only document how existing Vulkan features map to that new Android feature.
VK_FUCHSIA_imagepipe_surface
- Name String
-
VK_FUCHSIA_imagepipe_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
215
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Craig Stout cdotstout
-
- Last Modified Date
-
2018-07-27
- IP Status
-
No known IP claims.
- Contributors
-
-
Craig Stout, Google
-
Ian Elliott, Google
-
Jesse Hall, Google
-
The VK_FUCHSIA_imagepipe_surface
extension is an instance extension.
It provides a mechanism to create a VkSurfaceKHR object (defined by
the VK_KHR_surface
extension) that refers to a Fuchsia
imagePipeHandle
.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_IMAGEPIPE_SURFACE_CREATE_INFO_FUCHSIA
-
New Enums
None
New Structures
New Functions
Issues
None
Version History
-
Revision 1, 2018-07-27 (Craig Stout)
-
Initial draft.
-
VK_GOOGLE_decorate_string
- Name String
-
VK_GOOGLE_decorate_string
- Extension Type
-
Device extension
- Registered Extension Number
-
225
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Hai Nguyen chaoticbob
-
- Last Modified Date
-
2018-07-09
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Requires the SPV_GOOGLE_decorate_string SPIR-V extension.
-
- Contributors
-
-
Hai Nguyen, Google
-
Neil Henning, AMD
-
The VK_GOOGLE_decorate_string
extension allows use of the
SPV_GOOGLE_decorate_string
extension in SPIR-V shader modules.
New Enum Constants
None.
New Structures
None.
New SPIR-V Capabilities
None.
Issues
Version History
-
Revision 1, 2018-07-09 (Neil Henning)
-
Initial draft
-
VK_GOOGLE_display_timing
- Name String
-
VK_GOOGLE_display_timing
- Extension Type
-
Device extension
- Registered Extension Number
-
93
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_swapchain
-
- Contact
-
-
Ian Elliott ianelliottus
-
- Last Modified Date
-
2017-02-14
- IP Status
-
No known IP claims.
- Contributors
-
-
Ian Elliott, Google
-
Jesse Hall, Google
-
This device extension allows an application that uses the
VK_KHR_swapchain
extension to obtain information about the
presentation engine’s display, to obtain timing information about each
present, and to schedule a present to happen no earlier than a desired time.
An application can use this to minimize various visual anomalies (e.g.
stuttering).
Traditional game and real-time animation applications need to correctly position their geometry for when the presentable image will be presented to the user. To accomplish this, applications need various timing information about the presentation engine’s display. They need to know when presentable images were actually presented, and when they could have been presented. Applications also need to tell the presentation engine to display an image no sooner than a given time. This allows the application to avoid stuttering, so the animation looks smooth to the user.
This extension treats variable-refresh-rate (VRR) displays as if they are fixed-refresh-rate (FRR) displays.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PRESENT_TIMES_INFO_GOOGLE
-
New Enums
None.
New Structures
Issues
None.
Examples
Note
The example code for the this extension (like the |
Version History
-
Revision 1, 2017-02-14 (Ian Elliott)
-
Internal revisions
-
VK_GOOGLE_hlsl_functionality1
- Name String
-
VK_GOOGLE_hlsl_functionality1
- Extension Type
-
Device extension
- Registered Extension Number
-
224
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Hai Nguyen chaoticbob
-
- Last Modified Date
-
2018-07-09
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Requires the SPV_GOOGLE_hlsl_functionality1 SPIR-V extension.
-
- Contributors
-
-
Hai Nguyen, Google
-
Neil Henning, AMD
-
The VK_GOOGLE_hlsl_functionality1
extension allows use of the
SPV_GOOGLE_hlsl_functionality1
extension in SPIR-V shader modules.
New Enum Constants
None.
New Structures
None.
New SPIR-V Capabilities
None.
Issues
Version History
-
Revision 1, 2018-07-09 (Neil Henning)
-
Initial draft
-
VK_IMG_filter_cubic
- Name String
-
VK_IMG_filter_cubic
- Extension Type
-
Device extension
- Registered Extension Number
-
16
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Tobias Hector tobski
-
- Last Modified Date
-
2016-02-23
- Contributors
-
-
Tobias Hector, Imagination Technologies
-
VK_IMG_filter_cubic
adds an additional, high quality cubic filtering mode
to Vulkan, using a Catmull-Rom bicubic filter.
Performing this kind of filtering can be done in a shader by using 16
samples and a number of instructions, but this can be inefficient.
The cubic filter mode exposes an optimized high quality texture sampling
using fixed texture sampling functionality.
New Enum Constants
-
Extending VkFilter:
-
VK_FILTER_CUBIC_IMG
-
-
Extending VkFormatFeatureFlagBits
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_CUBIC_BIT_IMG
-
Example
Creating a sampler with the new filter for both magnification and minification
VkSamplerCreateInfo createInfo =
{
VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO // sType
// Other members set to application-desired values
};
createInfo.magFilter = VK_FILTER_CUBIC_IMG;
createInfo.minFilter = VK_FILTER_CUBIC_IMG;
VkSampler sampler;
VkResult result = vkCreateSampler(
device,
&createInfo,
&sampler);
Version History
-
Revision 1, 2016-02-23 (Tobias Hector)
-
Initial version
-
VK_MVK_ios_surface
- Name String
-
VK_MVK_ios_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
123
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Bill Hollings billhollings
-
- Last Modified Date
-
2017-02-24
- IP Status
-
No known IP claims.
- Contributors
-
-
Bill Hollings, The Brenwill Workshop Ltd.
-
The VK_MVK_ios_surface
extension is an instance extension.
It provides a mechanism to create a VkSurfaceKHR object (defined by
the VK_KHR_surface
extension) that refers to a UIView
, the native
surface type of iOS, which is underpinned by a CAMetalLayer
, to support
rendering to the surface using Apple’s Metal framework.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_IOS_SURFACE_CREATE_INFO_MVK
-
New Enums
None.
New Structures
New Functions
Issues
None.
Version History
-
Revision 1, 2017-02-15 (Bill Hollings)
-
Initial draft.
-
-
Revision 2, 2017-02-24 (Bill Hollings)
-
Minor syntax fix to emphasize firm requirement for UIView to be backed by a CAMetalLayer.
-
VK_MVK_macos_surface
- Name String
-
VK_MVK_macos_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
124
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Bill Hollings billhollings
-
- Last Modified Date
-
2017-02-24
- IP Status
-
No known IP claims.
- Contributors
-
-
Bill Hollings, The Brenwill Workshop Ltd.
-
The VK_MVK_macos_surface
extension is an instance extension.
It provides a mechanism to create a VkSurfaceKHR object (defined by
the VK_KHR_surface
extension) that refers to an NSView
, the
native surface type of macOS, which is underpinned by a CAMetalLayer
,
to support rendering to the surface using Apple’s Metal framework.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_MACOS_SURFACE_CREATE_INFO_MVK
-
New Enums
None.
New Structures
New Functions
Issues
None.
Version History
-
Revision 1, 2017-02-15 (Bill Hollings)
-
Initial draft.
-
-
Revision 2, 2017-02-24 (Bill Hollings)
-
Minor syntax fix to emphasize firm requirement for NSView to be backed by a CAMetalLayer.
-
VK_NN_vi_surface
- Name String
-
VK_NN_vi_surface
- Extension Type
-
Instance extension
- Registered Extension Number
-
63
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_surface
-
- Contact
-
-
Mathias Heyer mheyer
-
- Last Modified Date
-
2016-12-02
- IP Status
-
No known IP claims.
- Contributors
-
-
Mathias Heyer, NVIDIA
-
Michael Chock, NVIDIA
-
Yasuhiro Yoshioka, Nintendo
-
Daniel Koch, NVIDIA
-
The VK_NN_vi_surface
extension is an instance extension.
It provides a mechanism to create a VkSurfaceKHR object (defined by
the VK_KHR_surface
extension) associated with an
nn
::vi
::Layer
.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_VI_SURFACE_CREATE_INFO_NN
-
New Enums
None
New Structures
New Functions
Issues
1) Does VI need a way to query for compatibility between a particular physical device (and queue family?) and a specific VI display?
RESOLVED: No. It is currently always assumed that the device and display will always be compatible.
2) VkViSurfaceCreateInfoNN::pWindow
is intended to store an
nn
::vi
::NativeWindowHandle
, but its declared type is a bare
void
* to store the window handle.
Why the discrepancy?
RESOLVED: It is for C compatibility.
The definition for the VI native window handle type is defined inside the
nn
::vi
C++ namespace.
This prevents its use in C source files.
nn
::vi
::NativeWindowHandle
is always defined to be
void
*, so this extension uses void
* to match.
Version History
-
Revision 1, 2016-12-2 (Michael Chock)
-
Initial draft.
-
VK_NVX_device_generated_commands
- Name String
-
VK_NVX_device_generated_commands
- Extension Type
-
Device extension
- Registered Extension Number
-
87
- Revision
-
3
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Christoph Kubisch pixeljetstream
-
- Last Modified Date
-
2017-07-25
- Contributors
-
-
Pierre Boudier, NVIDIA
-
Christoph Kubisch, NVIDIA
-
Mathias Schott, NVIDIA
-
Jeff Bolz, NVIDIA
-
Eric Werness, NVIDIA
-
Detlef Roettger, NVIDIA
-
Daniel Koch, NVIDIA
-
Chris Hebert, NVIDIA
-
This extension allows the device to generate a number of critical commands for command buffers.
When rendering a large number of objects, the device can be leveraged to implement a number of critical functions, like updating matrices, or implementing occlusion culling, frustum culling, front to back sorting, etc. Implementing those on the device does not require any special extension, since an application is free to define its own data structure, and just process them using shaders.
However, if the application desires to quickly kick off the rendering of the final stream of objects, then unextended Vulkan forces the application to read back the processed stream and issue graphics command from the host. For very large scenes, the synchronization overhead, and cost to generate the command buffer can become the bottleneck. This extension allows an application to generate a device side stream of state changes and commands, and convert it efficiently into a command buffer without having to read it back on the host.
Furthermore, it allows incremental changes to such command buffers by manipulating only partial sections of a command stream — for example pipeline bindings. Unextended Vulkan requires re-creation of entire command buffers in such scenario, or updates synchronized on the host.
The intended usage for this extension is for the application to:
-
create its objects as in unextended Vulkan
-
create a VkObjectTableNVX, and register the various Vulkan objects that are needed to evaluate the input parameters.
-
create a VkIndirectCommandsLayoutNVX, which lists the VkIndirectCommandsTokenTypeNVX it wants to dynamically change as atomic command sequence. This step likely involves some internal device code compilation, since the intent is for the GPU to generate the command buffer in the pipeline.
-
fill the input buffers with the data for each of the inputs it needs. Each input is an array that will be filled with an index in the object table, instead of using CPU pointers.
-
set up a target secondary command buffer
-
reserve command buffer space via vkCmdReserveSpaceForCommandsNVX in a target command buffer at the position you want the generated commands to be executed.
-
call vkCmdProcessCommandsNVX to create the actual device commands for all sequences based on the array contents into a provided target command buffer.
-
execute the target command buffer like a regular secondary command buffer
For each draw/dispatch, the following can be specified:
-
a different pipeline state object
-
a number of descriptor sets, with dynamic offsets
-
a number of vertex buffer bindings, with an optional dynamic offset
-
a different index buffer, with an optional dynamic offset
Applications should register a small number of objects, and use dynamic offsets whenever possible.
While the GPU can be faster than a CPU to generate the commands, it may not happen asynchronously, therefore the primary use-case is generating “less” total work (occlusion culling, classification to use specialized shaders, etc.).
New Object Types
New Flag Types
New Enum Constants
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_OBJECT_TABLE_CREATE_INFO_NVX
-
VK_STRUCTURE_TYPE_INDIRECT_COMMANDS_LAYOUT_CREATE_INFO_NVX
-
VK_STRUCTURE_TYPE_CMD_PROCESS_COMMANDS_INFO_NVX
-
VK_STRUCTURE_TYPE_CMD_RESERVE_SPACE_FOR_COMMANDS_INFO_NVX
-
VK_STRUCTURE_TYPE_DEVICE_GENERATED_COMMANDS_LIMITS_NVX
-
VK_STRUCTURE_TYPE_DEVICE_GENERATED_COMMANDS_FEATURES_NVX
Extending VkPipelineStageFlagBits:
-
VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX
Extending VkAccessFlagBits:
-
VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX
-
VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX
New Enums
New Structures
New Functions
Issues
1) How to name this extension ?
RESOLVED: VK_NVX_device_generated_commands
As usual, one of the hardest issues ;)
Alternatives: VK_gpu_commands
, VK_execute_commands
,
VK_device_commands
, VK_device_execute_commands
, VK_device_execute
,
VK_device_created_commands
, VK_device_recorded_commands
,
VK_device_generated_commands
2) Should we use serial tokens or redundant sequence description?
Similarly to VkPipeline, signatures have the most likelihood to be cross-vendor adoptable. They also benefit from being processable in parallel.
3) How to name sequence description
ExecuteCommandSignature
is a bit long.
Maybe just ExecuteSignature
, or actually more following Vulkan
nomenclature: VkIndirectCommandsLayoutNVX.
4) Do we want to provide indirectCommands
inputs with layout or at
indirectCommands
time?
Separate layout from data as Vulkan does.
Provide full flexibilty for indirectCommands
.
5) Should the input be provided as SoA or AoS?
It is desirable for the application to reuse the list of objects and render them with some kind of an override. This can be done by just selecting a different input for a push constant or a descriptor set, if they are defined as independent arrays. If the data was interleaved, this would not be as easily possible.
Allowing input divisors can also reduce the conservative command buffer allocation.
6) How do we know the size of the GPU command buffer generated by vkCmdProcessCommandsNVX ?
maxSequenceCount
can give an upper estimate, even if the actual count
is sourced from the gpu buffer at (buffer, countOffset).
As such maxSequenceCount
must always be set correctly.
Developers are encouraged to make well use the
VkIndirectCommandsLayoutNVX’s pTokens
[].divisor, as they allow
less conservative storage costs.
Especially pipeline changes on a per-draw basis can be costly memory wise.
7) How to deal with dynamic offsets in DescriptorSets?
Maybe additional token VK_EXECUTE_DESCRIPTOR_SET_OFFSET_COMMAND_NVX
that works for a “single dynamic buffer” descriptor set and then use (32
bit tableEntry + 32bit offset)
added dynamicCount field, variable sized input
8) Should we allow updates to the object table, similar to DescriptorSet?
Desired yes, people may change “material” shaders and not want to recreate the entire register table. However the developer must ensure to not overwrite a registered objectIndex while it is still being used.
9) Should we allow dynamic state changes?
Seems a bit excessive for “per-draw” type of scenario, but GPU could partition work itself with viewport/scissor…
10) How do we allow re-using already “filled” indirectCommands
buffers?
just use a VkCommandBuffer for the output, and it can be reused easily.
11) How portable should such re-use be?
Same as secondary command buffer
12) Should sequenceOrdered be part of IndirectCommandsLayout or vkCmdProcessCommandsNVX?
Seems better for IndirectCommandsLayout, as that is when most heavy lifting in terms of internal device code generation is done.
13) Under which conditions is vkCmdProcessCommandsNVX legal?
Options:
a) on the host command buffer like a regular draw call
b) vkCmdProcessCommandsNVX makes use VkCommandBufferBeginInfo and serves as vkBeginCommandBuffer / vkEndCommandBuffer implicitly.
c) The targetCommandbuffer
must be inside the “begin” state already
at the moment of being passed.
This very likely suggests a new VkCommandBufferUsageFlags
VK_COMMAND_BUFFER_USAGE_DEVICE_GENERATED_BIT
.
d) The targetCommandbuffer
must reserve space via a new function.
used a) and d).
14) What if different pipelines have different DescriptorSetLayouts at a
certain set unit that mismatches in token
.dynamicCount?
Considered legal, as long as the maximum dynamic count of all used DescriptorSetLayouts is provided.
15) Should we add “strides” to input arrays, so that “Array of Structures” type setups can be supported more easily?
Maybe provide a usage flag for packed tokens stream (all inputs from same buffer, implicit stride).
No, given performance test was worse.
16) Should we allow re-using the target command buffer directly, without need to reset command buffer?
YES: new api vkCmdReserveSpaceForCommandsNVX.
17) Is vkCmdProcessCommandsNVX copying the input data or referencing it ?
There are multiple implementations possible:
-
one could have some emulation code that parse the inputs, and generates an output command buffer, therefore copying the inputs.
-
one could just reference the inputs, and have the processing done in pipe at execution time.
If the data is mandated to be copied, then it puts a penalty on implementation that could process the inputs directly in pipe. If the data is “referenced”, then it allows both types of implementation
The inputs are “referenced”, and should not be modified after the call to vkCmdProcessCommandsNVX and until after the rendering of the target command buffer is finished.
18) Why is this NVX and not NV?
To allow early experimentation and feedback. We expect that a version with a refined design as multi-vendor variant will follow up.
19) Should we make the availability for each token type a device limit?
Only distinguish between graphics/compute for now, further splitting up may lead to too much fractioning.
20) When can the objectTable
be modified?
Similar to the other inputs for vkCmdProcessCommandsNVX, only when all device access via vkCmdProcessCommandsNVX or execution of target command buffer has completed can an object at a given objectIndex be unregistered or re-registered again.
21) Which buffer usage flags are required for the buffers referenced by vkCmdProcessCommandsNVX
reuse existing VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT
-
VkCmdProcessCommandsInfoNVX::
sequencesCountBuffer
-
VkCmdProcessCommandsInfoNVX::
sequencesIndexBuffer
-
VkIndirectCommandsTokenNVX::
buffer
22) In which pipeline stage do the device generated command expansion happen?
vkCmdProcessCommandsNVX is treated as if it occurs in a separate
logical pipeline from either graphics or compute, and that pipeline only
includes VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
, a new stage
VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX
, and
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
.
This new stage has two corresponding new access types,
VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX
and
VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX
, used to synchronize reading
the buffer inputs and writing the command buffer memory output.
The output written in the target command buffer is considered to be consumed
by the VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
pipeline stage.
Thus, to synchronize from writing the input buffers to executing vkCmdProcessCommandsNVX, use:
-
dstStageMask
=VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX
-
dstAccessMask
=VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX
To synchronize from executing vkCmdProcessCommandsNVX to executing the generated commands, use
-
srcStageMask
=VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX
-
srcAccessMask
=VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX
-
dstStageMask
=VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
-
dstAccessMask
=VK_ACCESS_INDIRECT_COMMAND_READ_BIT
When vkCmdProcessCommandsNVX is used with a targetCommandBuffer
of NULL
, the generated commands are immediately executed and there is
implicit synchronization between generation and execution.
23) What if most token data is “static”, but we frequently want to render a subsection?
added “sequencesIndexBuffer”. This allows to easier sort and filter what should actually be processed.
Example Code
Open-Source samples illustrating the usage of the extension can be found at the following locations:
// setup secondary command buffer
vkBeginCommandBuffer(generatedCmdBuffer, &beginInfo);
... setup its state as usual
// insert the reservation (there can only be one per command buffer)
// where the generated calls should be filled into
VkCmdReserveSpaceForCommandsInfoNVX reserveInfo = { VK_STRUCTURE_TYPE_CMD_RESERVE_SPACE_FOR_COMMANDS_INFO_NVX };
reserveInfo.objectTable = objectTable;
reserveInfo.indirectCommandsLayout = deviceGeneratedLayout;
reserveInfo.maxSequencesCount = myCount;
vkCmdReserveSpaceForCommandsNVX(generatedCmdBuffer, &reserveInfo);
vkEndCommandBuffer(generatedCmdBuffer);
// trigger the generation at some point in another primary command buffer
VkCmdProcessCommandsInfoNVX processInfo = { VK_STRUCTURE_TYPE_CMD_PROCESS_COMMANDS_INFO_NVX };
processInfo.objectTable = objectTable;
processInfo.indirectCommandsLayout = deviceGeneratedLayout;
processInfo.maxSequencesCount = myCount;
// set the target of the generation (if null we would directly execute with mainCmd)
processInfo.targetCommandBuffer = generatedCmdBuffer;
// provide input data
processInfo.indirectCommandsTokenCount = 3;
processInfo.pIndirectCommandsTokens = myTokens;
// If you modify the input buffer data referenced by VkCmdProcessCommandsInfoNVX,
// ensure you have added the appropriate barriers prior generation process.
// When regenerating the content of the same reserved space, ensure prior operations have completed
VkMemoryBarrier memoryBarrier = { VK_STRUCTURE_TYPE_MEMORY_BARRIER };
memoryBarrier.srcAccessMask = ...;
memoryBarrier.dstAccessMask = VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX;
vkCmdPipelineBarrier(mainCmd,
/*srcStageMask*/VK_PIPELINE_STAGE_ALL_COMMANDS_BIT,
/*dstStageMask*/VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX,
/*dependencyFlags*/0,
/*memoryBarrierCount*/1,
/*pMemoryBarriers*/&memoryBarrier,
...);
vkCmdProcessCommandsNVX(mainCmd, &processInfo);
...
// execute the secondary command buffer and ensure the processing that modifies command-buffer content
// has completed
memoryBarrier.srcAccessMask = VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX;
memoryBarrier.dstAccessMask = VK_ACCESS_INDIRECT_COMMAND_READ_BIT;
vkCmdPipelineBarrier(mainCmd,
/*srcStageMask*/VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX,
/*dstStageMask*/VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT,
/*dependencyFlags*/0,
/*memoryBarrierCount*/1,
/*pMemoryBarriers*/&memoryBarrier,
...)
vkCmdExecuteCommands(mainCmd, 1, &generatedCmdBuffer);
Version History
-
Revision 3, 2017-07-25 (Chris Hebert)
-
Correction to specification of dynamicCount for push_constant token in VkIndirectCommandsLayoutNVX. Stride was incorrectly computed as dynamicCount was not treated as byte size.
-
-
Revision 2, 2017-06-01 (Christoph Kubisch)
-
header compatibility break: add missing _TYPE to VkIndirectCommandsTokenTypeNVX and VkObjectEntryTypeNVX enums to follow Vulkan naming convention
-
behavior clarification: only allow a single work provoking token per sequence when creating a VkIndirectCommandsLayoutNVX
-
-
Revision 1, 2016-10-31 (Christoph Kubisch)
-
Initial draft
-
VK_NVX_multiview_per_view_attributes
- Name String
-
VK_NVX_multiview_per_view_attributes
- Extension Type
-
Device extension
- Registered Extension Number
-
98
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_multiview
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2017-01-13
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension requires the SPV_NVX_multiview_per_view_attributes SPIR-V extension.
-
This extension requires the GL_NVX_multiview_per_view_attributes extension for GLSL source languages.
-
This extension interacts with
VK_NV_viewport_array2
.
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Daniel Koch, NVIDIA
-
This extension adds a new way to write shaders to be used with multiview subpasses, where the attributes for all views are written out by a single invocation of the vertex processing stages. Related SPIR-V and GLSL extensions SPV_NVX_multiview_per_view_attributes and GL_NVX_multiview_per_view_attributes introduce per-view position and viewport mask attributes arrays, and this extension defines how those per-view attribute arrays are interpreted by Vulkan. Pipelines using per-view attributes may only execute the vertex processing stages once for all views rather than once per-view, which reduces redundant shading work.
A subpass creation flag controls whether the subpass uses this extension. A subpass must either exclusively use this extension or not use it at all.
Some Vulkan implementations only support the position attribute varying between views in the X component. A subpass can declare via a second creation flag whether all pipelines compiled for this subpass will obey this restriction.
Shaders that use the new per-view outputs (e.g. gl_PositionPerViewNV
)
must also write the non-per-view output (gl_Position
), and the values
written must be such that gl_Position =
gl_PositionPerViewNV[gl_ViewIndex]
for all views in the subpass.
Implementations are free to either use the per-view outputs or the
non-per-view outputs, whichever would be more efficient.
If VK_NV_viewport_array2
is not also supported and enabled, the
per-view viewport mask must not be used.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_PER_VIEW_ATTRIBUTES_PROPERTIES_NVX
-
-
Extending VkSubpassDescriptionFlagBits
-
VK_SUBPASS_DESCRIPTION_PER_VIEW_ATTRIBUTES_BIT_NVX
-
VK_SUBPASS_DESCRIPTION_PER_VIEW_POSITION_X_ONLY_BIT_NVX
-
New Enums
None.
New Structures
New Functions
None.
New Built-In Variables
New SPIR-V Capabilities
Issues
None.
Examples
#version 450 core
#extension GL_KHX_multiview : enable
#extension GL_NVX_multiview_per_view_attributes : enable
layout(location = 0) in vec4 position;
layout(set = 0, binding = 0) uniform Block { mat4 mvpPerView[2]; } buf;
void main()
{
// Output both per-view positions and gl_Position as a function
// of gl_ViewIndex
gl_PositionPerViewNV[0] = buf.mvpPerView[0] * position;
gl_PositionPerViewNV[1] = buf.mvpPerView[1] * position;
gl_Position = buf.mvpPerView[gl_ViewIndex] * position;
}
Version History
-
Revision 1, 2017-01-13 (Jeff Bolz)
-
Internal revisions
-
VK_NV_clip_space_w_scaling
- Name String
-
VK_NV_clip_space_w_scaling
- Extension Type
-
Device extension
- Registered Extension Number
-
88
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Eric Werness ewerness-nv
-
- Last Modified Date
-
2017-02-15
- Contributors
-
-
Eric Werness, NVIDIA
-
Kedarnath Thangudu, NVIDIA
-
Virtual Reality (VR) applications often involve a post-processing step to apply a “barrel” distortion to the rendered image to correct the “pincushion” distortion introduced by the optics in a VR device. The barrel distorted image has lower resolution along the edges compared to the center. Since the original image is rendered at high resolution, which is uniform across the complete image, a lot of pixels towards the edges do not make it to the final post-processed image.
This extension provides a mechanism to render VR scenes at a non-uniform resolution, in particular a resolution that falls linearly from the center towards the edges. This is achieved by scaling the w coordinate of the vertices in the clip space before perspective divide. The clip space w coordinate of the vertices can be offset as of a function of x and y coordinates as follows:
w' = w + Ax + By
In the intended use case for viewport position scaling, an application should use a set of four viewports, one for each of the four quadrants of a Cartesian coordinate system. Each viewport is set to the dimension of the image, but is scissored to the quadrant it represents. The application should specify A and B coefficients of the w-scaling equation above, that have the same value, but different signs, for each of the viewports. The signs of A and B should match the signs of x and y for the quadrant that they represent such that the value of w' will always be greater than or equal to the original w value for the entire image. Since the offset to w, (Ax + By), is always positive, and increases with the absolute values of x and y, the effective resolution will fall off linearly from the center of the image to its edges.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_W_SCALING_STATE_CREATE_INFO_NV
-
-
Extending VkDynamicState:
-
VK_DYNAMIC_STATE_VIEWPORT_W_SCALING_NV
-
New Enums
None.
New Structures
New Functions
Issues
1) Is the pipeline struct name too long?
RESOLVED: It fits with the naming convention.
2) Separate W scaling section or fold into coordinate transformations?
RESOLVED: Leaving it as its own section for now.
Examples
VkViewport viewports[4];
VkRect2D scissors[4];
VkViewportWScalingNV scalings[4];
for (int i = 0; i < 4; i++) {
int x = (i & 2) ? 0 : currentWindowWidth / 2;
int y = (i & 1) ? 0 : currentWindowHeight / 2;
viewports[i].x = 0;
viewports[i].y = 0;
viewports[i].width = currentWindowWidth;
viewports[i].height = currentWindowHeight;
viewports[i].minDepth = 0.0f;
viewports[i].maxDepth = 1.0f;
scissors[i].offset.x = x;
scissors[i].offset.y = y;
scissors[i].extent.width = currentWindowWidth/2;
scissors[i].extent.height = currentWindowHeight/2;
const float factor = 0.15;
scalings[i].xcoeff = ((i & 2) ? -1.0 : 1.0) * factor;
scalings[i].ycoeff = ((i & 1) ? -1.0 : 1.0) * factor;
}
VkPipelineViewportWScalingStateCreateInfoNV vpWScalingStateInfo = { VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_W_SCALING_STATE_CREATE_INFO_NV };
vpWScalingStateInfo.viewportWScalingEnable = VK_TRUE;
vpWScalingStateInfo.viewportCount = 4;
vpWScalingStateInfo.pViewportWScalings = &scalings[0];
VkPipelineViewportStateCreateInfo vpStateInfo = { VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO };
vpStateInfo.viewportCount = 4;
vpStateInfo.pViewports = &viewports[0];
vpStateInfo.scissorCount = 4;
vpStateInfo.pScissors = &scissors[0];
vpStateInfo.pNext = &vpWScalingStateInfo;
Example shader to read from a w-scaled texture:
// Vertex Shader
// Draw a triangle that covers the whole screen
const vec4 positions[3] = vec4[3](vec4(-1, -1, 0, 1),
vec4( 3, -1, 0, 1),
vec4(-1, 3, 0, 1));
out vec2 uv;
void main()
{
vec4 pos = positions[ gl_VertexID ];
gl_Position = pos;
uv = pos.xy;
}
// Fragment Shader
uniform sampler2D tex;
uniform float xcoeff;
uniform float ycoeff;
out vec4 Color;
in vec2 uv;
void main()
{
// Handle uv as if upper right quadrant
vec2 uvabs = abs(uv);
// unscale: transform w-scaled image into an unscaled image
// scale: transform unscaled image int a w-scaled image
float unscale = 1.0 / (1 + xcoeff * uvabs.x + xcoeff * uvabs.y);
//float scale = 1.0 / (1 - xcoeff * uvabs.x - xcoeff * uvabs.y);
vec2 P = vec2(unscale * uvabs.x, unscale * uvabs.y);
// Go back to the right quadrant
P *= sign(uv);
Color = texture(tex, P * 0.5 + 0.5);
}
Version History
-
Revision 1, 2017-02-15 (Eric Werness)
-
Internal revisions
-
VK_NV_compute_shader_derivatives
- Name String
-
VK_NV_compute_shader_derivatives
- Extension Type
-
Device extension
- Registered Extension Number
-
202
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Pat Brown nvpbrown
-
- Last Modified Date
-
2018-07-19
- IP Status
-
No known IP claims.
- Contributors
-
-
Pat Brown, NVIDIA
-
This extension adds Vulkan support for the
SPV_NV_compute_shader_derivatives
SPIR-V extension.
The SPIR-V extension provides two new execution modes, both of which allow
compute shaders to use built-ins that evaluate compute derivatives
explicitly or implicitly.
Derivatives will be computed via differencing over a 2x2 group of shader
invocations.
The DerivativeGroupQuadsNV
execution mode assembles shader invocations
into 2x2 groups, where each group has x and y coordinates of the local
invocation ID of the form (2m+{0,1}, 2n+{0,1}).
The DerivativeGroupLinearNV
execution mode assembles shader invocations
into 2x2 groups, where each group has local invocation index values of the
form 4m+{0,1,2,3}.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_COMPUTE_SHADER_DERIVATIVES_FEATURES_NV
-
New Enums
None.
New Structures
New Functions
None.
New SPIR-V Capability
Issues
(1) Should we specify that the groups of four shader invocations used for derivatives in a compute shader are the same groups of four invocations that form a “quad” in shader subgroups?
RESOLVED: Yes.
Examples
None.
Version History
-
Revision 1, 2018-07-19 (Pat Brown)
-
Initial draft
-
VK_NV_corner_sampled_image
- Name String
-
VK_NV_corner_sampled_image
- Extension Type
-
Device extension
- Registered Extension Number
-
51
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Koch dgkoch
-
- Last Modified Date
-
2018-08-13
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Pat Brown, NVIDIA
-
Chris Lentini, NVIDIA
-
This extension adds support for a new image organization, which this extension refers to as “corner-sampled” images. A corner-sampled image differs from a conventional image in the following ways:
-
Texels are centered on integer coordinates. See Unnormalized Texel Coordinate Operations
-
Normalized coordinates are scaled using coord * (dim - 1) rather than coord * dim, where dim is the size of one dimension of the image. See normalized texel coordinate transform.
-
Partial derivatives are scaled using coord * (dim - 1) rather than coord * dim. See Scale Factor Operation.
-
Calculation of the next higher lod size goes according to ⌈dim / 2⌉ rather than ⌊dim / 2⌋. See Image Miplevel Sizing.
-
The minimum level size is 2x2 for 2D images and 2x2x2 for 3D images. See Image Miplevel Sizing.
This image organization is designed to facilitate a system like Ptex with separate textures for each face of a subdivision or polygon mesh. Placing sample locations at pixel corners allows applications to maintain continuity between adjacent patches by duplicating values along shared edges. Additionally, using the modified mipmapping logic along with texture dimensions of the form 2n+1 allows continuity across shared edges even if the adjacent patches use different level-of-detail values.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_CORNER_SAMPLED_IMAGE_FEATURES_NV
-
-
Extending VkImageCreateFlagBits
-
VK_IMAGE_CREATE_CORNER_SAMPLED_BIT_NV
-
New Enums
None.
New Structures
New Functions
None.
New Built-In Variables
None.
New SPIR-V Capabilities
None.
Issues
-
What should this extension be named?
DISCUSSION: While naming this extension, we chose the most distinctive aspect of the image organization and referred to such images as “corner-sampled images”. As a result, we decided to name the extension NV_corner_sampled_image.
-
Do we need a format feature flag so formats can advertise if they support corner-sampling?
DISCUSSION: Currently NVIDIA supports this for all 2D and 3D formats, but not for cubemaps or depth-stencil formats. A format feature might be useful if other vendors would only support this on some formats.
-
Do integer texel coordinates have a different range for corner-sampled images?
RESOLVED: No, these are unchanged.
-
Do unnormalized sampler coordinates work with corner-sampled images? Are there any functional differences?
RESOLVED: Yes they work. Unnormalized coordinates are treated as already scaled for corner-sample usage.
-
Should we have a diagram in the “Image Operations” chapter demonstrating different texel sampling locations?
UNRESOLVED: Probaby, but later.
Version History
-
Revision 1, 2018-08-14 (Daniel Koch)
-
Internal revisions
-
VK_NV_device_diagnostic_checkpoints
- Name String
-
VK_NV_device_diagnostic_checkpoints
- Extension Type
-
Device extension
- Registered Extension Number
-
207
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Nuno Subtil nsubtil
-
- Last Modified Date
-
2018-07-16
- Contributors
-
-
Oleg Kuznetsov, NVIDIA
-
Alex Dunn, NVIDIA
-
Jeff Bolz, NVIDIA
-
Eric Werness, NVIDIA
-
Daniel Koch, NVIDIA
-
This extension allows applications to insert markers in the command stream and associate them with custom data.
If a device lost error occurs, the application may then query the implementation for the last markers to cross specific implementation-defined pipeline stages, in order to narrow down which commands were executing at the time and might have caused the failure.
New Object Types
None.
New Enum Constants
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_CHECKPOINT_DATA_NV
-
VK_STRUCTURE_TYPE_QUEUE_FAMILY_CHECKPOINT_PROPERTIES_NV
New Enums
None.
New Structures
New Functions
Issues
None yet!
Version History
-
Revision 1, 2018-07-16 (Nuno Subtil)
-
Internal revisions
-
VK_NV_fill_rectangle
- Name String
-
VK_NV_fill_rectangle
- Extension Type
-
Device extension
- Registered Extension Number
-
154
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2017-05-22
- Contributors
-
-
Jeff Bolz, NVIDIA
-
This extension adds a new VkPolygonMode enum
where a triangle is
rasterized by computing and filling its axis-aligned screen-space bounding
box, disregarding the actual triangle edges.
This can be useful for drawing a rectangle without being split into two
triangles with an internal edge.
It is also useful to minimize the number of primitives that need to be
drawn, particularly for a user interface.
New Object Types
None.
New Enum Constants
-
Extending VkPolygonMode
-
VK_POLYGON_MODE_FILL_RECTANGLE_NV
-
New Enums
None.
New Structures
None.
New Functions
None.
Issues
None.
Version History
-
Revision 1, 2017-05-22 (Jeff Bolz)
-
Internal revisions
-
VK_NV_fragment_coverage_to_color
- Name String
-
VK_NV_fragment_coverage_to_color
- Extension Type
-
Device extension
- Registered Extension Number
-
150
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2017-05-21
- Contributors
-
-
Jeff Bolz, NVIDIA
-
This extension allows the fragment coverage value, represented as an integer
bitmask, to be substituted for a color output being written to a
single-component color attachment with integer components (e.g.
VK_FORMAT_R8_UINT
).
The functionality provided by this extension is different from simply
writing the SampleMask
fragment shader output, in that the coverage
value written to the framebuffer is taken after stencil test and depth test,
as well as after fragment operations such as alpha-to-coverage.
This functionality may be useful for deferred rendering algorithms, where the second pass needs to know which samples belong to which original fragments.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PIPELINE_COVERAGE_TO_COLOR_STATE_CREATE_INFO_NV
-
New Structures
New Functions
None.
Issues
None.
Version History
-
Revision 1, 2017-05-21 (Jeff Bolz)
-
Internal revisions
-
VK_NV_fragment_shader_barycentric
- Name String
-
VK_NV_fragment_shader_barycentric
- Extension Type
-
Device extension
- Registered Extension Number
-
204
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Pat Brown nvpbrown
-
- Last Modified Date
-
2018-08-03
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Requires the SPV_NV_fragment_shader_barycentric SPIR-V extension.
-
Requires the GL_NV_fragment_shader_barycentric extension for GLSL source languages.
-
- Contributors
-
-
Pat Brown, NVIDIA
-
Daniel Koch, NVIDIA
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_NV_fragment_shader_barycentric
The extension provides access to three additional fragment shader variable decorations in SPIR-V:
-
PerVertexNV
, which indicates that a fragment shader input will not have interpolated values, but instead must be accessed with an extra array index that identifies one of the vertices of the primitive producing the fragment -
BaryCoordNV
, which indicates that the variable is a three-component floating-point vector holding barycentric weights for the fragment produced using perspective interpolation -
BaryCoordNoPerspNV
, which indicates that the variable is a three-component floating-point vector holding barycentric weights for the fragment produced using linear interpolation
When using GLSL source-based shader languages, the following variables from GL_NV_fragment_shader_barycentric maps to these SPIR-V built-in decorations:
-
in vec3 gl_BaryCoordNV;
→BaryCoordNV
-
in vec3 gl_BaryCoordNoPerspNV;
→BaryCoordNoPerspNV
GLSL variables declared using the __pervertexNV
GLSL qualifier are
expected to be decorated with PerVertexNV
in SPIR-V.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAGMENT_SHADER_BARYCENTRIC_FEATURES_NV
-
New Enums
None.
New Structures
None.
New Functions
None.
New Built-In Variables
New SPIR-V Decorations
New SPIR-V Capabilities
Issues
(1) The AMD_shader_explicit_vertex_parameter extension provides similar functionality. Why write a new extension, and how is this extension different?
RESOLVED: For the purposes of Vulkan/SPIR-V, we chose to implement a separate extension due to several functional differences.
First, the hardware supporting this extension can provide a three-component
barycentric weight vector for variables decorated with BaryCoordNV
,
while variables decorated with BaryCoordSmoothAMD
provide only two
components.
In some cases, it may be more efficient to explicitly interpolate an
attribute via:
float value = (baryCoordNV.x * v[0].attrib + baryCoordNV.y * v[1].attrib + baryCoordNV.z * v[2].attrib);
instead of
float value = (baryCoordSmoothAMD.x * (v[0].attrib - v[2].attrib) + baryCoordSmoothAMD.y * (v[1].attrib - v[2].attrib) + v[2].attrib);
Additionally, the semantics of the decoration BaryCoordPullModelAMD
do
not appear to map to anything supported by the initial hardware
implementation of this extension.
This extension provides a smaller number of decorations than the AMD
extension, as we expect that shaders could derive variables decorated with
things like BaryCoordNoPerspCentroidAMD
with explicit attribute
interpolation instructions.
One other relevant difference is that explicit per-vertex attribute access
using this extension does not require a constant vertex number.
(2) Why do the built-in SPIR-V decorations for this extension include two
separate built-ins BaryCoordNV
and BaryCoordNoPerspNV
when a
“no perspective” variable could be decorated with BaryCoordNV
and
NoPerspective
?
RESOLVED: The SPIR-V extension for this feature chose to mirror the behavior of the GLSL extension, which provides two built-in variables. Additionally, it’s not clear that its a good idea (or even legal) to have two variables using the “same attribute”, but with different interpolation modifiers.
Version History
-
Revision 1, 2018-08-03 (Pat Brown)
-
Internal revisions
-
VK_NV_framebuffer_mixed_samples
- Name String
-
VK_NV_framebuffer_mixed_samples
- Extension Type
-
Device extension
- Registered Extension Number
-
153
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2017-06-04
- Contributors
-
-
Jeff Bolz, NVIDIA
-
This extension allows multisample rendering with a raster and depth/stencil sample count that is larger than the color sample count. Rasterization and the results of the depth and stencil tests together determine the portion of a pixel that is “covered”. It can be useful to evaluate coverage at a higher frequency than color samples are stored. This coverage is then “reduced” to a collection of covered color samples, each having an opacity value corresponding to the fraction of the color sample covered. The opacity can optionally be blended into individual color samples.
Rendering with fewer color samples than depth/stencil samples greatly reduces the amount of memory and bandwidth consumed by the color buffer. However, converting the coverage values into opacity introduces artifacts where triangles share edges and may not be suitable for normal triangle mesh rendering.
One expected use case for this functionality is Stencil-then-Cover path rendering (similar to the OpenGL GL_NV_path_rendering extension). The stencil step determines the coverage (in the stencil buffer) for an entire path at the higher sample frequency, and then the cover step draws the path into the lower frequency color buffer using the coverage information to antialias path edges. With this two-step process, internal edges are fully covered when antialiasing is applied and there is no corruption on these edges.
The key features of this extension are:
-
It allows render pass and framebuffer objects to be created where the number of samples in the depth/stencil attachment in a subpass is a multiple of the number of samples in the color attachments in the subpass.
-
A coverage reduction step is added to Fragment Operations which converts a set of covered raster/depth/stencil samples to a set of color samples that perform blending and color writes. The coverage reduction step also includes an optional coverage modulation step, multiplying color values by a fractional opacity corresponding to the number of associated raster/depth/stencil samples covered.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType
-
VK_STRUCTURE_TYPE_PIPELINE_COVERAGE_MODULATION_STATE_CREATE_INFO_NV
-
New Structures
New Functions
None.
Issues
None.
Version History
-
Revision 1, 2017-06-04 (Jeff Bolz)
-
Internal revisions
-
VK_NV_geometry_shader_passthrough
- Name String
-
VK_NV_geometry_shader_passthrough
- Extension Type
-
Device extension
- Registered Extension Number
-
96
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Koch dgkoch
-
- Last Modified Date
-
2017-02-15
- Interactions and External Dependencies
-
-
This extension requires the SPV_NV_geometry_shader_passthrough SPIR-V extension.
-
This extension requires the GL_NV_geometry_shader_passthrough extension for GLSL source languages.
-
This extension requires the
geometryShader
feature.
-
- Contributors
-
-
Piers Daniell, NVIDIA
-
Jeff Bolz, NVIDIA
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_NV_geometry_shader_passthrough
Geometry shaders provide the ability for applications to process each
primitive sent through the graphics pipeline using a programmable shader.
However, one common use case treats them largely as a “passthrough”.
In this use case, the bulk of the geometry shader code simply copies inputs
from each vertex of the input primitive to corresponding outputs in the
vertices of the output primitive.
Such shaders might also compute values for additional built-in or
user-defined per-primitive attributes (e.g., Layer
) to be assigned to
all the vertices of the output primitive.
This extension provides access to the PassthroughNV
decoration under
the GeometryShaderPassthroughNV
capability.
Adding this to a geometry shader input variable specifies that the values of
this input are copied to the corresponding vertex of the output primitive.
When using GLSL source-based shading languages, the passthrough
layout
qualifier from GL_NV_geometry_shader_passthrough maps to the
PassthroughNV
decoration.
To use the passthrough
layout, in GLSL the
GL_NV_geometry_shader_passthrough extension must be enabled.
Behaviour is described in the GL_NV_geometry_shader_passthrough extension
specification.
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
New Built-In Variables
None.
New Variable Decoration
New SPIR-V Capabilities
Issues
1) Should we require or allow a passthrough geometry shader to specify the output layout qualifiers for the output primitive type and maximum vertex count in the SPIR-V?
RESOLVED: Yes they should be required in the SPIR-V. Per GL_NV_geometry_shader_passthrough they are not permitted in the GLSL source shader, but SPIR-V is lower-level. It is straightforward for the GLSL compiler to infer them from the input primitive type and to explicitly emit them in the SPIR-V according to the following table.
Input Layout | Implied Output Layout |
---|---|
points |
|
lines |
|
triangles |
|
2) How does interface matching work with passthrough geometry shaders?
RESOLVED: This is described in Passthrough Interface Matching.
In GL when using passthough geometry shaders in separable mode, all inputs
must also be explicitly assigned location layout qualifiers.
In Vulkan all SPIR-V shader inputs (except built-ins) must also have
location decorations specified.
Redeclarations of built-in varables that add the passthrough layout
qualifier are exempted from the rule requiring location assignment because
built-in variables do not have locations and are matched by BuiltIn
decoration.
Sample Code
Consider the following simple geometry shader in unextended GLSL:
layout(triangles) in;
layout(triangle_strip) out;
layout(max_vertices=3) out;
in Inputs {
vec2 texcoord;
vec4 baseColor;
} v_in[];
out Outputs {
vec2 texcoord;
vec4 baseColor;
};
void main()
{
int layer = compute_layer();
for (int i = 0; i < 3; i++) {
gl_Position = gl_in[i].gl_Position;
texcoord = v_in[i].texcoord;
baseColor = v_in[i].baseColor;
gl_Layer = layer;
EmitVertex();
}
}
In this shader, the inputs gl_Position
, Inputs
.texcoord, and
Inputs
.baseColor are simply copied from the input vertex to the
corresponding output vertex.
The only “interesting” work done by the geometry shader is computing and
emitting a gl_Layer
value for the primitive.
The following geometry shader, using this extension, is equivalent:
#extension GL_NV_geometry_shader_passthrough : require
layout(triangles) in;
// No output primitive layout qualifiers required.
// Redeclare gl_PerVertex to pass through "gl_Position".
layout(passthrough) in gl_PerVertex {
vec4 gl_Position;
} gl_in[];
// Declare "Inputs" with "passthrough" to automatically copy members.
layout(passthrough) in Inputs {
vec2 texcoord;
vec4 baseColor;
} v_in[];
// No output block declaration required.
void main()
{
// The shader simply computes and writes gl_Layer. We don't
// loop over three vertices or call EmitVertex().
gl_Layer = compute_layer();
}
Version History
-
Revision 1, 2017-02-15 (Daniel Koch)
-
Internal revisions
-
VK_NV_mesh_shader
- Name String
-
VK_NV_mesh_shader
- Extension Type
-
Device extension
- Registered Extension Number
-
203
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Christoph Kubisch pixeljetstream
-
- Last Modified Date
-
2018-07-19
- Contributors
-
-
Pat Brown, NVIDIA
-
Jeff Bolz, NVIDIA
-
Daniel Koch, NVIDIA
-
Piers Daniell, NVIDIA
-
Pierre Boudier, NVIDIA
-
This extension provides a new mechanism allowing applications to generate collections of geometric primitives via programmable mesh shading. It is an alternative to the existing programmable primitive shading pipeline, which relied on generating input primitives by a fixed function assembler as well as fixed function vertex fetch.
There are new programmable shader types — the task and mesh shader — to generate these collections to be processed by fixed-function primitive assembly and rasterization logic. When the task and mesh shaders are dispatched, they replace the standard programmable vertex processing pipeline, including vertex array attribute fetching, vertex shader processing, tessellation, and the geometry shader processing.
This extension also adds support for the following SPIR-V extension in Vulkan: * SPV_NV_mesh_shader
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MESH_SHADER_FEATURES_NV
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MESH_SHADER_PROPERTIES_NV
-
-
Extending VkShaderStageFlagBits
-
VK_SHADER_STAGE_TASK_BIT_NV
-
VK_SHADER_STAGE_MESH_BIT_NV
-
-
Extending VkPipelineStageFlagBits
-
VK_PIPELINE_STAGE_TASK_SHADER_BIT_NV
-
VK_PIPELINE_STAGE_MESH_SHADER_BIT_NV
-
New Enums
None.
New Structures
New or Modified Built-In Variables
-
(modified)
Position
-
(modified)
PointSize
-
(modified)
ClipDistance
-
(modified)
CullDistance
-
(modified)
PrimitiveId
-
(modified)
Layer
-
(modified)
ViewportIndex
-
(modified)
WorkgroupSize
-
(modified)
WorkgroupId
-
(modified)
LocalInvocationId
-
(modified)
GlobalInvocationId
-
(modified)
LocalInvocationIndex
-
(modified)
DrawIndex
-
(modified)
ViewportMaskNV
-
(modified)
PositionPerViewNV
-
(modified)
ViewportMaskPerViewNV
New SPIR-V Capability
Issues
-
How to name this extension?
RESOLVED: VK_NV_mesh_shader
Other options considered:
-
VK_NV_mesh_shading
-
VK_NV_programmable_mesh_shading
-
VK_NV_primitive_group_shading
-
VK_NV_grouped_drawing
-
-
Do we need a new VkPrimitiveTopology?
RESOLVED: NO, we skip the InputAssembler stage
-
Should we allow Instancing?
RESOLVED: NO, there is no fixed function input, other than the IDs. However, allow offsetting with a "first" value.
-
Should we use existing vkCmdDraw or introduce new functions?
RESOLVED: Introduce new functions.
New functions make it easier to separate from "programmable primitive shading" chapter, less "dual use" language about existing functions having alternative behavior. The text around the existing "draws" is heavily based around emitting vertices.
-
If new functions, how to name?
RESOLVED: CmdDrawMeshTasks*
Other options considered:
-
CmdDrawMeshed
-
CmdDrawTasked
-
CmdDrawGrouped
-
-
Should VK_SHADER_STAGE_ALL_GRAPHICS be updated to include the new stages?
RESOLVED: No. If an application were to be recompiled with headers that include additional shader stage bits in VK_SHADER_STAGE_ALL_GRAPHICS, then the previously valid application would no longer be valid on implementations that don’t support mesh or task shaders. This means the change would not be backwards compatible. It’s too bad VkShaderStageFlagBits doesn’t have a dedicated "all supported graphics stages" bit like VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT, which would have avoided this problem.
Version History
-
Revision 1, 2018-07-19 (Christoph Kubisch, Daniel Koch)
-
Internal revisions
-
VK_NV_ray_tracing
- Name String
-
VK_NV_ray_tracing
- Extension Type
-
Device extension
- Registered Extension Number
-
166
- Revision
-
3
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_get_memory_requirements2
-
- Contact
-
-
Eric Werness ewerness
-
- Last Modified Date
-
2018-11-20
- Interactions and External Dependencies
-
-
This extension requires the SPV_NV_ray_tracing SPIR-V extension.
-
This extension requires the GL_NV_ray_tracing extension for GLSL source languages.
-
- Contributors
-
-
Eric Werness, NVIDIA
-
Ashwin Lele, NVIDIA
-
Robert Stepinski, NVIDIA
-
Nuno Subtil, NVIDIA
-
Christoph Kubisch, NVIDIA
-
Martin Stich, NVIDIA
-
Daniel Koch, NVIDIA
-
Jeff Bolz, NVIDIA
-
Joshua Barczak, Intel
-
Tobias Hector, AMD
-
Henrik Rydgard, NVIDIA
-
Pascal Gautron, NVIDIA
-
Rasterization has been the dominant method to produce interactive graphics, but increasing performance of graphics hardware has made ray tracing a viable option for interactive rendering. Being able to integrate ray tracing with traditional rasterization makes it easier for applications to incrementally add ray traced effects to existing applications or to do hybrid approaches with rasterization for primary visibility and ray tracing for secondary queries.
To enable ray tracing, this extension adds a few different categories of new functionality:
-
Acceleration structure objects and build commands
-
A new pipeline type with new shader domains
-
An indirection table to link shader groups with acceleration structure items
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_NV_ray_tracing
New Object Types
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_RAY_TRACING_PIPELINE_CREATE_INFO_NV
-
VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_CREATE_INFO_NV
-
VK_STRUCTURE_TYPE_GEOMETRY_NV
-
VK_STRUCTURE_TYPE_GEOMETRY_TRIANGLES_NV
-
VK_STRUCTURE_TYPE_GEOMETRY_AABB_NV
-
VK_STRUCTURE_TYPE_BIND_ACCELERATION_STRUCTURE_MEMORY_INFO_NV
-
VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET_ACCELERATION_STRUCTURE_NV
-
VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_MEMORY_REQUIREMENTS_INFO_NV
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_RAY_TRACING_PROPERTIES_NV
-
VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_INFO_NV
-
VK_STRUCTURE_TYPE_RAY_TRACING_SHADER_GROUP_CREATE_INFO_NV
-
-
Extending VkShaderStageFlagBits:
-
VK_SHADER_STAGE_RAYGEN_BIT_NV
-
VK_SHADER_STAGE_ANY_HIT_BIT_NV
-
VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV
-
VK_SHADER_STAGE_MISS_BIT_NV
-
VK_SHADER_STAGE_INTERSECTION_BIT_NV
-
VK_SHADER_STAGE_CALLABLE_BIT_NV
-
-
Extending VkPipelineStageFlagBits:
-
VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_NV
-
VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_NV
-
-
Extending VkBufferUsageFlagBits:
-
VK_BUFFER_USAGE_RAY_TRACING_BIT_NV
-
-
Extending VkPipelineBindPoint:
-
VK_PIPELINE_BIND_POINT_RAY_TRACING_NV
-
-
Extending VkDescriptorType:
-
VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV
-
-
Extending VkAccessFlagBits:
-
VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_NV
-
VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_NV
-
-
Extending VkQueryType:
-
VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_NV
-
-
Extending VkPipelineCreateFlagBits:
-
VK_PIPELINE_CREATE_DEFER_COMPILE_BIT_NV
-
-
Extending VkIndexType:
-
VK_INDEX_TYPE_NONE_NV
-
New Enums
New Structures
New Functions
New or Modified Built-In Variables
New SPIR-V Capabilities
Issues
1) Are there issues?
RESOLVED: Yes.
Sample Code
Example ray generation GLSL shader
#version 450 core
#extension GL_NV_ray_tracing : require
layout(set = 0, binding = 0, rgba8) uniform image2D image;
layout(set = 0, binding = 1) uniform accelerationStructureNV as;
layout(location = 0) rayPayloadNV float payload;
void main()
{
vec4 col = vec4(0, 0, 0, 1);
vec3 origin = vec3(float(gl_LaunchIDNV.x)/float(gl_LaunchSizeNV.x), float(gl_LaunchIDNV.y)/float(gl_LaunchSizeNV.y), 1.0);
vec3 dir = vec3(0.0, 0.0, -1.0);
traceNV(as, 0, 0xff, 0, 1, 0, origin, 0.0, dir, 1000.0, 0);
col.y = payload;
imageStore(image, ivec2(gl_LaunchIDNV.xy), col);
}
Version History
-
Revision 1, 2018-09-11 (Robert Stepinski, Nuno Subtil, Eric Werness)
-
Internal revisions
-
-
Revision 2, 2018-10-19 (Eric Werness)
-
rename to VK_NV_ray_tracing, add support for callables.
-
too many updates to list
-
-
Revision 3, 2018-11-20 (Daniel Koch)
-
update to use InstanceId instead of InstanceIndex as implemented.
-
VK_NV_representative_fragment_test
- Name String
-
VK_NV_representative_fragment_test
- Extension Type
-
Device extension
- Registered Extension Number
-
167
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Kedarnath Thangudu kthangudu
-
- Last Modified Date
-
2018-09-13
- Contributors
-
-
Kedarnath Thangudu, NVIDIA
-
Christoph Kubisch, NVIDIA
-
Pierre Boudier, NVIDIA
-
Pat Brown, NVIDIA
-
Jeff Bolz, NVIDIA
-
Eric Werness, NVIDIA
-
This extension provides a new representative fragment test that allows implementations to reduce the amount of rasterization and fragment processing work performed for each point, line, or triangle primitive. For any primitive that produces one or more fragments that pass all other early fragment tests, the implementation is permitted to choose one or more “representative” fragments for processing and discard all other fragments. For draw calls rendering multiple points, lines, or triangles arranged in lists, strips, or fans, the representative fragment test is performed independently for each of those primitives.
This extension is useful for applications that use an early render pass to determine the full set of primitives that would be visible in the final scene. In this render pass, such applications would set up a fragment shader that enables early fragment tests and writes to an image or shader storage buffer to record the ID of the primitive that generated the fragment. Without this extension, the shader would record the ID separately for each visible fragment of each primitive. With this extension, fewer stores will be performed, particularly for large primitives.
The representative fragment test has no effect if early fragment tests are not enabled via the fragment shader. The set of fragments discarded by the representative fragment test is implementation-dependent and may vary from frame to frame. In some cases, the representative fragment test may not discard any fragments for a given primitive.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_REPRESENTATIVE_FRAGMENT_TEST_FEATURES_NV
-
VK_STRUCTURE_TYPE_PIPELINE_REPRESENTATIVE_FRAGMENT_TEST_STATE_CREATE_INFO_NV
-
New Enums
None.
New Structures
New Functions
None.
Issues
(1) Is the representative fragment test guaranteed to have any effect?
RESOLVED: No. As specified, we only guarantee that each primitive with at least one fragment that passes prior tests will have one fragment passing the representative fragment tests. We don’t guarantee that any particular fragment will fail the test.
In the initial implementation of this extension, the representative fragment test is treated as an optimization that may be completely disabled for some pipeline states. This feature was designed for a use case where the fragment shader records information on individual primitives using shader storage buffers or storage images, with no writes to color or depth buffers.
(2) Will the set of fragments that pass the representative fragment test be repeatable if you draw the same scene over and over again?
RESOLVED: No. The set of fragments that pass the representative fragment test is implementation-dependent and may vary due to the timing of operations performed by the GPU.
(3) What happens if you enable the representative fragment test with writes to color and/or depth render targets enabled?
RESOLVED: If writes to the color or depth buffer are enabled, they will be performed for any fragments that survive the relevant tests. Any fragments that fail the representative fragment test will not update color buffers. For the use cases intended for this feature, we don’t expect color or depth writes to be enabled.
(4) How do derivatives and automatic texture level of detail computations work with the representative fragment test enabled?
RESOLVED: If a fragment shader uses derivative functions or texture lookups using automatic level of detail computation, derivatives will be computed identically whether or not the representative fragment test is enabled. For the use cases intended for this feature, we don’t expect the use of derivatives in the fragment shader.
Version History
-
Revision 2, 2018-09-13 (pbrown)
-
Add issues.
-
-
Revision 1, 2018-08-22 (Kedarnath Thangudu)
-
Internal Revisions
-
VK_NV_sample_mask_override_coverage
- Name String
-
VK_NV_sample_mask_override_coverage
- Extension Type
-
Device extension
- Registered Extension Number
-
95
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Piers Daniell pdaniell-nv
-
- Last Modified Date
-
2016-12-08
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension requires the SPV_NV_sample_mask_override_coverage SPIR-V extension.
-
This extension requires the GL_NV_sample_mask_override_coverage extension for GLSL source languages.
-
- Contributors
-
-
Daniel Koch, NVIDIA
-
Jeff Bolz, NVIDIA
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_NV_sample_mask_override_coverage
The extension provides access to the OverrideCoverageNV
decoration
under the SampleMaskOverrideCoverageNV
capability.
Adding this decoration to a variable with the SampleMask
builtin
decoration allows the shader to modify the coverage mask and affect which
samples are used to process the fragment.
When using GLSL source-based shader languages, the override_coverage
layout qualifier from GL_NV_sample_mask_override_coverage maps to the
OverrideCoverageNV
decoration.
To use the override_coverage
layout qualifier in GLSL the
GL_NV_sample_mask_override_coverage extension must be enabled.
Behavior is described in the GL_NV_sample_mask_override_coverage extension
spec.
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
New Built-In Variables
None.
New Variable Decoration
New SPIR-V Capabilities
Issues
None.
Version History
-
Revision 1, 2016-12-08 (Piers Daniell)
-
Internal revisions
-
VK_NV_scissor_exclusive
- Name String
-
VK_NV_scissor_exclusive
- Extension Type
-
Device extension
- Registered Extension Number
-
206
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Pat Brown nvpbrown
-
- Last Modified Date
-
2018-07-31
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
None
- Contributors
-
-
Pat Brown, NVIDIA
-
Jeff Bolz, NVIDIA
-
Piers Daniell, NVIDIA
-
Daniel Koch, NVIDIA
-
This extension adds support for an exclusive scissor test to Vulkan. The exclusive scissor test behaves like the scissor test, except that the exclusive scissor test fails for pixels inside the corresponding rectangle and passes for pixels outside the rectangle. If the same rectangle is used for both the scissor and exclusive scissor tests, the exclusive scissor test will pass if and only if the scissor test fails.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType
-
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_EXCLUSIVE_SCISSOR_STATE_CREATE_INFO_NV
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXCLUSIVE_SCISSOR_FEATURES_NV
-
-
Extending VkDynamicState
-
VK_DYNAMIC_STATE_EXCLUSIVE_SCISSOR_NV
-
New Enums
None.
New Structures
New Functions
New Built-In Variables
None.
New SPIR-V Capabilities
None.
Issues
1) For the scissor test, the viewport state must be created with a matching number of scissor and viewport rectangles. Should we have the same requirement for exclusive scissors?
RESOLVED: For exclusive scissors, we relax this requirement and allow an exclusive scissor rectangle count that is either zero or equal to the number of viewport rectangles. If you pass in an exclusive scissor count of zero, the exclusive scissor test is treated as disabled.
Version History
-
Revision 1, 2018-07-31 (Pat Brown)
-
Internal revisions
-
VK_NV_shader_image_footprint
- Name String
-
VK_NV_shader_image_footprint
- Extension Type
-
Device extension
- Registered Extension Number
-
205
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Pat Brown nvpbrown
-
- Last Modified Date
-
2018-09-13
- IP Status
-
No known IP claims.
- Contributors
-
-
Pat Brown, NVIDIA
-
Chris Lentini, NVIDIA
-
Daniel Koch, NVIDIA
-
Jeff Bolz, NVIDIA
-
This extension adds Vulkan support for the SPV_NV_shader_image_footprint
SPIR-V extension.
That SPIR-V extension provides a new instruction
OpImageSampleFootprintNV
allowing shaders to determine the set of
texels that would be accessed by an equivalent filtered texture lookup.
Instead of returning a filtered texture value, the instruction returns a structure that can be interpreted by shader code to determine the footprint of a filtered texture lookup. This structure includes integer values that identify a small neighborhood of texels in the image being accessed and a bitfield that indicates which texels in that neighborhood would be used. The structure also includes a bitfield where each bit identifies whether any texel in a small aligned block of texels would be fetched by the texture lookup. The size of each block is specified by an access granularity provided by the shader. The minimum granularity supported by this extension is 2x2 (for 2D textures) and 2x2x2 (for 3D textures); the maximum granularity is 256x256 (for 2D textures) or 64x32x32 (for 3D textures). Each footprint query returns the footprint from a single texture level. When using minification filters that combine accesses from multiple mipmap levels, shaders must perform separate queries for the two levels accessed (“fine” and “coarse”). The footprint query also returns a flag indicating if the texture lookup would access texels from only one mipmap level or from two neighboring levels.
This extension should be useful for multi-pass rendering operations that do an initial expensive rendering pass to produce a first image that is then used as a texture for a second pass. If the second pass ends up accessing only portions of the first image (e.g., due to visbility), the work spent rendering the non-accessed portion of the first image was wasted. With this feature, an application can limit this waste using an initial pass over the geometry in the second image that performs a footprint query for each visible pixel to determine the set of pixels that it needs from the first image. This pass would accumulate an aggregate footprint of all visible pixels into a separate “footprint image” using shader atomics. Then, when rendering the first image, the application can kill all shading work for pixels not in this aggregate footprint.
This extension has a number of limitations.
The OpImageSampleFootprintNV
instruction only supports for two- and
three-dimensional textures.
Footprint evaluation only supports the CLAMP_TO_EDGE wrap mode; results are
undefined for all other wrap modes.
Only a limited set of granularity values and that set does not support
separate coverage information for each texel in the original image.
When using SPIR-V generated from the OpenGL Shading Language, the new
instruction will be generated from code using the new
textureFootprint
*NV built-in functions from the
GL_NV_shader_texture_footprint
shading language extension.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_IMAGE_FOOTPRINT_FEATURES_NV
-
New Enums
None.
New Structures
New Functions
None.
New SPIR-V Capability
Issues
(1) The footprint returned by the SPIR-V instruction is a structure that includes an anchor, an offset, and a mask that represents a 8x8 or 4x4x4 neighborhood of texel groups. But the bits of the mask are not stored in simple pitch order. Why is the footprint built this way?
RESOLVED: We expect that applications using this feature will want to use a fixed granularity and accumulate coverage information from the returned footprints into an aggregate “footprint image” that tracks the portions of an image that would be needed by regular texture filtering. If an application is using a two-dimensional image with 4x4 pixel granularity, we expect that the footprint image will use 64-bit texels where each bit in an 8x8 array of bits corresponds to coverage for a 4x4 block in the original image. Texel (0,0) in the footprint image would correspond to texels (0,0) through (31,31) in the original image.
In the usual case, the footprint for a single access will fully contained in a 32x32 aligned region of the original texture, which corresponds to a single 64-bit texel in the footprint image. In that case, the implementation will return an anchor coordinate pointing at the single footprint image texel, an offset vector of (0,0), and a mask whose bits are aligned with the bits in the footprint texel. For this case, the shader can simply atomically OR the mask bits into the contents of the footprint texel to accumulate footprint coverage.
In the worst case, the footprint for a single access spans multiple 32x32 aligned regions and may require updates to four separate footprint image texels. In this case, the implementation will return an anchor coordinate pointing at the lower right footprint image texel and an offset will identify how many “columns” and “rows” of the returned 8x8 mask correspond to footprint texels to the left and above the anchor texel. If the anchor is (2,3), the 64 bits of the returned mask are arranged spatially as follows, where each 4x4 block is assigned a bit number that matches its bit number in the footprint image texels:
+-------------------------+-------------------------+ | -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- | | -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- | | -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- | | -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- | | -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- | | -- -- -- -- -- -- 46 47 | 40 41 42 43 44 45 -- -- | | -- -- -- -- -- -- 54 55 | 48 49 50 51 52 53 -- -- | | -- -- -- -- -- -- 62 63 | 56 57 58 59 60 61 -- -- | +-------------------------+-------------------------+ | -- -- -- -- -- -- 06 07 | 00 01 02 03 04 05 -- -- | | -- -- -- -- -- -- 14 15 | 08 09 10 11 12 13 -- -- | | -- -- -- -- -- -- 22 23 | 16 17 18 19 20 21 -- -- | | -- -- -- -- -- -- 30 31 | 24 25 26 27 28 29 -- -- | | -- -- -- -- -- -- 38 39 | 32 33 34 35 36 37 -- -- | | -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- | | -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- | | -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- | +-------------------------+-------------------------+
To accumulate coverage for each of the four footprint image texels, a shader can AND the returned mask with simple masks derived from the x and y offset values and then atomically OR the updated mask bits into the contents of the corresponding footprint texel.
uint64_t returnedMask = (uint64_t(footprint.mask.x) | (uint64_t(footprint.mask.y) << 32));
uint64_t rightMask = ((0xFF >> footprint.offset.x) * 0x0101010101010101UL);
uint64_t bottomMask = 0xFFFFFFFFFFFFFFFFUL >> (8 * footprint.offset.y);
uint64_t bottomRight = returnedMask & bottomMask & rightMask;
uint64_t bottomLeft = returnedMask & bottomMask & (~rightMask);
uint64_t topRight = returnedMask & (~bottomMask) & rightMask;
uint64_t topLeft = returnedMask & (~bottomMask) & (~rightMask);
(2) What should an application do to ensure maximum performance when accumulating footprints into an aggregate footprint image?
RESOLVED: We expect that the most common usage of this feature will be to accumulate aggregate footprint coverage, as described in the previous issue. Even if you ignore the anisotropic filtering case where the implementation may return a granularity larger than that requested by the caller, each shader invocation will need to use atomic functions to update up to four footprint image texels for each level of detail accessed. Having each active shader invocation perform multiple atomic operations can be expensive, particularly when neighboring invocations will want to update the same footprint image texels.
Techniques can be used to reduce the number of atomic operations performed when accumulating coverage include:
-
Have logic that detects returned footprints where all components of the returned offset vector are zero. In that case, the mask returned by the footprint function is guaranteed to be aligned with the footprint image texels and affects only a single footprint image texel.
-
Have fragment shaders communicate using built-in functions from the
VK_NV_shader_subgroup_partitioned
extension or other shader subgroup extensions. If you have multiple invocations in a subgroup that need to update the same texel (x,y) in the footprint image, compute an aggregate footprint mask across all invocations in the subgroup updating that texel and have a single invocation perform an atomic operation using that aggregate mask. -
When the returned footprint spans multiple texels in the footprint image, each invocation need to perform four atomic operations. In the previous issue, we had an example that computed separate masks for “topLeft”, “topRight”, “bottomLeft”, and “bottomRight”. When the invocations in a subgroup have good locality, it might be the case the “top left” for some invocations might refer to footprint image texel (10,10), while neighbors might have their “top left” texels at (11,10), (10,11), and (11,11). If you compute separate masks for even/odd x and y values instead of left/right or top/bottom, the “odd/odd” mask for all invocations in the subgroup hold coverage for footprint image texel (11,11), which can be updated by a single atomic operation for the entire subgroup.
Examples
TBD
Version History
-
Revision 2, 2018-09-13 (Pat Brown)
-
Add issue (2) with performance tips.
-
-
Revision 1, 2018-08-12 (Pat Brown)
-
Initial draft
-
VK_NV_shader_subgroup_partitioned
- Name String
-
VK_NV_shader_subgroup_partitioned
- Extension Type
-
Device extension
- Registered Extension Number
-
199
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.1
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2018-03-17
- Contributors
-
-
Jeff Bolz, NVIDIA
-
This extension enables support for a new class of subgroup operations via
the
GL_NV_shader_subgroup_partitioned
GLSL extension and
SPV_NV_shader_subgroup_partitioned
SPIR-V extension.
Support for these new operations is advertised via the
VK_SUBGROUP_FEATURE_PARTITIONED_BIT_NV
bit.
This extension requires Vulkan 1.1, for general subgroup support.
New Object Types
None.
New Enum Constants
-
Extending VkSubgroupFeatureFlagBits:
-
VK_SUBGROUP_FEATURE_PARTITIONED_BIT_NV
-
New Enums
None.
New Structures
None.
New Functions
None.
Issues
None.
Version History
-
Revision 1, 2018-03-17 (Jeff Bolz)
-
Internal revisions
-
VK_NV_shading_rate_image
- Name String
-
VK_NV_shading_rate_image
- Extension Type
-
Device extension
- Registered Extension Number
-
165
- Revision
-
3
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Pat Brown nvpbrown
-
- Last Modified Date
-
2018-09-13
- Contributors
-
-
Pat Brown, NVIDIA
-
Carsten Rohde, NVIDIA
-
Jeff Bolz, NVIDIA
-
Daniel Koch, NVIDIA
-
Mathias Schott, NVIDIA
-
Matthew Netsch, Qualcomm Technologies, Inc.
-
This extension allows applications to use a variable shading rate when processing fragments of rasterized primitives. By default, Vulkan will spawn one fragment shader for each pixel covered by a primitive. In this extension, applications can bind a shading rate image that can be used to vary the number of fragment shader invocations across the framebuffer. Some portions of the screen may be configured to spawn up to 16 fragment shaders for each pixel, while other portions may use a single fragment shader invocation for a 4x4 block of pixels. This can be useful for use cases like eye tracking, where the portion of the framebuffer that the user is looking at directly can be processed at high frequency, while distant corners of the image can be processed at lower frequency. Each texel in the shading rate image represents a fixed-size rectangle in the framebuffer, covering 16x16 pixels in the initial implementation of this extension. When rasterizing a primitive covering one of these rectangles, the Vulkan implementation reads a texel in the bound shading rate image and looks up the fetched value in a palette to determine a base shading rate.
In addition to the API support controlling rasterization, this extension also adds Vulkan support for the SPV_NV_shading_rate extension to SPIR-V. That extension provides two fragment shader variable decorations that allow fragment shaders to determine the shading rate used for processing the fragment:
-
FragmentSizeNV
, which indicates the width and height of the set of pixels processed by the fragment shader. -
InvocationsPerPixel
, which indicates the maximum number of fragment shader invocations that could be spawned for the pixel(s) covered by the fragment.
When using SPIR-V in conjunction with the OpenGL Shading Language (GLSL),
the fragment shader capabilities are provided by the
GL_NV_shading_rate_image language extension and correspond to the built-in
variables gl_FragmentSizeNV
and gl_InvocationsPerPixelNV
,
respectively.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_SHADING_RATE_IMAGE_STATE_CREATE_INFO_NV
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADING_RATE_IMAGE_FEATURES_NV
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADING_RATE_IMAGE_PROPERTIES_NV
-
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_COARSE_SAMPLE_ORDER_STATE_CREATE_INFO_NV
-
-
Extending VkImageLayout:
-
VK_IMAGE_LAYOUT_SHADING_RATE_OPTIMAL_NV
-
-
Extending VkDynamicState:
-
VK_DYNAMIC_STATE_VIEWPORT_SHADING_RATE_PALETTE_NV
-
-
Extending VkAccessFlagBits:
-
VK_ACCESS_SHADING_RATE_IMAGE_READ_BIT_NV
-
-
Extending VkImageUsageFlagBits:
-
VK_IMAGE_USAGE_SHADING_RATE_IMAGE_BIT_NV
-
-
Extending VkPipelineStageFlagBits
-
VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV
-
New Enums
-
VkShadingRatePaletteEntryNV, containing the following constants:
-
VK_SHADING_RATE_PALETTE_ENTRY_NO_INVOCATIONS_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_16_INVOCATIONS_PER_PIXEL_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_8_INVOCATIONS_PER_PIXEL_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_4_INVOCATIONS_PER_PIXEL_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_2_INVOCATIONS_PER_PIXEL_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_PIXEL_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_2X1_PIXELS_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_1X2_PIXELS_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_2X2_PIXELS_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_4X2_PIXELS_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_2X4_PIXELS_NV
-
VK_SHADING_RATE_PALETTE_ENTRY_1_INVOCATION_PER_4X4_PIXELS_NV
-
New Structures
Issues
(1) When using shading rates that specify “coarse” fragments covering multiple pixels, we will generate a combined coverage mask that combines the coverage masks of all pixels covered by the fragment. By default, these masks are combined in an implementation-dependent order. Should we provide a mechanism allowing applications to query or specify an exact order?
RESOLVED: Yes, this feature is useful for cases where most of the fragment shader can be evaluated once for an entire coarse fragment, but where some per-pixel computations are also required. For example, a per-pixel alpha test may want to kill all the samples for some pixels in a coarse fragment. This sort of test can be implemented using an output sample mask, but such a shader would need to know which bit in the mask corresponds to each sample in the coarse fragment. We are including a mechanism to allow aplications to specify the orders of coverage samples for each shading rate and sample count, either as static pipeline state or dynamically via a command buffer. This portion of the extension has its own feature bit.
We will not be providing a query to determine the implementation-dependent default ordering. The thinking here is that if an application cares enough about the coarse fragment sample ordering to perform such a query, it could instead just set its own order, also using custom per-pixel sample locations if required.
(2) For the pipeline stage
VK_PIPELINE_STAGE_SHADING_RATE_IMAGE_BIT_NV
, should we specify a
precise location in the pipeline the shading rate image is accessed
(after geometry shading, but before the early fragment tests) or leave
it under-specified in case there are other implementations that access
the image in a different pipeline location?
RESOLVED We are specifying the pipeline stage to be between the final
stage used for vertex processing
(VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT
) and before the first stage
used for fragment processing
(VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
), which seems to be the
natural place to access the shading rate image.
(3) How do centroid-sampled variables work with fragments larger than one pixel?
RESOLVED For single-pixel fragments, fragment shader inputs decorated with
Centroid
are sampled at an implementation-dependent location in the
intersection of the area of the primitive being rasterized and the area of
the pixel that corresponds to the fragment.
With multi-pixel fragments, we follow a similar pattern, using the
intersection of the primitive and the set of pixels corresponding to the
fragment.
One important thing to keep in mind when using such “coarse” shading rates is that fragment attributes are sampled at the center of the fragment by default, regardless of the set of pixels/samples covered by the fragment. For fragments with a size of 4x4 pixels, this center location will be more than two pixels (1.5 * sqrt(2)) away from the center of the pixels at the corners of the fragment. When rendering a primitive that covers only a small part of a coarse fragment, sampling a color outside the primitive can produce overly bright or dark color values if the color values have a large gradient. To deal with this, an application can use centroid sampling on attributes where “extrapolation” artifacts can lead to overly bright or dark pixels. Note that this same problem also exists for multisampling with single-pixel fragments, but is less severe because it only affects certain samples of a pixel and such bright/dark samples may be averaged with other samples that don’t have a similar problem.
Version History
-
Revision 2, 2018-09-13 (Pat Brown)
-
Miscellaneous edits preparing the specification for publication.
-
-
Revision 1, 2018-08-08 (Pat Brown)
-
Internal revisions
-
VK_NV_viewport_array2
- Name String
-
VK_NV_viewport_array2
- Extension Type
-
Device extension
- Registered Extension Number
-
97
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Daniel Koch dgkoch
-
- Last Modified Date
-
2017-02-15
- Interactions and External Dependencies
-
-
This extension requires the SPV_NV_viewport_array2 SPIR-V extension.
-
This extension requires the GL_NV_viewport_array2 extension for GLSL source languages.
-
This extension requires the
geometryShader
andmultiViewport
features. -
This extension interacts with the
tessellationShader
feature.
-
- Contributors
-
-
Piers Daniell, NVIDIA
-
Jeff Bolz, NVIDIA
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_NV_viewport_array2
which allows a single primitive to be broadcast to multiple viewports and/or
multiple layers.
A new shader built-in output ViewportMaskNV
is provided, which allows a
single primitive to be output to multiple viewports simultaneously.
Also, a new SPIR-V decoration is added to control whether the effective
viewport index is added into the variable decorated with the Layer
built-in decoration.
These capabilities allow a single primitive to be output to multiple layers
simultaneously.
This extension allows variables decorated with the Layer
and
ViewportIndex
built-ins to be exported from vertex or tessellation
shaders, using the ShaderViewportIndexLayerNV
capability.
This extension adds a new ViewportMaskNV
built-in decoration that is
available for output variables in vertex, tessellation evaluation, and
geometry shaders, and a new ViewportRelativeNV
decoration that can be
added on variables decorated with Layer
when using the
ShaderViewportMaskNV
capability.
When using GLSL source-based shading languages, the gl_ViewportMask
[]
built-in output variable and viewport_relative
layout qualifier from
GL_NV_viewport_array2 map to the ViewportMaskNV
and
ViewportRelativeNV
decorations, respectively.
Behaviour is described in the GL_NV_viewport_array2 extension
specificiation.
Note
The |
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
New or Modified Built-In Variables
-
(modified)
Layer
-
(modified)
ViewportIndex
New Variable Decoration
New SPIR-V Capabilities
Issues
None yet!
Version History
-
Revision 1, 2017-02-15 (Daniel Koch)
-
Internal revisions
-
VK_NV_viewport_swizzle
- Name String
-
VK_NV_viewport_swizzle
- Extension Type
-
Device extension
- Registered Extension Number
-
99
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Piers Daniell pdaniell-nv
-
- Last Modified Date
-
2016-12-22
- Interactions and External Dependencies
-
-
This extension requires
multiViewport
andgeometryShader
features to be useful.
-
- Contributors
-
-
Daniel Koch, NVIDIA
-
Jeff Bolz, NVIDIA
-
This extension provides a new per-viewport swizzle that can modify the position of primitives sent to each viewport. New viewport swizzle state is added for each viewport, and a new position vector is computed for each vertex by selecting from and optionally negating any of the four components of the original position vector.
This new viewport swizzle is useful for a number of algorithms, including single-pass cubemap rendering (broadcasting a primitive to multiple faces and reorienting the vertex position for each face) and voxel rasterization. The per-viewport component remapping and negation provided by the swizzle allows application code to re-orient three-dimensional geometry with a view along any of the X, Y, or Z axes. If a perspective projection and depth buffering is required, 1/W buffering should be used, as described in the single-pass cubemap rendering example in the “Issues” section below.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_SWIZZLE_STATE_CREATE_INFO_NV
-
New Structures
New Functions
None.
Issues
1) Where does viewport swizzling occur in the pipeline?
RESOLVED: Despite being associated with the viewport, viewport swizzling must happen prior to the viewport transform. In particular, it needs to be performed before clipping and perspective division.
The viewport mask expansion (VK_NV_viewport_array2
) and the viewport
swizzle could potentially be performed before or after transform feedback,
but feeding back several viewports worth of primitives with different
swizzles doesn’t seem particularly useful.
This specification applies the viewport mask and swizzle after transform
feedback, and makes primitive queries only count each primitive once.
2) Any interesting examples of how this extension,
VK_NV_viewport_array2
, and VK_NV_geometry_shader_passthrough
can
be used together in practice?
RESOLVED: One interesting use case for this extension is for single-pass
rendering to a cubemap.
In this example, the application would attach a cubemap texture to a layered
FBO where the six cube faces are treated as layers.
Vertices are sent through the vertex shader without applying a projection
matrix, where the gl_Position
output is (x,y,z,1) and the center
of the cubemap is at (0,0,0).
With unextended Vulkan, one could have a conventional instanced geometry
shader that looks something like the following:
layout(invocations = 6) in; // separate invocation per face
layout(triangles) in;
layout(triangle_strip) out;
layout(max_vertices = 3) out;
in Inputs {
vec2 texcoord;
vec3 normal;
vec4 baseColor;
} v[];
out Outputs {
vec2 texcoord;
vec3 normal;
vec4 baseColor;
};
void main()
{
int face = gl_InvocationID; // which face am I?
// Project gl_Position for each vertex onto the cube map face.
vec4 positions[3];
for (int i = 0; i < 3; i++) {
positions[i] = rotate(gl_in[i].gl_Position, face);
}
// If the primitive doesn't project onto this face, we're done.
if (shouldCull(positions)) {
return;
}
// Otherwise, emit a copy of the input primitive to the
// appropriate face (using gl_Layer).
for (int i = 0; i < 3; i++) {
gl_Layer = face;
gl_Position = positions[i];
texcoord = v[i].texcoord;
normal = v[i].normal;
baseColor = v[i].baseColor;
EmitVertex();
}
}
With passthrough geometry shaders, this can be done using a much simpler shader:
layout(triangles) in;
layout(passthrough) in Inputs {
vec2 texcoord;
vec3 normal;
vec4 baseColor;
}
layout(passthrough) in gl_PerVertex {
vec4 gl_Position;
} gl_in[];
layout(viewport_relative) out int gl_Layer;
void main()
{
// Figure out which faces the primitive projects onto and
// generate a corresponding viewport mask.
uint mask = 0;
for (int i = 0; i < 6; i++) {
if (!shouldCull(face)) {
mask |= 1U << i;
}
}
gl_ViewportMask = mask;
gl_Layer = 0;
}
The application code is set up so that each of the six cube faces has a
separate viewport (numbered 0 to 5).
Each face also has a separate swizzle, programmed via the
VkPipelineViewportSwizzleStateCreateInfoNV pipeline state.
The viewport swizzle feature performs the coordinate transformation handled
by the rotate
() function in the original shader.
The viewport_relative
layout qualifier says that the viewport number (0
to 5) is added to the base gl_Layer
value of 0 to determine which layer
(cube face) the primitive should be sent to.
Note that the use of the passed through input normal
in this example
suggests that the fragment shader in this example would perform an operation
like per-fragment lighting.
The viewport swizzle would transform the position to be face-relative, but
normal
would remain in the original coordinate system.
It seems likely that the fragment shader in either version of the example
would want to perform lighting in the original coordinate system.
It would likely do this by reconstructing the position of the fragment in
the original coordinate system using gl_FragCoord
, a constant or
uniform holding the size of the cube face, and the input
gl_ViewportIndex
(or gl_Layer
), which identifies the cube face.
Since the value of normal
is in the original coordinate system, it
would not need to be modified as part of this coordinate transformation.
Note that while the rotate
() operation in the regular geometry shader
above could include an arbitrary post-rotation projection matrix, the
viewport swizzle does not support arbitrary math.
To get proper projection, 1/W buffering should be used.
To do this:
-
Program the viewport swizzles to move the pre-projection W eye coordinate (typically 1.0) into the Z coordinate of the swizzle output and the eye coordinate component used for depth into the W coordinate. For example, the viewport corresponding to the +Z face might use a swizzle of (+X, -Y, +W, +Z). The Z normalized device coordinate computed after swizzling would then be z'/w' = 1/Zeye.
-
On NVIDIA implementations supporting floating-point depth buffers with values outside [0,1], prevent unwanted near plane clipping by enabling
depthClampEnable
. Ensure that the depth clamp doesn’t mess up depth testing by programming the depth range to very large values, such asminDepthBounds
=-z,maxDepthBounds
=+z, where z = 2127. It should be possible to use IEEE infinity encodings also (0xFF800000
for-INF
,0x7F800000
for+INF
). Even when near/far clipping is disabled, primitives extending behind the eye will still be clipped because one or more vertices will have a negative W coordinate and fail X/Y clipping tests.On other implementations, scale X, Y, and Z eye coordinates so that vertices on the near plane have a post-swizzle W coordinate of 1.0. For example, if the near plane is at Zeye = 1/256, scale X, Y, and Z by 256.
-
Adjust depth testing to reflect the fact that 1/W values are large near the eye and small away from the eye. Clear the depth buffer to zero (infinitely far away) and use a depth test of
VK_COMPARE_OP_GREATER
instead ofVK_COMPARE_OP_LESS
.
Version History
-
Revision 1, 2016-12-22 (Piers Daniell)
-
Internal revisions
-
List of Provisional Extensions
VK_KHR_vulkan_memory_model
- Name String
-
VK_KHR_vulkan_memory_model
- Extension Type
-
Device extension
- Registered Extension Number
-
212
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2018-02-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension requires SPV_KHR_vulkan_memory_model
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Alan Baker, Google
-
Tobias Hector, AMD
-
David Neto, Google
-
Robert Simpson, Qualcomm Technologies, Inc.
-
Brian Sumner, AMD
-
The VK_KHR_vulkan_memory_model
extension allows use of the
Vulkan Memory Model, which formally defines how to
synchronize memory accesses to the same memory locations performed by
multiple shader invocations.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_MEMORY_MODEL_FEATURES_KHR
-
New Structures
New SPIR-V Capabilities
Issues
Version History
-
Revision 1, 2018-06-24 (Jeff Bolz)
-
Initial draft
-
List of Deprecated Extensions
VK_KHR_16bit_storage
- Name String
-
VK_KHR_16bit_storage
- Extension Type
-
Device extension
- Registered Extension Number
-
84
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_storage_buffer_storage_class
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jan-Harald Fredriksen janharaldfredriksen-arm
-
- Last Modified Date
-
2017-09-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension requires SPV_KHR_16bit_storage
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Alexander Galazin, ARM
-
Jan-Harald Fredriksen, ARM
-
Joerg Wagner, ARM
-
Neil Henning, Codeplay
-
Jeff Bolz, Nvidia
-
Daniel Koch, Nvidia
-
David Neto, Google
-
John Kessenich, Google
-
The VK_KHR_16bit_storage
extension allows use of 16-bit types in shader
input and output interfaces, and push constant blocks.
This extension introduces several new optional features which map to SPIR-V
capabilities and allow access to 16-bit data in Block
-decorated objects
in the Uniform
and the StorageBuffer
storage classes, and objects
in the PushConstant
storage class.
This extension allows 16-bit variables to be declared and used as
user-defined shader inputs and outputs but does not change location
assignment and component assignment rules.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES_KHR
-
New Structures
New SPIR-V Capabilities
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
Version History
-
Revision 1, 2017-03-23 (Alexander Galazin)
-
Initial draft
-
VK_KHR_bind_memory2
- Name String
-
VK_KHR_bind_memory2
- Extension Type
-
Device extension
- Registered Extension Number
-
158
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Tobias Hector tobski
-
- Last Modified Date
-
2017-09-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Tobias Hector, Imagination Technologies
-
This extension provides versions of vkBindBufferMemory and vkBindImageMemory that allow multiple bindings to be performed at once, and are extensible.
This extension also introduces VK_IMAGE_CREATE_ALIAS_BIT_KHR
, which
allows “identical” images that alias the same memory to interpret the
contents consistently, even across image layout changes.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_INFO_KHR
-
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO_KHR
-
-
Extending VkImageCreateFlagBits:
-
VK_IMAGE_CREATE_ALIAS_BIT_KHR
-
New Enums
None.
New Structures
New Functions
New Built-In Variables
None.
New SPIR-V Capabilities
None.
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
None.
Version History
-
Revision 1, 2017-05-19 (Tobias Hector)
-
Pulled bind memory functions into their own extension
-
VK_KHR_dedicated_allocation
- Name String
-
VK_KHR_dedicated_allocation
- Extension Type
-
Device extension
- Registered Extension Number
-
128
- Revision
-
3
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_get_memory_requirements2
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2017-09-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Jason Ekstrand, Intel
-
This extension enables resources to be bound to a dedicated allocation,
rather than suballocated.
For any particular resource, applications can query whether a dedicated
allocation is recommended, in which case using a dedicated allocation may
improve the performance of access to that resource.
Normal device memory allocations must support multiple resources per
allocation, memory aliasing and sparse binding, which could interfere with
some optimizations.
Applications should query the implementation for when a dedicated allocation
may be beneficial by adding VkMemoryDedicatedRequirementsKHR
to the
pNext
chain of the VkMemoryRequirements2
structure passed as the
pMemoryRequirements
parameter to a call to
vkGetBufferMemoryRequirements2
or vkGetImageMemoryRequirements2
.
Certain external handle types and external images or buffers may also
depend on dedicated allocations on implementations that associate image or
buffer metadata with OS-level memory objects.
This extension adds a two small structures to memory requirements querying and memory allocation: a new structure that flags whether an image/buffer should have a dedicated allocation, and a structure indicating the image or buffer that an allocation will be bound to.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS_KHR
-
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO_KHR
-
New Enums
None.
New Structures
New Functions
None.
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
None.
Examples
// Create an image with a dedicated allocation based on the
// implementation's preference
VkImageCreateInfo imageCreateInfo =
{
// Image creation parameters
};
VkImage image;
VkResult result = vkCreateImage(
device,
&imageCreateInfo,
NULL, // pAllocator
&image);
VkMemoryDedicatedRequirementsKHR dedicatedRequirements =
{
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS_KHR,
NULL, // pNext
};
VkMemoryRequirements2 memoryRequirements =
{
VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2,
&dedicatedRequirements, // pNext
};
const VkImageMemoryRequirementsInfo2 imageRequirementsInfo =
{
VK_STRUCTURE_TYPE_IMAGE_MEMORY_REQUIREMENTS_INFO_2,
NULL, // pNext
image
};
vkGetImageMemoryRequirements2(
device,
&imageRequirementsInfo,
&memoryRequirements);
if (dedicatedRequirements.prefersDedicatedAllocation) {
// Allocate memory with VkMemoryDedicatedAllocateInfoKHR::image
// pointing to the image we are allocating the memory for
VkMemoryDedicatedAllocateInfoKHR dedicatedInfo =
{
VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO_KHR, // sType
NULL, // pNext
image, // image
VK_NULL_HANDLE, // buffer
};
VkMemoryAllocateInfo memoryAllocateInfo =
{
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO, // sType
&dedicatedInfo, // pNext
memoryRequirements.size, // allocationSize
FindMemoryTypeIndex(memoryRequirements.memoryTypeBits), // memoryTypeIndex
};
VkDeviceMemory memory;
vkAllocateMemory(
device,
&memoryAllocateInfo,
NULL, // pAllocator
&memory);
// Bind the image to the memory
vkBindImageMemory(
device,
image,
memory,
0);
} else {
// Take the normal memory sub-allocation path
}
Version History
-
Revision 1, 2017-02-27 (James Jones)
-
Copy content from VK_NV_dedicated_allocation
-
Add some references to external object interactions to the overview.
-
-
Revision 2, 2017-03-27 (Jason Ekstrand)
-
Rework the extension to be query-based
-
-
Revision 3, 2017-07-31 (Jason Ekstrand)
-
Clarify that memory objects created with VkMemoryDedicatedAllocateInfoKHR can only have the specified resource bound and no others.
-
VK_KHR_descriptor_update_template
- Name String
-
VK_KHR_descriptor_update_template
- Extension Type
-
Device extension
- Registered Extension Number
-
86
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Markus Tavenrath mtavenrath
-
- Last Modified Date
-
2017-09-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Interacts with
VK_KHR_push_descriptor
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Michael Worcester, Imagination Technologies
-
Applications may wish to update a fixed set of descriptors in a large number of descriptors sets very frequently, i.e. during initializaton phase or if it’s required to rebuild descriptor sets for each frame. For those cases it’s also not unlikely that all information required to update a single descriptor set is stored in a single struct. This extension provides a way to update a fixed set of descriptors in a single VkDescriptorSet with a pointer to a user defined data structure which describes the new descriptors.
New Object Types
-
VkDescriptorUpdateTemplateKHR
New Enum Constants
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO_KHR
New Functions
Promotion to Vulkan 1.1
vkCmdPushDescriptorSetWithTemplateKHR is included as an interaction
with VK_KHR_push_descriptor
.
If Vulkan 1.1 and VK_KHR_push_descriptor are supported, this is included by
VK_KHR_push_descriptor
.
The base functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Version History
-
Revision 1, 2016-01-11 (Markus Tavenrath)
-
Initial draft
-
VK_KHR_device_group
- Name String
-
VK_KHR_device_group
- Extension Type
-
Device extension
- Registered Extension Number
-
61
- Revision
-
3
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_device_group_creation
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2017-10-06
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Tobias Hector, Imagination Technologies
-
This extension provides functionality to use a logical device that consists
of multiple physical devices, as created with the
VK_KHR_device_group_creation
extension.
A device group can allocate memory across the subdevices, bind memory from
one subdevice to a resource on another subdevice, record command buffers
where some work executes on an arbitrary subset of the subdevices, and
potentially present a swapchain image from one or more subdevices.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_FLAGS_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_RENDER_PASS_BEGIN_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_COMMAND_BUFFER_BEGIN_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_SUBMIT_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_BIND_SPARSE_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_PRESENT_CAPABILITIES_KHR
-
VK_STRUCTURE_TYPE_IMAGE_SWAPCHAIN_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_SWAPCHAIN_INFO_KHR
-
VK_STRUCTURE_TYPE_ACQUIRE_NEXT_IMAGE_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_PRESENT_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_SWAPCHAIN_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_BIND_BUFFER_MEMORY_DEVICE_GROUP_INFO_KHR
-
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_DEVICE_GROUP_INFO_KHR
-
-
Extending VkImageCreateFlagBits
-
VK_IMAGE_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT_KHR
-
-
Extending VkPipelineCreateFlagBits
-
VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT_KHR
-
VK_PIPELINE_CREATE_DISPATCH_BASE_KHR
-
-
Extending VkDependencyFlagBits
-
VK_DEPENDENCY_DEVICE_GROUP_BIT_KHR
-
-
Extending VkSwapchainCreateFlagBitsKHR
-
VK_SWAPCHAIN_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT_KHR
-
New Enums
New Structures
New Functions
New Built-In Variables
New SPIR-V Capabilities
Promotion to Vulkan 1.1
The following enums, types and commands are included as interactions with
VK_KHR_swapchain
:
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_PRESENT_CAPABILITIES_KHR
-
VK_STRUCTURE_TYPE_IMAGE_SWAPCHAIN_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_SWAPCHAIN_INFO_KHR
-
VK_STRUCTURE_TYPE_ACQUIRE_NEXT_IMAGE_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_PRESENT_INFO_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_SWAPCHAIN_CREATE_INFO_KHR
-
VK_SWAPCHAIN_CREATE_SPLIT_INSTANCE_BIND_REGIONS_BIT_KHR
If Vulkan 1.1 and VK_KHR_swapchain are supported, these are included by VK_KHR_swapchain.
The base functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
None.
Examples
TODO
Version History
-
Revision 1, 2016-10-19 (Jeff Bolz)
-
Internal revisions
-
-
Revision 2, 2017-05-19 (Tobias Hector)
-
Removed extended memory bind functions to VK_KHR_bind_memory2, added dependency on that extension, and device-group-specific structs for those functions.
-
-
Revision 3, 2017-10-06 (Ian Elliott)
-
Corrected Vulkan 1.1 interactions with the WSI extensions. All Vulkan 1.1 WSI interactions are with the VK_KHR_swapchain extension.
-
-
Revision 4, 2017-10-10 (Jeff Bolz)
-
Rename "SFR" bits and structure members to use the phrase "split instance bind regions".
-
VK_KHR_device_group_creation
- Name String
-
VK_KHR_device_group_creation
- Extension Type
-
Instance extension
- Registered Extension Number
-
71
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2016-10-19
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
This extension provides instance-level commands to enumerate groups of
physical devices, and to create a logical device from a subset of one of
those groups.
Such a logical device can then be used with new features in the
VK_KHR_device_group
extension.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_GROUP_PROPERTIES_KHR
-
VK_STRUCTURE_TYPE_DEVICE_GROUP_DEVICE_CREATE_INFO_KHR
-
-
Extending VkMemoryHeapFlagBits
-
VK_MEMORY_HEAP_MULTI_INSTANCE_BIT_KHR
-
New Enums
None.
New Functions
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
None.
Examples
VkDeviceCreateInfo devCreateInfo = { VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO };
// (not shown) fill out devCreateInfo as usual.
uint32_t deviceGroupCount = 0;
VkPhysicalDeviceGroupPropertiesKHR *props = NULL;
// Query the number of device groups
vkEnumeratePhysicalDeviceGroupsKHR(g_vkInstance, &deviceGroupCount, NULL);
// Allocate and initialize structures to query the device groups
props = (VkPhysicalDeviceGroupPropertiesKHR *)malloc(deviceGroupCount*sizeof(VkPhysicalDeviceGroupPropertiesKHR));
for (i = 0; i < deviceGroupCount; ++i) {
props[i].sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_GROUP_PROPERTIES_KHR;
props[i].pNext = NULL;
}
vkEnumeratePhysicalDeviceGroupsKHR(g_vkInstance, &deviceGroupCount, props);
// If the first device group has more than one physical device. create
// a logical device using all of the physical devices.
VkDeviceGroupDeviceCreateInfoKHR deviceGroupInfo = { VK_STRUCTURE_TYPE_DEVICE_GROUP_DEVICE_CREATE_INFO_KHR };
if (props[0].physicalDeviceCount > 1) {
deviceGroupInfo.physicalDeviceCount = props[0].physicalDeviceCount;
deviceGroupInfo.pPhysicalDevices = props[0].physicalDevices;
devCreateInfo.pNext = &deviceGroupInfo;
}
vkCreateDevice(props[0].physicalDevices[0], &devCreateInfo, NULL, &g_vkDevice);
free(props);
Version History
-
Revision 1, 2016-10-19 (Jeff Bolz)
-
Internal revisions
-
VK_KHR_external_fence
- Name String
-
VK_KHR_external_fence
- Extension Type
-
Device extension
- Registered Extension Number
-
114
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_fence_capabilities
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jesse Hall critsec
-
- Last Modified Date
-
2017-05-08
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Cass Everitt, Oculus
-
Contributors to
VK_KHR_external_semaphore
-
An application using external memory may wish to synchronize access to that memory using fences. This extension enables an application to create fences from which non-Vulkan handles that reference the underlying synchronization primitive can be exported.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_EXPORT_FENCE_CREATE_INFO_KHR
New Enums
New Structs
New Functions
None.
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
This extension borrows concepts, semantics, and language from
VK_KHR_external_semaphore
.
That extension’s issues apply equally to this extension.
VK_KHR_external_fence_capabilities
- Name String
-
VK_KHR_external_fence_capabilities
- Extension Type
-
Instance extension
- Registered Extension Number
-
113
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jesse Hall critsec
-
- Last Modified Date
-
2017-05-08
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Cass Everitt, Oculus
-
Contributors to
VK_KHR_external_semaphore_capabilities
-
An application may wish to reference device fences in multiple Vulkan logical devices or instances, in multiple processes, and/or in multiple APIs. This extension provides a set of capability queries and handle definitions that allow an application to determine what types of “external” fence handles an implementation supports for a given set of use cases.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_FENCE_INFO_KHR
-
VK_STRUCTURE_TYPE_EXTERNAL_FENCE_PROPERTIES_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES_KHR
-
VK_LUID_SIZE_KHR
New Structs
New Functions
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
None.
VK_KHR_external_memory
- Name String
-
VK_KHR_external_memory
- Extension Type
-
Device extension
- Registered Extension Number
-
73
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_external_memory_capabilities
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-10-20
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Interacts with
VK_KHR_dedicated_allocation
. -
Interacts with
VK_NV_dedicated_allocation
. -
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jason Ekstrand, Intel
-
Ian Elliot, Google
-
Jesse Hall, Google
-
Tobias Hector, Imagination Technologies
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Matthew Netsch, Qualcomm Technologies, Inc.
-
Daniel Rakos, AMD
-
Carsten Rohde, NVIDIA
-
Ray Smith, ARM
-
Chad Versace, Google
-
An application may wish to reference device memory in multiple Vulkan logical devices or instances, in multiple processes, and/or in multiple APIs. This extension enables an application to export non-Vulkan handles from Vulkan memory objects such that the underlying resources can be referenced outside the scope of the Vulkan logical device that created them.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_BUFFER_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO_KHR
-
VK_QUEUE_FAMILY_EXTERNAL_KHR
-
VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR
New Enums
None.
New Structs
New Functions
None.
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
1) How do applications correlate two physical devices across process or Vulkan instance boundaries?
RESOLVED: New device ID fields have been introduced by
VK_KHR_external_memory_capabilities
.
These fields, combined with the existing
VkPhysicalDeviceProperties::driverVersion
field can be used to
identify compatible devices across processes, drivers, and APIs.
VkPhysicalDeviceProperties::pipelineCacheUUID
is not sufficient
for this purpose because despite its description in the specification, it
need only identify a unique pipeline cache format in practice.
Multiple devices may be able to use the same pipeline cache data, and hence
it would be desirable for all of them to have the same pipeline cache UUID.
However, only the same concrete physical device can be used when sharing
memory, so an actual unique device ID was introduced.
Further, the pipeline cache UUID was specific to Vulkan, but correlation
with other, non-extensible APIs is required to enable interoperation with
those APIs.
2) If memory objects are shared between processes and APIs, is this considered aliasing according to the rules outlined in the Memory Aliasing section?
RESOLVED: Yes. Applications must take care to obey all restrictions imposed on aliased resources when using memory across multiple Vulkan instances or other APIs.
3) Are new image layouts or metadata required to specify image layouts and layout transitions compatible with non-Vulkan APIs, or with other instances of the same Vulkan driver?
RESOLVED: Separate instances of the same Vulkan driver running on the same GPU should have identical internal layout semantics, so applications have the tools they need to ensure views of images are consistent between the two instances. Other APIs will fall into two categories: Those that are Vulkan- compatible, and those that are Vulkan-incompatible. Vulkan-incompatible APIs will require the image to be in the GENERAL layout whenever they are accessing them.
Note this does not attempt to address cross-device transitions, nor transitions to engines on the same device which are not visible within the Vulkan API. Both of these are beyond the scope of this extension.
4) Is a new barrier flag or operation of some type needed to prepare external memory for handoff to another Vulkan instance or API and/or receive it from another instance or API?
RESOLVED: Yes. Some implementations need to perform additional cache management when transitioning memory between address spaces, and other APIs, instances, or processes may operate in a separate address space. Options for defining this transition include:
-
A new structure that can be added to the
pNext
list in VkMemoryBarrier, VkBufferMemoryBarrier, and VkImageMemoryBarrier. -
A new bit in VkAccessFlags that can be set to indicate an “external” access.
-
A new bit in VkDependencyFlags
-
A new special queue family that represents an “external” queue.
A new structure has the advantage that the type of external transition can
be described in as much detail as necessary.
However, there is not currently a known need for anything beyond
differentiating external vs.
internal accesses, so this is likely an over-engineered solution.
The access flag bit has the advantage that it can be applied at buffer,
image, or global granularity, and semantically it maps pretty well to the
operation being described.
Additionally, the API already includes VK_ACCESS_MEMORY_READ_BIT
and
VK_ACCESS_MEMORY_WRITE_BIT
which appear to be intended for this
purpose.
However, there is no obvious pipeline stage that would correspond to an
external access, and therefore no clear way to use
VK_ACCESS_MEMORY_READ_BIT
or VK_ACCESS_MEMORY_WRITE_BIT
.
VkDependencyFlags and VkPipelineStageFlags operate at command
granularity rather than image or buffer granularity, which would make an
entire pipeline barrier an internal→external or external→internal barrier.
This may not be a problem in practice, but seems like the wrong scope.
Another downside of VkDependencyFlags is that it lacks inherent
directionality: There are not src
and dst
variants of it in the
barrier or dependency description semantics, so two bits might need to be
added to describe both internal→external and external→internal
transitions.
Transitioning a resource to a special queue family corresponds well with the
operation of transitioning to a separate Vulkan instance, in that both
operations ideally include scheduling a barrier on both sides of the
transition: Both the releasing and the acquiring queue or process.
Using a special queue family requires adding an additional reserved queue
family index.
Re-using VK_QUEUE_FAMILY_IGNORED
would have left it unclear how to
transition a concurrent usage resource from one process to another, since
the semantics would have likely been equivalent to the currently-ignored
transition of
VK_QUEUE_FAMILY_IGNORED
→ VK_QUEUE_FAMILY_IGNORED
.
Fortunately, creating a new reserved queue family index is not invasive.
Based on the above analysis, the approach of transitioning to a special “external” queue family was chosen.
5) Do internal driver memory arrangements and/or other internal driver image properties need to be exported and imported when sharing images across processes or APIs.
RESOLVED: Some vendors claim this is necessary on their implementations, but it was determined that the security risks of allowing opaque meta data to be passed from applications to the driver were too high. Therefore, implementations which require metadata will need to associate it with the objects represented by the external handles, and rely on the dedicated allocation mechanism to associate the exported and imported memory objects with a single image or buffer.
6) Most prior interoperation and cross-process sharing APIs have been based on image-level sharing. Should Vulkan sharing be based on memory-object sharing or image sharing?
RESOLVED: These extensions have assumed memory-level sharing is the correct granularity. Vulkan is a lower-level API than most prior APIs, and as such attempts to closely align with to the underlying primitives of the hardware and system-level drivers it abstracts. In general, the resource that holds the backing store for both images and buffers of various types is memory. Images and buffers are merely metadata containing brief descriptions of the layout of bits within that memory.
Because memory object-based sharing is aligned with the overall Vulkan API design, it exposes the full power of Vulkan on external objects. External memory can be used as backing for sparse images, for example, whereas such usage would be awkward at best with a sharing mechanism based on higher-level primitives such as images. Further, aligning the mechanism with the API in this way provides some hope of trivial compatibility with future API enhancements. If new objects backed by memory objects are added to the API, they too can be used across processes with minimal additions to the base external memory APIs.
Earlier APIs implemented interop at a higher level, and this necessitated entirely separate sharing APIs for images and buffers. To co-exist and interoperate with those APIs, the Vulkan external sharing mechanism must accommodate their model. However, if it can be agreed that memory-based sharing is the more desirable and forward-looking design, legacy interoperation considerations can be considered another reason to favor memory-based sharing: While native and legacy driver primitives that may be used to implement sharing may not be as low-level as the API here suggests, raw memory is still the least common denominator among the types. Image-based sharing can be cleanly derived from a set of base memory- object sharing APIs with minimal effort, whereas image-based sharing does not generalize well to buffer or raw-memory sharing. Therefore, following the general Vulkan design principle of minimalism, it is better to expose even interopability with image-based native and external primitives via the memory sharing API, and place sufficient limits on their usage to ensure they can be used only as backing for equivalent Vulkan images. This provides a consistent API for applications regardless of which platform or external API they are targeting, which makes development of multi-API and multi-platform applications simpler.
7) Should Vulkan define a common external handle type and provide Vulkan functions to facilitate cross-process sharing of such handles rather than relying on native handles to define the external objects?
RESOLVED: No. Cross-process sharing of resources is best left to native platforms. There are myriad security and extensibility issues with such a mechanism, and attempting to re-solve all those issues within Vulkan does not align with Vulkan’s purpose as a graphics API. If desired, such a mechanism could be built as a layer or helper library on top of the opaque native handle defined in this family of extensions.
8) Must implementations provide additional guarantees about state implicitly included in memory objects for those memory objects that may be exported?
RESOLVED: Implementations must ensure that sharing memory objects does not transfer any information between the exporting and importing instances and APIs other than that required to share the data contained in the memory objects explicitly shared. As specific examples, data from previously freed memory objects that used the same underlying physical memory, and data from memory obects using adjacent physical memory must not be visible to applications importing an exported memory object.
9) Must implementations validate external handles the application provides as input to memory import operations?
RESOLVED: Implementations must return an error to the application if the provided memory handle cannot be used to complete the requested import operation. However, implementations need not validate handles are of the exact type specified by the application.
VK_KHR_external_memory_capabilities
- Name String
-
VK_KHR_external_memory_capabilities
- Extension Type
-
Instance extension
- Registered Extension Number
-
72
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-10-17
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Interacts with
VK_KHR_dedicated_allocation
. -
Interacts with
VK_NV_dedicated_allocation
. -
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Ian Elliot, Google
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
An application may wish to reference device memory in multiple Vulkan logical devices or instances, in multiple processes, and/or in multiple APIs. This extension provides a set of capability queries and handle definitions that allow an application to determine what types of “external” memory handles an implementation supports for a given set of use cases.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_IMAGE_FORMAT_INFO_KHR
-
VK_STRUCTURE_TYPE_EXTERNAL_IMAGE_FORMAT_PROPERTIES_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_BUFFER_INFO_KHR
-
VK_STRUCTURE_TYPE_EXTERNAL_BUFFER_PROPERTIES_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES_KHR
-
VK_LUID_SIZE_KHR
New Structs
New Functions
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
1) Why do so many external memory capabilities need to be queried on a per-memory-handle-type basis?
PROPOSED RESOLUTION: This is because some handle types are based on OS-native objects that have far more limited capabilities than the very generic Vulkan memory objects. Not all memory handle types can name memory objects that support 3D images, for example. Some handle types cannot even support the deferred image and memory binding behavior of Vulkan and require specifying the image when allocating or importing the memory object.
2) Do the VkExternalImageFormatPropertiesKHR and VkExternalBufferPropertiesKHR structs need to include a list of memory type bits that support the given handle type?
PROPOSED RESOLUTION: No. The memory types that don’t support the handle types will simply be filtered out of the results returned by vkGetImageMemoryRequirements and vkGetBufferMemoryRequirements when a set of handle types was specified at image or buffer creation time.
3) Should the non-opaque handle types be moved to their own extension?
PROPOSED RESOLUTION: Perhaps. However, defining the handle type bits does very little and does not require any platform-specific types on its own, and it’s easier to maintain the bitfield values in a single extension for now. Presumably more handle types could be added by separate extensions though, and it would be midly weird to have some platform-specific ones defined in the core spec and some in extensions
4) Do we need a D3D11_TILEPOOL
type?
PROPOSED RESOLUTION: No. This is technically possible, but the synchronization is awkward. D3D11 surfaces must be synchronized using shared mutexes, and these synchronization primitives are shared by the entire memory object, so D3D11 shared allocations divided among multiple buffer and image bindings may be difficult to synchronize.
5) Should the Windows 7-compatible handle types be named “KMT” handles or “GLOBAL_SHARE” handles?
PROPOSED RESOLUTION: KMT, simply because it is more concise.
6) How do applications identify compatible devices and drivers across instance, process, and API boundaries when sharing memory?
PROPOSED RESOLUTION: New device properties are exposed that allow applications to correctly correlate devices and drivers. A device and driver UUID that must both match to ensure sharing compatibility between two Vulkan instances, or a Vulkan instance and an extensible external API are added. To allow correlating with Direct3D devices, a device LUID is added that corresponds to a DXGI adapter LUID. A driver ID is not needed for Direct3D because mismatched driver component versions are not a currently supported configuration on the Windows OS. Should support for such configurations be introduced at the OS level, further Vulkan extensions would be needed to correlate userspace component builds.
VK_KHR_external_semaphore
- Name String
-
VK_KHR_external_semaphore
- Extension Type
-
Device extension
- Registered Extension Number
-
78
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-10-21
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jason Ekstrand, Intel
-
Jesse Hall, Google
-
Tobias Hector, Imagination Technologies
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
Matthew Netsch, Qualcomm Technologies, Inc.
-
Ray Smith, ARM
-
Chad Versace, Google
-
An application using external memory may wish to synchronize access to that memory using semaphores. This extension enables an application to create semaphores from which non-Vulkan handles that reference the underlying synchronization primitive can be exported.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_CREATE_INFO_KHR
-
VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR
New Enums
New Structs
New Functions
None.
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
1) Should there be restrictions on what side effects can occur when waiting on imported semaphores that are in an invalid state?
RESOLVED: Yes. Normally, validating such state would be the responsibility of the application, and the implementation would be free to enter an undefined state if valid usage rules were violated. However, this could cause security concerns when using imported semaphores, as it would require the importing application to trust the exporting application to ensure the state is valid. Requiring this level of trust is undesirable for many potential use cases.
2) Must implementations validate external handles the application provides as input to semaphore state import operations?
RESOLVED: Implementations must return an error to the application if the provided semaphore state handle cannot be used to complete the requested import operation. However, implementations need not validate handles are of the exact type specified by the application.
VK_KHR_external_semaphore_capabilities
- Name String
-
VK_KHR_external_semaphore_capabilities
- Extension Type
-
Instance extension
- Registered Extension Number
-
77
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-10-20
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jesse Hall, Google
-
James Jones, NVIDIA
-
Jeff Juliano, NVIDIA
-
An application may wish to reference device semaphores in multiple Vulkan logical devices or instances, in multiple processes, and/or in multiple APIs. This extension provides a set of capability queries and handle definitions that allow an application to determine what types of “external” semaphore handles an implementation supports for a given set of use cases.
New Object Types
None.
New Enum Constants
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_SEMAPHORE_INFO_KHR
-
VK_STRUCTURE_TYPE_EXTERNAL_SEMAPHORE_PROPERTIES_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES_KHR
-
VK_LUID_SIZE_KHR
New Structs
New Functions
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
VK_KHR_get_memory_requirements2
- Name String
-
VK_KHR_get_memory_requirements2
- Extension Type
-
Device extension
- Registered Extension Number
-
147
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jason Ekstrand jekstrand
-
- Last Modified Date
-
2017-09-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jason Ekstrand, Intel
-
Jeff Bolz, NVIDIA
-
Jesse Hall, Google
-
This extension provides new entry points to query memory requirements of
images and buffers in a way that can be easily extended by other extensions,
without introducing any further entry points.
The Vulkan 1.0 VkMemoryRequirements and
VkSparseImageMemoryRequirements structures do not include a
sType
/pNext
, this extension wraps them in new structures with
sType
/pNext
so an application can query a chain of memory
requirements structures by constructing the chain and letting the
implementation fill them in.
A new command is added for each vkGet*MemoryRequrements
command in
core Vulkan 1.0.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_BUFFER_MEMORY_REQUIREMENTS_INFO_2_KHR
-
VK_STRUCTURE_TYPE_IMAGE_MEMORY_REQUIREMENTS_INFO_2_KHR
-
VK_STRUCTURE_TYPE_IMAGE_SPARSE_MEMORY_REQUIREMENTS_INFO_2_KHR
-
VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2_KHR
-
VK_STRUCTURE_TYPE_SPARSE_IMAGE_MEMORY_REQUIREMENTS_2_KHR
-
New Enums
None.
New Structures
New Functions
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
None.
Version History
-
Revision 1, 2017-03-23 (Jason Ekstrand)
-
Internal revisions
-
VK_KHR_get_physical_device_properties2
- Name String
-
VK_KHR_get_physical_device_properties2
- Extension Type
-
Instance extension
- Registered Extension Number
-
60
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2017-09-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
Ian Elliott, Google
-
This extension provides new entry points to query device features, device
properties, and format properties in a way that can be easily extended by
other extensions, without introducing any further entry points.
The Vulkan 1.0 feature/limit/formatproperty structures do not include
sType
/pNext
members.
This extension wraps them in new structures with sType
/pNext
members, so an application can query a chain of feature/limit/formatproperty
structures by constructing the chain and letting the implementation fill
them in.
A new command is added for each vkGetPhysicalDevice*
command in core
Vulkan 1.0.
The new feature structure (and a chain of extension structures) can also be
passed in to device creation to enable features.
This extension also allows applications to use the physical-device components of device extensions before vkCreateDevice is called.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2_KHR
-
VK_STRUCTURE_TYPE_FORMAT_PROPERTIES_2_KHR
-
VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR
-
VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MEMORY_PROPERTIES_2_KHR
-
VK_STRUCTURE_TYPE_SPARSE_IMAGE_FORMAT_PROPERTIES_2_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SPARSE_IMAGE_FORMAT_INFO_2_KHR
-
New Enums
None.
New Structures
New Functions
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
None.
Examples
// Get features with a hypothetical future extension.
VkHypotheticalExtensionFeaturesKHR hypotheticalFeatures =
{
VK_STRUCTURE_TYPE_HYPOTHETICAL_FEATURES_KHR, // sType
NULL, // pNext
};
VkPhysicalDeviceFeatures2KHR features =
{
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2_KHR, // sType
&hypotheticalFeatures, // pNext
};
// After this call, features and hypotheticalFeatures have been filled out.
vkGetPhysicalDeviceFeatures2KHR(physicalDevice, &features);
// Properties/limits can be chained and queried similarly.
// Enable some features:
VkHypotheticalExtensionFeaturesKHR enabledHypotheticalFeatures =
{
VK_STRUCTURE_TYPE_HYPOTHETICAL_FEATURES_KHR, // sType
NULL, // pNext
};
VkPhysicalDeviceFeatures2KHR enabledFeatures =
{
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2_KHR, // sType
&enabledHypotheticalFeatures, // pNext
};
enabledFeatures.features.xyz = VK_TRUE;
enabledHypotheticalFeatures.abc = VK_TRUE;
VkDeviceCreateInfo deviceCreateInfo =
{
VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO, // sType
&enabledFeatures, // pNext
...
NULL, // pEnabledFeatures
}
VkDevice device;
vkCreateDevice(physicalDevice, &deviceCreateInfo, NULL, &device);
Version History
-
Revision 1, 2016-09-12 (Jeff Bolz)
-
Internal revisions
-
-
Revision 2, 2016-11-02 (Ian Elliott)
-
Added ability for applications to use the physical-device components of device extensions before vkCreateDevice is called.
-
VK_KHR_maintenance1
- Name String
-
VK_KHR_maintenance1
- Extension Type
-
Device extension
- Registered Extension Number
-
70
- Revision
-
2
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Piers Daniell pdaniell-nv
-
- Last Modified Date
-
2018-03-13
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Dan Ginsburg, Valve
-
Daniel Koch, NVIDIA
-
Daniel Rakos, AMD
-
Jan-Harald Fredriksen, ARM
-
Jason Ekstrand, Intel
-
Jeff Bolz, NVIDIA
-
Jesse Hall, Google
-
John Kessenich, Google
-
Michael Worcester, Imagination Technologies
-
Neil Henning, Codeplay Software Ltd.
-
Piers Daniell, NVIDIA
-
Slawomir Grajewski, Intel
-
Tobias Hector, Imagination Technologies
-
Tom Olson, ARM
-
VK_KHR_maintenance1
adds a collection of minor features that were
intentionally left out or overlooked from the original Vulkan 1.0 release.
The new features are as follows:
-
Allow 2D and 2D array image views to be created from 3D images, which can then be used as color framebuffer attachments. This allows applications to render to slices of a 3D image.
-
Support vkCmdCopyImage between 2D array layers and 3D slices. This extension allows copying from layers of a 2D array image to slices of a 3D image and vice versa.
-
Allow negative height to be specified in the VkViewport::
height
field to perform y-inversion of the clip-space to framebuffer-space transform. This allows apps to avoid having to usegl_Position.y = -gl_Position.y
in shaders also targeting other APIs. -
Allow implementations to express support for doing just transfers and clears of image formats that they otherwise support no other format features for. This is done by adding new format feature flags
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR
andVK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR
. -
Support vkCmdFillBuffer on transfer-only queues. Previously vkCmdFillBuffer was defined to only work on command buffers allocated from command pools which support graphics or compute queues. It is now allowed on queues that just support transfer operations.
-
Fix the inconsistency of how error conditions are returned between the vkCreateGraphicsPipelines and vkCreateComputePipelines functions and the vkAllocateDescriptorSets and vkAllocateCommandBuffers functions.
-
Add new
VK_ERROR_OUT_OF_POOL_MEMORY_KHR
error so implementations can give a more precise reason for vkAllocateDescriptorSets failures. -
Add a new command vkTrimCommandPoolKHR which gives the implementation an opportunity to release any unused command pool memory back to the system.
New Object Types
None.
New Enum Constants
-
VK_ERROR_OUT_OF_POOL_MEMORY_KHR
-
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR
-
VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR
-
VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT_KHR
New Enums
None.
New Structures
None.
New Functions
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
-
Are viewports with zero height allowed?
RESOLVED: Yes, although they have low utility.
Version History
-
Revision 1, 2016-10-26 (Piers Daniell)
-
Internal revisions
-
-
Revision 2, 2018-03-13 (Jon Leech)
-
Add issue for zero-height viewports
-
VK_KHR_maintenance2
- Name String
-
VK_KHR_maintenance2
- Extension Type
-
Device extension
- Registered Extension Number
-
118
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Michael Worcester michaelworcester
-
- Last Modified Date
-
2017-09-05
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Michael Worcester, Imagination Technologies
-
Stuart Smith, Imagination Technologies
-
Jeff Bolz, NVIDIA
-
Daniel Koch, NVIDIA
-
Jan-Harald Fredriksen, ARM
-
Daniel Rakos, AMD
-
Neil Henning, Codeplay
-
Piers Daniell, NVIDIA
-
VK_KHR_maintenance2
adds a collection of minor features that were
intentionally left out or overlooked from the original Vulkan 1.0 release.
The new features are as follows:
-
Allow the application to specify which aspect of an input attachment might be read for a given subpass.
-
Allow implementations to express the clipping behavior of points.
-
Allow creating images with usage flags that may not be supported for the base image’s format, but are supported for image views of the image that have a different but compatible format.
-
Allow creating uncompressed image views of compressed images.
-
Allow the application to select between an upper-left and lower-left origin for the tessellation domain space.
-
Adds two new image layouts for depth stencil images to allow either the depth or stencil aspect to be read-only while the other aspect is writable.
Input Attachment Specification
Input attachment specification allows an application to specify which aspect
of a multi-aspect image (e.g. a combined depth stencil format) will be
accessed via a subpassLoad
operation.
On some implementations there may be a performance penalty if the implementation does not know (at vkCreateRenderPass time) which aspect(s) of multi-aspect images can be accessed as input attachments.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_RENDER_PASS_INPUT_ATTACHMENT_ASPECT_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_POINT_CLIPPING_PROPERTIES_KHR
-
VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_DOMAIN_ORIGIN_STATE_CREATE_INFO_KHR
-
-
Extending VkImageCreateFlagBits:
-
VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT_KHR
-
VK_IMAGE_CREATE_EXTENDED_USAGE_BIT_KHR
-
-
Extending VkImageLayout
-
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL_KHR
-
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL_KHR
-
-
VK_POINT_CLIPPING_BEHAVIOR_ALL_CLIP_PLANES_KHR
-
VK_POINT_CLIPPING_BEHAVIOR_USER_CLIP_PLANES_ONLY_KHR
New Structures
New Functions
None.
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Input Attachment Specification Example
Consider the case where a render pass has two subpasses and two attachments.
Attachment 0 has the format VK_FORMAT_D24_UNORM_S8_UINT
, attachment 1
has some color format.
Subpass 0 writes to attachment 0, subpass 1 reads only the depth information from attachment 0 (using inputAttachmentRead) and writes to attachment 1.
VkInputAttachmentAspectReferenceKHR references[] = {
{
.subpass = 1,
.inputAttachmentIndex = 0,
.aspectMask = VK_IMAGE_ASPECT_DEPTH_BIT
}
};
VkRenderPassInputAttachmentAspectCreateInfoKHR specifyAspects = {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_INPUT_ATTACHMENT_ASPECT_CREATE_INFO_KHR,
.pNext = NULL,
.aspectReferenceCount = 1,
.pAspectReferences = references
};
VkRenderPassCreateInfo createInfo = {
...
.pNext = &specifyAspects,
...
}
vkCreateRenderPass(...);
Issues
1) What is the default tessellation domain origin?
RESOLVED: Vulkan 1.0 originally inadvertently documented a lower-left origin, but the conformance tests and all implementations implemented an upper-left origin. This extension adds a control to select between lower-left (for compatibility with OpenGL) and upper-left, and we retroactively fix unextended Vulkan to have a default of an upper-left origin.
Version History
-
Revision 1, 2017-04-28
VK_KHR_maintenance3
- Name String
-
VK_KHR_maintenance3
- Extension Type
-
Device extension
- Registered Extension Number
-
169
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Status
-
Draft
- Last Modified Date
-
2017-09-05
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
VK_KHR_maintenance3
adds a collection of minor features that were
intentionally left out or overlooked from the original Vulkan 1.0 release.
The new features are as follows:
-
A limit on the maximum number of descriptors that are supported in a single descriptor set layout. Some implementations have a limit on the total size of descriptors in a set, which can’t be expressed in terms of the limits in Vulkan 1.0.
-
A limit on the maximum size of a single memory allocation. Some platforms have kernel interfaces that limit the maximum size of an allocation.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MAINTENANCE_3_PROPERTIES_KHR
-
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_SUPPORT_KHR
-
New Enums
None.
New Functions
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
None.
Version History
-
Revision 1, 2017-08-22
VK_KHR_multiview
- Name String
-
VK_KHR_multiview
- Extension Type
-
Device extension
- Registered Extension Number
-
54
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2016-10-28
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Jeff Bolz, NVIDIA
-
This extension has the same goal as the OpenGL ES GL_OVR_multiview extension - it enables rendering to multiple “views” by recording a single set of commands to be executed with slightly different behavior for each view. It includes a concise way to declare a render pass with multiple views, and gives implementations freedom to render the views in the most efficient way possible.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_FEATURES_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_PROPERTIES_KHR
-
-
Extending VkDependencyFlagBits
-
VK_DEPENDENCY_VIEW_LOCAL_BIT_KHR
-
New Enums
None.
New Structures
New Functions
None.
New Built-In Variables
New SPIR-V Capabilities
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Issues
None.
Examples
None.
Version History
-
Revision 1, 2016-10-28 (Jeff Bolz)
-
Internal revisions
-
VK_KHR_relaxed_block_layout
- Name String
-
VK_KHR_relaxed_block_layout
- Extension Type
-
Device extension
- Registered Extension Number
-
145
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
John Kessenich johnkslang
-
- Last Modified Date
-
2017-03-26
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
John Kessenich, Google
-
The VK_KHR_relaxed_block_layout
extension allows implementations to
indicate they can support more variation in block Offset
decorations.
For example, placing a vector of three floats at an offset of 16*N + 4.
See Offset and Stride Assignment for details.
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Version History
-
Revision 1, 2017-03-26 (JohnK)
VK_KHR_sampler_ycbcr_conversion
- Name String
-
VK_KHR_sampler_ycbcr_conversion
- Extension Type
-
Device extension
- Registered Extension Number
-
157
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_maintenance1
-
Requires
VK_KHR_bind_memory2
-
Requires
VK_KHR_get_memory_requirements2
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Andrew Garrard fluppeteer
-
- Last Modified Date
-
2017-08-11
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Promoted to Vulkan 1.1 Core
-
This extension interacts with
VK_EXT_debug_report
-
- Contributors
-
-
Andrew Garrard, Samsung Electronics
-
Tobias Hector, Imagination Technologies
-
James Jones, NVIDIA
-
Daniel Koch, NVIDIA
-
Daniel Rakos, AMD
-
Romain Guy, Google
-
Jesse Hall, Google
-
Tom Cooksey, ARM Ltd
-
Jeff Leger, Qualcomm Technologies, Inc
-
Jan-Harald Fredriksen, ARM Ltd
-
Jan Outters, Samsung Electronics
-
Alon Or-bach, Samsung Electronics
-
Michael Worcester, Imagination Technologies
-
Jeff Bolz, NVIDIA
-
Tony Zlatinski, NVIDIA
-
Matthew Netsch, Qualcomm Technologies, Inc
-
This extension provides the ability to perform specified color space conversions during texture sampling operations. It also adds a selection of multi-planar formats, including the ability to bind memory to the planes of an image collectively or separately.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_CREATE_INFO_KHR
-
VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_INFO_KHR
-
VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO_KHR
-
VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO_KHR
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLER_YCBCR_CONVERSION_FEATURES_KHR
-
-
Extending VkFormat:
-
VK_FORMAT_G8B8G8R8_422_UNORM_KHR
-
VK_FORMAT_B8G8R8G8_422_UNORM_KHR
-
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM_KHR
-
VK_FORMAT_G8_B8R8_2PLANE_420_UNORM_KHR
-
VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM_KHR
-
VK_FORMAT_G8_B8R8_2PLANE_422_UNORM_KHR
-
VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM_KHR
-
VK_FORMAT_R10X6_UNORM_PACK16_KHR
-
VK_FORMAT_R10X6G10X6_UNORM_2PACK16_KHR
-
VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16_KHR
-
VK_FORMAT_G10X6B10X6G10X6R10X6_422_UNORM_4PACK16_KHR
-
VK_FORMAT_B10X6G10X6R10X6G10X6_422_UNORM_4PACK16_KHR
-
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_420_UNORM_3PACK16_KHR
-
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_420_UNORM_3PACK16_KHR
-
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16_KHR
-
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_422_UNORM_3PACK16_KHR
-
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_444_UNORM_3PACK16_KHR
-
VK_FORMAT_R12X4_UNORM_PACK16_KHR
-
VK_FORMAT_R12X4G12X4_UNORM_2PACK16_KHR
-
VK_FORMAT_R12X4G12X4B12X4A12X4_UNORM_4PACK16_KHR
-
VK_FORMAT_G12X4B12X4G12X4R12X4_422_UNORM_4PACK16_KHR
-
VK_FORMAT_B12X4G12X4R12X4G12X4_422_UNORM_4PACK16_KHR
-
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_420_UNORM_3PACK16_KHR
-
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_420_UNORM_3PACK16_KHR
-
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_422_UNORM_3PACK16_KHR
-
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_422_UNORM_3PACK16_KHR
-
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_444_UNORM_3PACK16_KHR
-
VK_FORMAT_G16B16G16R16_422_UNORM_KHR
-
VK_FORMAT_B16G16R16G16_422_UNORM_KHR
-
VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM_KHR
-
VK_FORMAT_G16_B16R16_2PLANE_420_UNORM_KHR
-
VK_FORMAT_G16_B16_R16_3PLANE_422_UNORM_KHR
-
VK_FORMAT_G16_B16R16_2PLANE_422_UNORM_KHR
-
VK_FORMAT_G16_B16_R16_3PLANE_444_UNORM_KHR
-
-
Extending VkImageAspectFlagBits:
-
VK_IMAGE_ASPECT_PLANE_0_BIT_KHR
-
VK_IMAGE_ASPECT_PLANE_1_BIT_KHR
-
VK_IMAGE_ASPECT_PLANE_2_BIT_KHR
-
-
Extending VkImageCreateFlagBits:
-
VK_IMAGE_CREATE_DISJOINT_BIT_KHR
-
-
Extending VkFormatFeatureFlagBits:
-
VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT_KHR
-
VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT_KHR
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER_BIT_KHR
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_SEPARATE_RECONSTRUCTION_FILTER_BIT_KHR
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT_KHR
-
VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_FORCEABLE_BIT_KHR
-
VK_FORMAT_FEATURE_DISJOINT_BIT_KHR
-
New Structures
New Objects
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the KHR suffix omitted. The original type, enum and command names are still available as aliases of the core functionality.
Version History
-
Revision 1, 2017-01-24 (Andrew Garrard)
-
Initial draft
-
-
Revision 2, 2017-01-25 (Andrew Garrard)
-
After initial feedback
-
-
Revision 3, 2017-01-27 (Andrew Garrard)
-
Higher bit depth formats, renaming, swizzle
-
-
Revision 4, 2017-02-22 (Andrew Garrard)
-
Added query function, formats as RGB, clarifications
-
-
Revision 5, 2017-04 (Andrew Garrard)
-
Simplified query and removed output conversions
-
-
Version 6, 2017-4-24 (Andrew Garrard)
-
Tidying, incorporated new image query, restored transfer functions
-
-
Version 7, 2017-04-25 (Andrew Garrard)
-
Added cosited option/midpoint requirement for formats, "bypassConversion"
-
-
Version 8, 2017-04-25 (Andrew Garrard)
-
Simplified further
-
-
Version 9, 2017-04-27 (Andrew Garrard)
-
Disjoint no more
-
-
Version 10, 2017-04-28 (Andrew Garrard)
-
Restored disjoint
-
-
Version 11, 2017-04-29 (Andrew Garrard)
-
Now Ycbcr conversion, and KHR
-
-
Version 12, 2017-06-06 (Andrew Garrard)
-
Added conversion to image view creation
-
-
Version 13, 2017-07-13 (Andrew Garrard)
-
Allowed cosited-only chroma samples for formats
-
-
Version 14, 2017-08-11 (Andrew Garrard)
-
Reflected quantization changes in BT.2100-1
-
VK_KHR_shader_draw_parameters
- Name String
-
VK_KHR_shader_draw_parameters
- Extension Type
-
Device extension
- Registered Extension Number
-
64
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Daniel Koch dgkoch
-
- Last Modified Date
-
2017-09-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Requires the SPV_KHR_shader_draw_parameters SPIR-V extension.
-
Requires GL_ARB_shader_draw_parameters for GLSL source languages.
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Daniel Koch, NVIDIA Corporation
-
Jeff Bolz, NVIDIA
-
Daniel Rakos, AMD
-
Jan-Harald Fredriksen, ARM
-
John Kessenich, Google
-
Stuart Smith, IMG
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_KHR_shader_draw_parameters
The extension provides access to three additional built-in shader variables in Vulkan:
-
BaseInstance
, which contains thefirstInstance
parameter passed to draw commands, -
BaseVertex
, which contains thefirstVertex
/vertexOffset
parameter passed to draw commands, and -
DrawIndex
, which contains the index of the draw call currently being processed from an indirect draw call.
When using GLSL source-based shader languages, the following variables from GL_ARB_shader_draw_parameters can map to these SPIR-V built-in decorations:
-
in int gl_BaseInstanceARB;
→BaseInstance
, -
in int gl_BaseVertexARB;
→BaseVertex
, and -
in int gl_DrawIDARB;
→DrawIndex
.
New Object Types
None.
New Enum Constants
None.
New Enums
None.
New Structures
None.
New Functions
None.
New Built-In Variables
New SPIR-V Capabilities
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, however a feature bit was added to distinguish whether it’s actually available or not.
Issues
1) Is this the same functionality as GL_ARB_shader_draw_parameters?
RESOLVED: It’s actually a superset as it also adds in support for arrayed drawing commands.
In GL for GL_ARB_shader_draw_parameters, gl_BaseVertexARB
holds the
integer value passed to the parameter to the command that resulted in the
current shader invocation.
In the case where the command has no baseVertex
parameter, the value of
gl_BaseVertexARB
is zero.
This means that gl_BaseVertexARB
= baseVertex
(for
glDrawElements
commands with baseVertex
) or 0.
In particular there are no glDrawArrays
commands that take a
baseVertex
parameter.
Now in Vulkan, we have BaseVertex
= vertexOffset
(for indexed
drawing commands) or firstVertex
(for arrayed drawing commands), and
so Vulkan’s version is really a superset of GL functionality.
Version History
-
Revision 1, 2016-10-05 (Daniel Koch)
-
Internal revisions
-
VK_KHR_storage_buffer_storage_class
- Name String
-
VK_KHR_storage_buffer_storage_class
- Extension Type
-
Device extension
- Registered Extension Number
-
132
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Alexander Galazin alegal-arm
-
- Last Modified Date
-
2017-09-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
This extension requires the SPV_KHR_storage_buffer_storage_class SPIR-V extension.
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
Alexander Galazin, ARM
-
David Neto, Google
-
This extension adds support for the following SPIR-V extension in Vulkan:
-
SPV_KHR_storage_buffer_storage_class
This extension provides a new SPIR-V StorageBuffer
storage class.
A Block
-decorated object in this class is equivalent to a
BufferBlock
-decorated object in the Uniform
storage class.
New Enum Constants
None.
New Structures
None.
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1.
Issues
None.
Version History
-
Revision 1, 2017-03-23 (Alexander Galazin)
-
Initial draft
-
VK_KHR_variable_pointers
- Name String
-
VK_KHR_variable_pointers
- Extension Type
-
Device extension
- Registered Extension Number
-
121
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_KHR_storage_buffer_storage_class
-
- Deprecation state
-
-
Promoted to Vulkan 1.1
-
- Contact
-
-
Jesse Hall critsec
-
- Last Modified Date
-
2017-09-05
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Requires the SPV_KHR_variable_pointers SPIR-V extension.
-
Promoted to Vulkan 1.1 Core
-
- Contributors
-
-
John Kessenich, Google
-
Neil Henning, Codeplay
-
David Neto, Google
-
Daniel Koch, Nvidia
-
Graeme Leese, Broadcom
-
Weifeng Zhang, Qualcomm
-
Stephen Clarke, Imagination Technologies
-
Jason Ekstrand, Intel
-
Jesse Hall, Google
-
The VK_KHR_variable_pointers
extension allows implementations to indicate
their level of support for the SPV_KHR_variable_pointers SPIR-V extension.
The SPIR-V extension allows shader modules to use invocation-private
pointers into uniform and/or storage buffers, where the pointer values can
be dynamic and non-uniform.
The SPV_KHR_variable_pointers extension introduces two capabilities.
The first, VariablePointersStorageBuffer
, must be supported by all
implementations of this extension.
The second, VariablePointers
, is optional.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VARIABLE_POINTER_FEATURES_KHR
-
New Structures
New SPIR-V Capabilities
Promotion to Vulkan 1.1
All functionality in this extension is included in core Vulkan 1.1, with the
KHR suffix omitted, however support for the
variablePointersStorageBuffer
feature is made optional.
The original type, enum and command names are still available as aliases of
the core functionality.
Issues
1) Do we need an optional property for the SPIR-V
VariablePointersStorageBuffer
capability or should it be mandatory when
this extension is advertised?
RESOLVED: Add it as a distinct feature, but make support mandatory. Adding it as a feature makes the extension easier to include in a future core API version. In the extension, the feature is mandatory, so that presence of the extension guarantees some functionality. When included in a core API version, the feature would be optional.
2) Can support for these capabilities vary between shader stages?
RESOLVED: No, if the capability is supported in any stage it must be supported in all stages.
3) Should the capabilities be features or limits?
RESOLVED: Features, primarily for consistency with other similar extensions.
Version History
-
Revision 1, 2017-03-14 (Jesse Hall and John Kessenich)
-
Internal revisions
-
VK_EXT_debug_marker
- Name String
-
VK_EXT_debug_marker
- Extension Type
-
Device extension
- Registered Extension Number
-
23
- Revision
-
4
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_EXT_debug_report
-
- Deprecation state
-
-
Promoted to VK_EXT_debug_utils extension
-
- Contact
-
-
Baldur Karlsson baldurk
-
- Last Modified Date
-
2017-01-31
- IP Status
-
No known IP claims.
- Contributors
-
-
Baldur Karlsson
-
Dan Ginsburg, Valve
-
Jon Ashburn, LunarG
-
Kyle Spagnoli, NVIDIA
-
The VK_EXT_debug_marker
extension is a device extension.
It introduces concepts of object naming and tagging, for better tracking of
Vulkan objects, as well as additional commands for recording annotations of
named sections of a workload to aid organization and offline analysis in
external tools.
New Object Types
None
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DEBUG_MARKER_OBJECT_NAME_INFO_EXT
-
VK_STRUCTURE_TYPE_DEBUG_MARKER_OBJECT_TAG_INFO_EXT
-
VK_STRUCTURE_TYPE_DEBUG_MARKER_MARKER_INFO_EXT
-
New Enums
None
New Structures
New Functions
Examples
Example 1
Associate a name with an image, for easier debugging in external tools or with validation layers that can print a friendly name when referring to objects in error messages.
extern VkDevice device;
extern VkImage image;
// Must call extension functions through a function pointer:
PFN_vkDebugMarkerSetObjectNameEXT pfnDebugMarkerSetObjectNameEXT = (PFN_vkDebugMarkerSetObjectNameEXT)vkGetDeviceProcAddr(device, "vkDebugMarkerSetObjectNameEXT");
// Set a name on the image
const VkDebugMarkerObjectNameInfoEXT imageNameInfo =
{
VK_STRUCTURE_TYPE_DEBUG_MARKER_OBJECT_NAME_INFO_EXT, // sType
NULL, // pNext
VK_DEBUG_REPORT_OBJECT_TYPE_IMAGE_EXT, // objectType
(uint64_t)image, // object
"Brick Diffuse Texture", // pObjectName
};
pfnDebugMarkerSetObjectNameEXT(device, &imageNameInfo);
// A subsequent error might print:
// Image 'Brick Diffuse Texture' (0xc0dec0dedeadbeef) is used in a
// command buffer with no memory bound to it.
Example 2
Annotating regions of a workload with naming information so that offline analysis tools can display a more usable visualisation of the commands submitted.
extern VkDevice device;
extern VkCommandBuffer commandBuffer;
// Must call extension functions through a function pointer:
PFN_vkCmdDebugMarkerBeginEXT pfnCmdDebugMarkerBeginEXT = (PFN_vkCmdDebugMarkerBeginEXT)vkGetDeviceProcAddr(device, "vkCmdDebugMarkerBeginEXT");
PFN_vkCmdDebugMarkerEndEXT pfnCmdDebugMarkerEndEXT = (PFN_vkCmdDebugMarkerEndEXT)vkGetDeviceProcAddr(device, "vkCmdDebugMarkerEndEXT");
PFN_vkCmdDebugMarkerInsertEXT pfnCmdDebugMarkerInsertEXT = (PFN_vkCmdDebugMarkerInsertEXT)vkGetDeviceProcAddr(device, "vkCmdDebugMarkerInsertEXT");
// Describe the area being rendered
const VkDebugMarkerMarkerInfoEXT houseMarker =
{
VK_STRUCTURE_TYPE_DEBUG_MARKER_MARKER_INFO_EXT, // sType
NULL, // pNext
"Brick House", // pMarkerName
{ 1.0f, 0.0f, 0.0f, 1.0f }, // color
};
// Start an annotated group of calls under the 'Brick House' name
pfnCmdDebugMarkerBeginEXT(commandBuffer, &houseMarker);
{
// A mutable structure for each part being rendered
VkDebugMarkerMarkerInfoEXT housePartMarker =
{
VK_STRUCTURE_TYPE_DEBUG_MARKER_MARKER_INFO_EXT, // sType
NULL, // pNext
NULL, // pMarkerName
{ 0.0f, 0.0f, 0.0f, 0.0f }, // color
};
// Set the name and insert the marker
housePartMarker.pMarkerName = "Walls";
pfnCmdDebugMarkerInsertEXT(commandBuffer, &housePartMarker);
// Insert the drawcall for the walls
vkCmdDrawIndexed(commandBuffer, 1000, 1, 0, 0, 0);
// Insert a recursive region for two sets of windows
housePartMarker.pMarkerName = "Windows";
pfnCmdDebugMarkerBeginEXT(commandBuffer, &housePartMarker);
{
vkCmdDrawIndexed(commandBuffer, 75, 6, 1000, 0, 0);
vkCmdDrawIndexed(commandBuffer, 100, 2, 1450, 0, 0);
}
pfnCmdDebugMarkerEndEXT(commandBuffer);
housePartMarker.pMarkerName = "Front Door";
pfnCmdDebugMarkerInsertEXT(commandBuffer, &housePartMarker);
vkCmdDrawIndexed(commandBuffer, 350, 1, 1650, 0, 0);
housePartMarker.pMarkerName = "Roof";
pfnCmdDebugMarkerInsertEXT(commandBuffer, &housePartMarker);
vkCmdDrawIndexed(commandBuffer, 500, 1, 2000, 0, 0);
}
// End the house annotation started above
pfnCmdDebugMarkerEndEXT(commandBuffer);
Issues
1) Should the tag or name for an object be specified using the pNext
parameter in the object’s Vk*CreateInfo
structure?
RESOLVED: No.
While this fits with other Vulkan patterns and would allow more type safety
and future proofing against future objects, it has notable downsides.
In particular passing the name at Vk*CreateInfo
time does not allow
renaming, prevents late binding of naming information, and does not allow
naming of implicitly created objects such as queues and swapchain images.
2) Should the command annotation functions vkCmdDebugMarkerBeginEXT and vkCmdDebugMarkerEndEXT support the ability to specify a color?
RESOLVED: Yes. The functions have been expanded to take an optional color which can be used at will by implementations consuming the command buffer annotations in their visualisation.
3) Should the functions added in this extension accept an extensible structure as their parameter for a more flexible API, as opposed to direct function parameters? If so, which functions?
RESOLVED: Yes.
All functions have been modified to take a structure type with extensible
pNext
pointer, to allow future extensions to add additional annotation
information in the same commands.
Version History
-
Revision 1, 2016-02-24 (Baldur Karlsson)
-
Initial draft, based on LunarG marker spec
-
-
Revision 2, 2016-02-26 (Baldur Karlsson)
-
Renamed Dbg to DebugMarker in function names
-
Allow markers in secondary command buffers under certain circumstances
-
Minor language tweaks and edits
-
-
Revision 3, 2016-04-23 (Baldur Karlsson)
-
Reorganise spec layout to closer match desired organisation
-
Added optional color to markers (both regions and inserted labels)
-
Changed functions to take extensible structs instead of direct function parameters
-
-
Revision 4, 2017-01-31 (Baldur Karlsson)
-
Added explicit dependency on VK_EXT_debug_report
-
Moved definition of VkDebugReportObjectTypeEXT to debug report chapter.
-
Fixed typo in dates in revision history
-
VK_EXT_debug_report
- Name String
-
VK_EXT_debug_report
- Extension Type
-
Instance extension
- Registered Extension Number
-
12
- Revision
-
9
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Deprecated by VK_EXT_debug_utils extension
-
- Contact
-
-
Courtney Goeltzenleuchter courtney-g
-
- Last Modified Date
-
2017-09-12
- IP Status
-
No known IP claims.
- Contributors
-
-
Courtney Goeltzenleuchter, LunarG
-
Dan Ginsburg, Valve
-
Jon Ashburn, LunarG
-
Mark Lobodzinski, LunarG
-
Due to the nature of the Vulkan interface, there is very little error
information available to the developer and application.
By enabling optional validation layers and using the VK_EXT_debug_report
extension, developers can obtain much more detailed feedback on the
application’s use of Vulkan.
This extension defines a way for layers and the implementation to call back
to the application for events of interest to the application.
New Object Types
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT
-
-
Extending VkResult:
-
VK_ERROR_VALIDATION_FAILED_EXT
-
New Structures
New Functions
New Function Pointers
Examples
VK_EXT_debug_report
allows an application to register multiple callbacks
with the validation layers.
Some callbacks may log the information to a file, others may cause a debug
break point or other application defined behavior.
An application can register callbacks even when no validation layers are
enabled, but they will only be called for loader and, if implemented, driver
events.
To capture events that occur while creating or destroying an instance an
application can link a VkDebugReportCallbackCreateInfoEXT structure
to the pNext
element of the VkInstanceCreateInfo structure given
to vkCreateInstance.
This callback is only valid for the duration of the vkCreateInstance
and the vkDestroyInstance call.
Use vkCreateDebugReportCallbackEXT to create persistent callback
objects.
Example uses: Create three callback objects.
One will log errors and warnings to the debug console using Windows
OutputDebugString
.
The second will cause the debugger to break at that callback when an error
happens and the third will log warnings to stdout.
VkResult res;
VkDebugReportCallbackEXT cb1, cb2, cb3;
VkDebugReportCallbackCreateInfoEXT callback1 = {
VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT, // sType
NULL, // pNext
VK_DEBUG_REPORT_ERROR_BIT_EXT | // flags
VK_DEBUG_REPORT_WARNING_BIT_EXT,
myOutputDebugString, // pfnCallback
NULL // pUserData
};
res = vkCreateDebugReportCallbackEXT(instance, &callback1, &cb1);
if (res != VK_SUCCESS)
/* Do error handling for VK_ERROR_OUT_OF_MEMORY */
callback.flags = VK_DEBUG_REPORT_ERROR_BIT_EXT;
callback.pfnCallback = myDebugBreak;
callback.pUserData = NULL;
res = vkCreateDebugReportCallbackEXT(instance, &callback, &cb2);
if (res != VK_SUCCESS)
/* Do error handling for VK_ERROR_OUT_OF_MEMORY */
VkDebugReportCallbackCreateInfoEXT callback3 = {
VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT, // sType
NULL, // pNext
VK_DEBUG_REPORT_WARNING_BIT_EXT, // flags
mystdOutLogger, // pfnCallback
NULL // pUserData
};
res = vkCreateDebugReportCallbackEXT(instance, &callback3, &cb3);
if (res != VK_SUCCESS)
/* Do error handling for VK_ERROR_OUT_OF_MEMORY */
...
/* remove callbacks when cleaning up */
vkDestroyDebugReportCallbackEXT(instance, cb1);
vkDestroyDebugReportCallbackEXT(instance, cb2);
vkDestroyDebugReportCallbackEXT(instance, cb3);
Note
In the initial release of the |
Note
In the initial release of the |
Issues
1) What is the hierarchy / seriousness of the message flags? E.g.
ERROR
> WARN
> PERF_WARN
…
RESOLVED: There is no specific hierarchy. Each bit is independent and should be checked via bitwise AND. For example:
if (localFlags & VK_DEBUG_REPORT_ERROR_BIT_EXT) {
process error message
}
if (localFlags & VK_DEBUG_REPORT_DEBUG_BIT_EXT) {
process debug message
}
The validation layers do use them in a hierarchical way (ERROR
>
WARN
> PERF
, WARN
> DEBUG
> INFO
) and they (at
least at the time of this writing) only set one bit at a time.
But it is not a requirement of this extension.
It is possible that a layer may intercept and change, or augment the flags with extension values the application’s debug report handler may not be familiar with, so it is important to treat each flag independently.
2) Should there be a VU requiring
VkDebugReportCallbackCreateInfoEXT::flags
to be non-zero?
RESOLVED: It may not be very useful, but we do not need VU statement
requiring the VkDebugReportCallbackCreateInfoEXT
::msgFlags
at
create-time to be non-zero.
One can imagine that apps may prefer it as it allows them to set the mask as
desired - including nothing - at runtime without having to check.
3) What is the difference between VK_DEBUG_REPORT_DEBUG_BIT_EXT
and
VK_DEBUG_REPORT_INFORMATION_BIT_EXT
?
RESOLVED: VK_DEBUG_REPORT_DEBUG_BIT_EXT
specifies information that
could be useful debugging the Vulkan implementation itself.
Version History
-
Revision 1, 2015-05-20 (Courtney Goetzenleuchter)
-
Initial draft, based on LunarG KHR spec, other KHR specs
-
-
Revision 2, 2016-02-16 (Courtney Goetzenleuchter)
-
Update usage, documentation
-
-
Revision 3, 2016-06-14 (Courtney Goetzenleuchter)
-
Update VK_EXT_DEBUG_REPORT_SPEC_VERSION to indicate added support for vkCreateInstance and vkDestroyInstance
-
-
Revision 4, 2016-12-08 (Mark Lobodzinski)
-
Added Display_KHR, DisplayModeKHR extension objects
-
Added ObjectTable_NVX, IndirectCommandsLayout_NVX extension objects
-
Bumped spec revision
-
Retroactively added version history
-
-
Revision 5, 2017-01-31 (Baldur Karlsson)
-
Moved definition of VkDebugReportObjectTypeEXT from debug marker chapter
-
-
Revision 6, 2017-01-31 (Baldur Karlsson)
-
Added VK_DEBUG_REPORT_OBJECT_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_KHR_EXT
-
-
Revision 7, 2017-04-20 (Courtney Goeltzenleuchter)
-
Clarify wording and address questions from developers.
-
-
Revision 8, 2017-04-21 (Courtney Goeltzenleuchter)
-
Remove unused enum VkDebugReportErrorEXT
-
-
Revision 9, 2017-09-12 (Tobias Hector)
-
Added interactions with Vulkan 1.1
-
VK_AMD_draw_indirect_count
- Name String
-
VK_AMD_draw_indirect_count
- Extension Type
-
Device extension
- Registered Extension Number
-
34
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Promoted to VK_KHR_draw_indirect_count extension
-
- Contact
-
-
Daniel Rakos drakos-amd
-
- Last Modified Date
-
2016-08-23
- IP Status
-
No known IP claims.
- Contributors
-
-
Matthaeus G. Chajdas, AMD
-
Derrick Owens, AMD
-
Graham Sellers, AMD
-
Daniel Rakos, AMD
-
Dominik Witczak, AMD
-
This extension allows an application to source the number of draw calls for indirect draw calls from a buffer. This enables applications to generate arbitrary amounts of draw commands and execute them without host intervention.
New Functions
Version History
-
Revision 2, 2016-08-23 (Dominik Witczak)
-
Minor fixes
-
-
Revision 1, 2016-07-21 (Matthaeus Chajdas)
-
Initial draft
-
VK_AMD_negative_viewport_height
- Name String
-
VK_AMD_negative_viewport_height
- Extension Type
-
Device extension
- Registered Extension Number
-
36
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Obsoleted by VK_KHR_maintenance1 extension
-
Which in turn was promoted to Vulkan 1.1
-
-
- Contact
-
-
Matthaeus G. Chajdas anteru
-
- Last Modified Date
-
2016-09-02
- IP Status
-
No known IP claims.
- Contributors
-
-
Matthaeus G. Chajdas, AMD
-
Graham Sellers, AMD
-
Baldur Karlsson
-
- Interactions and External Dependencies
-
-
Obsoleted by
VK_KHR_maintenance1
-
Obsoleted by Vulkan 1.1
-
This extension allows an application to specify a negative viewport height. The result is that the viewport transformation will flip along the y-axis.
-
Allow negative height to be specified in the VkViewport::
height
field to perform y-inversion of the clip-space to framebuffer-space transform. This allows apps to avoid having to usegl_Position.y = -gl_Position.y
in shaders also targeting other APIs.
Obsoletion by VK_KHR_maintenance1 and Vulkan 1.1
Functionality in this extension is included in VK_KHR_maintenance1
and
Vulkan 1.1.
Due to some slight behavioral differences, this extension must not be
enabled alongside VK_KHR_maintenance1
, or in an instance created with
version 1.1 or later requested in VkApplicationInfo::apiVersion
.
Version History
-
Revision 1, 2016-09-02 (Matthaeus Chajdas)
-
Initial draft
-
VK_NV_dedicated_allocation
- Name String
-
VK_NV_dedicated_allocation
- Extension Type
-
Device extension
- Registered Extension Number
-
27
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Deprecated by VK_KHR_dedicated_allocation extension
-
Which in turn was promoted to Vulkan 1.1
-
-
- Contact
-
-
Jeff Bolz jeffbolznv
-
- Last Modified Date
-
2016-05-31
- IP Status
-
No known IP claims.
- Contributors
-
-
Jeff Bolz, NVIDIA
-
This extension allows device memory to be allocated for a particular buffer or image resource, which on some devices can significantly improve the performance of that resource. Normal device memory allocations must support memory aliasing and sparse binding, which could interfere with optimizations like framebuffer compression or efficient page table usage. This is important for render targets and very large resources, but need not (and probably should not) be used for smaller resources that can benefit from suballocation.
This extension adds a few small structures to resource creation and memory allocation: a new structure that flags whether am image/buffer will have a dedicated allocation, and a structure indicating the image or buffer that an allocation will be bound to.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_IMAGE_CREATE_INFO_NV
-
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_BUFFER_CREATE_INFO_NV
-
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_MEMORY_ALLOCATE_INFO_NV
-
New Enums
None.
New Structures
New Functions
None.
Issues
None.
Examples
// Create an image with
// VkDedicatedAllocationImageCreateInfoNV::dedicatedAllocation
// set to VK_TRUE
VkDedicatedAllocationImageCreateInfoNV dedicatedImageInfo =
{
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_IMAGE_CREATE_INFO_NV, // sType
NULL, // pNext
VK_TRUE, // dedicatedAllocation
};
VkImageCreateInfo imageCreateInfo =
{
VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, // sType
&dedicatedImageInfo // pNext
// Other members set as usual
};
VkImage image;
VkResult result = vkCreateImage(
device,
&imageCreateInfo,
NULL, // pAllocator
&image);
VkMemoryRequirements memoryRequirements;
vkGetImageMemoryRequirements(
device,
image,
&memoryRequirements);
// Allocate memory with VkDedicatedAllocationMemoryAllocateInfoNV::image
// pointing to the image we are allocating the memory for
VkDedicatedAllocationMemoryAllocateInfoNV dedicatedInfo =
{
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_MEMORY_ALLOCATE_INFO_NV, // sType
NULL, // pNext
image, // image
VK_NULL_HANDLE, // buffer
};
VkMemoryAllocateInfo memoryAllocateInfo =
{
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO, // sType
&dedicatedInfo, // pNext
memoryRequirements.size, // allocationSize
FindMemoryTypeIndex(memoryRequirements.memoryTypeBits), // memoryTypeIndex
};
VkDeviceMemory memory;
vkAllocateMemory(
device,
&memoryAllocateInfo,
NULL, // pAllocator
&memory);
// Bind the image to the memory
vkBindImageMemory(
device,
image,
memory,
0);
Version History
-
Revision 1, 2016-05-31 (Jeff Bolz)
-
Internal revisions
-
VK_NV_external_memory
- Name String
-
VK_NV_external_memory
- Extension Type
-
Device extension
- Registered Extension Number
-
57
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_NV_external_memory_capabilities
-
- Deprecation state
-
-
Deprecated by VK_KHR_external_memory extension
-
Which in turn was promoted to Vulkan 1.1
-
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-08-19
- IP Status
-
No known IP claims.
- Contributors
-
-
James Jones, NVIDIA
-
Carsten Rohde, NVIDIA
-
Applications may wish to export memory to other Vulkan instances or other APIs, or import memory from other Vulkan instances or other APIs to enable Vulkan workloads to be split up across application module, process, or API boundaries. This extension enables applications to create exportable Vulkan memory objects such that the underlying resources can be referenced outside the Vulkan instance that created them.
New Object Types
None.
New Enum Constants
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO_NV
-
VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO_NV
New Enums
None.
New Structures
New Functions
None.
Issues
1) If memory objects are shared between processes and APIs, is this considered aliasing according to the rules outlined in the Memory Aliasing section?
RESOLVED: Yes, but strict exceptions to the rules are added to allow some forms of aliasing in these cases. Further, other extensions may build upon these new aliasing rules to define specific support usage within Vulkan for imported native memory objects, or memory objects from other APIs.
2) Are new image layouts or metadata required to specify image layouts and layout transitions compatible with non-Vulkan APIs, or with other instances of the same Vulkan driver?
RESOLVED: No.
Separate instances of the same Vulkan driver running on the same GPU should
have identical internal layout semantics, so applictions have the tools they
need to ensure views of images are consistent between the two instances.
Other APIs will fall into two categories: Those that are Vulkan compatible
(a term to be defined by subsequent interopability extensions), or Vulkan
incompatible.
When sharing images with Vulkan incompatible APIs, the Vulkan image must be
transitioned to the VK_IMAGE_LAYOUT_GENERAL
layout before handing it
off to the external API.
Note this does not attempt to address cross-device transitions, nor transitions to engines on the same device which are not visible within the Vulkan API. Both of these are beyond the scope of this extension.
Examples
// TODO: Write some sample code here.
Version History
-
Revision 1, 2016-08-19 (James Jones)
-
Initial draft
-
VK_NV_external_memory_capabilities
- Name String
-
VK_NV_external_memory_capabilities
- Extension Type
-
Instance extension
- Registered Extension Number
-
56
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Deprecated by VK_KHR_external_memory_capabilities extension
-
Which in turn was promoted to Vulkan 1.1
-
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-08-19
- IP Status
-
No known IP claims.
- Interactions and External Dependencies
-
-
Interacts with Vulkan 1.1.
-
Interacts with
VK_KHR_dedicated_allocation
. -
Interacts with
VK_NV_dedicated_allocation
.
-
- Contributors
-
-
James Jones, NVIDIA
-
Applications may wish to import memory from the Direct 3D API, or export memory to other Vulkan instances. This extension provides a set of capability queries that allow applications determine what types of win32 memory handles an implementation supports for a given set of use cases.
New Object Types
None.
New Enum Constants
None.
New Structs
New Functions
Issues
1) Why do so many external memory capabilities need to be queried on a per-memory-handle-type basis?
RESOLVED: This is because some handle types are based on OS-native objects that have far more limited capabilities than the very generic Vulkan memory objects. Not all memory handle types can name memory objects that support 3D images, for example. Some handle types cannot even support the deferred image and memory binding behavior of Vulkan and require specifying the image when allocating or importing the memory object.
2) Does the VkExternalImageFormatPropertiesNV struct need to include a list of memory type bits that support the given handle type?
RESOLVED: No. The memory types that do not support the handle types will simply be filtered out of the results returned by vkGetImageMemoryRequirements when a set of handle types was specified at image creation time.
3) Should the non-opaque handle types be moved to their own extension?
RESOLVED: Perhaps. However, defining the handle type bits does very little and does not require any platform-specific types on its own, and it is easier to maintain the bitmask values in a single extension for now. Presumably more handle types could be added by separate extensions though, and it would be midly weird to have some platform-specific ones defined in the core spec and some in extensions
VK_NV_external_memory_win32
- Name String
-
VK_NV_external_memory_win32
- Extension Type
-
Device extension
- Registered Extension Number
-
58
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_NV_external_memory
-
- Deprecation state
-
-
Deprecated by VK_KHR_external_memory_win32 extension
-
- Contact
-
-
James Jones cubanismo
-
- Last Modified Date
-
2016-08-19
- IP Status
-
No known IP claims.
- Contributors
-
-
James Jones, NVIDIA
-
Carsten Rohde, NVIDIA
-
Applications may wish to export memory to other Vulkan instances or other APIs, or import memory from other Vulkan instances or other APIs to enable Vulkan workloads to be split up across application module, process, or API boundaries. This extension enables win32 applications to export win32 handles from Vulkan memory objects such that the underlying resources can be referenced outside the Vulkan instance that created them, and import win32 handles created in the Direct3D API to Vulkan memory objects.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_IMPORT_MEMORY_WIN32_HANDLE_INFO_NV
-
VK_STRUCTURE_TYPE_EXPORT_MEMORY_WIN32_HANDLE_INFO_NV
-
New Enums
None.
New Structures
New Functions
Issues
1) If memory objects are shared between processes and APIs, is this considered aliasing according to the rules outlined in the Memory Aliasing section?
RESOLVED: Yes, but strict exceptions to the rules are added to allow some forms of aliasing in these cases. Further, other extensions may build upon these new aliasing rules to define specific support usage within Vulkan for imported native memory objects, or memory objects from other APIs.
2) Are new image layouts or metadata required to specify image layouts and layout transitions compatible with non-Vulkan APIs, or with other instances of the same Vulkan driver?
RESOLVED: No.
Separate instances of the same Vulkan driver running on the same GPU should
have identical internal layout semantics, so applictions have the tools they
need to ensure views of images are consistent between the two instances.
Other APIs will fall into two categories: Those that are Vulkan compatible
(a term to be defined by subsequent interopability extensions), or Vulkan
incompatible.
When sharing images with Vulkan incompatible APIs, the Vulkan image must be
transitioned to the VK_IMAGE_LAYOUT_GENERAL
layout before handing it
off to the external API.
Note this does not attempt to address cross-device transitions, nor transitions to engines on the same device which are not visible within the Vulkan API. Both of these are beyond the scope of this extension.
3) Do applications need to call CloseHandle
() on the values returned
from vkGetMemoryWin32HandleNV when handleType
is
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_NV
?
RESOLVED: Yes, unless it is passed back in to another driver instance to import the object. A successful get call transfers ownership of the handle to the application, while an import transfers ownership to the associated driver. Destroying the memory object will not destroy the handle or the handle’s reference to the underlying memory resource.
Examples
//
// Create an exportable memory object and export an external
// handle from it.
//
// Pick an external format and handle type.
static const VkFormat format = VK_FORMAT_R8G8B8A8_UNORM;
static const VkExternalMemoryHandleTypeFlagsNV handleType =
VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_NV;
extern VkPhysicalDevice physicalDevice;
extern VkDevice device;
VkPhysicalDeviceMemoryProperties memoryProperties;
VkExternalImageFormatPropertiesNV properties;
VkExternalMemoryImageCreateInfoNV externalMemoryImageCreateInfo;
VkDedicatedAllocationImageCreateInfoNV dedicatedImageCreateInfo;
VkImageCreateInfo imageCreateInfo;
VkImage image;
VkMemoryRequirements imageMemoryRequirements;
uint32_t numMemoryTypes;
uint32_t memoryType;
VkExportMemoryAllocateInfoNV exportMemoryAllocateInfo;
VkDedicatedAllocationMemoryAllocateInfoNV dedicatedAllocationInfo;
VkMemoryAllocateInfo memoryAllocateInfo;
VkDeviceMemory memory;
VkResult result;
HANDLE memoryHnd;
// Figure out how many memory types the device supports
vkGetPhysicalDeviceMemoryProperties(physicalDevice,
&memoryProperties);
numMemoryTypes = memoryProperties.memoryTypeCount;
// Check the external handle type capabilities for the chosen format
// Exportable 2D image support with at least 1 mip level, 1 array
// layer, and VK_SAMPLE_COUNT_1_BIT using optimal tiling and supporting
// texturing and color rendering is required.
result = vkGetPhysicalDeviceExternalImageFormatPropertiesNV(
physicalDevice,
format,
VK_IMAGE_TYPE_2D,
VK_IMAGE_TILING_OPTIMAL,
VK_IMAGE_USAGE_SAMPLED_BIT |
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
0,
handleType,
&properties);
if ((result != VK_SUCCESS) ||
!(properties.externalMemoryFeatures &
VK_EXTERNAL_MEMORY_FEATURE_EXPORTABLE_BIT_NV)) {
abort();
}
// Set up the external memory image creation info
memset(&externalMemoryImageCreateInfo,
0, sizeof(externalMemoryImageCreateInfo));
externalMemoryImageCreateInfo.sType =
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO_NV;
externalMemoryImageCreateInfo.handleTypes = handleType;
if (properties.externalMemoryFeatures &
VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT_NV) {
memset(&dedicatedImageCreateInfo, 0, sizeof(dedicatedImageCreateInfo));
dedicatedImageCreateInfo.sType =
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_IMAGE_CREATE_INFO_NV;
dedicatedImageCreateInfo.dedicatedAllocation = VK_TRUE;
externalMemoryImageCreateInfo.pNext = &dedicatedImageCreateInfo;
}
// Set up the core image creation info
memset(&imageCreateInfo, 0, sizeof(imageCreateInfo));
imageCreateInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
imageCreateInfo.pNext = &externalMemoryImageCreateInfo;
imageCreateInfo.format = format;
imageCreateInfo.extent.width = 64;
imageCreateInfo.extent.height = 64;
imageCreateInfo.extent.depth = 1;
imageCreateInfo.mipLevels = 1;
imageCreateInfo.arrayLayers = 1;
imageCreateInfo.samples = VK_SAMPLE_COUNT_1_BIT;
imageCreateInfo.tiling = VK_IMAGE_TILING_OPTIMAL;
imageCreateInfo.usage = VK_IMAGE_USAGE_SAMPLED_BIT |
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;
imageCreateInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
imageCreateInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
vkCreateImage(device, &imageCreateInfo, NULL, &image);
vkGetImageMemoryRequirements(device,
image,
&imageMemoryRequirements);
// For simplicity, just pick the first compatible memory type.
for (memoryType = 0; memoryType < numMemoryTypes; memoryType++) {
if ((1 << memoryType) & imageMemoryRequirements.memoryTypeBits) {
break;
}
}
// At least one memory type must be supported given the prior external
// handle capability check.
assert(memoryType < numMemoryTypes);
// Allocate the external memory object.
memset(&exportMemoryAllocateInfo, 0, sizeof(exportMemoryAllocateInfo));
exportMemoryAllocateInfo.sType =
VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO_NV;
exportMemoryAllocateInfo.handleTypes = handleType;
if (properties.externalMemoryFeatures &
VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT_NV) {
memset(&dedicatedAllocationInfo, 0, sizeof(dedicatedAllocationInfo));
dedicatedAllocationInfo.sType =
VK_STRUCTURE_TYPE_DEDICATED_ALLOCATION_MEMORY_ALLOCATE_INFO_NV;
dedicatedAllocationInfo.image = image;
exportMemoryAllocateInfo.pNext = &dedicatedAllocationInfo;
}
memset(&memoryAllocateInfo, 0, sizeof(memoryAllocateInfo));
memoryAllocateInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
memoryAllocateInfo.pNext = &exportMemoryAllocateInfo;
memoryAllocateInfo.allocationSize = imageMemoryRequirements.size;
memoryAllocateInfo.memoryTypeIndex = memoryType;
vkAllocateMemory(device, &memoryAllocateInfo, NULL, &memory);
if (!(properties.externalMemoryFeatures &
VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT_NV)) {
vkBindImageMemory(device, image, memory, 0);
}
// Get the external memory opaque FD handle
vkGetMemoryWin32HandleNV(device, memory, &memoryHnd);
Version History
-
Revision 1, 2016-08-11 (James Jones)
-
Initial draft
-
VK_NV_glsl_shader
- Name String
-
VK_NV_glsl_shader
- Extension Type
-
Device extension
- Registered Extension Number
-
13
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
- Deprecation state
-
-
Deprecated without replacement
-
- Contact
-
-
Piers Daniell pdaniell-nv
-
- Last Modified Date
-
2016-02-14
- IP Status
-
No known IP claims.
- Contributors
-
-
Piers Daniell, NVIDIA
-
This extension allows GLSL shaders written to the GL_KHR_vulkan_glsl extension specification to be used instead of SPIR-V. The implementation will automatically detect whether the shader is SPIR-V or GLSL, and compile it appropriately.
New Object Types
New Enum Constants
-
Extending VkResult:
-
VK_ERROR_INVALID_SHADER_NV
-
New Enums
New Structures
New Functions
Issues
Examples
Example 1
Passing in GLSL code
char const vss[] =
"#version 450 core\n"
"layout(location = 0) in vec2 aVertex;\n"
"layout(location = 1) in vec4 aColor;\n"
"out vec4 vColor;\n"
"void main()\n"
"{\n"
" vColor = aColor;\n"
" gl_Position = vec4(aVertex, 0, 1);\n"
"}\n"
;
VkShaderModuleCreateInfo vertexShaderInfo = { VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO };
vertexShaderInfo.codeSize = sizeof vss;
vertexShaderInfo.pCode = vss;
VkShaderModule vertexShader;
vkCreateShaderModule(device, &vertexShaderInfo, 0, &vertexShader);
Version History
-
Revision 1, 2016-02-14 (Piers Daniell)
-
Initial draft
-
VK_NV_win32_keyed_mutex
- Name String
-
VK_NV_win32_keyed_mutex
- Extension Type
-
Device extension
- Registered Extension Number
-
59
- Revision
-
1
- Extension and Version Dependencies
-
-
Requires Vulkan 1.0
-
Requires
VK_NV_external_memory_win32
-
- Deprecation state
-
-
Promoted to VK_KHR_win32_keyed_mutex extension
-
- Contact
-
-
Carsten Rohde crohde
-
- Last Modified Date
-
2016-08-19
- IP Status
-
No known IP claims.
- Contributors
-
-
James Jones, NVIDIA
-
Carsten Rohde, NVIDIA
-
Applications that wish to import Direct3D 11 memory objects into the Vulkan API may wish to use the native keyed mutex mechanism to synchronize access to the memory between Vulkan and Direct3D. This extension provides a way for an application to access the keyed mutex associated with an imported Vulkan memory object when submitting command buffers to a queue.
New Object Types
None.
New Enum Constants
-
Extending VkStructureType:
-
VK_STRUCTURE_TYPE_WIN32_KEYED_MUTEX_ACQUIRE_RELEASE_INFO_NV
-
New Enums
None.
New Structures
New Functions
None.
Issues
None.
Examples
//
// Import a memory object from Direct3D 11, and synchronize
// access to it in Vulkan using keyed mutex objects.
//
extern VkPhysicalDevice physicalDevice;
extern VkDevice device;
extern HANDLE sharedNtHandle;
static const VkFormat format = VK_FORMAT_R8G8B8A8_UNORM;
static const VkExternalMemoryHandleTypeFlagsNV handleType =
VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_IMAGE_BIT_NV;
VkPhysicalDeviceMemoryProperties memoryProperties;
VkExternalImageFormatPropertiesNV properties;
VkExternalMemoryImageCreateInfoNV externalMemoryImageCreateInfo;
VkImageCreateInfo imageCreateInfo;
VkImage image;
VkMemoryRequirements imageMemoryRequirements;
uint32_t numMemoryTypes;
uint32_t memoryType;
VkImportMemoryWin32HandleInfoNV importMemoryInfo;
VkMemoryAllocateInfo memoryAllocateInfo;
VkDeviceMemory mem;
VkResult result;
// Figure out how many memory types the device supports
vkGetPhysicalDeviceMemoryProperties(physicalDevice,
&memoryProperties);
numMemoryTypes = memoryProperties.memoryTypeCount;
// Check the external handle type capabilities for the chosen format
// Importable 2D image support with at least 1 mip level, 1 array
// layer, and VK_SAMPLE_COUNT_1_BIT using optimal tiling and supporting
// texturing and color rendering is required.
result = vkGetPhysicalDeviceExternalImageFormatPropertiesNV(
physicalDevice,
format,
VK_IMAGE_TYPE_2D,
VK_IMAGE_TILING_OPTIMAL,
VK_IMAGE_USAGE_SAMPLED_BIT |
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
0,
handleType,
&properties);
if ((result != VK_SUCCESS) ||
!(properties.externalMemoryFeatures &
VK_EXTERNAL_MEMORY_FEATURE_IMPORTABLE_BIT_NV)) {
abort();
}
// Set up the external memory image creation info
memset(&externalMemoryImageCreateInfo,
0, sizeof(externalMemoryImageCreateInfo));
externalMemoryImageCreateInfo.sType =
VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO_NV;
externalMemoryImageCreateInfo.handleTypes = handleType;
// Set up the core image creation info
memset(&imageCreateInfo, 0, sizeof(imageCreateInfo));
imageCreateInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
imageCreateInfo.pNext = &externalMemoryImageCreateInfo;
imageCreateInfo.format = format;
imageCreateInfo.extent.width = 64;
imageCreateInfo.extent.height = 64;
imageCreateInfo.extent.depth = 1;
imageCreateInfo.mipLevels = 1;
imageCreateInfo.arrayLayers = 1;
imageCreateInfo.samples = VK_SAMPLE_COUNT_1_BIT;
imageCreateInfo.tiling = VK_IMAGE_TILING_OPTIMAL;
imageCreateInfo.usage = VK_IMAGE_USAGE_SAMPLED_BIT |
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;
imageCreateInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
imageCreateInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
vkCreateImage(device, &imageCreateInfo, NULL, &image);
vkGetImageMemoryRequirements(device,
image,
&imageMemoryRequirements);
// For simplicity, just pick the first compatible memory type.
for (memoryType = 0; memoryType < numMemoryTypes; memoryType++) {
if ((1 << memoryType) & imageMemoryRequirements.memoryTypeBits) {
break;
}
}
// At least one memory type must be supported given the prior external
// handle capability check.
assert(memoryType < numMemoryTypes);
// Allocate the external memory object.
memset(&exportMemoryAllocateInfo, 0, sizeof(exportMemoryAllocateInfo));
exportMemoryAllocateInfo.sType =
VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO_NV;
importMemoryInfo.handleTypes = handleType;
importMemoryInfo.handle = sharedNtHandle;
memset(&memoryAllocateInfo, 0, sizeof(memoryAllocateInfo));
memoryAllocateInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
memoryAllocateInfo.pNext = &exportMemoryAllocateInfo;
memoryAllocateInfo.allocationSize = imageMemoryRequirements.size;
memoryAllocateInfo.memoryTypeIndex = memoryType;
vkAllocateMemory(device, &memoryAllocateInfo, NULL, &mem);
vkBindImageMemory(device, image, mem, 0);
...
const uint64_t acquireKey = 1;
const uint32_t timeout = INFINITE;
const uint64_t releaseKey = 2;
VkWin32KeyedMutexAcquireReleaseInfoNV keyedMutex =
{ VK_STRUCTURE_TYPE_WIN32_KEYED_MUTEX_ACQUIRE_RELEASE_INFO_NV };
keyedMutex.acquireCount = 1;
keyedMutex.pAcquireSyncs = &mem;
keyedMutex.pAcquireKeys = &acquireKey;
keyedMutex.pAcquireTimeoutMilliseconds = &timeout;
keyedMutex.releaseCount = 1;
keyedMutex.pReleaseSyncs = &mem;
keyedMutex.pReleaseKeys = &releaseKey;
VkSubmitInfo submit_info = { VK_STRUCTURE_TYPE_SUBMIT_INFO, &keyedMutex };
submit_info.commandBufferCount = 1;
submit_info.pCommandBuffers = &cmd_buf;
vkQueueSubmit(queue, 1, &submit_info, VK_NULL_HANDLE);
Version History
-
Revision 2, 2016-08-11 (James Jones)
-
Updated sample code based on the NV external memory extensions.
-
Renamed from NVX to NV extension.
-
Added Overview and Description sections.
-
Updated sample code to use the NV external memory extensions.
-
-
Revision 1, 2016-06-14 (Carsten Rohde)
-
Initial draft.
-
Appendix F: API Boilerplate
This appendix defines Vulkan API features that are infrastructure required for a complete functional description of Vulkan, but do not logically belong elsewhere in the Specification.
Vulkan Header Files
Vulkan is defined as an API in the C99 language.
Khronos provides a corresponding set of header files for applications using
the API, which may be used in either C or C++ code.
The interface descriptions in the specification are the same as the
interfaces defined in these header files, and both are derived from the
vk.xml
XML API Registry, which is the canonical machine-readable
description of the Vulkan API.
The Registry, scripts used for processing it into various forms, and
documentation of the registry schema are available as described at
https://www.khronos.org/registry/vulkan/#apiregistry .
Language bindings for other languages can be defined using the information in the Specification and the Registry. Khronos does not provide any such bindings, but third-party developers have created some additional bindings.
Vulkan Combined API Header vulkan.h
(Informative)
Applications normally will include the header vulkan.h
.
In turn, vulkan.h
always includes the following headers:
-
vk_platform.h
, defining platform-specific macros and headers. -
vulkan_core.h
, defining APIs for the Vulkan core and all registered extensions other than window system-specific extensions.
In addition, specific preprocessor macros defined at the time vulkan.h
is
included cause header files for the corresponding window system-specific extension interfaces to be included.
Vulkan Platform-Specific Header vk_platform.h
(Informative)
Platform-specific macros and interfaces are defined in vk_platform.h
.
These macros are used to control platform-dependent behavior, and their
exact definitions are under the control of specific platforms and Vulkan
implementations.
Platform-Specific Calling Conventions
On many platforms the following macros are empty strings, causing platform- and compiler-specific default calling conventions to be used.
VKAPI_ATTR
is a macro placed before the return type in Vulkan API
function declarations.
This macro controls calling conventions for C++11 and GCC/Clang-style
compilers.
VKAPI_CALL
is a macro placed after the return type in Vulkan API
function declarations.
This macro controls calling conventions for MSVC-style compilers.
VKAPI_PTR
is a macro placed between the '(' and '*' in Vulkan API
function pointer declarations.
This macro also controls calling conventions, and typically has the same
definition as VKAPI_ATTR
or VKAPI_CALL
, depending on the
compiler.
With these macros, a Vulkan function declaration takes the form of:
VKAPI_ATTR <return_type> VKAPI_CALL <command_name>(<command_parameters>);
Additionally, a Vulkan function pointer type declaration takes the form of:
typedef <return_type> (VKAPI_PTR *PFN_<command_name>)(<command_parameters>);
Platform-Specific Header Control
If the VK_NO_STDINT_H
macro is defined by the application at compile
time, extended integer types used by the Vulkan API, such as uint8_t
,
must also be defined by the application.
Otherwise, the Vulkan headers will not compile.
If VK_NO_STDINT_H
is not defined, the system <stdint.h>
is used to
define these types.
There is a fallback path when Microsoft Visual Studio version 2008 and
earlier versions are detected at compile time.
Vulkan Core API Header vulkan_core.h
Applications that do not make use of window system-specific extensions may
simply include vulkan_core.h
instead of vulkan.h
, although there is
usually no reason to do so.
In addition to the Vulkan API, vulkan_core.h
also defines a small number
of C preprocessor macros that are described below.
Vulkan Version Number Macros
API Version Numbers are packed into integers. These macros manipulate version numbers in useful ways.
VK_VERSION_MAJOR
extracts the API major version number from a packed
version number:
#define VK_VERSION_MAJOR(version) ((uint32_t)(version) >> 22)
VK_VERSION_MINOR
extracts the API minor version number from a packed
version number:
#define VK_VERSION_MINOR(version) (((uint32_t)(version) >> 12) & 0x3ff)
VK_VERSION_PATCH
extracts the API patch version number from a packed
version number:
#define VK_VERSION_PATCH(version) ((uint32_t)(version) & 0xfff)
VK_API_VERSION_1_0
returns the API version number for Vulkan 1.0.
The patch version number in this macro will always be zero.
The supported patch version for a physical device can be queried with
vkGetPhysicalDeviceProperties.
// Vulkan 1.0 version number
#define VK_API_VERSION_1_0 VK_MAKE_VERSION(1, 0, 0)// Patch version should always be set to 0
VK_API_VERSION_1_1
returns the API version number for Vulkan 1.1.
The patch version number in this macro will always be zero.
The supported patch version for a physical device can be queried with
vkGetPhysicalDeviceProperties.
// Vulkan 1.1 version number
#define VK_API_VERSION_1_1 VK_MAKE_VERSION(1, 1, 0)// Patch version should always be set to 0
VK_API_VERSION
is now commented out of vulkan_core.h
and cannot be
used.
// DEPRECATED: This define has been removed. Specific version defines (e.g. VK_API_VERSION_1_0), or the VK_MAKE_VERSION macro, should be used instead.
//#define VK_API_VERSION VK_MAKE_VERSION(1, 0, 0) // Patch version should always be set to 0
VK_MAKE_VERSION
constructs an API version number.
#define VK_MAKE_VERSION(major, minor, patch) \
(((major) << 22) | ((minor) << 12) | (patch))
-
major
is the major version number. -
minor
is the minor version number. -
patch
is the patch version number.
This macro can be used when constructing the
VkApplicationInfo::apiVersion
parameter passed to
vkCreateInstance.
Vulkan Header File Version Number
VK_HEADER_VERSION
is the version number of the vulkan_core.h
header.
This value is kept synchronized with the patch version of the released
Specification.
// Version of this file
#define VK_HEADER_VERSION 96
Vulkan Handle Macros
VK_DEFINE_HANDLE
defines a dispatchable handle type.
#define VK_DEFINE_HANDLE(object) typedef struct object##_T* object;
-
object
is the name of the resulting C type.
The only dispatchable handle types are those related to device and instance management, such as VkDevice.
VK_DEFINE_NON_DISPATCHABLE_HANDLE
defines a
non-dispatchable handle type.
#if !defined(VK_DEFINE_NON_DISPATCHABLE_HANDLE)
#if defined(__LP64__) || defined(_WIN64) || (defined(__x86_64__) && !defined(__ILP32__) ) || defined(_M_X64) || defined(__ia64) || defined (_M_IA64) || defined(__aarch64__) || defined(__powerpc64__)
#define VK_DEFINE_NON_DISPATCHABLE_HANDLE(object) typedef struct object##_T *object;
#else
#define VK_DEFINE_NON_DISPATCHABLE_HANDLE(object) typedef uint64_t object;
#endif
#endif
-
object
is the name of the resulting C type.
Most Vulkan handle types, such as VkBuffer, are non-dispatchable.
Note
The |
VK_NULL_HANDLE
is a reserved value representing a non-valid object
handle.
It may be passed to and returned from Vulkan commands only when
specifically allowed.
#define VK_NULL_HANDLE 0
Window System-Specific Header Control (Informative)
To use a Vulkan extension supporting a platform-specific window system, header files for that window systems must be included at compile time, or platform-specific types must be forward-declared. The Vulkan header files cannot determine whether or not an external header is available at compile time, so platform-specific extensions are provided in separate headers from the core API and platform-independent extensions, allowing applications to decide which ones should be defined and how the external headers are included.
Extensions dependent on particular sets of platform headers, or that
forward-declare platform-specific types, are declared in a header named for
that platform.
Before including these platform-specific Vulkan headers, applications must
include both vulkan_core.h
and any external native headers the platform
extensions depend on.
As a convenience for applications that do not need the flexibility of
separate platform-specific Vulkan headers, vulkan.h
includes
vulkan_core.h
, and then conditionally includes platform-specific Vulkan
headers and the external headers they depend on.
Applications control which platform-specific headers are included by
#defining macros before including vulkan.h
.
The correspondence between platform-specific extensions, external headers
they require, the platform-specific header which declares them, and the
preprocessor macros which enable inclusion by vulkan.h
are shown in the
following table.
Extension Name | Window System Name | Platform-specific Header | Required External Headers | Controlling vulkan.h Macro |
---|---|---|---|---|
Android |
|
None |
|
|
Wayland |
|
|
|
|
|
Microsoft Windows |
|
|
|
X11 Xcb |
|
|
|
|
X11 Xlib |
|
|
|
|
X11 XRAndR |
|
|
|
|
iOS |
|
None |
|
|
macOS |
|
None |
|
|
VI |
|
None |
|
Note
This section describes the purpose of the headers independently of the specific underlying functionality of the window system extensions themselves. Each extension name will only link to a description of that extension when viewing a specification built with that extension included. |
Appendix G: Invariance
The Vulkan specification is not pixel exact. It therefore does not guarantee an exact match between images produced by different Vulkan implementations. However, the specification does specify exact matches, in some cases, for images produced by the same implementation. The purpose of this appendix is to identify and provide justification for those cases that require exact matches.
Repeatability
The obvious and most fundamental case is repeated issuance of a series of Vulkan commands. For any given Vulkan and framebuffer state vector, and for any Vulkan command, the resulting Vulkan and framebuffer state must be identical whenever the command is executed on that initial Vulkan and framebuffer state. This repeatability requirement does not apply when using shaders containing side effects (image and buffer variable stores and atomic operations), because these memory operations are not guaranteed to be processed in a defined order.
The repeatability requirement does not apply for rendering done using a
graphics pipeline that uses VK_RASTERIZATION_ORDER_RELAXED_AMD
.
One purpose of repeatability is avoidance of visual artifacts when a double-buffered scene is redrawn. If rendering is not repeatable, swapping between two buffers rendered with the same command sequence may result in visible changes in the image. Such false motion is distracting to the viewer. Another reason for repeatability is testability.
Repeatability, while important, is a weak requirement. Given only repeatability as a requirement, two scenes rendered with one (small) polygon changed in position might differ at every pixel. Such a difference, while within the law of repeatability, is certainly not within its spirit. Additional invariance rules are desirable to ensure useful operation.
Multi-pass Algorithms
Invariance is necessary for a whole set of useful multi-pass algorithms. Such algorithms render multiple times, each time with a different Vulkan mode vector, to eventually produce a result in the framebuffer. Examples of these algorithms include:
-
“Erasing” a primitive from the framebuffer by redrawing it, either in a different color or using the XOR logical operation.
-
Using stencil operations to compute capping planes.
Invariance Rules
For a given Vulkan device:
Rule 1 For any given Vulkan and framebuffer state vector, and for any given Vulkan command, the resulting Vulkan and framebuffer state must be identical each time the command is executed on that initial Vulkan and framebuffer state.
Rule 2 Changes to the following state values have no side effects (the use of any other state value is not affected by the change):
Required:
-
Color and depth/stencil attachment contents
-
Scissor parameters (other than enable)
-
Write masks (color, depth, stencil)
-
Clear values (color, depth, stencil)
Strongly suggested:
-
Stencil parameters (other than enable)
-
Depth test parameters (other than enable)
-
Blend parameters (other than enable)
-
Logical operation parameters (other than enable)
Corollary 1 Fragment generation is invariant with respect to the state values listed in Rule 2.
Rule 3 The arithmetic of each per-fragment operation is invariant except with respect to parameters that directly control it.
Corollary 2 Images rendered into different color attachments of the same framebuffer, either simultaneously or separately using the same command sequence, are pixel identical.
Rule 4 Identical pipelines will produce the same result when run multiple times with the same input. The wording “Identical pipelines” means VkPipeline objects that have been created with identical SPIR-V binaries and identical state, which are then used by commands executed using the same Vulkan state vector. Invariance is relaxed for shaders with side effects, such as performing stores or atomics.
Rule 5 All fragment shaders that either conditionally or unconditionally
assign FragCoord
.z to FragDepth
are depth-invariant with
respect to each other, for those fragments where the assignment to
FragDepth
actually is done.
If a sequence of Vulkan commands specifies primitives to be rendered with shaders containing side effects (image and buffer variable stores and atomic operations), invariance rules are relaxed. In particular, rule 1, corollary 2, and rule 4 do not apply in the presence of shader side effects.
The following weaker versions of rules 1 and 4 apply to Vulkan commands involving shader side effects:
Rule 6 For any given Vulkan and framebuffer state vector, and for any given Vulkan command, the contents of any framebuffer state not directly or indirectly affected by results of shader image or buffer variable stores or atomic operations must be identical each time the command is executed on that initial Vulkan and framebuffer state.
Rule 7 Identical pipelines will produce the same result when run multiple times with the same input as long as:
-
shader invocations do not use image atomic operations;
-
no framebuffer memory is written to more than once by image stores, unless all such stores write the same value; and
-
no shader invocation, or other operation performed to process the sequence of commands, reads memory written to by an image store.
Note
The OpenGL spec has the following invariance rule: Consider a primitive p' obtained by translating a primitive p through an offset (x, y) in window coordinates, where x and y are integers. As long as neither p' nor p is clipped, it must be the case that each fragment f' produced from p' is identical to a corresponding fragment f from p except that the center of f' is offset by (x, y) from the center of f. This rule does not apply to Vulkan and is an intentional difference from OpenGL. |
When any sequence of Vulkan commands triggers shader invocations that perform image stores or atomic operations, and subsequent Vulkan commands read the memory written by those shader invocations, these operations must be explicitly synchronized.
Tessellation Invariance
When using a pipeline containing tessellation evaluation shaders, the fixed-function tessellation primitive generator consumes the input patch specified by an application and emits a new set of primitives. The following invariance rules are intended to provide repeatability guarantees. Additionally, they are intended to allow an application with a carefully crafted tessellation evaluation shader to ensure that the sets of triangles generated for two adjacent patches have identical vertices along shared patch edges, avoiding “cracks” caused by minor differences in the positions of vertices along shared edges.
Rule 1 When processing two patches with identical outer and inner tessellation levels, the tessellation primitive generator will emit an identical set of point, line, or triangle primitives as long as the pipeline used to process the patch primitives has tessellation evaluation shaders specifying the same tessellation mode, spacing, vertex order, and point mode decorations. Two sets of primitives are considered identical if and only if they contain the same number and type of primitives and the generated tessellation coordinates for the vertex numbered m of the primitive numbered n are identical for all values of m and n.
Rule 2 The set of vertices generated along the outer edge of the subdivided primitive in triangle and quad tessellation, and the tessellation coordinates of each, depends only on the corresponding outer tessellation level and the spacing decorations in the tessellation shaders of the pipeline.
Rule 3 The set of vertices generated when subdividing any outer primitive edge is always symmetric. For triangle tessellation, if the subdivision generates a vertex with tessellation coordinates of the form (0, x, 1-x), (x, 0, 1-x), or (x, 1-x, 0), it will also generate a vertex with coordinates of exactly (0, 1-x, x), (1-x, 0, x), or (1-x, x, 0), respectively. For quad tessellation, if the subdivision generates a vertex with coordinates of (x, 0) or (0, x), it will also generate a vertex with coordinates of exactly (1-x, 0) or (0, 1-x), respectively. For isoline tessellation, if it generates vertices at (0, x) and (1, x) where x is not zero, it will also generate vertices at exactly (0, 1-x) and (1, 1-x), respectively.
Rule 4 The set of vertices generated when subdividing outer edges in triangular and quad tessellation must be independent of the specific edge subdivided, given identical outer tessellation levels and spacing. For example, if vertices at (x, 1 - x, 0) and (1-x, x, 0) are generated when subdividing the w = 0 edge in triangular tessellation, vertices must be generated at (x, 0, 1-x) and (1-x, 0, x) when subdividing an otherwise identical v = 0 edge. For quad tessellation, if vertices at (x, 0) and (1-x, 0) are generated when subdividing the v = 0 edge, vertices must be generated at (0, x) and (0, 1-x) when subdividing an otherwise identical u = 0 edge.
Rule 5 When processing two patches that are identical in all respects enumerated in rule 1 except for vertex order, the set of triangles generated for triangle and quad tessellation must be identical except for vertex and triangle order. For each triangle n1 produced by processing the first patch, there must be a triangle n2 produced when processing the second patch each of whose vertices has the same tessellation coordinates as one of the vertices in n1.
Rule 6 When processing two patches that are identical in all respects enumerated in rule 1 other than matching outer tessellation levels and/or vertex order, the set of interior triangles generated for triangle and quad tessellation must be identical in all respects except for vertex and triangle order. For each interior triangle n1 produced by processing the first patch, there must be a triangle n2 produced when processing the second patch each of whose vertices has the same tessellation coordinates as one of the vertices in n1. A triangle produced by the tessellator is considered an interior triangle if none of its vertices lie on an outer edge of the subdivided primitive.
Rule 7 For quad and triangle tessellation, the set of triangles connecting an inner and outer edge depends only on the inner and outer tessellation levels corresponding to that edge and the spacing decorations.
Rule 8 The value of all defined components of TessCoord
will be in
the range [0, 1].
Additionally, for any defined component x of TessCoord
, the results
of computing 1.0-x in a tessellation evaluation shader will be exact.
If any floating-point values in the range [0, 1] fail to satisfy this
property, such values must not be used as tessellation coordinate
components.
Glossary
The terms defined in this section are used consistently throughout this Specification and may be used with or without capitalization.
- Accessible (Descriptor Binding)
-
A descriptor binding is accessible to a shader stage if that stage is included in the
stageFlags
of the descriptor binding. Descriptors using that binding can only be used by stages in which they are accessible. - Acquire Operation (Resource)
-
An operation that acquires ownership of an image subresource or buffer range.
- Active (Transform Feedback)
-
Transform feedback is made active after vkCmdBeginTransformFeedbackEXT executes and remains active until vkCmdEndTransformFeedbackEXT executes. While transform feedback is active, data written to variables in the output interface of the last vertex processing stage of the graphics pipeline are captured to the bound transform feedback buffers if those variables are decorated for transform feedback.
- Adjacent Vertex
-
A vertex in an adjacency primitive topology that is not part of a given primitive, but is accessible in geometry shaders.
- Advanced Blend Operation
-
Blending performed using one of the blend operation enums introduced by the
VK_EXT_blend_operation_advanced
extension. See Advanced Blending Operations. - Aliased Range (Memory)
-
A range of a device memory allocation that is bound to multiple resources simultaneously.
- Allocation Scope
-
An association of a host memory allocation to a parent object or command, where the allocation’s lifetime ends before or at the same time as the parent object is freed or destroyed, or during the parent command.
- Aspect (Image)
-
An image may contain multiple kinds, or aspects, of data for each pixel, where each aspect is used in a particular way by the pipeline and may be stored differently or separately from other aspects. For example, the color components of an image format make up the color aspect of the image, and may be used as a framebuffer color attachment. Some operations, like depth testing, operate only on specific aspects of an image. Others operations, like image/buffer copies, only operate on one aspect at a time.
- Attachment (Render Pass)
-
A zero-based integer index name used in render pass creation to refer to a framebuffer attachment that is accessed by one or more subpasses. The index also refers to an attachment description which includes information about the properties of the image view that will later be attached.
- Availability Operation
-
An operation that causes the values generated by specified memory write accesses to become available for future access.
- Available
-
A state of values written to memory that allows them to be made visible.
- Axis-aligned Bounding Box
-
A box bounding a region in space defined by extents along each axis and thus representing a box where each edge is aligned to one of the major axes.
- Back-Facing
-
See Facingness.
- Batch
-
A single structure submitted to a queue as part of a queue submission command, describing a set of queue operations to execute.
- Backwards Compatibility
-
A given version of the API is backwards compatible with an earlier version if an application, relying only on valid behavior and functionality defined by the earlier specification, is able to correctly run against each version without any modification. This assumes no active attempt by that application to not run when it detects a different version.
- Full Compatibility
-
A given version of the API is fully compatible with another version if an application, relying only on valid behavior and functionality defined by either of those specifications, is able to correctly run against each version without any modification. This assumes no active attempt by that application to not run when it detects a different version.
- Binding (Memory)
-
An association established between a range of a resource object and a range of a memory object. These associations determine the memory locations affected by operations performed on elements of a resource object. Memory bindings are established using the vkBindBufferMemory command for non-sparse buffer objects, using the vkBindImageMemory command for non-sparse image objects, and using the vkQueueBindSparse command for sparse resources.
- Blend Constant
-
Four floating point (RGBA) values used as an input to blending.
- Blending
-
Arithmetic operations between a fragment color value and a value in a color attachment that produce a final color value to be written to the attachment.
- Buffer
-
A resource that represents a linear array of data in device memory. Represented by a VkBuffer object.
- Buffer View
-
An object that represents a range of a specific buffer, and state that controls how the contents are interpreted. Represented by a VkBufferView object.
- Built-In Variable
-
A variable decorated in a shader, where the decoration makes the variable take values provided by the execution environment or values that are generated by fixed-function pipeline stages.
- Built-In Interface Block
-
A block defined in a shader that contains only variables decorated with built-in decorations, and is used to match against other shader stages.
- Clip Coordinates
-
The homogeneous coordinate space that vertex positions (
Position
decoration) are written in by vertex processing stages. - Clip Distance
-
A built-in output from vertex processing stages that defines a clip half-space against which the primitive is clipped.
- Clip Volume
-
The intersection of the view volume with all clip half-spaces.
- Color Attachment
-
A subpass attachment point, or image view, that is the target of fragment color outputs and blending.
- Color Fragment
-
A unique color value within a pixel of a multisampled color image. The fragment mask will contain indices to the color fragment.
- Color Renderable Format
-
A VkFormat where
VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT
is set in one of the following, depending on the image’s tiling:-
VkImageFormatProperties::
linearTilingFeatures
-
VkImageFormatProperties::
optimalTilingFeatures
-
VkDrmFormatModifierPropertiesEXT::
drmFormatModifierTilingFeatures
-
- Color Sample Mask
-
A bitfield associated with a fragment, with one bit for each sample in the color attachment(s). Samples are considered to be covered based on the result of the Coverage Reduction stage. Uncovered samples do not write to color attachments.
- Combined Image Sampler
-
A descriptor type that includes both a sampled image and a sampler.
- Command Buffer
-
An object that records commands to be submitted to a queue. Represented by a VkCommandBuffer object.
- Command Pool
-
An object that command buffer memory is allocated from, and that owns that memory. Command pools aid multithreaded performance by enabling different threads to use different allocators, without internal synchronization on each use. Represented by a VkCommandPool object.
- Compatible Allocator
-
When allocators are compatible, allocations from each allocator can be freed by the other allocator.
- Compatible Image Formats
-
When formats are compatible, images created with one of the formats can have image views created from it using any of the compatible formats. Also see Size-Compatible Image Formats.
- Compatible Queues
-
Queues within a queue family. Compatible queues have identical properties.
- Complete Mipmap Chain
-
The entire set of miplevels that can be provided for an image, from the largest application specified miplevel size down to the minimum miplevel size. See Image Miplevel Sizing.
- Component (Format)
-
A distinct part of a format. Depth, stencil, and color channels (e.g. R, G, B, A), are all separate components.
- Compressed Texel Block
-
An element of an image having a block-compressed format, comprising a rectangular block of texel values that are encoded as a single value in memory. Compressed texel blocks of a particular block-compressed format have a corresponding width, height, and depth that define the dimensions of these elements in units of texels, and a size in bytes of the encoding in memory.
- Corner-Sampled Image
-
A VkImage where unnormalized texel coordinates are centered on integer values instead of half-integer values. Specified by setting the
VK_IMAGE_CREATE_CORNER_SAMPLED_BIT_NV
bit on VkImageCreateInfo::flags
at image creation. - Coverage
-
A bitfield associated with a fragment, where each bit is associated to a rasterization sample. Samples are initially considered to be covered based on the result of rasterization, and then coverage can subsequently be turned on or off by other fragment operations or the fragment shader. Uncovered samples do not write to framebuffer attachments.
- Cull Distance
-
A built-in output from vertex processing stages that defines a cull half-space where the primitive is rejected if all vertices have a negative value for the same cull distance.
- Cull Volume
-
The intersection of the view volume with all cull half-spaces.
- Decoration (SPIR-V)
-
Auxiliary information such as built-in variables, stream numbers, invariance, interpolation type, relaxed precision, etc., added to variables or structure-type members through decorations.
- Deprecated
-
A feature is deprecated if it is no longer recommended as the correct or best way to achieve its intended purpose. Generally a newer feature will have been created that solves the same problem - in cases where no newer alternative feature exists, justification should be provided.
- Depth/Stencil Attachment
-
A subpass attachment point, or image view, that is the target of depth and/or stencil test operations and writes.
- Depth/Stencil Format
-
A VkFormat that includes depth and/or stencil components.
- Depth/Stencil Image (or ImageView)
-
A VkImage (or VkImageView) with a depth/stencil format.
- Derivative Group
-
A set of fragment or compute shader invocations that cooperate to compute derivatives, including implicit derivatives for sampled image operations.
- Descriptor
-
Information about a resource or resource view written into a descriptor set that is used to access the resource or view from a shader.
- Descriptor Binding
-
An entry in a descriptor set layout corresponding to zero or more descriptors of a single descriptor type in a set. Defined by a VkDescriptorSetLayoutBinding structure.
- Descriptor Pool
-
An object that descriptor sets are allocated from, and that owns the storage of those descriptor sets. Descriptor pools aid multithreaded performance by enabling different threads to use different allocators, without internal synchronization on each use. Represented by a VkDescriptorPool object.
- Descriptor Set
-
An object that resource descriptors are written into via the API, and that can be bound to a command buffer such that the descriptors contained within it can be accessed from shaders. Represented by a VkDescriptorSet object.
- Descriptor Set Layout
-
An object that defines the set of resources (types and counts) and their relative arrangement (in the binding namespace) within a descriptor set. Used when allocating descriptor sets and when creating pipeline layouts. Represented by a VkDescriptorSetLayout object.
- Device
-
The processor(s) and execution environment that perform tasks requested by the application via the Vulkan API.
- Device Group
-
A set of physical devices that support accessing each other’s memory and recording a single command buffer that can be executed on all the physical devices.
- Device Index
-
A zero-based integer that identifies one physical device from a logical device. A device index is valid if it is less than the number of physical devices in the logical device.
- Device Mask
-
A bitmask where each bit represents one device index. A device mask value is valid if every bit that is set in the mask is at a bit position that is less than the number of physical devices in the logical device.
- Device Memory
-
Memory accessible to the device. Represented by a VkDeviceMemory object.
- Device-Level Command
-
Any command that is dispatched from a logical device, or from a child object of a logical device.
- Device-Level Functionality
-
All device-level commands and objects, and their structures, enumerated types, and enumerants.
- Device-Level Object
-
Logical device objects and their child objects. For example, VkDevice, VkQueue, and VkCommandBuffer objects are device-level objects.
- Device-Local Memory
-
Memory that is connected to the device, and may be more performant for device access than host-local memory.
- Direct Drawing Commands
-
Drawing commands that take all their parameters as direct arguments to the command (and not sourced via structures in buffer memory as the indirect drawing commands). Includes vkCmdDrawMeshTasksNV, vkCmdDraw, and vkCmdDrawIndexed.
- Disjoint
-
Disjoint planes are image planes to which memory is bound independently.
A disjoint image consists of multiple disjoint planes, and is created with theVK_IMAGE_CREATE_DISJOINT_BIT
bit set. - Dispatchable Handle
-
A handle of a pointer handle type which may be used by layers as part of intercepting API commands. The first argument to each Vulkan command is a dispatchable handle type.
- Dispatching Commands
-
Commands that provoke work using a compute pipeline. Includes vkCmdDispatch and vkCmdDispatchIndirect.
- Drawing Commands
-
Commands that provoke work using a graphics pipeline. Includes vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR, vkCmdDrawIndirectCountAMD, vkCmdDrawIndexedIndirectCountAMD, vkCmdDrawMeshTasksNV, vkCmdDrawMeshTasksIndirectNV, vkCmdDrawMeshTasksIndirectCountNV, vkCmdDrawIndirect, and vkCmdDrawIndexedIndirect.
- Duration (Command)
-
The duration of a Vulkan command refers to the interval between calling the command and its return to the caller.
- Dynamic Storage Buffer
-
A storage buffer whose offset is specified each time the storage buffer is bound to a command buffer via a descriptor set.
- Dynamic Uniform Buffer
-
A uniform buffer whose offset is specified each time the uniform buffer is bound to a command buffer via a descriptor set.
- Dynamically Uniform
-
See Dynamically Uniform in section 2.2 “Terms” of the Khronos SPIR-V Specification.
- Element
-
Arrays are composed of multiple elements, where each element exists at a unique index within that array. Used primarily to describe data passed to or returned from the Vulkan API.
- Explicitly-Enabled Layer
-
A layer enabled by the application by adding it to the enabled layer list in vkCreateInstance or vkCreateDevice.
- Event
-
A synchronization primitive that is signaled when execution of previous commands complete through a specified set of pipeline stages. Events can be waited on by the device and polled by the host. Represented by a VkEvent object.
- Executable State (Command Buffer)
-
A command buffer that has ended recording commands and can be executed. See also Initial State and Recording State.
- Execution Dependency
-
A dependency that guarantees that certain pipeline stages’ work for a first set of commands has completed execution before certain pipeline stages’ work for a second set of commands begins execution. This is accomplished via pipeline barriers, subpass dependencies, events, or implicit ordering operations.
- Execution Dependency Chain
-
A sequence of execution dependencies that transitively act as a single execution dependency.
- Explicit chroma reconstruction
-
An implementation of sampler Y’CBCR conversion which reconstructs reduced-resolution chroma samples to luma resolution and then separately performs texture sample interpolation. This is distinct from an implicit implementation, which incorporates chroma sample reconstruction into texture sample interpolation.
- Extension Scope
-
The set of objects and commands that can be affected by an extension. Extensions are either device scope or instance scope.
- External Handle
-
A resource handle which has meaning outside of a specific Vulkan device or its parent instance. External handles may be used to share resources between multiple Vulkan devices in different instances, or between Vulkan and other APIs. Some external handle types correspond to platform-defined handles, in which case the resource may outlive any particular Vulkan device or instance and may be transferred between processes, or otherwise manipulated via functionality defined by the platform for that handle type.
- External synchronization
-
A type of synchronization required of the application, where parameters defined to be externally synchronized must not be used simultaneously in multiple threads.
- Facingness (Polygon)
-
A classification of a polygon as either front-facing or back-facing, depending on the orientation (winding order) of its vertices.
- Facingness (Fragment)
-
A fragment is either front-facing or back-facing, depending on the primitive it was generated from. If the primitive was a polygon (regardless of polygon mode), the fragment inherits the facingness of the polygon. All other fragments are front-facing.
- Fence
-
A synchronization primitive that is signaled when a set of batches or sparse binding operations complete execution on a queue. Fences can be waited on by the host. Represented by a VkFence object.
- Flat Shading
-
A property of a vertex attribute that causes the value from a single vertex (the provoking vertex) to be used for all vertices in a primitive, and for interpolation of that attribute to return that single value unaltered.
- Fragment
-
A rectangular framebuffer region with associated data produced by rasterization and processed by fragment operations including the fragment shader.
- Fragment Area
-
The width and height, in pixels, of a fragment.
- Fragment Density
-
The ratio of fragments per framebuffer area in the x and y direction.
- Fragment Density Texel Size
-
The (w,h) framebuffer region in pixels that each texel in a fragment density map applies to.
- Fragment Input Attachment Interface
-
Variables with
UniformConstant
storage class and a decoration ofInputAttachmentIndex
that are statically used by a fragment shader’s entry point, which receive values from input attachments. - Fragment Mask
-
A lookup table that associates color samples with color fragment values.
- Fragment Output Interface
-
A fragment shader entry point’s variables with
Output
storage class, which output to color and/or depth/stencil attachments. - Framebuffer
-
A collection of image views and a set of dimensions that, in conjunction with a render pass, define the inputs and outputs used by drawing commands. Represented by a VkFramebuffer object.
- Framebuffer Attachment
-
One of the image views used in a framebuffer.
- Framebuffer Coordinates
-
A coordinate system in which adjacent pixels’ coordinates differ by 1 in x and/or y, with (0,0) in the upper left corner and pixel centers at half-integers.
- Framebuffer-Space
-
Operating with respect to framebuffer coordinates.
- Framebuffer-Local
-
A framebuffer-local dependency guarantees that only for a single framebuffer region, the first set of operations happens-before the second set of operations.
- Framebuffer-Global
-
A framebuffer-global dependency guarantees that for all framebuffer regions, the first set of operations happens-before the second set of operations.
- Framebuffer Region
-
A framebuffer region is a set of sample (x, y, layer, sample) coordinates that is a subset of the entire framebuffer.
- Front-Facing
-
See Facingness.
- Global Workgroup
-
A collection of local workgroups dispatched by a single dispatch command. In addition to the compute dispatch, a single mesh task draw command can also generate such a collection.
- Handle
-
An opaque integer or pointer value used to refer to a Vulkan object. Each object type has a unique handle type.
- Happen-after
-
A transitive, irreflexive and antisymmetric ordering relation between operations. An execution dependency with a source of A and a destination of B enforces that B happens-after A. The inverse relation of happens-before.
- Happen-before
-
A transitive, irreflexive and antisymmetric ordering relation between operations. An execution dependency with a source of A and a destination of B enforces that A happens-before B. The inverse relation of happens-after.
- Helper Invocation
-
A fragment shader invocation that is created solely for the purposes of evaluating derivatives for use in non-helper fragment shader invocations, and which does not have side effects.
- Host
-
The processor(s) and execution environment that the application runs on, and that the Vulkan API is exposed on.
- Host Mapped Device Memory
-
Device memory that is mapped for host access using vkMapMemory.
- Host Mapped Foreign Memory
-
Memory owned by a foreign device that is mapped for host access.
- Host Memory
-
Memory not accessible to the device, used to store implementation data structures.
- Host-Accessible Subresource
-
A buffer, or a linear image subresource in either the
VK_IMAGE_LAYOUT_PREINITIALIZED
orVK_IMAGE_LAYOUT_GENERAL
layout. Host-accessible subresources have a well-defined addressing scheme which can be used by the host. - Host-Local Memory
-
Memory that is not local to the device, and may be less performant for device access than device-local memory.
- Host-Visible Memory
-
Device memory that can be mapped on the host and can be read and written by the host.
- Identically Defined Objects
-
Objects of the same type where all arguments to their creation or allocation functions, with the exception of
pAllocator
, are-
Vulkan handles which refer to the same object or
-
identical scalar or enumeration values or
-
Host pointers which point to an array of values or structures which also satisfy these three constraints.
-
- Image
-
A resource that represents a multi-dimensional formatted interpretation of device memory. Represented by a VkImage object.
- Image Subresource
-
A specific mipmap level and layer of an image.
- Image Subresource Range
-
A set of image subresources that are contiguous mipmap levels and layers.
- Image View
-
An object that represents an image subresource range of a specific image, and state that controls how the contents are interpreted. Represented by a VkImageView object.
- Immutable Sampler
-
A sampler descriptor provided at descriptor set layout creation time, and that is used for that binding in all descriptor sets allocated from the layout, and cannot be changed.
- Implicit chroma reconstruction
-
An implementation of sampler Y’CBCR conversion which reconstructs the reduced-resolution chroma samples directly at the sample point, as part of the normal texture sampling operation. This is distinct from an explicit chroma reconstruction implementation, which reconstructs the reduced-resolution chroma samples to the resolution of the luma samples, then filters the result as part of texture sample interpolation.
- Implicitly-Enabled Layer
-
A layer enabled by a loader-defined mechanism outside the Vulkan API, rather than explicitly by the application during instance or device creation.
- Index Buffer
-
A buffer bound via vkCmdBindIndexBuffer which is the source of index values used to fetch vertex attributes for a vkCmdDrawIndexed or vkCmdDrawIndexedIndirect command.
- Indexed Drawing Commands
-
Drawing commands which use an index buffer as the source of index values used to fetch vertex attributes for a drawing command. Includes vkCmdDrawIndexed, vkCmdDrawIndexedIndirectCountKHR, vkCmdDrawIndexedIndirectCountAMD, and vkCmdDrawIndexedIndirect.
- Indirect Commands
-
Drawing or dispatching commands that source some of their parameters from structures in buffer memory. Includes vkCmdDrawIndirect, vkCmdDrawIndexedIndirect, vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR, vkCmdDrawIndirectCountAMD, vkCmdDrawIndexedIndirectCountAMD, vkCmdDrawMeshTasksIndirectNV, vkCmdDrawMeshTasksIndirectCountNV, and vkCmdDispatchIndirect.
- Indirect Commands Layout
-
A definition of a sequence of commands, that are generated on the device via vkCmdProcessCommandsNVX. Each sequence is comprised of multiple VkIndirectCommandsTokenTypeNVX, which represent a subset of traditional command buffer commands. Represented as VkIndirectCommandsLayoutNVX.
- Indirect Drawing Commands
-
Drawing commands that source some of their parameters from structures in buffer memory. Includes vkCmdDrawIndirect, vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR, vkCmdDrawIndirectCountAMD, vkCmdDrawIndexedIndirectCountAMD, vkCmdDrawMeshTasksIndirectNV, vkCmdDrawMeshTasksIndirectCountNV, and vkCmdDrawIndexedIndirect.
- Initial State (Command Buffer)
-
A command buffer that has not begun recording commands. See also Recorded State and Executable State.
- Inline Uniform Block
-
A descriptor type that represents uniform data stored directly in descriptor sets, and supports read-only access in a shader.
- Input Attachment
-
A descriptor type that represents an image view, and supports unfiltered read-only access in a shader, only at the fragment’s location in the view.
- Instance
-
The top-level Vulkan object, which represents the application’s connection to the implementation. Represented by a VkInstance object.
- Instance-Level Command
-
Any command that is dispatched from an instance, or from a child object of an instance, except for physical devices and their children.
- Instance-Level Functionality
-
All instance-level commands and objects, and their structures, enumerated types, and enumerants.
- Instance-Level Object
-
High-level Vulkan objects, which are not physical devices, nor children of physical devices. For example, VkInstance is an instance-level object.
- Instance (Memory)
-
In a logical device representing more than one physical device, some device memory allocations have the requested amount of memory allocated multiple times, once for each physical device in a device mask. Each such replicated allocation is an instance of the device memory.
- Instance (Resource)
-
In a logical device representing more than one physical device, buffer and image resources exist on all physical devices but can be bound to memory differently on each. Each such replicated resource is an instance of the resource.
- Internal Synchronization
-
A type of synchronization required of the implementation, where parameters not defined to be externally synchronized may require internal mutexing to avoid multithreaded race conditions.
- Invocation (Shader)
-
A single execution of an entry point in a SPIR-V module. For example, a single vertex’s execution of a vertex shader or a single fragment’s execution of a fragment shader.
- Invocation Group
-
A set of shader invocations that are executed in parallel and that must execute the same control flow path in order for control flow to be considered dynamically uniform.
- Linear Resource
-
A resource is linear if it is one of the following:
-
a VkBuffer
-
a VkImage created with
VK_IMAGE_TILING_LINEAR
-
a VkImage created with
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
and whose Linux DRM format modifier isDRM_FORMAT_MOD_LINEAR
A resource is non-linear if it is one of the following:
-
a VkImage created with
VK_IMAGE_TILING_OPTIMAL
-
a VkImage created with
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
and whose Linux DRM format modifier is notDRM_FORMAT_MOD_LINEAR
-
- Linux DRM Format Modifier
-
A 64-bit, vendor-prefixed, semi-opaque unsigned integer that describes vendor-specific details of an image’s memory layout. In Linux graphics APIs, modifiers are commonly used to specify the memory layout of externally shared images. An image has a modifier if and only if it is created with
tiling
equal toVK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT
. For more details, refer to the appendix for extension VK_EXT_image_drm_format_modifier. - Local Workgroup
-
A collection of compute shader invocations invoked by a single dispatch command, which share data via
WorkgroupLocal
variables and can synchronize with each other. - Logical Device
-
An object that represents the application’s interface to the physical device. The logical device is the parent of most Vulkan objects. Represented by a VkDevice object.
- Logical Operation
-
Bitwise operations between a fragment color value and a value in a color attachment, that produce a final color value to be written to the attachment.
- Lost Device
-
A state that a logical device may be in as a result of unrecoverable implementation errors, or other exceptional conditions.
- Mappable
-
See Host-Visible Memory.
- Memory Dependency
-
A memory dependency is an execution dependency which includes availability and visibility operations such that:
-
The first set of operations happens-before the availability operation
-
The availability operation happens-before the visibility operation
-
The visibility operation happens-before the second set of operations
-
- Memory Domain
-
A memory domain is an abstract place to which memory writes are made available by availability operations and memory domain operations. The memory domains correspond to the set of agents that the write can then be made visible to. The memory domains are host, device, shader, workgroup instance (for workgroup instance there is a unique domain for each compute workgroup) and subgroup instance (for subgroup instance there is a unique domain for each subgroup).
- Memory Domain Operation
-
An operation that makes the writes that are available to one memory domain available to another memory domain.
- Memory Heap
-
A region of memory from which device memory allocations can be made.
- Memory Type
-
An index used to select a set of memory properties (e.g. mappable, cached) for a device memory allocation.
- Mesh Shading Pipeline
-
A graphics pipeline where the primitives are assembled explicitly in the shader stages. In contrast to the primitive shading pipeline where input primitives are assembled by fixed function processing.
- Mesh Tasks Drawing Commands
-
Drawing commands which create shader invocations organized in workgroups for drawing mesh tasks. Includes vkCmdDrawMeshTasksNV, vkCmdDrawMeshTasksIndirectNV, and vkCmdDrawMeshTasksIndirectCountNV.
- Minimum Miplevel Size
-
The smallest size that is permitted for a miplevel. For conventional images this is 1x1x1. For corner-sampled images, this is 2x2x2. See Image Miplevel Sizing.
- Mip Tail Region
-
The set of mipmap levels of a sparse residency texture that are too small to fill a sparse block, and that must all be bound to memory collectively and opaquely.
- Multi-planar
-
A multi-planar format (or “planar format”) is an image format consisting of more than one plane, identifiable with a
_2PLANE
or_3PLANE
component to the format name and listed in Formats requiring sampler Y’CBCR conversion forVK_IMAGE_ASPECT_COLOR_BIT
image views. A multi-planar image (or “planar image”) is an image of a multi-planar format. - Non-Dispatchable Handle
-
A handle of an integer handle type. Handle values may not be unique, even for two objects of the same type.
- Non-Indexed Drawing Commands
-
Drawing commands for which the vertex attributes are sourced in linear order from the vertex input attributes for a drawing command (i.e. they do not use an index buffer). Includes vkCmdDraw, vkCmdDrawIndirectCountKHR, vkCmdDrawIndirectCountAMD, and vkCmdDrawIndirect.
- Normalized
-
A value that is interpreted as being in the range [0,1] as a result of being implicitly divided by some other value.
- Normalized Device Coordinates
-
A coordinate space after perspective division is applied to clip coordinates, and before the viewport transformation converts to framebuffer coordinates.
- Object Table
-
A binding table for various resources (VkPipeline, VkBuffer, VkDescriptorSet), so that they can be referenced in device-generated command processing. Represented as VkObjectTableNVX. Entries are registered or unregistered via
uint32_t
indices. - Obsoleted
-
A feature is obsolete if it can no longer be used. For core features, making one obsolete would be in violation of the compatibility rules, so must not be done. However extensions do not have these guarantees, and can be made obsolete by a newer core version or extension.
- Overlapped Range (Aliased Range)
-
The aliased range of a device memory allocation that intersects a given image subresource of an image or range of a buffer.
- Ownership (Resource)
-
If an entity (e.g. a queue family) has ownership of a resource, access to that resource is well-defined for access by that entity.
- Packed Format
-
A format whose components are stored as a single texel block in memory, with their relative locations defined within that element.
- Passthrough Geometry Shader
-
A geometry shader which uses the
PassthroughNV
decoration on a variable in its input interface. Output primitives in a passthrough geometry shader always have the same topology as the input primitive and are not produced by emitting vertices. - Payload
-
Importable or exportable reference to the internal data of an object in Vulkan.
- Per-View
-
A variable that has an array of values which are output, one for each view that is being generated. A mesh shader which uses the
PerViewNV
decoration on a variable in its output interface. - Peer Memory
-
An instance of memory corresponding to a different physical device than the physical device performing the memory access, in a logical device that represents multiple physical devices.
- Physical Device
-
An object that represents a single device in the system. Represented by a VkPhysicalDevice object.
- Physical-Device-Level Command
-
Any command that is dispatched from a physical device.
- Physical-Device-Level Functionality
-
All physical-device-level commands and objects, and their structures, enumerated types, and enumerants.
- Physical-Device-Level Object
-
Physical device objects. For example, VkPhysicalDevice is a physical-device-level object.
- Pipeline
-
An object that controls how graphics or compute work is executed on the device. A pipeline includes one or more shaders, as well as state controlling any non-programmable stages of the pipeline. Represented by a VkPipeline object.
- Pipeline Barrier
-
An execution and/or memory dependency recorded as an explicit command in a command buffer, that forms a dependency between the previous and subsequent commands.
- Pipeline Cache
-
An object that can be used to collect and retrieve information from pipelines as they are created, and can be populated with previously retrieved information in order to accelerate pipeline creation. Represented by a VkPipelineCache object.
- Pipeline Layout
-
An object that defines the set of resources (via a collection of descriptor set layouts) and push constants used by pipelines that are created using the layout. Used when creating a pipeline and when binding descriptor sets and setting push constant values. Represented by a VkPipelineLayout object.
- Pipeline Stage
-
A logically independent execution unit that performs some of the operations defined by an action command.
pNext
Chain-
A set of structures chained together through their
pNext
members. - Planar
-
See multi-planar.
- Plane
-
An image plane is part of the representation of an image, containing a subset of the color channels required to represent the texels in the image and with a contiguous mapping of coordinates to bound memory. Most images consist only of a single plane, but some formats spread the channels across multiple image planes. The host-accessible properties of each image plane are accessed in a linear layout using vkGetImageSubresourceLayout. If a multi-planar image is created with the
VK_IMAGE_CREATE_DISJOINT_BIT
bit set, the image is described as disjoint, and its planes are therefore are bound to memory independently. - Point Sampling (Rasterization)
-
A rule that determines whether a fragment sample location is covered by a polygon primitive by testing whether the sample location is in the interior of the polygon in framebuffer-space, or on the boundary of the polygon according to the tie-breaking rules.
- Presentable image
-
A
VkImage
object obtained from aVkSwapchainKHR
used to present to aVkSurfaceKHR
object. - Preserve Attachment
-
One of a list of attachments in a subpass description that is not read or written by the subpass, but that is read or written on earlier and later subpasses and whose contents must be preserved through this subpass.
- Primary Command Buffer
-
A command buffer that can execute secondary command buffers, and can be submitted directly to a queue.
- Primitive Shading Pipeline
-
A graphics pipeline where input primitives are assembled by fixed function processing. It is the counterpart to mesh shading.
- Primitive Topology
-
State that controls how vertices are assembled into primitives, e.g. as lists of triangles, strips of lines, etc..
- Promoted
-
A feature is promoted if it is taken from an older extension and made available as part of a new core version of the API, or a newer extension that is considered to be either as widely supported or more so. A promoted feature may have minor differences from the original such as:
-
It may be renamed
-
A small number of non-intrusive parameters may have been added
-
The feature may be advertised differently by device features
-
The author ID suffixes will be changed or removed as appropriate
-
- Protected Buffer
-
A buffer to which protected device memory can be bound.
- Protected-capable Device Queue
-
A device queue to which protected command buffers can be submitted.
- Protected Command Buffer
-
A command buffer which can be submitted to a protected-capable device queue.
- Protected Device Memory
-
Device memory which can be visible to the device but must not be visible to the host.
- Protected Image
-
An image to which protected device memory can be bound.
- Provisional
-
A feature is released provisionally in order to get wider feedback on the functionality before it is finalized. Provisional features may change in ways that break backwards compatibility, and thus are not recommended for use in production applications.
- Provoking Vertex
-
The vertex in a primitive from which flat shaded attribute values are taken. This is generally the “first” vertex in the primitive, and depends on the primitive topology.
- Push Constants
-
A small bank of values writable via the API and accessible in shaders. Push constants allow the application to set values used in shaders without creating buffers or modifying and binding descriptor sets for each update.
- Push Constant Interface
-
The set of variables with
PushConstant
storage class that are statically used by a shader entry point, and which receive values from push constant commands. - Push Descriptors
-
Descriptors that are written directly into a command buffer rather than into a descriptor set. Push descriptors allow the application to set descriptors used in shaders without allocating or modifying descriptor sets for each update.
- Descriptor Update Template
-
An object that specifies a mapping from descriptor update information in host memory to elements in a descriptor set, which helps enable more efficient descriptor set updates.
- Query Pool
-
An object that contains a number of query entries and their associated state and results. Represented by a VkQueryPool object.
- Queue
-
An object that executes command buffers and sparse binding operations on a device. Represented by a VkQueue object.
- Queue Family
-
A set of queues that have common properties and support the same functionality, as advertised in VkQueueFamilyProperties.
- Queue Operation
-
A unit of work to be executed by a specific queue on a device, submitted via a queue submission command. Each queue submission command details the specific queue operations that occur as a result of calling that command. Queue operations typically include work that is specific to each command, and synchronization tasks.
- Queue Submission
-
Zero or more batches and an optional fence to be signaled, passed to a command for execution on a queue. See the Devices and Queues chapter for more information.
- Recording State (Command Buffer)
-
A command buffer that is ready to record commands. See also Initial State and Executable State.
- Release Operation (Resource)
-
An operation that releases ownership of an image subresource or buffer range.
- Render Pass
-
An object that represents a set of framebuffer attachments and phases of rendering using those attachments. Represented by a VkRenderPass object.
- Render Pass Instance
-
A use of a render pass in a command buffer.
- Required Extensions
-
Extensions that must be enabled alongside extensions dependent on them (see Extension Dependencies).
- Reset (Command Buffer)
-
Resetting a command buffer discards any previously recorded commands and puts a command buffer in the initial state.
- Residency Code
-
An integer value returned by sparse image instructions, indicating whether any sparse unbound texels were accessed.
- Resolve Attachment
-
A subpass attachment point, or image view, that is the target of a multisample resolve operation from the corresponding color attachment at the end of the subpass.
- Retired Swapchain
-
A swapchain that has been used as the
oldSwapchain
parameter to vkCreateSwapchainKHR. Images cannot be acquired from a retired swapchain, however images that were acquired (but not presented) before the swapchain was retired can be presented. - Sample Shading
-
Invoking the fragment shader multiple times per fragment, with the covered samples partitioned among the invocations.
- Sampled Image
-
A descriptor type that represents an image view, and supports filtered (sampled) and unfiltered read-only access in a shader.
- Sampler
-
An object that contains state that controls how sampled image data is sampled (or filtered) when accessed in a shader. Also a descriptor type describing the object. Represented by a VkSampler object.
- Secondary Command Buffer
-
A command buffer that can be executed by a primary command buffer, and must not be submitted directly to a queue.
- Self-Dependency
-
A subpass dependency from a subpass to itself, i.e. with
srcSubpass
equal todstSubpass
. A self-dependency is not automatically performed during a render pass instance, rather a subset of it can be performed via vkCmdPipelineBarrier during the subpass. - Semaphore
-
A synchronization primitive that supports signal and wait operations, and can be used to synchronize operations within a queue or across queues. Represented by a VkSemaphore object.
- Shader
-
Instructions selected (via an entry point) from a shader module, which are executed in a shader stage.
- Shader Code
-
A stream of instructions used to describe the operation of a shader.
- Shader Module
-
A collection of shader code, potentially including several functions and entry points, that is used to create shaders in pipelines. Represented by a VkShaderModule object.
- Shader Stage
-
A stage of the graphics or compute pipeline that executes shader code.
- Shading Rate
-
The ratio of the number of fragment shader invocations generated in a fully covered framebuffer region to the size (in pixels) of that region.
- Shading Rate Image
-
An image used to establish the shading rate for a framebuffer region, where each pixel controls the shading rate for a corresponding framebuffer region.
- Shared presentable image
-
A presentable image created from a swapchain with VkPresentModeKHR set to either
VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR
orVK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR
. - Side Effect
-
A store to memory or atomic operation on memory from a shader invocation.
- Single-plane format
-
A format that is not multi-planar.
- Size-Compatible Image Formats
-
When a compressed image format and an uncompressed image format are size-compatible, it means that the texel block size of the uncompressed format must equal the texel block size of the compressed format.
- Sparse Block
-
An element of a sparse resource that can be independently bound to memory. Sparse blocks of a particular sparse resource have a corresponding size in bytes that they use in the bound memory.
- Sparse Image Block
-
A sparse block in a sparse partially-resident image. In addition to the sparse block size in bytes, sparse image blocks have a corresponding width, height, and depth that define the dimensions of these elements in units of texels or compressed texel blocks, the latter being used in case of sparse images having a block-compressed format.
- Sparse Unbound Texel
-
A texel read from a region of a sparse texture that does not have memory bound to it.
- Static Use
-
An object in a shader is statically used by a shader entry point if any function in the entry point’s call tree contains an instruction using the object. Static use is used to constrain the set of descriptors used by a shader entry point.
- Storage Buffer
-
A descriptor type that represents a buffer, and supports reads, writes, and atomics in a shader.
- Storage Image
-
A descriptor type that represents an image view, and supports unfiltered loads, stores, and atomics in a shader.
- Storage Texel Buffer
-
A descriptor type that represents a buffer view, and supports unfiltered, formatted reads, writes, and atomics in a shader.
- Subgroup
-
A set of shader invocations that can synchronize and share data with each other efficiently. In compute shaders, the local workgroup is a superset of the subgroup.
- Subgroup Mask
-
A bitmask for all invocations in the current subgroup with one bit per invocation, starting with the least significant bit in the first vector component, continuing to the last bit (less than
SubgroupSize
) in the last required vector component. - Subpass
-
A phase of rendering within a render pass, that reads and writes a subset of the attachments.
- Subpass Dependency
-
An execution and/or memory dependency between two subpasses described as part of render pass creation, and automatically performed between subpasses in a render pass instance. A subpass dependency limits the overlap of execution of the pair of subpasses, and can provide guarantees of memory coherence between accesses in the subpasses.
- Subpass Description
-
Lists of attachment indices for input attachments, color attachments, depth/stencil attachment, resolve attachments, and preserve attachments used by the subpass in a render pass.
- Subset (Self-Dependency)
-
A subset of a self-dependency is a pipeline barrier performed during the subpass of the self-dependency, and whose stage masks and access masks each contain a subset of the bits set in the identically named mask in the self-dependency.
- Texel Block
-
A single addressable element of an image with an uncompressed VkFormat, or a single compressed block of an image with a compressed VkFormat.
- Texel Block Size
-
The size (in bytes) used to store a texel block of a compressed or uncompressed image.
- Texel Coordinate System
-
One of three coordinate systems (normalized, unnormalized, integer) that define how texel coordinates are interpreted in an image or a specific mipmap level of an image.
- Uniform Texel Buffer
-
A descriptor type that represents a buffer view, and supports unfiltered, formatted, read-only access in a shader.
- Uniform Buffer
-
A descriptor type that represents a buffer, and supports read-only access in a shader.
- Units in the Last Place (ULP)
-
A measure of floating-point error loosely defined as the smallest representable step in a floating-point format near a given value. For the precise definition see Precision and Operation of SPIR-V instructions or Jean-Michel Muller, “On the definition of ulp(x)”, RR-5504, INRIA. Other sources may also use the term “unit of least precision”.
- Unnormalized
-
A value that is interpreted according to its conventional interpretation, and is not normalized.
- Unprotected Buffer
-
A buffer to which unprotected device memory can be bound.
- Unprotected Command Buffer
-
A command buffer which can be submitted to an unprotected device queue or a protected-capable device queue.
- Unprotected Device Memory
-
Device memory which can be visible to the device and can be visible to the host.
- Unprotected Image
-
An image to which unprotected device memory can be bound.
- User-Defined Variable Interface
-
A shader entry point’s variables with
Input
orOutput
storage class that are not built-in variables. - Vertex Input Attribute
-
A graphics pipeline resource that produces input values for the vertex shader by reading data from a vertex input binding and converting it to the attribute’s format.
- Vertex Stream
-
A vertex stream is where the last vertex processing stage outputs vertex data, which then goes to the rasterizer, is captured to a transform feedback buffer, or both. Geometry shaders can emit primitives to multiple independent vertex streams. Each vertex emitted by the geometry shader is directed at one of the vertex streams.
- Validation Cache
-
An object that can be used to collect and retrieve validation results from the validation layers, and can be populated with previously retrieved results in order to accelerate the validation process. Represented by a VkValidationCacheEXT object.
- Vertex Input Binding
-
A graphics pipeline resource that is bound to a buffer and includes state that affects addressing calculations within that buffer.
- Vertex Input Interface
-
A vertex shader entry point’s variables with
Input
storage class, which receive values from vertex input attributes. - Vertex Processing Stages
-
A set of shader stages that comprises the vertex shader, tessellation control shader, tessellation evaluation shader, and geometry shader stages. The task and mesh shader stages also belong to this group.
- View Mask
-
When multiview is enabled, a view mask is a property of a subpass controlling which views the rendering commands are broadcast to.
- View Volume
-
A subspace in homogeneous coordinates, corresponding to post-projection x and y values between -1 and +1, and z values between 0 and +1.
- Viewport Transformation
-
A transformation from normalized device coordinates to framebuffer coordinates, based on a viewport rectangle and depth range.
- Visibility Operation
-
An operation that causes available values to become visible to specified memory accesses.
- Visible
-
A state of values written to memory that allows them to be accessed by a set of operations.
Common Abbreviations
Abbreviations and acronyms are sometimes used in the Specification and the API where they are considered clear and commonplace, and are defined here:
- Src
-
Source
- Dst
-
Destination
- Min
-
Minimum
- Max
-
Maximum
- Rect
-
Rectangle
- Info
-
Information
- LOD
-
Level of Detail
- ID
-
Identifier
- UUID
-
Universally Unique Identifier
- Op
-
Operation
- R
-
Red color component
- G
-
Green color component
- B
-
Blue color component
- A
-
Alpha color component
Prefixes
Prefixes are used in the API to denote specific semantic meaning of Vulkan names, or as a label to avoid name clashes, and are explained here:
- VK/Vk/vk
-
Vulkan namespace
All types, commands, enumerants and defines in this specification are prefixed with these two characters. - PFN/pfn
-
Function Pointer
Denotes that a type is a function pointer, or that a variable is of a pointer type. - p
-
Pointer
Variable is a pointer. - vkCmd
-
Commands that record commands in command buffers
These API commands do not result in immediate processing on the device. Instead, they record the requested action in a command buffer for execution when the command buffer is submitted to a queue. - s
-
Structure
Used to denote theVK_STRUCTURE_TYPE*
member of each structure insType
Appendix H: Credits (Informative)
Vulkan 1.1 is the result of contributions from many people and companies participating in the Khronos Vulkan Working Group, as well as input from the Vulkan Advisory Panel.
Members of the Working Group, including the company that they represented at the time of their most recent contribution, are listed in the following sections. Some specific contributions made by individuals are listed together with their name.
Working Group Contributors to Vulkan 1.1 and 1.0
-
Adam Jackson, Red Hat
-
Alexander Galazin, Arm
-
Alex Bourd, Qualcomm Technologies, Inc.
-
Alon Or-bach, Samsung Electronics (WSI technical sub-group chair)
-
Andrew Garrard, Samsung Electronics (format wrangler)
-
Andrew Woloszyn, Google
-
Antoine Labour, Google
-
Bill Licea-Kane, Qualcomm Technologies, Inc.
-
Cass Everitt, Oculus VR
-
Chad Versace, Google
-
Christophe Riccio, Unity Technologies
-
Dan Baker, Oxide Games
-
Dan Ginsburg, Valve Software
-
Daniel Johnston, Intel
-
Daniel Koch, NVIDIA (Shader Interfaces; Features, Limits, and Formats)
-
Daniel Rakos, AMD
-
David Airlie, Red Hat
-
David Miller, Miller & Mattson (Vulkan reference card)
-
David Neto, Google
-
Dominik Witczak, AMD
-
Graeme Leese, Broadcom
-
Graham Sellers, AMD
-
Ian Romanick, Intel
-
James Jones, NVIDIA
-
Jan-Harald Fredriksen, Arm
-
Jan Hermes, Continental Corporation
-
Jason Ekstrand, Intel
-
Jeff Bolz, NVIDIA (extensive contributions, exhaustive review and rewrites for technical correctness)
-
Jeff Juliano, NVIDIA
-
Jesse Barker, Unity Technologies
-
Jesse Hall, Google
-
Johannes van Waveren, Oculus VR
-
John Kessenich, Google (SPIR-V and GLSL for Vulkan spec author)
-
John McDonald, Valve Software
-
Jonas Gustavsson, Samsung Electronics
-
Jon Ashburn, LunarG
-
Jon Leech, Independent (XML toolchain, normative language, release wrangler)
-
Jungwoo Kim, Samsung Electronics
-
Kathleen Mattson, Miller & Mattson (Vulkan reference card)
-
Kenneth Benzie, Codeplay Software Ltd.
-
Kerch Holt, NVIDIA (SPIR-V technical sub-group chair)
-
Kristian Kristensen, Intel
-
Mark Lobodzinski, LunarG
-
Mathias Heyer, NVIDIA
-
Mathias Schott, NVIDIA
-
Maurice Ribble, Qualcomm Technologies, Inc.
-
Michael Worcester, Imagination Technologies
-
Mika Isojarvi, Google
-
Mitch Singer, AMD
-
Neil Henning, Codeplay Software Ltd.
-
Neil Trevett, NVIDIA
-
Norbert Nopper, Independent
-
Pierre Boudier, NVIDIA
-
Pierre-Loup Griffais, Valve Software
-
Piers Daniell, NVIDIA (dynamic state, copy commands, memory types)
-
Pyry Haulos, Google (Vulkan conformance test subcommittee chair)
-
Ray Smith, Arm
-
Robert Simpson, Qualcomm Technologies, Inc.
-
Rolando Caloca Olivares, Epic Games
-
Sean Harmer, KDAB Group
-
Shannon Woods, Google
-
Slawomir Cygan, Intel
-
Slawomir Grajewski, Intel
-
Stuart Smith, Imagination Technologies
-
Timothy Lottes, AMD
-
Tobias Hector, Imagination Technologies (validity language and toolchain)
-
Tom Olson, Arm (working group chair)
-
Tony Barbour, LunarG
-
Yanjun Zhang, VeriSilicon
Working Group Contributors to Vulkan 1.1
-
Aaron Greig, Codeplay Software Ltd.
-
Aaron Hagan, AMD
-
Alan Ward, Google
-
Alejandro Piñeiro, Igalia
-
Andres Gomez, Igalia
-
Baldur Karlsson, Independent
-
Barthold Lichtenbelt, NVIDIA
-
Bas Nieuwenhuizen, Google
-
Bill Hollings, Brenwill
-
Colin Riley, AMD
-
Cort Stratton, Google
-
Courtney Goeltzenleuchter, Google
-
Dae Kim, Imagination Technologies
-
Daniel Stone, Collabora
-
David Pinedo, LunarG
-
Dejan Mircevski, Google
-
Dzmitry Malyshau, Mozilla
-
Erika Johnson, LunarG
-
Greg Fischer, LunarG
-
Hans-Kristian Arntzen, Arm
-
Iago Toral, Igalia
-
Ian Elliott, Google
-
Jeff Leger, Qualcomm Technologies, Inc.
-
Jeff Vigil, Samsung Electronics
-
Jens Owen, Google
-
Joe Davis, Samsung Electronics
-
John Zulauf, LunarG
-
Jordan Justen, Intel
-
Jörg Wagner, Arm
-
Kalle Raita, Google
-
Karen Ghavam, LunarG
-
Karl Schultz, LunarG
-
Kenneth Russell, Google
-
Kevin O’Neil, AMD
-
Lauri Ilola, Nokia
-
Lenny Komow, LunarG
-
Lionel Landwerlin, Intel
-
Maciej Jesionowski, AMD
-
Mais Alnasser, AMD
-
Marcin Rogucki, Mobica
-
Mark Callow, Independent
-
Mark Kilgard, NVIDIA
-
Markus Tavenrath, NVIDIA
-
Mark Young, LunarG
-
Matthäus Chajdas, AMD
-
Matt Netsch, Qualcomm Technologies, Inc.
-
Michael O’Hara, AMD
-
Michael Wong, Codeplay Software Ltd.
-
Mike Schuchardt, LunarG
-
Mike Weiblen, LunarG
-
Nicolai Hähnle, AMD
-
Nuno Subtil, NVIDIA
-
Patrick Cozzi, Independent
-
Petros Bantolas, Imagination Technologies
-
Ralph Potter, Codeplay Software Ltd.
-
Rob Barris, NVIDIA
-
Ruihao Zhang, Qualcomm Technologies, Inc.
-
Sorel Bosan, AMD
-
Stephen Huang, Mediatek
-
Tilmann Scheller, Samsung Electronics
-
Tomasz Bednarz, Independent
-
Victor Eruhimov, ???
-
Wolfgang Engel, ???
Working Group Contributors to Vulkan 1.0
-
Adam Śmigielski, Mobica
-
Allen Hux, Intel
-
Andrew Cox, Samsung Electronics
-
Andrew Poole, Samsung Electronics
-
Andrew Rafter, Samsung Electronics
-
Andrew Richards, Codeplay Software Ltd.
-
Aras Pranckevičius, Unity Technologies
-
Ashwin Kolhe, NVIDIA
-
Ben Bowman, Imagination Technologies
-
Benj Lipchak
-
Bill Hollings, The Brenwill Workshop
-
Brent E. Insko, Intel
-
Brian Ellis, Qualcomm Technologies, Inc.
-
Cemil Azizoglu, Canonical
-
Chang-Hyo Yu, Samsung Electronics
-
Chia-I Wu, LunarG
-
Chris Frascati, Qualcomm Technologies, Inc.
-
Cody Northrop, LunarG
-
Courtney Goeltzenleuchter, LunarG
-
Damien Leone, NVIDIA
-
David Mao, AMD
-
David Yu, Pixar
-
Frank (LingJun) Chen, Qualcomm Technologies, Inc.
-
Fred Liao, Mediatek
-
Gabe Dagani, Freescale
-
Graham Connor, Imagination Technologies
-
Hwanyong Lee, Kyungpook National University
-
Ian Elliott, LunarG
-
James Hughes, Oculus VR
-
Jeff Vigil, Qualcomm Technologies, Inc.
-
Jens Owen, LunarG
-
Jeremy Hayes, LunarG
-
Jonathan Hamilton, Imagination Technologies
-
Krzysztof Iwanicki, Samsung Electronics
-
Larry Seiler, Intel
-
Lutz Latta, Lucasfilm
-
Maria Rovatsou, Codeplay Software Ltd.
-
Mark Callow
-
Mateusz Przybylski, Intel
-
Maxim Lukyanov, Samsung Electronics
-
Michael Lentine, Google
-
Michal Pietrasiuk, Intel
-
Mike Stroyan, LunarG
-
Minyoung Son, Samsung Electronics
-
Mythri Venugopal, Samsung Electronics
-
Naveen Leekha, Google
-
Nick Penwarden, Epic Games
-
Niklas Smedberg, Unity Technologies
-
Pat Brown, NVIDIA
-
Patrick Doane, Blizzard Entertainment
-
Peter Lohrmann, Valve Software
-
Piotr Bialecki, Intel
-
Prabindh Sundareson, Samsung Electronics
-
Rob Stepinski, Transgaming
-
Roy Ju, Mediatek
-
Rufus Hamade, Imagination Technologies
-
Sean Ellis, Arm
-
Stefanus Du Toit, Google
-
Steve Hill, Broadcom
-
Steve Viggers, Core Avionics & Industrial Inc.
-
Tim Foley, Intel
-
Timo Suoranta, AMD
-
Tobin Ehlis, LunarG
-
Tomasz Kubale, Intel
-
Wayne Lister, Imagination Technologies
Other Credits
The Vulkan Advisory Panel members provided important real-world usage information and advice that helped guide design decisions.
The wider Vulkan community have provided useful feedback, questions and spec changes that have helped improve the quality of the Specification via GitHub.
Administrative support to the Working Group for Vulkan 1.1 was provided by Khronos staff including Angela Cheng, Ann Thorsnes, Emily Stearns, Liz Maitral, and Dominic Agoro-Ombaka; and by Alex Crabb of Caster Communications.
Administrative support for Vulkan 1.0 was provided by Andrew Riegel, Elizabeth Riegel, Glenn Fredericks, Kathleen Mattson and Michelle Clark of Gold Standard Group.
Technical support was provided by James Riordon, webmaster of Khronos.org and OpenGL.org.