39. Video Decode and Encode Operations
Vulkan implementations can expose video decode and encode engines, which are independent from the graphics and compute engines. Video decode and encode is performed by recording video operations and submitting them to video decode and encode queues. Vulkan provides core support for video decode and encode and can support a variety of video codecs through individual extensions built on the core video support.
The subsections below detail the fundamental components and operation of Vulkan video.
39.1. Technical Terminology and Semantics
39.1.1. Video Picture Resources
Video Picture Resources contain format information, can be multidimensional and may have associated metadata. The metadata can include implementation-private details required for the decode or encode operations and application managed color-space related information.
In Vulkan, a Video Picture Resource is represented by a VkImage. The VkImageView, representing the VkImage, is used with the decode operations as Output and Decoded Picture Buffer (DPB), and with the encode operation as Input and Reconstructed Video Picture Resource.
39.1.2. Reference Picture
Video Reference Picture is a Video Picture Resource that can be used in the video decode or encode process to provide predictions of the values of samples in the subsequently decoded or encoded pictures.
39.1.3. Decoded Output Picture
The pixels resulting from the video decoding process are stored in a Decoded Output Picture, represented by a VkImageView. This can be shared with the Encoder Reconstructed or Decoder DPB Video Picture Resources. It can also be used as an input for Video Encode, Graphics, Compute processing, or WSI presentation.
39.1.4. Input Picture to Encode
The primary source of input pixels for the video encoding process is the Input Picture to Encode, represented by a VkImageView. This can be shared with the Encoder Reconstructed or Decoder DPB Video Picture Resources. It can be a direct target of Video Decode, Graphics, Compute processing, or WSI presentation.
39.1.5. Decoded Picture Buffer (DPB)
Previously decoded pictures are used by video codecs to provide predictions of the values of samples in the subsequently decoded pictures. At the decoder, such Video Picture Resources are stored in a Decoded Picture Buffer (DPB) as an indexed set of Reference Pictures.
39.1.6. Reconstructed Pictures
An integral part of the video decoding pipeline is the reconstruction of pictures from the compressed stream. A similar stage exists in the video encoding pipeline as well. Such reconstructed pictures may be used as Reference Pictures for subsequently decoded or encoded pictures. The correct use of such Reference Pictures is driven by the video compression standard, the implementation, and the application-specific use cases.
This specification refers to the collection of the Decoded Picture Buffer and Reconstructed Pictures as Decoded Picture Buffer (DPB) Set, or only, DPB.
39.1.7. Decoded Picture Buffer (DPB) Slot
Decoded Picture Buffer (DPB) Slot represents a single or multi-layer indexed Reference Picture’s entry within the Video Session’s DPB Set. A valid DPB Slot index starts from zero and goes up to the maximum of N - 1, where N is the number of Reference Picture entries requested for a Video Session.
39.1.8. Reference Picture Metadata
The opaque DPB Slot state managed by the implementation may contain Reference Picture Metadata, present when the picture resource associated with the DPB Slot is used as a reference picture in one or more video decode or encode operations.
An implementation or application may have other Picture Metadata related to the Video Picture Resource or the DPB Slot, but such data is outside the scope of this specification.
Note:
The video decode or encode implementation does not maintain internal references to the Reference Pictures, beyond the Reference Picture Metadata. It is the responsibility of the Vulkan Application to create, manage, and destroy, as well as to provide those Video Picture Resources, when required, during the decoding or encoding process. |
39.1.9. Color Space Metadata
Color Space Metadata is the additional static or dynamic state associated with a Video Picture Resource specifying the color volume (the color primaries, white point, and luminance range) of the display that was used in mastering the video content. The use of Color Space Metadata is outside the scope of the current version of the video core specification.
39.2. Introduction
This chapter discusses extensions supporting Video Decode or Encode
operations.
Video Decode and Encode operations are supported by queues with an
advertised queue capability of VK_QUEUE_VIDEO_DECODE_BIT_KHR
and
VK_QUEUE_VIDEO_ENCODE_BIT_KHR
, respectively.
Video Decode or Encode queue operation support allows for Vulkan
applications to cooperate closely with other graphics or compute operations
seamlessly and efficiently, therefore improving the overall application
performance.
39.2.1. Video Decode Queue
VK_KHR_video_decode_queue
adds a video decode queue type bit
VK_QUEUE_VIDEO_DECODE_BIT_KHR
to VkQueueFlagBits.
As in the case of other queue types, an application must use
vkGetPhysicalDeviceQueueFamilyProperties to query whether the physical
device has support for the Video Decode Queue.
When the implementation reports the VK_QUEUE_VIDEO_DECODE_BIT_KHR
bit
for a queue family, it advertises general support for Vulkan queue
operations described in Devices and Queues.
39.2.2. Video Encode Queue
VK_KHR_video_encode_queue
adds a video encode queue type bit
VK_QUEUE_VIDEO_ENCODE_BIT_KHR
to VkQueueFlagBits.
As in the case of other queue types, an application must use
vkGetPhysicalDeviceQueueFamilyProperties to query whether the physical
device has support for the Video Encode Queue.
When the implementation reports the VK_QUEUE_VIDEO_ENCODE_BIT_KHR
bit
for a queue family, it advertises general support for Vulkan queue
operations described in Devices and Queues.
The rest of the chapter focuses, specifically, on Video Decode and Encode queue operations.
39.2.3. Video Session
Before performing any video decoding or encoding operations, the application must create a Video Session instance, of type VkVideoSessionKHR. A Video Session instance is an immutable object and supports a single compression standard (for example, H.264, H.265, VP9, AV1, etc.). The implementation uses the VkVideoSessionKHR object to maintain the video state for the video decode or video encode operation. A Video Session instance is created specifically:
-
For a particular video compression standard;
-
For video decoding or video encoding;
-
With maximum supported decoded or encoded picture width/height;
-
With the maximum number of supported DPB or Reconstructed Pictures slots that can be allocated;
-
With the maximum number of Reference Pictures that can be used simultaneously for video decode or encode operations;
-
Codec color and features profile;
-
Color Space format description (not supported with this version of the specification);
VkVideoSessionKHR represents a single video decode or encode stream. For each concurrently used stream, a separate instance of VkVideoSessionKHR is required. After the application has finished with the processing of a stream, it can reuse the Video Session instance for another, provided that the configuration parameters between the two usages are compatible (as determined by the video compression standard in use). Once the VkVideoSessionKHR instance has been created, the video compression standard and profiles, Input / Output / DPB formats, and the settings like the maximum extent cannot be changed.
The values of the following VkVideoSessionKHR parameters can be updated each frame, subject to the restrictions imposed on parameter updates by the video compression standard in use:
-
decoded or encoded picture size
-
number of active DPB or Reconstructed Picture slots
-
number of Reference Pictures in use,
-
color space and color space metadata
-
color space metadata.
The updated parameters must not exceed the maximum limits specified when creating the VkVideoSessionKHR instance.
39.2.4. Video Session Device Memory Heaps
After creating a Video Session instance, and before the object can be used for any of the decode or encode operations, the application must allocate and bind device memory resources to the Video Session object. An implementation may require one or more device memory heaps of different memory types, as reported by the vkGetVideoSessionMemoryRequirementsKHR function, to be bound with the vkBindVideoSessionMemoryKHR function to the Video Session, For more information about the Video Session Device Memory, please refer to the Binding the Session Object Device Memory section, below.
39.2.5. Video Session Parameters
A lot of codec standards require parameters that are in use for the entire video stream. For example, H.264/AVC and HEVC standards require sequence and picture parameter sets (SPS and PPS) that apply to multiple Video Decode and Encode frames, layers, and sub-layers. Vulkan Video uses Video Session Parameters objects to store such standard parameters. The application creates one or more Video Session Parameters Objects against a Video Session, with a set of common Video Parameters that are required for the processing of the video content. During the object creation, the implementation stores the parameters to the created instance. During command buffer recording, it is the responsibility of the application to provide the Video Session Parameters object containing the parameters that are necessary for the processing the portion of the stream under consideration.
39.2.6. Video Picture Subresources
For Video Picture Resources, an application has
the option to use single or multi-layer images for
image views.
The layer to be used during decode or encode operations can be specified
when the image view is being created with the
VkImageSubresourceRange::baseArrayLayer
parameter, and/or within
the resource binding operations in command buffer by using the
VkVideoPictureResourceKHR::baseArrayLayer
parameter.
Note:
Both Video Decode and Encode operations only work with a single layer at the time. |
The Image views representing the
Input / Output /
DPB Video Picture Resources could have
been created with sizes bigger than the coded size that is used with Video
Decode and Encode operations.
This allows for the same Video Picture Resources to be reused when there is
a change in the input video content resolution.
The effective coded size of the Video Picture
Resources used for Video Decode and Encode operations is provided with
VkVideoPictureResourceKHR::extent
parameter of each resource in
use.
Note:
Many codec standards require the coded and Video Picture Resources' sizes to match. |
Video Session DPB and Reconstructed Video Picture Resources
The video compression standard chosen may require the use of Reference Pictures. In Vulkan Video, like any other Video Picture Resources, the Reference Pictures are represented with Image Views.
When an application requires Reference Picture Resources, it creates and then associates image views, representing these resources, with Video Session DPB or Reconstructed slots while recording the command buffer.
Decoded output pictures may be used as reference pictures in future video decode operations. The same pictures may be used in texture sampling operations or in the (WSI) presentation pipeline. Representing the DPB’s Video Picture Resources by image views makes it possible to accommodate all these use cases in a “zero-copy” fashion. Also, it provides more fine-grained control of the application over the efficient usage of the DPB and Reconstructed Device Memory Resources.
Video Session DPB and Reconstructed Slot Resource Management
Before Video Picture Resources can be used as Reference Picture Resources, Video Session DPB or Reconstructed Slots must be associated with those resources.
The application allocates a DPB or Reconstructed Slot and associates it with a Video Picture Resource and then sets up the resource as a target of decode or encode operation. After successfully decoding or encoding a picture with the targeted DPB or Reconstructed Slot , in addition to the Reference Picture pixel data, the implementation may generate an opaque Reference Picture Metadata for that video session Slot and its associated Video Picture Resource.
Subsequently, one or more DPB or Reconstructed video session Slots, along with their associated Video Picture Resources, can be used as Reference Picture’s source for the video decode or encode operations.
If Reference Pictures were to be required for decoding
or encoding of the video bitstream, the
VkVideoSessionCreateInfoKHR::maxReferencePicturesSlotsCount
must be set to a value bigger than 0
when the instance of the
Video Session object is created.
Up to
VkVideoSessionCreateInfoKHR::maxReferencePicturesSlotsCount
slots can be activated with Video Picture
Resources for a video session and up to
VkVideoSessionCreateInfoKHR::maxReferencePicturesActiveCount
active slots can be used as DPB or Reconstructed
Reference Pictures within a single decode or encode
operation.
When the implementation is associating Reference Picture Metadata with the Video Picture Resources themselves, such data must be independent of the Video Session to allow for those Video Picture Resources to be shared with other Video Session instances. All of the Video Session-dependent Reference Picture Metadata must only be associated with the Video Session DPB or Reconstructed Slots.
The application with the help of the implementation is responsible for managing the individual DPB, or Reconstructed Slots that belong to a single Video Session DPB set:
-
The application maintains the Slot allocation and per-slot Reference Picture Resources;
-
Implementation maintains global and per-slot opaque Reference Picture Metadata;
The application also manages the mapping between the codec-specific picture IDs and DPB Slots.
When a Video Picture is decoded and is set as a Reference Picture against a Video Session DPB Slot, or is encoded and a Reconstructed Video Picture Resource is associated with a Video Session DPB Slot then:
-
The Video Picture Resource associated with the Slot is filled with the decoded or reconstructed pixel data;
-
The implementation generates the DPB Slot’s Reference Picture Metadata;
When a DPB’s Slot is deactivated, or a different Video Picture Resource is used with the Slot, or the content of the Video Picture Resource is modified, the Reference Picture Metadata associated with the DPB Slot gets invalidated by the implementation. Subsequent attempts to use such, invalidated, DPB Slot as a Reference source would produce undefined results.
Video Session DPB Slot subresources
DPB Reference Picture’s coded width and height can
change, dynamically, via VkVideoPictureResourceKHR::extent
, and
the picture parameters from the codec-specific extensions.
When a DPB Slot is activated as a Reference
Picture and a decode or encode operation is performed against that slot,
the coded extent can be recorded by the implementation to the corresponding
DPB Slot’s metadata state.
Subsequently, when the Reference Pictures are used
with the decoded Output or encoded
Input Picture, their coded extent can differ.
Decoding or encoding pictures, using picture sizes, different from the
previously produced Reference Pictures should be used
with care, not to conflict with the codec standard and the implementation’s
support for that.
It is the responsibility of the application to ensure that valid DPB
Set of Reference Pictures are in use, according to
the codec standard.
In addition, the Video Picture Resources extent
cannot exceed the VkVideoSessionCreateInfoKHR::maxCodedExtent
.
Note:
Coding Standards such as VP9 and AV1 allow for images with different sizes to be used as Reference Pictures. Others, like H.264 and H.265, do not support Reference Pictures with different sizes. Using Reference Pictures with incompatible sizes with such standards would render undefined results. |
The application is in control of the allocation and use of the system resource
In Vulkan Video, the application has complete control over how and when system resources are used. The Vulkan Video framework provides the following tools to ensure that device and host memory resources are used in an optimal way:
-
The video application can allocate or destroy the number of allocated Output or Input Pictures, and can grow, or shrink the DPB set of Reference Pictures, dynamically, based on the changing video content requirement.
-
Reference Pictures can be shared with the decoded Output or encoded Input pictures.
-
The application can use sparse memory for the images, representing Video Picture Resources. The use of sparse memory would allow the application to remove the Device Memory backing of the image resources when the DPB Slot is not in active use. Furthermore, if the sparse residency feature is supported by the implementation (see Sparse Resources), then images can be, partially, bound with the resource memory. This feature is particularly important when using video content with a significant change of decoded or encoded resolution.
-
If the implementation supports image arrays, and sparse memory resources, then the application can remove the Device Memory backing of image array layers that are not used by any DPB Slots.
Using DPB and Reconstructed Slot’s Associated Resources
Before a DPB Slot is to become Valid for use with a Reference Picture, it requires memory resources to be bound to it.
Some of the memory resources required for the DPB Slot, are
opaquely managed by the implementation and, internally, allocated from the
Session’s Device Memory Heaps.
The application provides the image resources of one or
more Reference Pictures, in the
VkVideoBeginCodingInfoKHR::pReferenceSlots
as part of the
vkCmdBeginVideoCodingKHR command.
If a DPB Slot was already used with an image view, and a new image view or a VK_NULL_HANDLE handle is used with that Slot, then the DPB Slot’s state will be invalidated by the implementation. If a DPB Slot were to be reused with the same image view, the state of the Slot would not change.
Video Session Activating DPB Slot as a Reference
Before a DPB Slot is to be used for a
Reference Pictures index, it must be activated.
The activation of a DPB Slot is done within the
vkCmdDecodeVideoKHR command’s
VkVideoDecodeInfoKHR::slotIndex
field for the decode operations,
and within the vkCmdEncodeVideoKHR command’s
VkVideoEncodeInfoKHR::slotIndex
field for the encode operations.
While activating a Slot for DPB, it must already have an associated
image view, within the
VkVideoBeginCodingInfoKHR::pReferenceSlots
in the
vkCmdBeginVideoCodingKHR command and Device Memory
backing of the image resources must be resident.
When a DPB Slot were to be activated, the
VkVideoDecodeInfoKHR::slotIndex
for decode, or
VkVideoEncodeInfoKHR::slotIndex
for encode, must be set to the
application’s allocated DPB Slot’s index.
When activating a DPB Slot, the application will perform a
decode or encode operation against its Slot’s index in order to enable its
state as a Valid Picture Reference.
If a DPB Slot is activated, but a decode or encode operation is
not performed against that Slot’s index, or the decode or encode operation
was unsuccessful, then the DPB Slot would remain in the
Invalid Picture Reference state (see below the
DPB Slot States).
By just providing a Video Picture Resources for
a DPB Slot within the
VkVideoBeginCodingInfoKHR::pReferenceSlots
, and without
successfully performing a decode or encode operation against that Slot, the
DPB Slot’s state cannot be changed to Valid Picture
Reference.
If the DPB Slots were already in Valid Picture Reference, and
there is no Video Picture Resources associated
with the DPB Slot for a decode or encode operation, the state
DPB Slot would not change.
However, if an application is referring to a valid DPB Slot in
its current decode or encode operations, then a valid
image view must be provided for that Slot within
VkVideoPictureResourceKHR::imageViewBinding
for that decode or
encode operation.
Video Session Invalidating DPB Slot’s Reference State
When a DPB Slot is invalidated, its state is set to Invalid Picture Reference. Using a DPB Slot as a Reference Picture index for video decode or encode operations while the Slot is in Invalid Picture Reference state would render undefined results.
Video Session DPB Slot States
To help understand the valid use of the Video Session DPB and its resource management, this section aims to explain the different states and correct usage of DPB Slots.
There are four (4) states that a DPB Slot could be in:
-
Picture Reference Unused;
-
Invalid Picture Reference;
-
Updating Picture Reference;
-
Valid Picture Reference;
The different states are outlined within the DPB Slot States and DPB Slot States Flow Diagram below.
All DPB Slot management operations are performed via the
VkVideoDecodeInfoKHR::slotIndex
or
VkVideoEncodeInfoKHR::slotIndex
field.
All DPB resource binding, invalidating, and activating Slot management
operations are performed, by the implementation, before the decoding or
encoding commands, based on the VkVideoDecodeInfoKHR::slotIndex
or VkVideoEncodeInfoKHR::slotIndex
field and the entries from
the VkVideoBeginCodingInfoKHR::pReferenceSlots
.
The application cannot move a DPB Slot from a Picture Reference Unused to
Updating Picture Reference state, implicitly, within a decode or encode
command operation.
Such a DPB Slot must first be transitioned to an Invalid Picture
Reference state using VkVideoDecodeInfoKHR::slotIndex
or
VkVideoEncodeInfoKHR::slotIndex
, as part of a decode command.
For more details, see Video Picture Decode
Modes.
When using sparse memory resources, it would be acceptable and valid behavior for the application to unbind the memory while the DPB Slot is any of the DPB Slot states, provided the command buffers, in a pending state, do not reference any such Video Picture Resources.
Accessing unbound regions of the
sparse memory resources by the decoder or encoder,
regardless if those are used as Output,
Input, DPB or
Reconstructed Video
Picture Resources, would render undefined results.
The VkPhysicalDeviceSparseProperties::residencyNonResidentStrict
property reported by the implementation does not offer guarantees on the
behavior of decode or encode operations when it comes to accessing
unbound regions.
However, both reads and writes are still considered safe and will not affect
other resources or populated regions of the image.
DPB Slot State | Moving to DPB Slot State | Exiting DPB Slot State | Retain Video Picture Resource Memory |
---|---|---|---|
Picture Reference Unused |
* Bind Device Memory; |
* Activate Reference DPB Slot → Invalid Picture Reference |
Application Controlled |
Invalid Picture Reference |
* Activate Reference DPB Slot; |
* Start decode or encode operation with an active Reference DPB Slot target → Updating Picture Reference; |
Application Controlled |
Updating Picture Reference |
Start decode or encode operation with an active Reference DPB Slot target; |
Decode or encode operation with an active Reference DPB Slot target Completed Successfully → Valid Picture Reference; |
Yes |
Valid Picture Reference |
Video decode or encode operation with an active Reference DPB Slot target Completed Successfully; |
* Replace Reference DPB Slot → Invalid Picture Reference; |
Yes |
39.3. Video Physical Device Capabilities
39.3.1. Supported Video Codec Operations Enumeration
The structure VkVideoQueueFamilyProperties2KHR may be chained to VkQueueFamilyProperties2 when calling vkGetPhysicalDeviceQueueFamilyProperties2 to retrieve the video codec operations supported for the physical device queue family index.
The VkVideoQueueFamilyProperties2KHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoQueueFamilyProperties2KHR {
VkStructureType sType;
void* pNext;
VkVideoCodecOperationFlagsKHR videoCodecOperations;
} VkVideoQueueFamilyProperties2KHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
videoCodecOperations
is a bitmask of VkVideoCodecOperationFlagBitsKHR specifying supported video codec operation(s).
The codec operations are defined with the VkVideoCodecOperationFlagBitsKHR enum:
// Provided by VK_KHR_video_queue
typedef enum VkVideoCodecOperationFlagBitsKHR {
VK_VIDEO_CODEC_OPERATION_INVALID_BIT_KHR = 0,
#ifdef VK_ENABLE_BETA_EXTENSIONS
// Provided by VK_EXT_video_encode_h264
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT = 0x00010000,
#endif
#ifdef VK_ENABLE_BETA_EXTENSIONS
// Provided by VK_EXT_video_encode_h265
VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_EXT = 0x00020000,
#endif
#ifdef VK_ENABLE_BETA_EXTENSIONS
// Provided by VK_EXT_video_decode_h264
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_EXT = 0x00000001,
#endif
#ifdef VK_ENABLE_BETA_EXTENSIONS
// Provided by VK_EXT_video_decode_h265
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_EXT = 0x00000002,
#endif
} VkVideoCodecOperationFlagBitsKHR;
Each decode or encode codec-specific extension extends this enumeration with the appropriate bit corresponding to the extension’s codec operation:
-
VK_VIDEO_CODEC_OPERATION_INVALID_BIT_KHR
- No video operations are supported for this queue family. -
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT
- H.264 video encode operations are supported by this queue family. -
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_EXT
- H.264 video decode operations are supported by this queue family. -
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_EXT
- H.265 video decode operations are supported by this queue family.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCodecOperationFlagsKHR;
VkVideoCodecOperationFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoCodecOperationFlagBitsKHR.
39.3.2. Video Profiles
A video profile is defined by VkVideoProfileKHR structure as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoProfileKHR {
VkStructureType sType;
void* pNext;
VkVideoCodecOperationFlagBitsKHR videoCodecOperation;
VkVideoChromaSubsamplingFlagsKHR chromaSubsampling;
VkVideoComponentBitDepthFlagsKHR lumaBitDepth;
VkVideoComponentBitDepthFlagsKHR chromaBitDepth;
} VkVideoProfileKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
videoCodecOperation
is a VkVideoCodecOperationFlagBitsKHR value specifying a video codec operation. -
chromaSubsampling
is a bitmask of VkVideoChromaSubsamplingFlagBitsKHR specifying video chroma subsampling information. -
lumaBitDepth
is a bitmask of VkVideoComponentBitDepthFlagBitsKHR specifying video luma bit depth information. -
chromaBitDepth
is a bitmask of VkVideoComponentBitDepthFlagBitsKHR specifying video chroma bit depth information.
The video format chroma subsampling is defined with the following enums:
// Provided by VK_KHR_video_queue
typedef enum VkVideoChromaSubsamplingFlagBitsKHR {
VK_VIDEO_CHROMA_SUBSAMPLING_INVALID_BIT_KHR = 0,
VK_VIDEO_CHROMA_SUBSAMPLING_MONOCHROME_BIT_KHR = 0x00000001,
VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR = 0x00000002,
VK_VIDEO_CHROMA_SUBSAMPLING_422_BIT_KHR = 0x00000004,
VK_VIDEO_CHROMA_SUBSAMPLING_444_BIT_KHR = 0x00000008,
} VkVideoChromaSubsamplingFlagBitsKHR;
-
VK_VIDEO_CHROMA_SUBSAMPLING_MONOCHROME_BIT_KHR
- the format is monochrome. -
VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR
- the format is 4:2:0 chroma subsampled. The two chroma components are each subsampled at a factor of 2 both horizontally and vertically. -
VK_VIDEO_CHROMA_SUBSAMPLING_422_BIT_KHR
- the format is 4:2:2 chroma subsampled. The two chroma components are sampled at half the sample rate of luma. The horizontal chroma resolution is halved. -
VK_VIDEO_CHROMA_SUBSAMPLING_444_BIT_KHR
- the format is 4:4:4 chroma sampled. Each of the three YCbCr components have the same sample rate, thus there is no chroma subsampling.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoChromaSubsamplingFlagsKHR;
VkVideoChromaSubsamplingFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoChromaSubsamplingFlagBitsKHR.
The video format component bit depth is defined with the following enums:
// Provided by VK_KHR_video_queue
typedef enum VkVideoComponentBitDepthFlagBitsKHR {
VK_VIDEO_COMPONENT_BIT_DEPTH_INVALID_KHR = 0,
VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR = 0x00000001,
VK_VIDEO_COMPONENT_BIT_DEPTH_10_BIT_KHR = 0x00000004,
VK_VIDEO_COMPONENT_BIT_DEPTH_12_BIT_KHR = 0x00000010,
} VkVideoComponentBitDepthFlagBitsKHR;
-
VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR
- the format component bit depth is 8 bits. -
VK_VIDEO_COMPONENT_BIT_DEPTH_10_BIT_KHR
- the format component bit depth is 10 bits. -
VK_VIDEO_COMPONENT_BIT_DEPTH_12_BIT_KHR
- the format component bit depth is 12 bits.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoComponentBitDepthFlagsKHR;
VkVideoComponentBitDepthFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoComponentBitDepthFlagBitsKHR.
A video profile is provided when querying capabilities or image formats for
video using vkGetPhysicalDeviceVideoCapabilitiesKHR or
vkGetPhysicalDeviceVideoFormatPropertiesKHR, respectively.
A video profile is also provided when creating resources (images, video
sessions, etc.) used by video queues.
Each instance of VkVideoProfileKHR must chain a codec-operation
specific video profile extension structure, corresponding to the
codec-operation specified in
VkVideoProfileKHR::videoCodecOperation
.
Additional information is provided in each codec-operation-specific video
extension.
39.3.3. Supported Video Decode or Encode Capabilities
To query video decode or encode capabilities for a specific codec, call:
// Provided by VK_KHR_video_queue
VkResult vkGetPhysicalDeviceVideoCapabilitiesKHR(
VkPhysicalDevice physicalDevice,
const VkVideoProfileKHR* pVideoProfile,
VkVideoCapabilitiesKHR* pCapabilities);
-
physicalDevice
is the physical device whose video decode or encode capabilities will be queried. -
pVideoProfile
is a pointer to a VkVideoProfileKHR structure with a chained codec-operation specific video profile structure. -
pCapabilities
is a pointer to a VkVideoCapabilitiesKHR structure in which the capabilities are returned.
If pVideoProfile
and chained codec-operation specific profile is not
supported, VK_ERROR_FORMAT_NOT_SUPPORTED
is returned.
Otherwise, the implementation will fill pCapabilities
with
capabilities associated with this video profile.
The VkVideoCapabilitiesKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoCapabilitiesKHR {
VkStructureType sType;
void* pNext;
VkVideoCapabilityFlagsKHR capabilityFlags;
VkDeviceSize minBitstreamBufferOffsetAlignment;
VkDeviceSize minBitstreamBufferSizeAlignment;
VkExtent2D videoPictureExtentGranularity;
VkExtent2D minExtent;
VkExtent2D maxExtent;
uint32_t maxReferencePicturesSlotsCount;
uint32_t maxReferencePicturesActiveCount;
} VkVideoCapabilitiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
capabilityFlags
is a bitmask of VkVideoCapabilityFlagBitsKHR specifying capability flags. -
minBitstreamBufferOffsetAlignment
is the minimum alignment for the input or output bitstream buffer offset. -
minBitstreamBufferSizeAlignment
is the minimum alignment for the input or output bitstream buffer size -
videoPictureExtentGranularity
is the minimum size alignment of the extent with the required padding for the decoded or encoded video images. -
minExtent
is the minimum width and height of the decoded or encoded video. -
maxExtent
is the maximum width and height of the decoded or encoded video. -
maxReferencePicturesSlotsCount
is the maximum number of DPB Slots supported by the implementation for a single video session instance. -
maxReferencePicturesActiveCount
is the maximum slots that can be used as Reference Pictures with a single decode or encode operation.
The VkVideoCapabilitiesKHR flags are defined with the following enumeration:
// Provided by VK_KHR_video_queue
typedef enum VkVideoCapabilityFlagBitsKHR {
VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR = 0x00000001,
VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR = 0x00000002,
} VkVideoCapabilityFlagBitsKHR;
-
VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR
- the decode or encode session supports protected content. -
VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR
- the DPB or Reconstructed Video Picture Resources for the video session may be created as a separate VkImage for each DPB picture. If not supported, the DPB must be created as single multi-layered image where each layer represents one of the DPB Video Picture Resources.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCapabilityFlagsKHR;
VkVideoCapabilityFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoCapabilityFlagBitsKHR.
39.3.4. Enumeration of Supported Video Output, Input and DPB Formats
To enumerate the supported output, input and DPB image formats for a specific codec operation and video profile, call:
// Provided by VK_KHR_video_queue
VkResult vkGetPhysicalDeviceVideoFormatPropertiesKHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceVideoFormatInfoKHR* pVideoFormatInfo,
uint32_t* pVideoFormatPropertyCount,
VkVideoFormatPropertiesKHR* pVideoFormatProperties);
-
physicalDevice
is the physical device being queried. -
pVideoFormatInfo
is a pointer to a VkPhysicalDeviceVideoFormatInfoKHR structure specifying the codec and video profile for which information is returned. -
pVideoFormatPropertyCount
is a pointer to an integer related to the number of video format properties available or queried, as described below. -
pVideoFormatProperties
is a pointer to an array of VkVideoFormatPropertiesKHR structures in which supported formats are returned.
If pVideoFormatProperties
is NULL
, then the number of video format
properties supported for the given physicalDevice
is returned in
pVideoFormatPropertyCount
.
Otherwise, pVideoFormatPropertyCount
must point to a variable set by
the user to the number of elements in the pVideoFormatProperties
array, and on return the variable is overwritten with the number of values
actually written to pVideoFormatProperties
.
If the value of pVideoFormatPropertyCount
is less than the number of
video format properties supported, at most pVideoFormatPropertyCount
values will be written to pVideoFormatProperties
, and
VK_INCOMPLETE
will be returned instead of VK_SUCCESS
, to
indicate that not all the available values were returned.
If an implementation reports
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR
is
supported but
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR
is not
supported in VkVideoDecodeCapabilitiesKHR::flags
, then to query
for video format properties for decode DPB or output, imageUsage
must
have both VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR
and
VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR
set.
Otherwise, the call will fail with VK_ERROR_FORMAT_NOT_SUPPORTED
.
If an implementation reports
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR
is
supported but
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR
is not
supported in VkVideoDecodeCapabilitiesKHR::flags
, then to query
for video format properties for decode DPB, imageUsage
must have
VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR
set and
VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR
not set.
Otherwise, the call will fail with VK_ERROR_FORMAT_NOT_SUPPORTED
.
Similarly, to query for video format properties for decode output,
imageUsage
must have VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR
set and VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR
not set.
Otherwise, the call will fail with VK_ERROR_FORMAT_NOT_SUPPORTED
.
Note:
For most use cases, only decode or encode related usage flags are going to be specified. For a use case such as transcode, if the image were to be shared between decode and encode session(s), then both decode and encode related usage flags can be set. |
The VkPhysicalDeviceVideoFormatInfoKHR input structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkPhysicalDeviceVideoFormatInfoKHR {
VkStructureType sType;
void* pNext;
VkImageUsageFlags imageUsage;
const VkVideoProfilesKHR* pVideoProfiles;
} VkPhysicalDeviceVideoFormatInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
imageUsage
is a bitmask of VkImageUsageFlagBits specifying intended video image usages. -
pVideoProfiles
is a pointer to a VkVideoProfilesKHR structure providing the video profile(s) of video session(s) that will use the image. For most use cases, the image is used by a single video session and a single video profile is provided. For a use case such as transcode, where a decode session output image may be used as encode input for one or more encode sessions, multiple video profiles representing the video sessions that will share the image may be provided.
The VkVideoProfilesKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoProfilesKHR {
VkStructureType sType;
void* pNext;
uint32_t profileCount;
const VkVideoProfileKHR* pProfiles;
} VkVideoProfilesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
profileCount
is an integer which holds the number of video profiles included inpProfiles
. -
pProfiles
is a pointer to an array of VkVideoProfileKHR structures. Each VkVideoProfileKHR structure must chain the corresponding codec-operation specific extension video profile structure.
The VkVideoFormatPropertiesKHR output structure for vkGetPhysicalDeviceVideoFormatPropertiesKHR is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoFormatPropertiesKHR {
VkStructureType sType;
void* pNext;
VkFormat format;
} VkVideoFormatPropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
format
is one of the supported formats reported by the implementation.
If the pVideoProfiles
or imageUsage
provided in input structure
pVideoFormatInfo
are not supported,
VK_ERROR_FORMAT_NOT_SUPPORTED
is returned.
Before creating an image, the application should obtain the supported image
creation parameters by querying with
vkGetPhysicalDeviceFormatProperties2 or
vkGetPhysicalDeviceImageFormatProperties2 using one of the reported
pImageFormats
and adding VkVideoProfilesKHR to the pNext
chain of VkFormatProperties2.
39.4. Video Session Objects
39.4.1. Video Session
Video session objects are abstracted and represented by VkVideoSessionKHR handles:
// Provided by VK_KHR_video_queue
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkVideoSessionKHR)
Creating a Video Session
To create a video session object, call:
// Provided by VK_KHR_video_queue
VkResult vkCreateVideoSessionKHR(
VkDevice device,
const VkVideoSessionCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkVideoSessionKHR* pVideoSession);
-
device
is the logical device that creates the decode or encode session object. -
pCreateInfo
is a pointer to a VkVideoSessionCreateInfoKHR structure containing parameters specifying the creation of the decode or encode session. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pVideoSession
is a pointer to a VkVideoSessionKHR structure specifying the decode or encode video session object which will be created by this function when it returnsVK_SUCCESS
The VkVideoSessionCreateInfoKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionCreateInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t queueFamilyIndex;
VkVideoSessionCreateFlagsKHR flags;
const VkVideoProfileKHR* pVideoProfile;
VkFormat pictureFormat;
VkExtent2D maxCodedExtent;
VkFormat referencePicturesFormat;
uint32_t maxReferencePicturesSlotsCount;
uint32_t maxReferencePicturesActiveCount;
} VkVideoSessionCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
queueFamilyIndex
is the queue family of the created video session. -
flags
is a bitmask of VkVideoSessionCreateFlagBitsKHR specifying creation flags. -
pVideoProfile
is a pointer to a VkVideoProfileKHR structure. -
pictureFormat
is the format of the image views representing decoded Output or encoded Input pictures. -
maxCodedExtent
is the maximum width and height of the coded pictures that this instance will be able to support. -
referencePicturesFormat
is the format of the DPB image views representing the Reference Pictures. -
maxReferencePicturesSlotsCount
is the maximum number of DPB Slots that can be activated with associated Video Picture Resources for the created video session. -
maxReferencePicturesActiveCount
is the maximum number of active DPB Slots that can be used as Dpb or Reconstructed Reference Pictures within a single decode or encode operation for the created video session.
The decode or encode session creation flags defined with the following enums:
// Provided by VK_KHR_video_queue
typedef enum VkVideoSessionCreateFlagBitsKHR {
VK_VIDEO_SESSION_CREATE_DEFAULT_KHR = 0,
VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR = 0x00000001,
} VkVideoSessionCreateFlagBitsKHR;
-
VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR
- create the video session for use with protected video content
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoSessionCreateFlagsKHR;
VkVideoSessionCreateFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoSessionCreateFlagBitsKHR.
39.4.2. Destroying a Video Session
To destroy a decode session object, call:
// Provided by VK_KHR_video_queue
void vkDestroyVideoSessionKHR(
VkDevice device,
VkVideoSessionKHR videoSession,
const VkAllocationCallbacks* pAllocator);
-
device
is the device that was used for the creation of the video session. -
videoSession
is the decode or encode video session to be destroyed. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
39.4.3. Video Session Memory Resource Management
Obtaining the Video Session Object Device Memory Requirements
To get memory requirements for a video session, call:
// Provided by VK_KHR_video_queue
VkResult vkGetVideoSessionMemoryRequirementsKHR(
VkDevice device,
VkVideoSessionKHR videoSession,
uint32_t* pVideoSessionMemoryRequirementsCount,
VkVideoGetMemoryPropertiesKHR* pVideoSessionMemoryRequirements);
-
device
is the logical device that owns the video session. -
videoSession
is the video session to query. -
pVideoSessionMemoryRequirementsCount
is a pointer to an integer related to the number of memory heap requirements available or queried, as described below. -
pVideoSessionMemoryRequirements
isNULL
or a pointer to an array of VkVideoGetMemoryPropertiesKHR structures in which the memory heap requirements of the video session are returned.
If pVideoSessionMemoryRequirements
is NULL
, then the number of
memory heap types required for the video session is returned in
pVideoSessionMemoryRequirementsCount
.
Otherwise, pVideoSessionMemoryRequirementsCount
must point to a
variable set by the user with the number of elements in the
pVideoSessionMemoryRequirements
array, and on return the variable is
overwritten with the number of formats actually written to
pVideoSessionMemoryRequirements
.
If pVideoSessionMemoryRequirementsCount
is less than the number of
memory heap types required for the video session, then at most
pVideoSessionMemoryRequirementsCount
elements will be written to
pVideoSessionMemoryRequirements
, and VK_INCOMPLETE
will be
returned, instead of VK_SUCCESS
, to indicate that not all required
memory heap types were returned.
The VkVideoGetMemoryPropertiesKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoGetMemoryPropertiesKHR {
VkStructureType sType;
const void* pNext;
uint32_t memoryBindIndex;
VkMemoryRequirements2* pMemoryRequirements;
} VkVideoGetMemoryPropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
memoryBindIndex
is the memory bind index of the memory heap type described by the information returned inpMemoryRequirements
. -
pMemoryRequirements
is a pointer to a VkMemoryRequirements2 structure in which the requested memory heap requirements for the heap with indexmemoryBindIndex
are returned.
Binding the Session Object Device Memory
To attach memory to a video session object, call:
// Provided by VK_KHR_video_queue
VkResult vkBindVideoSessionMemoryKHR(
VkDevice device,
VkVideoSessionKHR videoSession,
uint32_t videoSessionBindMemoryCount,
const VkVideoBindMemoryKHR* pVideoSessionBindMemories);
-
device
is the logical device that owns the video session’s memory. -
videoSession
is the video session to be bound with device memory. -
videoSessionBindMemoryCount
is the number ofpVideoSessionBindMemories
to be bound. -
pVideoSessionBindMemories
is a pointer to an array of VkVideoBindMemoryKHR structures specifying memory regions to be bound to a device memory heap.
The VkVideoBindMemoryKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoBindMemoryKHR {
VkStructureType sType;
const void* pNext;
uint32_t memoryBindIndex;
VkDeviceMemory memory;
VkDeviceSize memoryOffset;
VkDeviceSize memorySize;
} VkVideoBindMemoryKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
memoryBindIndex
is the index of the device memory heap returned in VkVideoGetMemoryPropertiesKHR::memoryBindIndex
from vkGetVideoSessionMemoryRequirementsKHR. -
memory
is the allocated device memory to be bound to the video session’s heap with indexmemoryBindIndex
. -
memoryOffset
is the start offset of the region ofmemory
which is to be bound. -
memorySize
is the size in bytes of the region ofmemory
, starting frommemoryOffset
bytes, to be bound.
39.4.4. Video Session Parameters
This specification supports several classes of preprocessed parameters stored in Video Session Parameters objects. The Video Session Parameters objects reduces the number of parameters being dispatched and then processed by the implementation while recording video command buffers.
39.4.5. Creating Video Session Parameters
Video session parameter objects are represented by VkVideoSessionParametersKHR handles:
// Provided by VK_KHR_video_queue
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkVideoSessionParametersKHR)
To create a video session parameters object, call:
// Provided by VK_KHR_video_queue
VkResult vkCreateVideoSessionParametersKHR(
VkDevice device,
const VkVideoSessionParametersCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkVideoSessionParametersKHR* pVideoSessionParameters);
-
device
is the logical device that was used for the creation of the video session object. -
pCreateInfo
is a pointer to VkVideoSessionParametersCreateInfoKHR structure specifying the video session parameters. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pVideoSessionParameters
is a pointer to a VkVideoSessionParametersKHR handle in which the video session parameters object is returned.
The VkVideoSessionParametersCreateInfoKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionParametersCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoSessionParametersKHR videoSessionParametersTemplate;
VkVideoSessionKHR videoSession;
} VkVideoSessionParametersCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
videoSessionParametersTemplate
is VK_NULL_HANDLE or a valid handle to a VkVideoSessionParametersKHR object. If this parameter represents a valid handle, then the underlying Video Session Parameters object will be used as a template for constructing the new video session parameters object. All of the template object’s current parameters will be inherited by the new object in such a case. Optionally, some of the template’s parameters can be updated or new parameters added to the newly constructed object via the extension-specific parameters. -
videoSession
is the video session object against which the video session parameters object is going to be created.
39.4.6. Updating the parameters of the Video Session Parameters object
To update, add, or remove video session parameters state, call:
// Provided by VK_KHR_video_queue
VkResult vkUpdateVideoSessionParametersKHR(
VkDevice device,
VkVideoSessionParametersKHR videoSessionParameters,
const VkVideoSessionParametersUpdateInfoKHR* pUpdateInfo);
-
device
is the logical device that was used for the creation of the video session object. -
videoSessionParameters
is the video session object that is going to be updated. -
pUpdateInfo
is a pointer to aVkVideoSessionParametersUpdateInfoKHR
structure containing the session parameters update information.
The VkVideoSessionParametersUpdateInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionParametersUpdateInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t updateSequenceCount;
} VkVideoSessionParametersUpdateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
updateSequenceCount
is the sequence number of the object update with parameters, starting from1
and incrementing the value by one with each subsequent update.
39.4.7. Destroying Video Session Parameters
To destroy a video session object, call:
// Provided by VK_KHR_video_queue
void vkDestroyVideoSessionParametersKHR(
VkDevice device,
VkVideoSessionParametersKHR videoSessionParameters,
const VkAllocationCallbacks* pAllocator);
-
device
is the device the video session was created with. -
videoSessionParameters
is the video session parameters object to be destroyed. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
39.4.8. Video Encode and Decode commands
To start video decode or encode operations, call:
// Provided by VK_KHR_video_queue
void vkCmdBeginVideoCodingKHR(
VkCommandBuffer commandBuffer,
const VkVideoBeginCodingInfoKHR* pBeginInfo);
-
commandBuffer
is the command buffer to be used when recording commands for the video decode or encode operations. -
pBeginInfo
is a pointer to a VkVideoBeginCodingInfoKHR structure.
The VkVideoBeginCodingInfoKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoBeginCodingInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoBeginCodingFlagsKHR flags;
VkVideoCodingQualityPresetFlagsKHR codecQualityPreset;
VkVideoSessionKHR videoSession;
VkVideoSessionParametersKHR videoSessionParameters;
uint32_t referenceSlotCount;
const VkVideoReferenceSlotKHR* pReferenceSlots;
} VkVideoBeginCodingInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use. -
codecQualityPreset
is a bitmask of VkVideoCodingQualityPresetFlagBitsKHR specifying the Video Decode or Encode quality preset. -
videoSession
is the video session object to be bound for the processing of the video commands. -
videoSessionParameters
is VK_NULL_HANDLE or a handle of a VkVideoSessionParametersKHR object to be used for the processing of the video commands. If VK_NULL_HANDLE, then no video session parameters apply to this command buffer context. -
referenceSlotCount
is the number of reference slot entries provided inpReferenceSlots
. -
pReferenceSlots
is a pointer to an array of VkVideoReferenceSlotKHR structures specifying reference slots, used within the video command context between this vkCmdBeginVideoCodingKHR command and the vkCmdEndVideoCodingKHR commmand that follows. Each reference slot provides a slot index and the VkVideoPictureResourceKHR specifying the reference picture resource bound to this slot index. A slot index must not appear more than once inpReferenceSlots
in a given command.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoBeginCodingFlagsKHR;
VkVideoBeginCodingFlagsKHR
is a bitmask type for setting a mask, but
is currently reserved for future use.
The decode preset types are defined with the following:
// Provided by VK_KHR_video_queue
typedef enum VkVideoCodingQualityPresetFlagBitsKHR {
VK_VIDEO_CODING_QUALITY_PRESET_NORMAL_BIT_KHR = 0x00000001,
VK_VIDEO_CODING_QUALITY_PRESET_POWER_BIT_KHR = 0x00000002,
VK_VIDEO_CODING_QUALITY_PRESET_QUALITY_BIT_KHR = 0x00000004,
} VkVideoCodingQualityPresetFlagBitsKHR;
-
VK_VIDEO_CODING_QUALITY_PRESET_NORMAL_BIT_KHR
defines normal decode case. -
VK_VIDEO_CODING_QUALITY_PRESET_POWER_BIT_KHR
defines power efficient case. -
VK_VIDEO_CODING_QUALITY_PRESET_QUALITY_BIT_KHR
defines quality focus case.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCodingQualityPresetFlagsKHR;
VkVideoCodingQualityPresetFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoCodingQualityPresetFlagBitsKHR.
The VkVideoReferenceSlotKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoReferenceSlotKHR {
VkStructureType sType;
const void* pNext;
int8_t slotIndex;
const VkVideoPictureResourceKHR* pPictureResource;
} VkVideoReferenceSlotKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
slotIndex
is the unique reference slot index used for the encode or decode operation. -
pPictureResource
is a pointer to a VkVideoPictureResourceKHR structure describing the picture resource bound to this slot index.
The VkVideoPictureResourceKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoPictureResourceKHR {
VkStructureType sType;
const void* pNext;
VkOffset2D codedOffset;
VkExtent2D codedExtent;
uint32_t baseArrayLayer;
VkImageView imageViewBinding;
} VkVideoPictureResourceKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
codedOffset
is the offset to be used for the picture resource. -
codedExtent
is the extent to be used for the picture resource. -
baseArrayLayer
is the first array layer to be accessed for the Decode or Encode Operations. -
imageViewBinding
is a VkImageView image view representing this picture resource.
39.4.9. End of the Video Session
To end video decode or encode operations, call:
// Provided by VK_KHR_video_queue
void vkCmdEndVideoCodingKHR(
VkCommandBuffer commandBuffer,
const VkVideoEndCodingInfoKHR* pEndCodingInfo);
-
commandBuffer
is the command buffer to be filled by this function. -
pEndCodingInfo
is a pointer to a VkVideoEndCodingInfoKHR structure.
The VkVideoEndCodingInfoKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoEndCodingInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoEndCodingFlagsKHR flags;
} VkVideoEndCodingInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoEndCodingFlagsKHR;
VkVideoEndCodingFlagsKHR
is a bitmask type for setting a mask, but is
currently reserved for future use.
39.4.10. Video Session Control Command
To apply dynamic controls to video decode or video encode operations, call:
// Provided by VK_KHR_video_queue
void vkCmdControlVideoCodingKHR(
VkCommandBuffer commandBuffer,
const VkVideoCodingControlInfoKHR* pCodingControlInfo);
-
commandBuffer
is the command buffer to be filled by this function. -
pCodingControlInfo
is a pointer to a VkVideoCodingControlInfoKHR structure.
The settings provided in this call are applied to the video stream at the
time of queue submission and are in effect until the submission of a
subsequent vkCmdControlVideoCodingKHR
.
The VkVideoCodingControlInfoKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoCodingControlInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoCodingControlFlagsKHR flags;
} VkVideoCodingControlInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoCodingControlFlagsKHR specifying control flags.
The vkCmdControlVideoCodingKHR flags are defined with the following enumeration:
// Provided by VK_KHR_video_queue
typedef enum VkVideoCodingControlFlagBitsKHR {
VK_VIDEO_CODING_CONTROL_DEFAULT_KHR = 0,
VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR = 0x00000001,
} VkVideoCodingControlFlagBitsKHR;
-
VK_VIDEO_CODING_CONTROL_DEFAULT_KHR
indicates a request for the coding control paramaters to be applied to the current state of the bound video session. -
VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR
indicates a request for the bound video session device context to be reset before the coding control parameters are applied.
A newly created video session must be reset before use for video decode or encode operations. The reset operation returns all session DPB slots to the unused state (see DPB Slot States). For encode sessions, the reset operation returns rate control configuration to implementation default settings. After decode or encode operations are performed on a session, the reset operation may be used to return the video session device context to the same initial state as after the reset of a newly created video session. This may be used when different video sequences are processed with the same session.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCodingControlFlagsKHR;
VkVideoCodingControlFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoCodingControlFlagBitsKHR.
39.5. Video Decode Operations
Before the application can start recording Vulkan command buffers for the Video Decode Operations, it must do the following, beforehand:
-
Ensure that the implementation can decode the Video Content by querying the supported codec operations and profiles using vkGetPhysicalDeviceQueueFamilyProperties2.
-
By using vkGetPhysicalDeviceVideoFormatPropertiesKHR and providing one or more video profiles, choose the Vulkan formats supported by the implementation. The formats for output and reference pictures must be queried and chosen separately. Refer to the section on enumeration of supported video formats.
-
Before creating an image to be used as a video picture resource, obtain the supported image creation parameters by querying with vkGetPhysicalDeviceFormatProperties2 and vkGetPhysicalDeviceImageFormatProperties2 using one of the reported formats and adding VkVideoProfilesKHR to the
pNext
chain of VkFormatProperties2. When querying the parameters with vkGetPhysicalDeviceImageFormatProperties2 for images targeting decoded output and reference (DPB) pictures, the VkPhysicalDeviceImageFormatInfo2::usage
field should containVK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR
andVK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR
, respectively. -
Create none, some, or all of the required images for the decoded output and reference pictures. More Video Picture Resources can be created at some later point if needed while processing the decoded content. Also, if the decoded picture size is expected to change, the images can be created based on the maximum decoded content size required.
-
Create the video session to be used for video decode operations. Before creating the Decode Video Session, the decode capabilities should be queried with vkGetPhysicalDeviceVideoCapabilitiesKHR to obtain the limits of the parameters allowed by the implementation for a particular codec profile.
-
Bind memory resources with the decode video session by calling vkBindVideoSessionMemoryKHR. The video session cannot be used until memory resources are allocated and bound to it. In order to determine the required memory sizes and heap types of the device memory allocations, vkGetVideoSessionMemoryRequirementsKHR should be called.
-
Create one or more Session Parameter objects for use across command buffer recording operations, if required by the codec extension in use. These objects must be created against a video session with the parameters required by the codec. Each Session Parameter object created is a child object of the associated Session object and cannot be bound in the command buffer with any other Session Object.
The recording of Video Decode Commands against a Vulkan command buffer consists of the following sequence:
-
vkCmdBeginVideoCodingKHR starts the recording of one or more Video Decode operations in the command buffer. For each Video Decode Command operation, a Video Session must be bound to the command buffer within this command. This command establishes a Vulkan Video Decode Context that consists of the bound Video Session Object, Session Parameters Object, and the required Video Picture Resources. The established Video Decode Context is in effect until the vkCmdEndVideoCodingKHR command is recorded. If more Video Decode operations are to be required after the vkCmdEndVideoCodingKHR command, another Video Decode Context can be started with the vkCmdBeginVideoCodingKHR command.
-
vkCmdDecodeVideoKHR specifies one or more compressed data buffers to be decoded. The VkVideoDecodeInfoKHR parameters, and the codec extension structures chained to this, specify the details of the decode operation.
-
vkCmdControlVideoCodingKHR records operations against the decoded data, decoding device, or the Video Session state.
-
vkCmdEndVideoCodingKHR signals the end of the recording of the Vulkan Video Decode Context, as established by vkCmdBeginVideoCodingKHR.
In addition to the above, the following commands can be recorded between vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR:
-
Query operations
-
Global Memory Barriers
-
Buffer Memory Barriers
-
Image Memory Barriers (these must be used to transition the Video Picture Resources to the proper
VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR
andVK_IMAGE_LAYOUT_VIDEO_DECODE_DST_KHR
layouts). -
Pipeline Barriers
-
Events
-
Timestamps
-
Device Groups (device mask)
The following Video Decode related commands must be recorded outside the Vulkan Video Decode Context established with the vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR commands:
-
Sparse Memory Binding
-
Copy Commands
-
Clear Commands
39.5.1. Video Picture Decode Modes
There are a few ways that the vkCmdDecodeVideoKHR can be configured for the Video Picture Decode Operations, based on:
-
if the output resource would need to be used as Reference Picture for subsequent decode operations and;
-
if DPB Slots are required for use as Reference Pictures indexes.
The most basic Video Picture Decode operation with the vkCmdDecodeVideoKHR command is to output the decoded pixel data without using any DPB Reference Pictures and without updating any DPB Slot’s indexes.
In this case, the following VkVideoDecodeInfoKHR parameters must be set:
-
VkVideoDecodeInfoKHR::
pSetupReferenceSlot->pPictureResource->imageViewBinding
must be a valid VkImageView. This VkImageView represents the output resource where the decoded pixels will be populated after a successful decode operation. -
VkVideoDecodeInfoKHR::
pSetupReferenceSlot->slotIndex
must be an invalid DPB Slot index (-1) since the decoded picture is not intended to be used as a reference picture with subsequent video decode operations. -
The value of the VkVideoDecodeInfoKHR::
referenceSlotCount
can be0
and VkVideoDecodeInfoKHR::pReferenceSlots
can beNULL
. -
If VkVideoDecodeInfoKHR::
pReferenceSlots
is notNULL
, it can still have entries representing DPB Slot indexes with a Valid Picture Reference. The codec extension selects the actual use of the Reference Pictures by referring to a DPB Slot index with a Valid Picture Reference.
Video Picture Decode operations with the vkCmdDecodeVideoKHR command, requiring one or more Reference Pictures for the predictions of the values of samples for the decoded output picture would require DPB Slots with Valid Picture Reference.
In this case, the following VkVideoDecodeInfoKHR parameters must be set:
-
VkVideoDecodeInfoKHR::
pSetupReferenceSlot->pPictureResource->imageViewBinding
must be a valid VkImageView. This VkImageView represents the output resource where the decoded pixels will be populated after a successful decode operation. -
VkVideoDecodeInfoKHR::
pSetupReferenceSlot->slotIndex
must be an invalid DPB Slot index (-1) since the decoded picture is not intended to be used as a reference picture with subsequent video decode operations. -
The value of the VkVideoDecodeInfoKHR::
referenceSlotCount
must not be0
and VkVideoDecodeInfoKHR::pReferenceSlots
should represent at least the number of the reference slots required for the decode operation. The codec extension selects the actual use of the Reference Pictures by referring to a DPB Slot index with a Valid Picture Reference. If the implementation does not use an opaque DPB, each DPB slot representing a reference picture must refer to a valid image view. The image views must represent the same image resources that were used to create the reference picture for the corresponding DPB Slot index. -
VkVideoDecodeInfoKHR::
pReferenceSlots
can still have entries representing DPB Slot indexes with a Valid Picture Reference.
After the vkCmdDecodeVideoKHR operation is completed successfully, the
VkVideoDecodeInfoKHR::pSetupReferenceSlot->pPictureResource->imageViewBinding
pixel data will be updated with the decoded content.
The operation will not update any DPB Slot with
Reference Pictures data.
However, any DPB Slot activation, invalidation, or deactivation
operations requested via VkVideoDecodeInfoKHR::pReferenceSlots
are still going to be performed.
Video Picture Decode with a Reference Picture slot update and using optional Reference Pictures
When it is known that the picture to be decoded will be used as a reference picture for subsequent decode operations, one of the available DPB Slots needs to be selected for activation and update operations as part of the vkCmdDecodeVideoKHR command.
Based on whether a decode operation with reference pictures or without reference pictures is required, the vkCmdDecodeVideoKHR should be configured with parameters as described in the previous sections. In addition, one of the available DPB Slots must be selected by the application, activated with resources and then set-up for an update with the decode operation.
In this case, the following VkVideoDecodeInfoKHR parameters must be set:
-
VkVideoDecodeInfoKHR::
pSetupReferenceSlot->pPictureResource->imageViewBinding
must be a valid VkImageView. This VkImageView represents the output resource where the decoded pixels will be populated after a successful decode operation. If the implementation does not use an opaque DPB, both the output and reference picture resource coincide. -
VkVideoDecodeInfoKHR::
pSetupReferenceSlot->slotIndex
must be a valid DPB Slot index selected by the application, based on the currently available slots. -
VkVideoDecodeInfoKHR::
pReferenceSlots
can still have entries representing DPB Slot indexes with a Valid Picture Reference.
After the vkCmdDecodeVideoKHR operation has completed successfully,
the decoded content will be available in the resource provided for
VkVideoDecodeInfoKHR::pSetupReferenceSlot->pPictureResource->imageViewBinding
.
In addition, this operation will update the selected DPB Slot
with Reference Pictures data.
Any other DPB Slot activation,invalidation, or deactivation
operation requested via the
VkVideoDecodeInfoKHR::pReferenceSlots
are going to be performed
as well.
39.5.2. Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with
pVideoProfile->videoCodecOperation
specified as one of the decode
operation bits, the VkVideoDecodeCapabilitiesKHR structure must be
included in the pNext
chain of the VkVideoCapabilitiesKHR
structure to retrieve capabilities specific to video decoding.
The VkVideoDecodeCapabilitiesKHR
structure is defined as:
// Provided by VK_KHR_video_decode_queue
typedef struct VkVideoDecodeCapabilitiesKHR {
VkStructureType sType;
void* pNext;
VkVideoDecodeCapabilityFlagsKHR flags;
} VkVideoDecodeCapabilitiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoDecodeCapabilityFlagBitsKHR describing supported decoding features.
// Provided by VK_KHR_video_decode_queue
typedef VkFlags VkVideoDecodeCapabilityFlagsKHR;
VkVideoDecodeCapabilityFlagsKHR
is a bitmask type for setting a mask
of zero or more VkVideoDecodeCapabilityFlagBitsKHR.
Bits which may be set in VkVideoDecodeCapabilitiesKHR::flags
,
indicating the decoding features supported, are:
// Provided by VK_KHR_video_decode_queue
typedef enum VkVideoDecodeCapabilityFlagBitsKHR {
VK_VIDEO_DECODE_CAPABILITY_DEFAULT_KHR = 0,
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR = 0x00000001,
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR = 0x00000002,
} VkVideoDecodeCapabilityFlagBitsKHR;
-
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR
- reports the implementation supports using the same Video Picture Resource for decode DPB and decode output. -
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR
- reports the implementation supports using distinct Video Picture Resources for decode DPB and decode output.
An implementation must report at least one of
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR
or
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR
as
supported.
Note:
If both |
39.5.3. Video Decode Command Buffer Commands
To decode a frame, call:
// Provided by VK_KHR_video_decode_queue
void vkCmdDecodeVideoKHR(
VkCommandBuffer commandBuffer,
const VkVideoDecodeInfoKHR* pFrameInfo);
-
commandBuffer
is the command buffer to be filled with this function for decode frame command. -
pFrameInfo
is a pointer to a VkVideoDecodeInfoKHR structure.
The VkVideoDecodeInfoKHR structure is defined as:
// Provided by VK_KHR_video_decode_queue
typedef struct VkVideoDecodeInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoDecodeFlagsKHR flags;
VkOffset2D codedOffset;
VkExtent2D codedExtent;
VkBuffer srcBuffer;
VkDeviceSize srcBufferOffset;
VkDeviceSize srcBufferRange;
VkVideoPictureResourceKHR dstPictureResource;
const VkVideoReferenceSlotKHR* pSetupReferenceSlot;
uint32_t referenceSlotCount;
const VkVideoReferenceSlotKHR* pReferenceSlots;
} VkVideoDecodeInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. All the codec specific structures related to each frame(picture parameters, quantization matrix, etc.) must be chained here and pass to decode session with the function call vkCmdDecodeVideoKHR. -
flags
is a bitmask of VkVideoDecodeFlagBitsKHR specifying decode flags, reserved for future versions of this specification. -
codedOffset
is the coded offset of the decode operations. The purpose of this field is interpreted based on the codec extension. When decoding content in H.264 field mode, thecodedOffset
specifies the line or picture field’s offset within the image. -
codedExtent
is the coded size of the decode operations. -
srcBuffer
is the source buffer that holds the encoded bitstream. -
srcBufferOffset
is the buffer offset where the valid encoded bitstream starts in srcBuffer. It must meet the alignment requirementminBitstreamBufferOffsetAlignment
within VkVideoCapabilitiesKHR queried with the vkGetPhysicalDeviceVideoCapabilitiesKHR function. -
srcBufferRange
is the size of the srcBuffer with valid encoded bitstream, starting fromsrcBufferOffset
. It must meet the alignment requirementminBitstreamBufferSizeAlignment
within VkVideoCapabilitiesKHR queried with the vkGetPhysicalDeviceVideoCapabilitiesKHR function. -
dstPictureResource
is the destination Decoded Output Picture Resource. -
pSetupReferenceSlot
isNULL
or a pointer to a VkVideoReferenceSlotKHR structure used for generating a DPB reference slot and Picture Resource.pSetupReferenceSlot->slotIndex
specifies the slot index number to use as a target for producing the DPB data.slotIndex
must reference a valid entry as specified in VkVideoBeginCodingInfoKHR via thepReferenceSlots
within the vkCmdBeginVideoCodingKHR command that established the Vulkan Video Decode Context for this command. -
referenceSlotCount
is the number of the DPB Reference Pictures that will be used when this decoding operation is executing. -
pReferenceSlots
is a pointer to an array of VkVideoReferenceSlotKHR structures specifying the DPB Reference pictures that will be used when this decoding operation is executing.
The vkCmdDecodeVideoKHR flags are defined with the following enumeration:
// Provided by VK_KHR_video_decode_queue
typedef enum VkVideoDecodeFlagBitsKHR {
VK_VIDEO_DECODE_DEFAULT_KHR = 0,
VK_VIDEO_DECODE_RESERVED_0_BIT_KHR = 0x00000001,
} VkVideoDecodeFlagBitsKHR;
-
VK_VIDEO_DECODE_RESERVED_0_BIT_KHR
The current version of the specification has reserved this value for future use.
// Provided by VK_KHR_video_decode_queue
typedef VkFlags VkVideoDecodeFlagsKHR;
VkVideoDecodeFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoDecodeFlagBitsKHR.
39.6. Video Decode of AVC (ITU-T H.264)
This extension adds H.264 codec specific structures needed for decode session to execute decode jobs which include H.264 sequence header, picture parameter header and quantization matrix etc. Unless otherwise noted, all references to the H.264 specification are to the 2010 edition published by the ITU-T, dated March 2010. This specification is available at https://www.itu.int/rec/T-REC-H.264.
39.6.1. H.264 decode profile
A H.264 decode profile is specified using VkVideoDecodeH264ProfileEXT
chained to VkVideoProfileKHR when the codec-operation in
VkVideoProfileKHR is
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_EXT
.
The VkVideoDecodeH264ProfileEXT
structure is defined as:
// Provided by VK_EXT_video_decode_h264
typedef struct VkVideoDecodeH264ProfileEXT {
VkStructureType sType;
const void* pNext;
StdVideoH264ProfileIdc stdProfileIdc;
VkVideoDecodeH264PictureLayoutFlagsEXT pictureLayout;
} VkVideoDecodeH264ProfileEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdProfileIdc
is aStdVideoH264ProfileIdc
value specifying the H.264 codec profile IDC -
pictureLayout
is a bitmask of VkVideoDecodeH264PictureLayoutFlagBitsEXT specifying the layout of the decoded picture’s contents depending on the nature (progressive vs. interlaced) of the input content.
Note
When passing |
// Provided by VK_EXT_video_decode_h264
typedef VkFlags VkVideoDecodeH264PictureLayoutFlagsEXT;
VkVideoDecodeH264PictureLayoutFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoDecodeH264PictureLayoutFlagBitsEXT.
The H.264 video decode picture layout flags are defined with the following enum:
// Provided by VK_EXT_video_decode_h264
typedef enum VkVideoDecodeH264PictureLayoutFlagBitsEXT {
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_EXT = 0,
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_EXT = 0x00000001,
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_EXT = 0x00000002,
} VkVideoDecodeH264PictureLayoutFlagBitsEXT;
-
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_EXT
specifies support for progressive content. This flag has the value0
. -
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_EXT
specifies support for or use of a picture layout for interlaced content where all lines belonging to the first field are decoded to the even-numbered lines within the picture resource, and all lines belonging to the second field are decoded to the odd-numbered lines within the picture resource. -
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_EXT
specifies support for or use of a picture layout for interlaced content where all lines belonging to the first field are grouped together in a single plane, followed by another plane containing all lines belonging to the second field.
39.6.2. Selecting a H.264 decode profile
When using vkGetPhysicalDeviceVideoCapabilitiesKHR to query the
capabilities for the input pVideoProfile
with
videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_EXT
, a
VkVideoDecodeH264ProfileEXT structure must be chained to
VkVideoProfileKHR to select a H.264 decode profile.
If supported, the implementation returns the capabilities associated with
the specified H.264 decode profile.
The requirement is similar when querying supported image formats using
vkGetPhysicalDeviceVideoFormatPropertiesKHR.
A supported H.264 decode profile must be selected when creating a video session by chaining VkVideoDecodeH264ProfileEXT to the VkVideoProfileKHR field of VkVideoSessionCreateInfoKHR.
39.6.3. Capabilities
The VkVideoDecodeH264CapabilitiesEXT
structure is defined as:
// Provided by VK_EXT_video_decode_h264
typedef struct VkVideoDecodeH264CapabilitiesEXT {
VkStructureType sType;
void* pNext;
uint32_t maxLevel;
VkOffset2D fieldOffsetGranularity;
VkExtensionProperties stdExtensionVersion;
} VkVideoDecodeH264CapabilitiesEXT;
When using vkGetPhysicalDeviceVideoCapabilitiesKHR to query the
capabilities for the input pVideoProfile
with
videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_EXT
, a
VkVideoDecodeH264CapabilitiesEXT structure must be chained to
VkVideoCapabilitiesKHR to get this H.264 decode profile specific
capabilities.
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxLevel
is the maximum H.264 level supported by the device. -
fieldOffsetGranularity
- if Interlaced Video Content is suported, the maximum field offset granularity supported for the picture resource. -
stdExtensionVersion
is a VkExtensionProperties structure specifying the H.264 extension name and version supported by this implementation.
39.6.4. Create Information
The VkVideoDecodeH264SessionCreateInfoEXT
structure is defined as:
// Provided by VK_EXT_video_decode_h264
typedef struct VkVideoDecodeH264SessionCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkVideoDecodeH264CreateFlagsEXT flags;
const VkExtensionProperties* pStdExtensionVersion;
} VkVideoDecodeH264SessionCreateInfoEXT;
A VkVideoDecodeH264SessionCreateInfoEXT
structure can be chained to
VkVideoSessionCreateInfoKHR when the function
vkCreateVideoSessionKHR is called to create a video session for H.264
decode.
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use. -
pStdExtensionVersion
is a pointer to a VkExtensionProperties structure specifying the H.264 codec extensions defined inStdVideoH264Extensions
.
// Provided by VK_EXT_video_decode_h264
typedef VkFlags VkVideoDecodeH264CreateFlagsEXT;
VkVideoDecodeH264CreateFlagsEXT
is a bitmask type for setting a mask,
but is currently reserved for future use.
39.6.5. Decoder Parameter Sets
To reduce parameter traffic during decoding, the decoder parameter set object supports storing H.264 SPS/PPS parameter sets that may be later referenced during decoding.
The VkVideoDecodeH264SessionParametersCreateInfoEXT
structure is
defined as:
// Provided by VK_EXT_video_decode_h264
typedef struct VkVideoDecodeH264SessionParametersCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t maxSpsStdCount;
uint32_t maxPpsStdCount;
const VkVideoDecodeH264SessionParametersAddInfoEXT* pParametersAddInfo;
} VkVideoDecodeH264SessionParametersCreateInfoEXT;
A VkVideoDecodeH264SessionParametersCreateInfoEXT structure holding one H.264 SPS and at least one H.264 PPS paramater set must be chained to VkVideoSessionParametersCreateInfoKHR when calling vkCreateVideoSessionParametersKHR to store these parameter set(s) with the decoder parameter set object for later reference. The provided H.264 SPS/PPS parameters must be within the limits specified during decoder creation for the decoder specified in VkVideoSessionParametersCreateInfoKHR.
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxSpsStdCount
is the maximum number of SPS parameters that theVkVideoSessionParametersKHR
can contain. -
maxPpsStdCount
is the maximum number of PPS parameters that theVkVideoSessionParametersKHR
can contain. -
pParametersAddInfo
isNULL
or a pointer to a VkVideoDecodeH264SessionParametersAddInfoEXT structure specifying H.264 parameters to add upon object creation.
The VkVideoDecodeH264SessionParametersAddInfoEXT
structure is defined
as:
// Provided by VK_EXT_video_decode_h264
typedef struct VkVideoDecodeH264SessionParametersAddInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t spsStdCount;
const StdVideoH264SequenceParameterSet* pSpsStd;
uint32_t ppsStdCount;
const StdVideoH264PictureParameterSet* pPpsStd;
} VkVideoDecodeH264SessionParametersAddInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
spsStdCount
is the number of SPS elements inpSpsStd
. Its value must be less than or equal to the value ofmaxSpsStdCount
. -
pSpsStd
is a pointer to an array ofStdVideoH264SequenceParameterSet
structures representing H.264 sequence parameter sets. Each element of the array must have a unique H.264 SPS ID. -
ppsStdCount
is the number of PPS provided inpPpsStd
. Its value must be less than or equal to the value ofmaxPpsStdCount
. -
pPpsStd
is a pointer to an array ofStdVideoH264PictureParameterSet
structures representing H.264 picture parameter sets. Each element of the array must have a unique H.264 SPS-PPS ID pair.
39.6.6. Picture Decoding
To decode a picture, the structure VkVideoDecodeH264PictureInfoEXT may be chained to VkVideoDecodeInfoKHR when calling vkCmdDecodeVideoKHR.
The VkVideoDecodeH264PictureInfoEXT structure represents a picture decode operation and is defined as:
// Provided by VK_EXT_video_decode_h264
typedef struct VkVideoDecodeH264PictureInfoEXT {
VkStructureType sType;
const void* pNext;
const StdVideoDecodeH264PictureInfo* pStdPictureInfo;
uint32_t slicesCount;
const uint32_t* pSlicesDataOffsets;
} VkVideoDecodeH264PictureInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pStdPictureInfo
is a pointer to aStdVideoDecodeH264PictureInfo
structure specifying the codec standard specific picture information from the H.264 specification. -
slicesCount
is the number of slices in this picture. -
pSlicesDataOffsets
is a pointer to an array ofslicesCount
offsets indicating the start offset of each slice within the bitstream buffer.
The VkVideoDecodeH264DpbSlotInfoEXT structure correlates a DPB Slot index with codec-specific information and is defined as:
// Provided by VK_EXT_video_decode_h264
typedef struct VkVideoDecodeH264DpbSlotInfoEXT {
VkStructureType sType;
const void* pNext;
const StdVideoDecodeH264ReferenceInfo* pStdReferenceInfo;
} VkVideoDecodeH264DpbSlotInfoEXT;
-
sType
is the type of this structure. -
pStdReferenceInfo
is a pointer to aStdVideoDecodeH264ReferenceInfo
structure specifying the codec standard specific picture reference information from the H.264 specification.
The VkVideoDecodeH264MvcEXT
structure is defined as:
// Provided by VK_EXT_video_decode_h264
typedef struct VkVideoDecodeH264MvcEXT {
VkStructureType sType;
const void* pNext;
const StdVideoDecodeH264Mvc* pStdMvc;
} VkVideoDecodeH264MvcEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pStdMvc
is a pointer to aStdVideoDecodeH264Mvc
structure specifying H.264 codec specification information for MVC.
When the content type is H.264 MVC, a VkVideoDecodeH264MvcEXT structure must be chained to VkVideoDecodeH264PictureInfoEXT.
39.7. Video Decode of HEVC (ITU-T H.265)
This extension adds H.265 codec specific structures needed for decode session to execute decode jobs which include H.265 sequence header, picture parameter header and quantization matrix etc. Unless otherwise noted, all references to the H.265 specification are to the 2013 edition published by the ITU-T, dated April 2013. This specification is available at https://www.itu.int/rec/T-REC-H.265.
39.7.1. H.265 decode profile
A H.265 decode profile is specified using VkVideoDecodeH265ProfileEXT
chained to VkVideoProfileKHR when the codec-operation in
VkVideoProfileKHR is
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_EXT
.
The VkVideoDecodeH265ProfileEXT
structure is defined as:
// Provided by VK_EXT_video_decode_h265
typedef struct VkVideoDecodeH265ProfileEXT {
VkStructureType sType;
const void* pNext;
StdVideoH265ProfileIdc stdProfileIdc;
} VkVideoDecodeH265ProfileEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdProfileIdc
is aStdVideoH265ProfileIdc
value specifying the H.265 codec profile IDC.
39.7.2. Selecting an H.265 Profile
When using vkGetPhysicalDeviceVideoCapabilitiesKHR to query the
capabilities for the input pVideoProfile
with
videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_EXT
, a
VkVideoDecodeH265ProfileEXT structure must be chained to
VkVideoProfileKHR to select a H.265 decode profile.
If supported, the implementation returns the capabilities associated with
the specified H.265 decode profile.
The requirement is similar when querying supported image formats using
vkGetPhysicalDeviceVideoFormatPropertiesKHR.
A supported H.265 decode profile must be selected when creating a video session by chaining VkVideoDecodeH265ProfileEXT to the VkVideoProfileKHR field of VkVideoSessionCreateInfoKHR.
39.7.3. Capabilities
The VkVideoDecodeH265CapabilitiesEXT
structure is defined as:
// Provided by VK_EXT_video_decode_h265
typedef struct VkVideoDecodeH265CapabilitiesEXT {
VkStructureType sType;
void* pNext;
uint32_t maxLevel;
VkExtensionProperties stdExtensionVersion;
} VkVideoDecodeH265CapabilitiesEXT;
When using vkGetPhysicalDeviceVideoCapabilitiesKHR to query the
capabilities for the parameter videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_EXT
, a
VkVideoDecodeH265CapabilitiesEXT structure can be chained to
VkVideoCapabilitiesKHR to return this H.265 extension specific
capabilities.
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxLevel
is the maximum H.265 level supported by the device. -
stdExtensionVersion
is a VkExtensionProperties structure specifying the H.265 extension name and version supported by this implementation.
39.7.4. Create Infomation
The VkVideoDecodeH265SessionCreateInfoEXT
structure is defined as:
// Provided by VK_EXT_video_decode_h265
typedef struct VkVideoDecodeH265SessionCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkVideoDecodeH265CreateFlagsEXT flags;
const VkExtensionProperties* pStdExtensionVersion;
} VkVideoDecodeH265SessionCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use. -
pStdExtensionVersion
is a pointer to a VkExtensionProperties structure specifying H.265 codec extensions.
A VkVideoDecodeH265SessionCreateInfoEXT structure can be chained to VkVideoSessionCreateInfoKHR when the function vkCreateVideoSessionKHR is called to create a video session for H.265 decode operations.
// Provided by VK_EXT_video_decode_h265
typedef VkFlags VkVideoDecodeH265CreateFlagsEXT;
VkVideoDecodeH265CreateFlagsEXT
is a bitmask type for setting a mask,
but is currently reserved for future use.
39.7.5. Decoder Parameter Sets
To reduce parameter traffic during decoding, the decoder parameter set object supports storing H.265 SPS/PPS parameter sets that may be later referenced during decoding.
The VkVideoDecodeH265SessionParametersCreateInfoEXT
structure is
defined as:
// Provided by VK_EXT_video_decode_h265
typedef struct VkVideoDecodeH265SessionParametersCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t maxSpsStdCount;
uint32_t maxPpsStdCount;
const VkVideoDecodeH265SessionParametersAddInfoEXT* pParametersAddInfo;
} VkVideoDecodeH265SessionParametersCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxSpsStdCount
is the maximum number of SPS parameters that theVkVideoSessionParametersKHR
can contain. -
maxPpsStdCount
is the maximum number of PPS parameters that theVkVideoSessionParametersKHR
can contain. -
pParametersAddInfo
isNULL
or a pointer to a VkVideoDecodeH265SessionParametersAddInfoEXT structure specifying H.265 parameters to add upon object creation.
A VkVideoDecodeH265SessionParametersCreateInfoEXT
structure holding
one H.265 SPS and at least one H.265 PPS paramater set must be chained to
VkVideoSessionParametersCreateInfoKHR
when calling
vkCreateVideoSessionParametersKHR
to store these parameter set(s) with
the decoder parameter set object for later reference.
The provided H.265 SPS/PPS parameters must be within the limits specified
during decoder creation for the decoder specified in
VkVideoSessionParametersCreateInfoKHR
.
The VkVideoDecodeH265SessionParametersAddInfoEXT
structure is defined
as:
// Provided by VK_EXT_video_decode_h265
typedef struct VkVideoDecodeH265SessionParametersAddInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t spsStdCount;
const StdVideoH265SequenceParameterSet* pSpsStd;
uint32_t ppsStdCount;
const StdVideoH265PictureParameterSet* pPpsStd;
} VkVideoDecodeH265SessionParametersAddInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
spsStdCount
is the number of SPS elements in thepSpsStd
. Its value must be less than or equal to the value ofmaxSpsStdCount
. -
pSpsStd
is a pointer to an array ofStdVideoH265SequenceParameterSet
structures representing H.265 sequence parameter sets. Each element of the array must have a unique H.265 VPS-SPS ID pair. -
ppsStdCount
is the number of PPS provided inpPpsStd
. Its value must be less than or equal to the value ofmaxPpsStdCount
. -
pPpsStd
is a pointer to an array ofStdVideoH265PictureParameterSet
structures representing H.265 picture parameter sets. Each element of the array entry must have a unique H.265 VPS-SPS-PPS ID tuple.
39.7.6. Picture Parameters
The VkVideoDecodeH265PictureInfoEXT
structure is defined as:
// Provided by VK_EXT_video_decode_h265
typedef struct VkVideoDecodeH265PictureInfoEXT {
VkStructureType sType;
const void* pNext;
StdVideoDecodeH265PictureInfo* pStdPictureInfo;
uint32_t slicesCount;
const uint32_t* pSlicesDataOffsets;
} VkVideoDecodeH265PictureInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pStdPictureInfo
is a pointer to aStdVideoDecodeH265PictureInfo
structure specifying codec standard specific picture information from the H.265 specification. -
slicesCount
is the number of slices in this picture. -
pSlicesDataOffsets
is a pointer to an array ofslicesCount
offsets indicating the start offset of each slice within the bitstream buffer.
The VkVideoDecodeH265DpbSlotInfoEXT
structure is defined as:
// Provided by VK_EXT_video_decode_h265
typedef struct VkVideoDecodeH265DpbSlotInfoEXT {
VkStructureType sType;
const void* pNext;
const StdVideoDecodeH265ReferenceInfo* pStdReferenceInfo;
} VkVideoDecodeH265DpbSlotInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pStdReferenceInfo
is a pointer to aStdVideoDecodeH265ReferenceInfo
structure specifying the codec standard specific picture reference information from the H.264 specification.
39.8. Video Encode Operations
Before the application can start recording Vulkan command buffers for the Video Encode Operations, it must do the following, beforehand:
-
Ensure that the implementation can encode the Video Content by querying the supported codec operations and profiles using vkGetPhysicalDeviceQueueFamilyProperties2.
-
By using vkGetPhysicalDeviceVideoFormatPropertiesKHR and providing one or more video profiles, choose the Vulkan formats supported by the implementation. The formats for input and reference pictures must be queried and chosen separately. Refer to the section on enumeration of supported video formats.
-
Before creating an image to be used as a video picture resource, obtain the supported image creation parameters by querying with vkGetPhysicalDeviceFormatProperties2 and vkGetPhysicalDeviceImageFormatProperties2 using one of the reported formats and adding VkVideoProfilesKHR to the
pNext
chain of VkFormatProperties2. When querying the parameters with vkGetPhysicalDeviceImageFormatProperties2 for images targeting input and reference (DPB) pictures, the VkPhysicalDeviceImageFormatInfo2::usage
field should containVK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR
andVK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR
, respectively. -
Create none, some, or all of the required images for the input and reference pictures. More Video Picture Resources can be created at some later point if needed while processing the content to be encoded. Also, if the size of the picture to be encoded is expected to change, the images can be created based on the maximum expected content size.
-
Create the video session to be used for video encode operations. Before creating the Encode Video Session, the encode capabilities should be queried with vkGetPhysicalDeviceVideoCapabilitiesKHR to obtain the limits of the parameters allowed by the implementation for a particular codec profile.
-
Bind memory resources with the encode video session by calling vkBindVideoSessionMemoryKHR. The video session cannot be used until memory resources are allocated and bound to it. In order to determine the required memory sizes and heap types of the device memory allocations, vkGetVideoSessionMemoryRequirementsKHR should be called.
-
Create one or more Session Parameter objects for use across command buffer recording operations, if required by the codec extension in use. These objects must be created against a video session with the parameters required by the codec. Each Session Parameter object created is a child object of the associated Session object and cannot be bound in the command buffer with any other Session Object.
The recording of Video Encode Commands against a Vulkan Command Buffer consists of the following sequence:
-
vkCmdBeginVideoCodingKHR starts the recording of one or more Video Encode operations in the command buffer. For each Video Encode Command operation, a Video Session must be bound to the command buffer within this command. This command establishes a Vulkan Video Encode Context that consists of the bound Video Session Object, Session Parameters Object, and the required Video Picture Resources. The established Video Encode Context is in effect until the vkCmdEndVideoCodingKHR command is recorded. If more Video Encode operations are to be required after the vkCmdEndVideoCodingKHR command, another Video Encode Context can be started with the vkCmdBeginVideoCodingKHR command.
-
vkCmdEncodeVideoKHR specifies one or more frames to be encoded. The VkVideoEncodeInfoKHR parameters, and the codec extension structures chained to this, specify the details of the encode operation.
-
vkCmdControlVideoCodingKHR records operations against the encoded data, encoding device, or the Video Session state.
-
vkCmdEndVideoCodingKHR signals the end of the recording of the Vulkan Video Encode Context, as established by vkCmdBeginVideoCodingKHR.
In addition to the above, the following commands can be recorded between vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR:
-
Query operations
-
Global Memory Barriers
-
Buffer Memory Barriers
-
Image Memory Barriers (these must be used to transition the Video Picture Resources to the proper
VK_IMAGE_LAYOUT_VIDEO_ENCODE_SRC_KHR
andVK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR
layouts). -
Pipeline Barriers
-
Events
-
Timestamps
-
Device Groups (device mask)
The following Video Encode related commands must be recorded outside the Vulkan Video Encode Context established with the vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR commands:
-
Sparse Memory Binding
-
Copy Commands
-
Clear Commands
39.8.1. Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with
pVideoProfile->videoCodecOperation
specified as one of the encode
operation bits, the VkVideoEncodeCapabilitiesKHR structure must be
included in the pNext
chain of the VkVideoCapabilitiesKHR
structure to retrieve capabilities specific to video encoding.
The VkVideoEncodeCapabilitiesKHR
structure is defined as:
// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeCapabilitiesKHR {
VkStructureType sType;
const void* pNext;
VkVideoEncodeCapabilityFlagsKHR flags;
VkVideoEncodeRateControlModeFlagsKHR rateControlModes;
uint8_t rateControlLayerCount;
uint8_t qualityLevelCount;
VkExtent2D inputImageDataFillAlignment;
} VkVideoEncodeCapabilitiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoEncodeCapabilityFlagBitsKHR describing supported encoding features. -
rateControlModes
is a bitmask of VkVideoEncodeRateControlModeFlagBitsKHR describing supported rate control modes. All implementations must supportVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
rateControlLayerCount
reports the maximum number of rate control layers supported. Implementations must report at least 1. -
qualityLevelCount
is the number of discrete quality levels supported. Implementations must report at least 1. -
inputImageDataFillAlignment
reports alignment of data that should be filled in the input image horizontally and vertically in pixels before encode operations are performed on the input image.
The input content and encode resolution (specified in
VkVideoEncodeInfoKHR::codedExtent
) may not be aligned with the
codec-specific coding block size.
For example, the input content may be 1920x1080 and the coding block size
may be 16x16 pixel blocks.
In this example, the content is horizontally aligned with the coding block
size, but not vertically aligned with the coding block size.
Encoding of the last row of blocks may be impacted by contents of the input
image in pixel rows 1081 to 1088 (the next vertical alignment with the
coding block size).
In general, to ensure efficient encoding for the last row/column of blocks,
and/or to ensure consistent encoding results between repeated encoding of
the same input content, these extra pixel rows/columns should be filled to
known values up to the coding block size alignment before encoding
operations are performed.
Some implementations support performing auto-fill of unaligned pixels beyond
a specific alignment, which is reported in
inputImageDataFillAlignment
.
For example, if an implementation reports 1x1 in
inputImageDataFillAlignment
, then the implementation will perform
auto-fill for any unaligned pixels beyond the encode resolution up to the
next coding block size.
For a coding block size of 16x16, if the implementation reports 16x16 in
inputImageDataFillAlignment
, then it is the application’s
responsibility to fill any unaligned pixels, if desired.
If not, it may impact the encoding efficiency, but it will not affect the
validity of the generated bitstream.
If the implementation reports 8x8 in inputImageDataFillAlignment
, then
for the 1920x1080 example, since the content is aligned to 8 pixels
vertically, the implementation will auto-fill pixel rows 1081 to 1088 (up to
the 16x16 coding block size in the example).
The auto-fill value(s) are implementation-specific.
The auto-fill value(s) are not written to the input image memory, but are
used as part of the encoding operation on the input image.
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeCapabilityFlagsKHR;
VkVideoEncodeCapabilityFlagsKHR
is a bitmask type for setting a mask
of zero or more VkVideoEncodeCapabilityFlagBitsKHR.
Bits which may be set in VkVideoEncodeCapabilitiesKHR::flags
,
indicating the encoding tools supported, are:
// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeCapabilityFlagBitsKHR {
VK_VIDEO_ENCODE_CAPABILITY_DEFAULT_KHR = 0,
VK_VIDEO_ENCODE_CAPABILITY_PRECEDING_EXTERNALLY_ENCODED_BYTES_BIT_KHR = 0x00000001,
} VkVideoEncodeCapabilityFlagBitsKHR;
-
VK_VIDEO_ENCODE_CAPABILITY_PRECEDING_EXTERNALLY_ENCODED_BYTES_BIT_KHR
reports that the implementation supports use of VkVideoEncodeInfoKHR::precedingExternallyEncodedBytes
.
39.8.2. Video Encode Vulkan Command Buffer Commands
To launch an encode operation that results in bitstream generation, call:
// Provided by VK_KHR_video_encode_queue
void vkCmdEncodeVideoKHR(
VkCommandBuffer commandBuffer,
const VkVideoEncodeInfoKHR* pEncodeInfo);
-
commandBuffer
is the command buffer to be filled with this function for encoding to generate a bitstream. -
pEncodeInfo
is a pointer to a VkVideoEncodeInfoKHR structure.
The VkVideoEncodeInfoKHR
structure is defined as:
// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoEncodeFlagsKHR flags;
uint32_t qualityLevel;
VkExtent2D codedExtent;
VkBuffer dstBitstreamBuffer;
VkDeviceSize dstBitstreamBufferOffset;
VkDeviceSize dstBitstreamBufferMaxRange;
VkVideoPictureResourceKHR srcPictureResource;
const VkVideoReferenceSlotKHR* pSetupReferenceSlot;
uint32_t referenceSlotCount;
const VkVideoReferenceSlotKHR* pReferenceSlots;
uint32_t precedingExternallyEncodedBytes;
} VkVideoEncodeInfoKHR;
-
sType
is the type of this structure. -
pNext
is a pointer to a structure extending this structure. A codec-specific extension structure must be chained to specify what bitstream unit to generate with this encode operation. -
flags
is a bitmask of VkVideoEncodeFlagBitsKHR specifying encode flags, and is reserved for future versions of this specification. -
qualityLevel
is the coding quality level of the encoding. It is defined by the codec-specific extensions. -
codedExtent
is the coded size of the encode operations. -
dstBitstreamBuffer
is the buffer where the encoded bitstream output will be produced. -
dstBitstreamBufferOffset
is the offset in thedstBitstreamBuffer
where the encoded bitstream output will start.dstBitstreamBufferOffset
’s value must be aligned to VkVideoCapabilitiesKHR::minBitstreamBufferOffsetAlignment
, as reported by the implementation. -
dstBitstreamBufferMaxRange
is the maximum size of thedstBitstreamBuffer
that can be used while the encoded bitstream output is produced.dstBitstreamBufferMaxRange
’s value must be aligned to VkVideoCapabilitiesKHR::minBitstreamBufferSizeAlignment
, as reported by the implementation. -
srcPictureResource
is the Picture Resource of the Input Picture to be encoded by the operation. -
pSetupReferenceSlot
is a pointer to a VkVideoReferenceSlotKHR structure used for generating a reconstructed reference slot and Picture Resource.pSetupReferenceSlot->slotIndex
specifies the slot index number to use as a target for producing the Reconstructed (DPB) data.pSetupReferenceSlot
must be one of the entries provided in VkVideoBeginCodingInfoKHR via thepReferenceSlots
within the vkCmdBeginVideoCodingKHR command that established the Vulkan Video Encode Context for this command. -
referenceSlotCount
is the number of Reconstructed Reference Pictures that will be used when this encoding operation is executing. -
pReferenceSlots
isNULL
or a pointer to an array of VkVideoReferenceSlotKHR structures that will be used when this encoding operation is executing. Each entry inpReferenceSlots
must be one of the entries provided in VkVideoBeginCodingInfoKHR via thepReferenceSlots
within the vkCmdBeginVideoCodingKHR command that established the Vulkan Video Encode Context for this command. -
precedingExternallyEncodedBytes
is the number of bytes externally encoded for insertion in the active video encode session overall bitstream prior to the bitstream that will be generated by the implementation for this instance ofVkVideoEncodeInfoKHR
. Valid when VkVideoEncodeRateControlInfoKHR::rateControlMode
is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. The value provided is used to update the implementation’s rate control algorithm for the rate control layer this instance ofVkVideoEncodeInfoKHR
belongs to, by accounting for the bitrate budget consumed by these externally encoded bytes. See VkVideoEncodeRateControlInfoKHR for additional information about encode rate control.
Multiple vkCmdEncodeVideoKHR commands may be recorded within a Vulkan
Video Encode Context.
The execution of each vkCmdEncodeVideoKHR command will result in
generating codec-specific bitstream units.
These bitstream units are generated consecutively into the bitstream buffer
specified in dstBitstreamBuffer
of a VkVideoEncodeInfoKHR
structure within the vkCmdBeginVideoCodingKHR command.
The produced bitstream is the sum of all these bitstream units, including
any padding between the bitstream units.
Any bitstream padding must be filled with data compliant to the codec
standard so as not to cause any syntax errors during decoding of the
bitstream units with the padding included.
The range of the bitstream buffer written can be queried via
video encode bitstream buffer
range queries.
The vkCmdEncodeVideoKHR flags are defined with the following enumeration:
// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeFlagBitsKHR {
VK_VIDEO_ENCODE_DEFAULT_KHR = 0,
VK_VIDEO_ENCODE_RESERVED_0_BIT_KHR = 0x00000001,
} VkVideoEncodeFlagBitsKHR;
-
VK_VIDEO_ENCODE_RESERVED_0_BIT_KHR
The current version of the specification has reserved this value for future use.
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeFlagsKHR;
VkVideoEncodeFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeFlagBitsKHR.
The VkVideoEncodeRateControlInfoKHR
structure is defined as:
// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeRateControlInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoEncodeRateControlFlagsKHR flags;
VkVideoEncodeRateControlModeFlagBitsKHR rateControlMode;
uint8_t layerCount;
const VkVideoEncodeRateControlLayerInfoKHR* pLayerConfigs;
} VkVideoEncodeRateControlInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoEncodeRateControlFlagBitsKHR specifying encode rate control flags. -
rateControlMode
is a VkVideoEncodeRateControlModeFlagBitsKHR value specifying the encode stream rate control mode. -
layerCount
specifies the number of rate control layers in the video encode stream. -
pLayerConfigs
is a pointer to an array of VkVideoEncodeRateControlLayerInfoKHR structures specifying the rate control configurations oflayerCount
rate control layers.
In order to provide video encode stream rate control settings, add a
VkVideoEncodeRateControlInfoKHR
structure to the pNext
chain of
the VkVideoCodingControlInfoKHR structure passed to the
vkCmdControlVideoCodingKHR command.
A codec-specific extension structure for further encode stream rate control parameter settings may be chained to VkVideoEncodeRateControlInfoKHR.
To ensure that the video session is properly initalized with stream-level rate control settings, the application must call vkCmdControlVideoCodingKHR with stream-level rate control settings at least once in execution order before the first vkCmdEncodeVideoKHR command that is executed after video session reset. If not provided, default implementation-specific stream rate control settings will be used.
Stream rate control settings can also be re-initialized during an active
video encoding session.
The re-initialization takes effect whenever the
VkVideoEncodeRateControlInfoKHR
structure is included in the
pNext
chain of the VkVideoCodingControlInfoKHR structure in the
call to vkCmdControlVideoCodingKHR, and only impacts
vkCmdEncodeVideoKHR operations that follow in execution order.
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeRateControlFlagsKHR;
VkVideoEncodeRateControlFlagsKHR
is a bitmask type for setting a mask,
but currently reserved for future use.
// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeRateControlFlagBitsKHR {
VK_VIDEO_ENCODE_RATE_CONTROL_DEFAULT_KHR = 0,
VK_VIDEO_ENCODE_RATE_CONTROL_RESERVED_0_BIT_KHR = 0x00000001,
} VkVideoEncodeRateControlFlagBitsKHR;
VkVideoEncodeRateControlFlagBitsKHR defines bits which may be set in a VkVideoEncodeRateControlFlagsKHR value, but is currently unused.
The rate control modes are defined with the following enums:
// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeRateControlModeFlagBitsKHR {
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR = 0,
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR = 1,
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR = 2,
} VkVideoEncodeRateControlModeFlagBitsKHR;
-
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
for disabling rate control. -
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR
for constant bitrate rate control mode. -
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR
for variable bitrate rate control mode.
The VkVideoEncodeRateControlLayerInfoKHR
structure is defined as:
// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeRateControlLayerInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t averageBitrate;
uint32_t maxBitrate;
uint32_t frameRateNumerator;
uint32_t frameRateDenominator;
uint32_t virtualBufferSizeInMs;
uint32_t initialVirtualBufferSizeInMs;
} VkVideoEncodeRateControlLayerInfoKHR;
-
sType
is the type of this structure. -
pNext
is a pointer to a structure extending this structure. -
averageBitrate
is the average bitrate in bits/second. Valid when rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
maxBitrate
is the peak bitrate in bits/second. Valid when rate control mode isVK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR
. -
frameRateNumerator
is the numerator of the frame rate. Valid when rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
frameRateDenominator
is the denominator of the frame rate. Valid when rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
virtualBufferSizeInMs
is the leaky bucket model virtual buffer size in milliseconds, with respect to peak bitrate. Valid when rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. For example, virtual buffer size is (virtualBufferSizeInMs
*maxBitrate
/ 1000). -
initialVirtualBufferSizeInMs
is the initial occupancy in milliseconds of the virtual buffer in the leaky bucket model. Valid when the rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
.
A codec-specific structure specifying additional per-layer rate control
settings must be chained to VkVideoEncodeRateControlLayerInfoKHR
.
If multiple rate control layers are enabled
(VkVideoEncodeRateControlInfoKHR::layerCount
is greater than 1),
then the chained codec-specific extension structure also identifies the
specific video coding layer its parent
VkVideoEncodeRateControlLayerInfoKHR
applies to.
If multiple rate control layers are enabled, the number of rate control
layers must match the number of video coding layers.
The specification for an encode codec-specific extension would describe how
multiple video coding layers are enabled for the corresponding codec.
Per-layer rate control settings for all enabled rate control layers must be
initialized or re-initialized whenever stream rate control settings are
provided via VkVideoEncodeRateControlInfoKHR.
This is done by specifying settings for all enabled rate control layers in
VkVideoEncodeRateControlInfoKHR::pLayerConfigs
.
To adjust rate control settings for an individual layer at runtime, add a
VkVideoEncodeRateControlLayerInfoKHR
structure to the pNext
chain of the VkVideoCodingControlInfoKHR structure passed to the
vkCmdControlVideoCodingKHR command.
This adjustment only impacts the specified layer without impacting the rate
control settings or implementation rate control algorithm behavior for any
other enabled rate control layers.
The adjustment takes effect whenever the corresponding
vkCmdControlVideoCodingKHR is executed, and only impacts
vkCmdEncodeVideoKHR operations pertaining to the corresponding video
coding layer that follow in execution order.
It is possible for an application to enable multiple video coding layers
(via codec-specific extensions to encoding operations) while only enabling a
single layer of rate control for the entire video stream.
To achieve this, layerCount
in VkVideoEncodeRateControlInfoKHR
must be set to 1, and the single VkVideoEncodeRateControlLayerInfoKHR
provided in pLayerConfigs
would apply to all encoded segments of the
video stream, regardless of which codec-defined video coding layer they
belong to.
In this case, the implementation decides bitrate distribution across video
coding layers (if applicable to the specified stream rate control mode).
39.9. Encode H.264
This extension adds H.264 codec specific structures/types needed to support H.264 encoding. Unless otherwise noted, all references to the H.264 specification are to the 2010 edition published by the ITU-T, dated March 2010. This specification is available at https://www.itu.int/rec/T-REC-H.264.
39.9.1. H.264 encode profile
The VkVideoEncodeH264ProfileEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264ProfileEXT {
VkStructureType sType;
const void* pNext;
StdVideoH264ProfileIdc stdProfileIdc;
} VkVideoEncodeH264ProfileEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdProfileIdc
is aStdVideoH264ProfileIdc
value specifying the H.264 codec profile IDC.
An H.264 encode profile is specified by including a
VkVideoEncodeH264ProfileEXT
structure in the pNext
chain of the
VkVideoProfileKHR structure when
VkVideoProfileKHR::videoCodecOperation
is
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT
.
39.9.2. Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with
pVideoProfile->videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT
, the
VkVideoEncodeH264CapabilitiesEXT structure must be included in the
pNext
chain of the VkVideoCapabilitiesKHR structure to retrieve
more capabilities specific to H.264 video encoding.
The VkVideoEncodeH264CapabilitiesEXT structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264CapabilitiesEXT {
VkStructureType sType;
const void* pNext;
VkVideoEncodeH264CapabilityFlagsEXT flags;
VkVideoEncodeH264InputModeFlagsEXT inputModeFlags;
VkVideoEncodeH264OutputModeFlagsEXT outputModeFlags;
uint8_t maxPPictureL0ReferenceCount;
uint8_t maxBPictureL0ReferenceCount;
uint8_t maxL1ReferenceCount;
VkBool32 motionVectorsOverPicBoundariesFlag;
uint32_t maxBytesPerPicDenom;
uint32_t maxBitsPerMbDenom;
uint32_t log2MaxMvLengthHorizontal;
uint32_t log2MaxMvLengthVertical;
VkExtensionProperties stdExtensionVersion;
} VkVideoEncodeH264CapabilitiesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoEncodeH264CapabilityFlagBitsEXT describing supported encoding tools. -
inputModeFlags
is a bitmask of VkVideoEncodeH264InputModeFlagBitsEXT describing supported command buffer input granularities/modes. -
outputModeFlags
is a bitmask of VkVideoEncodeH264OutputModeFlagBitsEXT describing supported output (bitstream size reporting) granularities/modes. -
maxPPictureL0ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L0 for P pictures. -
maxBPictureL0ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L0 for B pictures. The reported value is0
if encoding of B pictures is not supported. -
maxL1ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L1 if encoding of B pictures is supported. The reported value is0
if encoding of B pictures is not supported. -
motionVectorsOverPicBoundariesFlag
ifVK_TRUE
, indicates motion_vectors_over_pic_boundaries_flag will be enabled if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags. -
maxBytesPerPicDenom
reports the value that will be used for max_bytes_per_pic_denom if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags. -
maxBitsPerMbDenom
reports the value that will be used for max_bits_per_mb_denom if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags. -
log2MaxMvLengthHorizontal
reports the value that will be used for log2_max_mv_length_horizontal if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags. -
log2MaxMvLengthVertical
reports the value that will be used for log2_max_mv_length_vertical if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags. -
stdExtensionVersion
is the specific H.264 extension name and version supported by this implementation.
When vkGetPhysicalDeviceVideoCapabilitiesKHR is called to query the
capabilities with parameter videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT
, a
VkVideoEncodeH264CapabilitiesEXT
structure can be chained to
VkVideoCapabilitiesKHR to retrieve H.264 extension specific
capabilities.
// Provided by VK_EXT_video_encode_h264
typedef VkFlags VkVideoEncodeH264CapabilityFlagsEXT;
VkVideoEncodeH264CapabilityFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH264CapabilityFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH264CapabilitiesEXT::flags
, indicating the encoding
tools supported, are:
// Provided by VK_EXT_video_encode_h264
typedef enum VkVideoEncodeH264CapabilityFlagBitsEXT {
VK_VIDEO_ENCODE_H264_CAPABILITY_DIRECT_8X8_INFERENCE_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H264_CAPABILITY_SEPARATE_COLOUR_PLANE_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H264_CAPABILITY_QPPRIME_Y_ZERO_TRANSFORM_BYPASS_BIT_EXT = 0x00000004,
VK_VIDEO_ENCODE_H264_CAPABILITY_SCALING_LISTS_BIT_EXT = 0x00000008,
VK_VIDEO_ENCODE_H264_CAPABILITY_HRD_COMPLIANCE_BIT_EXT = 0x00000010,
VK_VIDEO_ENCODE_H264_CAPABILITY_CHROMA_QP_OFFSET_BIT_EXT = 0x00000020,
VK_VIDEO_ENCODE_H264_CAPABILITY_SECOND_CHROMA_QP_OFFSET_BIT_EXT = 0x00000040,
VK_VIDEO_ENCODE_H264_CAPABILITY_PIC_INIT_QP_MINUS26_BIT_EXT = 0x00000080,
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_PRED_BIT_EXT = 0x00000100,
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_BIPRED_EXPLICIT_BIT_EXT = 0x00000200,
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_BIPRED_IMPLICIT_BIT_EXT = 0x00000400,
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_PRED_NO_TABLE_BIT_EXT = 0x00000800,
VK_VIDEO_ENCODE_H264_CAPABILITY_TRANSFORM_8X8_BIT_EXT = 0x00001000,
VK_VIDEO_ENCODE_H264_CAPABILITY_CABAC_BIT_EXT = 0x00002000,
VK_VIDEO_ENCODE_H264_CAPABILITY_CAVLC_BIT_EXT = 0x00004000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_DISABLED_BIT_EXT = 0x00008000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_ENABLED_BIT_EXT = 0x00010000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_PARTIAL_BIT_EXT = 0x00020000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DISABLE_DIRECT_SPATIAL_MV_PRED_BIT_EXT = 0x00040000,
VK_VIDEO_ENCODE_H264_CAPABILITY_MULTIPLE_SLICE_PER_FRAME_BIT_EXT = 0x00080000,
VK_VIDEO_ENCODE_H264_CAPABILITY_SLICE_MB_COUNT_BIT_EXT = 0x00100000,
VK_VIDEO_ENCODE_H264_CAPABILITY_ROW_UNALIGNED_SLICE_BIT_EXT = 0x00200000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_EXT = 0x00400000,
} VkVideoEncodeH264CapabilityFlagBitsEXT;
-
VK_VIDEO_ENCODE_H264_CAPABILITY_DIRECT_8X8_INFERENCE_BIT_EXT
reports if enabling direct_8x8_inference_flag in StdVideoH264SpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_SEPARATE_COLOUR_PLANE_BIT_EXT
reports if enabling separate_colour_plane_flag in StdVideoH264SpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_QPPRIME_Y_ZERO_TRANSFORM_BYPASS_BIT_EXT
reports if enabling qpprime_y_zero_transform_bypass_flag in StdVideoH264SpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_SCALING_LISTS_BIT_EXT
reports if enabling seq_scaling_matrix_present_flag in StdVideoH264SpsFlags or pic_scaling_matrix_present_flag in StdVideoH264PpsFlags are supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_HRD_COMPLIANCE_BIT_EXT
reports if the implementation guarantees generating a HRD compliant bitstream if nal_hrd_parameters_present_flag or vcl_hrd_parameters_present_flag are enabled in StdVideoH264SpsVuiFlags. -
VK_VIDEO_ENCODE_H264_CAPABILITY_CHROMA_QP_OFFSET_BIT_EXT
reports if setting non-zero chroma_qp_index_offset in StdVideoH264PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_SECOND_CHROMA_QP_OFFSET_BIT_EXT
reports if setting non-zero second_chroma_qp_index_offset in StdVideoH264PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_PIC_INIT_QP_MINUS26_BIT_EXT
reports if setting non-zero pic_init_qp_minus26 in StdVideoH264PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_PRED_BIT_EXT
reports if enabling weighted_pred_flag in StdVideoH264PpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_BIPRED_EXPLICIT_BIT_EXT
reports if using STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_EXPLICIT from StdVideoH264WeightedBipredIdc is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_BIPRED_IMPLICIT_BIT_EXT
reports if using STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_IMPLICIT from StdVideoH264WeightedBipredIdc is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_PRED_NO_TABLE_BIT_EXT
reports that when weighted_pred_flag is enabled or STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_EXPLICIT from StdVideoH264WeightedBipredIdc is used, the implementation is able to internally decide syntax for pred_weight_table. -
VK_VIDEO_ENCODE_H264_CAPABILITY_TRANSFORM_8X8_BIT_EXT
reports if enabling transform_8x8_mode_flag in StdVideoH264PpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_CABAC_BIT_EXT
reports if CABAC entropy coding is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_CAVLC_BIT_EXT
reports if CAVLC entropy coding is supported. An implementation must support at least one entropy coding mode. -
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_DISABLED_BIT_EXT
reports if using STD_VIDEO_H264_DISABLE_DEBLOCKING_FILTER_IDC_DISABLED from StdVideoH264DisableDeblockingFilterIdc is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_ENABLED_BIT_EXT
reports if using STD_VIDEO_H264_DISABLE_DEBLOCKING_FILTER_IDC_ENABLED from StdVideoH264DisableDeblockingFilterIdc is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_PARTIAL_BIT_EXT
reports if using STD_VIDEO_H264_DISABLE_DEBLOCKING_FILTER_IDC_PARTIAL from StdVideoH264DisableDeblockingFilterIdc is supported. An implementation must support at least one deblocking filter mode. -
VK_VIDEO_ENCODE_H264_CAPABILITY_DISABLE_DIRECT_SPATIAL_MV_PRED_BIT_EXT
reports if disablingStdVideoEncodeH264SliceHeaderFlags
::direct_spatial_mv_pred_flag is supported when it is present in the slice header. -
VK_VIDEO_ENCODE_H264_CAPABILITY_MULTIPLE_SLICE_PER_FRAME_BIT_EXT
reports if encoding multiple slices per frame is supported. If not set, the implementation is only able to encode a single slice for the entire frame. -
VK_VIDEO_ENCODE_H264_CAPABILITY_SLICE_MB_COUNT_BIT_EXT
reports support for configuring VkVideoEncodeH264NaluSliceEXT::mbCount
and first_mb_in_slice in StdVideoEncodeH264SliceHeader for each slice in a frame with multiple slices. If not supported, the implementation decides the number of macroblocks in each slice based on VkVideoEncodeH264VclFrameInfoEXT::naluSliceEntryCount
. -
VK_VIDEO_ENCODE_H264_CAPABILITY_ROW_UNALIGNED_SLICE_BIT_EXT
reports that each slice in a frame with multiple slices may begin or finish at any offset in a macroblock row. If not supported, all slices in the frame must begin at the start of a macroblock row (and hence each slice must finish at the end of a macroblock row). -
VK_VIDEO_ENCODE_H264_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_EXT
reports that whenVK_VIDEO_ENCODE_H264_CAPABILITY_MULTIPLE_SLICE_PER_FRAME_BIT_EXT
is supported and a frame is encoded with multiple slices, the implementation allows encoding each slice with a differentStdVideoEncodeH264SliceHeader
::slice_type. If not supported, all slices of the frame must be encoded with the sameslice_type
which corresponds to the picture type of the frame. For example, all slices of a P-frame would be encoded as P-slices.
// Provided by VK_EXT_video_encode_h264
typedef VkFlags VkVideoEncodeH264InputModeFlagsEXT;
VkVideoEncodeH264InputModeFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH264InputModeFlagBitsEXT.
The inputModeFlags
field reports the various commmand buffer input
granularities supported by the implementation as follows:
// Provided by VK_EXT_video_encode_h264
typedef enum VkVideoEncodeH264InputModeFlagBitsEXT {
VK_VIDEO_ENCODE_H264_INPUT_MODE_FRAME_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H264_INPUT_MODE_SLICE_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H264_INPUT_MODE_NON_VCL_BIT_EXT = 0x00000004,
} VkVideoEncodeH264InputModeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H264_INPUT_MODE_FRAME_BIT_EXT
indicates that a single command buffer must at least encode an entire frame. Any non-VCL NALUs must be encoded using the same command buffer as the frame ifVK_VIDEO_ENCODE_H264_INPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H264_INPUT_MODE_SLICE_BIT_EXT
indicates that a single command buffer must at least encode a single slice. Any non-VCL NALUs must be encoded using the same command buffer as the first slice of the frame ifVK_VIDEO_ENCODE_H264_INPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H264_INPUT_MODE_NON_VCL_BIT_EXT
indicates that a single command buffer may encode a non-VCL NALU by itself.
An implementation must support at least one of
VK_VIDEO_ENCODE_H264_INPUT_MODE_FRAME_BIT_EXT
or
VK_VIDEO_ENCODE_H264_INPUT_MODE_SLICE_BIT_EXT
.
If VK_VIDEO_ENCODE_H264_INPUT_MODE_SLICE_BIT_EXT
is not supported, the
following two additional restrictions apply for frames encoded with multiple
slices.
First, all frame slices must have the same pRefList0ModOperations and the
same pRefList1ModOperations.
Second, the order in which slices appear in
VkVideoEncodeH264VclFrameInfoEXT::pNaluSliceEntries
or in the
command buffer must match the placement order of the slices in the frame.
// Provided by VK_EXT_video_encode_h264
typedef VkFlags VkVideoEncodeH264OutputModeFlagsEXT;
VkVideoEncodeH264OutputModeFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH264InputModeFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH264CapabilitiesEXT::outputModeFlags
, indicating
the minimum bitstream generation commands that must be included between
each vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR pair
(henceforth simply begin/end pair), are:
// Provided by VK_EXT_video_encode_h264
typedef enum VkVideoEncodeH264OutputModeFlagBitsEXT {
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_FRAME_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_SLICE_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_NON_VCL_BIT_EXT = 0x00000004,
} VkVideoEncodeH264OutputModeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_FRAME_BIT_EXT
indicates that calls to generate all NALUs of a frame must be included within a single begin/end pair. Any non-VCL NALUs must be encoded within the same begin/end pair ifVK_VIDEO_ENCODE_H264_OUTPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_SLICE_BIT_EXT
indicates that each begin/end pair must encode at least one slice. Any non-VCL NALUs must be encoded within the same begin/end pair as the first slice of the frame ifVK_VIDEO_ENCODE_H264_OUTPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_NON_VCL_BIT_EXT
indicates that each begin/end pair may encode only a non-VCL NALU by itself. An implementation must support at least one ofVK_VIDEO_ENCODE_H264_OUTPUT_MODE_FRAME_BIT_EXT
orVK_VIDEO_ENCODE_H264_OUTPUT_MODE_SLICE_BIT_EXT
.
A single begin/end pair must not encode more than a single frame.
The bitstreams of NALUs generated within a single begin/end pair are written continuously into the same bitstream buffer (any padding between the NALUs must be compliant to the H.264 standard).
The supported input modes must be coarser or equal to the supported output modes. For example, it is illegal to report slice input is supported but only frame output is supported.
An implementation must report one of the following combinations of input/output modes:
-
Input: Frame, Output: Frame
-
Input: Frame, Output: Frame and Non-VCL
-
Input: Frame, Output: Slice
-
Input: Frame, Output: Slice and Non-VCL
-
Input: Slice, Output: Slice
-
Input: Slice, Output: Slice and Non-VCL
-
Input: Frame and Non-VCL, Output: Frame and Non-VCL
-
Input: Frame and Non-VCL, Output: Slice and Non-VCL
-
Input: Slice and Non-VCL, Output: Slice and Non-VCL
39.9.3. Create Information
The VkVideoEncodeH264SessionCreateInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264SessionCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkVideoEncodeH264CreateFlagsEXT flags;
VkExtent2D maxPictureSizeInMbs;
const VkExtensionProperties* pStdExtensionVersion;
} VkVideoEncodeH264SessionCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoEncodeH264CreateFlagsEXT specifying H.264 encoder creation flags. -
maxPictureSizeInMbs
specifies the syntax element pic_width_in_mbs_minus1 + 1 and the syntax element pic_height_in_map_units_minus1 + 1. -
pStdExtensionVersion
is a pointer to a VkExtensionProperties structure specifying H.264 codec extensions.
A VkVideoEncodeH264SessionCreateInfoEXT
structure must be chained to
VkVideoSessionCreateInfoKHR when the function
vkCreateVideoSessionKHR is called with videoCodecOperation
in
VkVideoSessionCreateInfoKHR set to
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT
.
// Provided by VK_EXT_video_encode_h264
typedef VkFlags VkVideoEncodeH264CreateFlagsEXT;
VkVideoEncodeH264CreateFlagsEXT is a bitmask type for setting a mask of zero or more VkVideoEncodeH264CreateFlagBitsEXT.
Bits which can be set in
VkVideoEncodeH264SessionCreateInfoEXT::flags
are:
// Provided by VK_EXT_video_encode_h264
typedef enum VkVideoEncodeH264CreateFlagBitsEXT {
VK_VIDEO_ENCODE_H264_CREATE_DEFAULT_EXT = 0,
VK_VIDEO_ENCODE_H264_CREATE_RESERVED_0_BIT_EXT = 0x00000001,
} VkVideoEncodeH264CreateFlagBitsEXT;
-
VK_VIDEO_ENCODE_H264_CREATE_DEFAULT_EXT
is 0, and specifies no additional creation flags. -
VK_VIDEO_ENCODE_H264_CREATE_RESERVED_0_BIT_EXT
The current version of the specification has reserved this value for future use.
39.9.4. Encoder Parameter Sets
To reduce parameter traffic during encoding, the encoder parameter set object supports storing H.264 SPS/PPS parameter sets that may be later referenced during encoding.
The VkVideoEncodeH264SessionParametersCreateInfoEXT structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264SessionParametersCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t maxSpsStdCount;
uint32_t maxPpsStdCount;
const VkVideoEncodeH264SessionParametersAddInfoEXT* pParametersAddInfo;
} VkVideoEncodeH264SessionParametersCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxSpsStdCount
is the maximum number of SPS parameters that theVkVideoSessionParametersKHR
can contain. -
maxPpsStdCount
is the maximum number of PPS parameters that theVkVideoSessionParametersKHR
can contain. -
pParametersAddInfo
isNULL
or a pointer to aVkVideoEncodeH264SessionParametersAddInfoEXT
structure specifying H.264 parameters to add upon object creation.
A VkVideoEncodeH264SessionParametersCreateInfoEXT
structure holding
one H.264 SPS and at least one H.264 PPS paramater set must be chained to
VkVideoSessionParametersCreateInfoKHR when calling
vkCreateVideoSessionParametersKHR to store these parameter set(s) with
the encoder parameter set object for later reference.
The provided H.264 SPS/PPS parameters must be within the limits specified
during encoder creation for the encoder specified in
VkVideoSessionParametersCreateInfoKHR.
The VkVideoEncodeH264SessionParametersAddInfoEXT
structure is defined
as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264SessionParametersAddInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t spsStdCount;
const StdVideoH264SequenceParameterSet* pSpsStd;
uint32_t ppsStdCount;
const StdVideoH264PictureParameterSet* pPpsStd;
} VkVideoEncodeH264SessionParametersAddInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
spsStdCount
is the number of SPS elements in thepSpsStd
. Its value must be less than or equal to the value ofmaxSpsStdCount
. -
pSpsStd
is a pointer to an array ofStdVideoH264SequenceParameterSet
structures representing H.264 sequence parameter sets. Each element of the array must have a unique H.264 SPS ID. -
ppsStdCount
is the number of PPS provided inpPpsStd
. Its value must be less than or equal to the value ofmaxPpsStdCount
. -
pPpsStd
is a pointer to an array ofStdVideoH264PictureParameterSet
structures representing H.264 picture parameter sets. Each element of the array must have a unique H.264 SPS-PPS ID pair.
39.9.5. Frame Encoding
In order to encode a frame, add a VkVideoEncodeH264VclFrameInfoEXT
structure to the pNext
chain of the VkVideoEncodeInfoKHR
structure passed to the vkCmdEncodeVideoKHR command.
The VkVideoEncodeH264VclFrameInfoEXT structure representing a frame encode operation is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264VclFrameInfoEXT {
VkStructureType sType;
const void* pNext;
const VkVideoEncodeH264ReferenceListsEXT* pReferenceFinalLists;
uint32_t naluSliceEntryCount;
const VkVideoEncodeH264NaluSliceEXT* pNaluSliceEntries;
const StdVideoEncodeH264PictureInfo* pCurrentPictureInfo;
} VkVideoEncodeH264VclFrameInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pReferenceFinalLists
isNULL
or a pointer to a VkVideoEncodeH264ReferenceListsEXT structure specifying the reference lists to be used for the current picture. -
naluSliceEntryCount
is the number of slice NALUs in the frame. -
pNaluSliceEntries
is a pointer to an array ofnaluSliceEntryCount
VkVideoEncodeH264NaluSliceEXT structures specifying the division of the current picture into slices and the properties of these slices. This is an ordered sequence; the NALUs are generated consecutively in VkVideoEncodeInfoKHR::dstBitstreamBuffer
in the same order as in this array. -
pCurrentPictureInfo
is a pointer to aStdVideoEncodeH264PictureInfo
structure specifying the syntax and other codec-specific information from the H.264 specification associated with this picture. The information provided must reflect the decoded picture marking operations that are applicable to this frame.
The VkVideoEncodeH264NaluSliceEXT structure representing a slice is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264NaluSliceEXT {
VkStructureType sType;
const void* pNext;
uint32_t mbCount;
const VkVideoEncodeH264ReferenceListsEXT* pReferenceFinalLists;
const StdVideoEncodeH264SliceHeader* pSliceHeaderStd;
} VkVideoEncodeH264NaluSliceEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
mbCount
is the number of macroblocks in this slice. -
pReferenceFinalLists
isNULL
or a pointer to a VkVideoEncodeH264ReferenceListsEXT structure specifying the reference lists to be used for the current slice. IfpReferenceFinalLists
is notNULL
, these reference lists override the reference lists provided in VkVideoEncodeH264VclFrameInfoEXT::pReferenceFinalLists
. -
pSliceHeaderStd
is a pointer to aStdVideoEncodeH264SliceHeader
structure specifying the slice header for the current slice.
The VkVideoEncodeH264DpbSlotInfoEXT structure, representing a reconstructed picture that is being used as a reference picture, is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264DpbSlotInfoEXT {
VkStructureType sType;
const void* pNext;
int8_t slotIndex;
const StdVideoEncodeH264ReferenceInfo* pStdReferenceInfo;
} VkVideoEncodeH264DpbSlotInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
slotIndex
is the DPB Slot index for this picture.slotIndex
must match theslotIndex
inpSetupReferenceSlot
of VkVideoEncodeInfoKHR in the command used to encode the corresponding picture. -
pStdReferenceInfo
is a pointer to aStdVideoEncodeH264ReferenceInfo
structure specifying the syntax and other codec-specific information from the H.264 specification associated with this reference picture.
The VkVideoEncodeH264ReferenceListsEXT structure representing reference lists is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264ReferenceListsEXT {
VkStructureType sType;
const void* pNext;
uint8_t referenceList0EntryCount;
const VkVideoEncodeH264DpbSlotInfoEXT* pReferenceList0Entries;
uint8_t referenceList1EntryCount;
const VkVideoEncodeH264DpbSlotInfoEXT* pReferenceList1Entries;
const StdVideoEncodeH264RefMemMgmtCtrlOperations* pMemMgmtCtrlOperations;
} VkVideoEncodeH264ReferenceListsEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
referenceList0EntryCount
is the number of reference pictures in reference list L0 and is identical toStdVideoEncodeH264SliceHeader
::num_ref_idx_l0_active_minus1
+ 1. -
pReferenceList0Entries
is a pointer to an array ofreferenceList0EntryCount
VkVideoEncodeH264DpbSlotInfoEXT structures specifying the reference list L0 entries for the current picture. The entries provided must be ordered after all reference list L0 modification operations are applied (i.e. final list order). The entries provided must not reflect decoded picture marking operations in this frame that are applicable to references; the impact of such operations must be reflected in future frame encode commands. The slot index in each entry must match one of the slot indexes provided in thepReferenceSlots
of the parent VkVideoEncodeInfoKHR structure. -
referenceList1EntryCount
is the number of reference pictures in reference list L1 and is identical toStdVideoEncodeH264SliceHeader
::num_ref_idx_l1_active_minus1
+ 1. -
pReferenceList1Entries
is a pointer to an array ofreferenceList1EntryCount
VkVideoEncodeH264DpbSlotInfoEXT structures specifying the reference list L1 entries for the current picture. The entries provided must be ordered after all reference list L1 modification operations are applied (i.e. final list order). The entries provided must not reflect decoded picture marking operations in this frame that are applicable to references; the impact of such operations must be reflected in future frame encode commands. The slot index in each entry must match one of the slot indexes provided in thepReferenceSlots
of the parent VkVideoEncodeInfoKHR structure. -
pMemMgmtCtrlOperations
is a pointer to aStdVideoEncodeH264RefMemMgmtCtrlOperations
structure specifying reference lists modifications and decoded picture marking operations.
The VkVideoEncodeH264EmitPictureParametersEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264EmitPictureParametersEXT {
VkStructureType sType;
const void* pNext;
uint8_t spsId;
VkBool32 emitSpsEnable;
uint32_t ppsIdEntryCount;
const uint8_t* ppsIdEntries;
} VkVideoEncodeH264EmitPictureParametersEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
spsId
is the H.264 SPS ID for the H.264 SPS to insert in the bitstream. The SPS ID must match the SPS provided inspsStd
of VkVideoEncodeH264SessionParametersCreateInfoEXT. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR. -
emitSpsEnable
enables the emitting of the SPS structure with id ofspsId
. -
ppsIdEntryCount
is the number of entries in theppsIdEntries
. If this parameter is0
then no pps entries are going to be emitted in the bitstream. -
ppsIdEntries
is a pointer to an array of H.264 PPS IDs for the H.264 PPS to insert in the bitstream. The PPS IDs must match one of the IDs of the PPS(s) provided inpPpsStd
of VkVideoEncodeH264SessionParametersCreateInfoEXT to identify the PPS parameter set to insert in the bitstream. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR.
39.9.6. Rate control
The VkVideoEncodeH264RateControlInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264RateControlInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t gopFrameCount;
uint32_t idrPeriod;
uint32_t consecutiveBFrameCount;
VkVideoEncodeH264RateControlStructureFlagBitsEXT rateControlStructure;
uint8_t temporalLayerCount;
} VkVideoEncodeH264RateControlInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
gopFrameCount
is the number of frames contained within the group of pictures (GOP), starting from an intra frame and until the next intra frame. If it is set to 0, the implementation chooses a suitable value. If it is set toUINT32_MAX
, the GOP length is treated as infinite. -
idrPeriod
is the interval, in terms of number of frames, between two IDR frames. If it is set to 0, the implementation chooses a suitable value. If it is set toUINT32_MAX
, the IDR period is treated as infinite. -
consecutiveBFrameCount
is the number of consecutive B-frames between I- and/or P-frames within the GOP. -
rateControlStructure
is a VkVideoEncodeH264RateControlStructureFlagBitsEXT value specifying the expected encode stream reference structure, to aid in rate control calculations. -
temporalLayerCount
specifies the number of temporal layers enabled in the stream.
In order to provide H.264-specific stream rate control parameters, add a
VkVideoEncodeH264RateControlInfoEXT
structure to the pNext
chain
of the VkVideoEncodeRateControlInfoKHR structure in the pNext
chain of the VkVideoCodingControlInfoKHR structure passed to the
vkCmdControlVideoCodingKHR command.
The parameters from this structure act as a guidance for implementations to apply various rate control heuristics.
It is possible to infer the picture type to be used when encoding a frame,
on the basis of the values provided for consecutiveBFrameCount
,
idrPeriod
, and gopFrameCount
, but this inferred picture type
will not be used by implementations to override the picture type provided in
vkCmdEncodeVideoKHR.
Additionally, it is not required for the video session to be reset if the
inferred picture type does not match the actual picture type.
The rateControlStructure
in VkVideoEncodeH264RateControlInfoEXT
specifies one of the following video stream reference structures as a hint
for the rate control implementation:
// Provided by VK_EXT_video_encode_h264
typedef enum VkVideoEncodeH264RateControlStructureFlagBitsEXT {
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_UNKNOWN_EXT = 0,
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_FLAT_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_DYADIC_BIT_EXT = 0x00000002,
} VkVideoEncodeH264RateControlStructureFlagBitsEXT;
-
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_UNKNOWN_EXT
is0
, and specifies a reference structure unknown at the time of stream rate control configuration. -
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_FLAT_BIT_EXT
specifies a flat reference structure. -
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_DYADIC_BIT_EXT
specifies a dyadic reference structure.
The VkVideoEncodeH264RateControlLayerInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264RateControlLayerInfoEXT {
VkStructureType sType;
const void* pNext;
uint8_t temporalLayerId;
VkBool32 useInitialRcQp;
VkVideoEncodeH264QpEXT initialRcQp;
VkBool32 useMinQp;
VkVideoEncodeH264QpEXT minQp;
VkBool32 useMaxQp;
VkVideoEncodeH264QpEXT maxQp;
VkBool32 useMaxFrameSize;
VkVideoEncodeH264FrameSizeEXT maxFrameSize;
} VkVideoEncodeH264RateControlLayerInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
temporalLayerId
specifies the H.264 temporal layer ID of the video coding layer that settings provided in this structure and its parent VkVideoEncodeRateControlLayerInfoKHR structure apply to. -
useInitialRcQp
indicates whether the values withininitialRcQp
should be used by the implementation. -
initialRcQp
provides the QP values for each picture type, to be used in rate control calculations at the start of video encode operations on a newly-created video session, or immediately after a session reset. These values are ignored when VkVideoEncodeRateControlInfoKHR::rateControlMode
isVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
useMinQp
indicates whether the values withinminQp
should be used by the implementation. When it is set toVK_FALSE
, the implementation ignores the values inminQp
and chooses suitable values. -
minQp
provides the lower bound on the QP values for each picture type, to be used in rate control calculations. -
useMaxQp
indicates whether the values withinmaxQp
should be used by the implementation. When it is set toVK_FALSE
, the implementation ignores the values inmaxQp
and chooses suitable values. -
maxQp
provides the upper bound on the QP values for each picture type, to be used in rate control calculations. -
useMaxFrameSize
indicates whether the values withinmaxFrameSize
should be used by the implementation. -
maxFrameSize
provides the upper bound on the encoded frame size for each picture type. The implementation does not guarantee the encoded frame sizes will be within the specified limits, however these limits may be used as a guide in rate control calculations. If enabled and not set properly, themaxQp
limit may prevent the implementation from respecting themaxFrameSize
limit.
H.264-specific per-layer rate control parameters must be specified by
adding a VkVideoEncodeH264RateControlLayerInfoEXT
structure to the
pNext
chain of each VkVideoEncodeRateControlLayerInfoKHR
structure in a call to vkCmdControlVideoCodingKHR command, when the
command buffer context has an active video encode H.264 session.
The VkVideoEncodeH264QpEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264QpEXT {
int32_t qpI;
int32_t qpP;
int32_t qpB;
} VkVideoEncodeH264QpEXT;
-
qpI
is the QP to be used for I-frames. -
qpP
is the QP to be used for P-frames. -
qpB
is the QP to be used for B-frames.
The VkVideoEncodeH264FrameSizeEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264FrameSizeEXT {
uint32_t frameISize;
uint32_t framePSize;
uint32_t frameBSize;
} VkVideoEncodeH264FrameSizeEXT;
-
frameISize
is the size in bytes to be used for I-frames. -
framePSize
is the size in bytes to be used for P-frames. -
frameBSize
is the size in bytes to be used for B-frames.
39.10. Encode H.265
This extension adds H.265 codec-specific structures/types needed to support H.265 video encoding. Unless otherwise noted, all references to the H.265 specification are to the 2013 edition published by the ITU-T, dated April 2013. This specification is available at https://www.itu.int/rec/T-REC-H.265.
39.10.1. H.265 encode profile
An H.265 encode profile is specified by including the
VkVideoEncodeH265ProfileEXT structure in the pNext
chain of the
VkVideoProfileKHR structure when
VkVideoProfileKHR::videoCodecOperation
is
VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_EXT
.
The VkVideoEncodeH265ProfileEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265ProfileEXT {
VkStructureType sType;
const void* pNext;
StdVideoH265ProfileIdc stdProfileIdc;
} VkVideoEncodeH265ProfileEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdProfileIdc
is aStdVideoH265ProfileIdc
value specifying the H.265 codec profile IDC.
39.10.2. Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with
pVideoProfile->videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_EXT
, the
VkVideoEncodeH265CapabilitiesEXT structure must be included in the
pNext
chain of the VkVideoCapabilitiesKHR structure to retrieve
more capabilities specific to H.265 video encoding.
The VkVideoEncodeH265CapabilitiesEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265CapabilitiesEXT {
VkStructureType sType;
const void* pNext;
VkVideoEncodeH265CapabilityFlagsEXT flags;
VkVideoEncodeH265InputModeFlagsEXT inputModeFlags;
VkVideoEncodeH265OutputModeFlagsEXT outputModeFlags;
VkVideoEncodeH265CtbSizeFlagsEXT ctbSizes;
VkVideoEncodeH265TransformBlockSizeFlagsEXT transformBlockSizes;
uint8_t maxPPictureL0ReferenceCount;
uint8_t maxBPictureL0ReferenceCount;
uint8_t maxL1ReferenceCount;
uint8_t maxSubLayersCount;
uint8_t minLog2MinLumaCodingBlockSizeMinus3;
uint8_t maxLog2MinLumaCodingBlockSizeMinus3;
uint8_t minLog2MinLumaTransformBlockSizeMinus2;
uint8_t maxLog2MinLumaTransformBlockSizeMinus2;
uint8_t minMaxTransformHierarchyDepthInter;
uint8_t maxMaxTransformHierarchyDepthInter;
uint8_t minMaxTransformHierarchyDepthIntra;
uint8_t maxMaxTransformHierarchyDepthIntra;
uint8_t maxDiffCuQpDeltaDepth;
uint8_t minMaxNumMergeCand;
uint8_t maxMaxNumMergeCand;
VkExtensionProperties stdExtensionVersion;
} VkVideoEncodeH265CapabilitiesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoEncodeH265CapabilityFlagBitsEXT describing supported encoding tools. -
inputModeFlags
is a bitmask of VkVideoEncodeH265InputModeFlagBitsEXT describing the command buffer input granularities/modes supported by the implementation. -
outputModeFlags
is a bitmask of VkVideoEncodeH265OutputModeFlagBitsEXT describing the output (bitstream size reporting) granularities/modes supported by the implementation. -
ctbSizes
is a bitmask of VkVideoEncodeH265CtbSizeFlagBitsEXT describing the supported CTB sizes. -
transformBlockSizes
is a bitmask of VkVideoEncodeH265TransformBlockSizeFlagBitsEXT describing the supported transform block sizes. -
maxPPictureL0ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L0 for P pictures. -
maxBPictureL0ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L0 for B pictures. The reported value is0
if encoding of B pictures is not supported. -
maxL1ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L1 if encoding of B pictures is supported. The reported value is0
if encoding of B pictures is not supported. -
maxSubLayersCount
reports the maximum number of sublayers. -
minLog2MinLumaCodingBlockSizeMinus3
reports the minimum value that may be set for log2_min_luma_coding_block_size_minus3 in StdVideoH265SequenceParameterSet. -
maxLog2MinLumaCodingBlockSizeMinus3
reports the maximum value that may be set for log2_min_luma_coding_block_size_minus3 in StdVideoH265SequenceParameterSet. -
minLog2MinLumaTransformBlockSizeMinus2
reports the minimum value that may be set for log2_min_luma_transform_block_size_minus2 in StdVideoH265SequenceParameterSet. -
maxLog2MinLumaTransformBlockSizeMinus2
reports the maximum value that may be set for log2_min_luma_transform_block_size_minus2 in StdVideoH265SequenceParameterSet. -
minMaxTransformHierarchyDepthInter
reports the minimum value that may be set for max_transform_hierarchy_depth_inter in StdVideoH265SequenceParameterSet. -
maxMaxTransformHierarchyDepthInter
reports the maximum value that may be set for max_transform_hierarchy_depth_inter in StdVideoH265SequenceParameterSet. -
minMaxTransformHierarchyDepthIntra
reports the minimum value that may be set for max_transform_hierarchy_depth_intra in StdVideoH265SequenceParameterSet. -
maxMaxTransformHierarchyDepthIntra
reports the maximum value that may be set for max_transform_hierarchy_depth_intra in StdVideoH265SequenceParameterSet. -
maxDiffCuQpDeltaDepth
reports the maximum value that may be set for diff_cu_qp_delta_depth in StdVideoH265PictureParameterSet. -
minMaxNumMergeCand
reports the minimum value that may be set for MaxNumMergeCand in StdVideoEncodeH265SliceHeader. -
maxMaxNumMergeCand
reports the maximum value that may be set for MaxNumMergeCand in StdVideoEncodeH265SliceHeader. -
stdExtensionVersion
is a VkExtensionProperties structure in which the H.265 extension name and version supported by the implementation are returned.
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265CapabilityFlagsEXT;
VkVideoEncodeH265CapabilityFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH265CapabilityFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::flags
, indicating the encoding
tools supported, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265CapabilityFlagBitsEXT {
VK_VIDEO_ENCODE_H265_CAPABILITY_SEPARATE_COLOUR_PLANE_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_CAPABILITY_SCALING_LISTS_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_CAPABILITY_SAMPLE_ADAPTIVE_OFFSET_ENABLED_BIT_EXT = 0x00000004,
VK_VIDEO_ENCODE_H265_CAPABILITY_PCM_ENABLE_BIT_EXT = 0x00000008,
VK_VIDEO_ENCODE_H265_CAPABILITY_SPS_TEMPORAL_MVP_ENABLED_BIT_EXT = 0x00000010,
VK_VIDEO_ENCODE_H265_CAPABILITY_HRD_COMPLIANCE_BIT_EXT = 0x00000020,
VK_VIDEO_ENCODE_H265_CAPABILITY_INIT_QP_MINUS26_BIT_EXT = 0x00000040,
VK_VIDEO_ENCODE_H265_CAPABILITY_LOG2_PARALLEL_MERGE_LEVEL_MINUS2_BIT_EXT = 0x00000080,
VK_VIDEO_ENCODE_H265_CAPABILITY_SIGN_DATA_HIDING_ENABLED_BIT_EXT = 0x00000100,
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSFORM_SKIP_ENABLED_BIT_EXT = 0x00000200,
VK_VIDEO_ENCODE_H265_CAPABILITY_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_BIT_EXT = 0x00000400,
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_PRED_BIT_EXT = 0x00000800,
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_BIPRED_BIT_EXT = 0x00001000,
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_PRED_NO_TABLE_BIT_EXT = 0x00002000,
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSQUANT_BYPASS_ENABLED_BIT_EXT = 0x00004000,
VK_VIDEO_ENCODE_H265_CAPABILITY_ENTROPY_CODING_SYNC_ENABLED_BIT_EXT = 0x00008000,
VK_VIDEO_ENCODE_H265_CAPABILITY_DEBLOCKING_FILTER_OVERRIDE_ENABLED_BIT_EXT = 0x00010000,
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILE_PER_FRAME_BIT_EXT = 0x00020000,
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_SLICE_PER_TILE_BIT_EXT = 0x00040000,
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILE_PER_SLICE_BIT_EXT = 0x00080000,
VK_VIDEO_ENCODE_H265_CAPABILITY_SLICE_SEGMENT_CTB_COUNT_BIT_EXT = 0x00100000,
VK_VIDEO_ENCODE_H265_CAPABILITY_ROW_UNALIGNED_SLICE_SEGMENT_BIT_EXT = 0x00200000,
VK_VIDEO_ENCODE_H265_CAPABILITY_DEPENDENT_SLICE_SEGMENT_BIT_EXT = 0x00400000,
VK_VIDEO_ENCODE_H265_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_EXT = 0x00800000,
} VkVideoEncodeH265CapabilityFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_CAPABILITY_SEPARATE_COLOUR_PLANE_BIT_EXT
reports if enabling separate_colour_plane_flag in StdVideoH265SpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SCALING_LISTS_BIT_EXT
reports if enabling scaling_list_enabled_flag and sps_scaling_list_data_present_flag in StdVideoH265SpsFlags, or enabling pps_scaling_list_data_present_flag in StdVideoH265PpsFlags are supproted. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SAMPLE_ADAPTIVE_OFFSET_ENABLED_BIT_EXT
reports if enabling sample_adaptive_offset_enabled_flag in StdVideoH265SpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_PCM_ENABLE_BIT_EXT
reports if enabling pcm_enable_flag in StdVideoH265SpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SPS_TEMPORAL_MVP_ENABLED_BIT_EXT
reports if enabling sps_temporal_mvp_enabled_flag in StdVideoH265SpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_HRD_COMPLIANCE_BIT_EXT
reports if the implementation guarantees generating a HRD compliant bitstream if nal_hrd_parameters_present_flag, vcl_hrd_parameters_present_flag, or sub_pic_hrd_params_present_flag are enabled in StdVideoH265HrdFlags, or vui_hrd_parameters_present_flag is enabled in StdVideoH265SpsVuiFlags. -
VK_VIDEO_ENCODE_H265_CAPABILITY_INIT_QP_MINUS26_BIT_EXT
reports if setting non-zero init_qp_minus26 in StdVideoH265PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_LOG2_PARALLEL_MERGE_LEVEL_MINUS2_BIT_EXT
reports if setting non-zero value for log2_parallel_merge_level_minus2 in StdVideoH265PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SIGN_DATA_HIDING_ENABLED_BIT_EXT
reports if enabling sign_data_hiding_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSFORM_SKIP_ENABLED_BIT_EXT
reports if enabling transform_skip_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_BIT_EXT
reports if enabling pps_slice_chroma_qp_offsets_present_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_PRED_BIT_EXT
reports if enabling weighted_pred_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_BIPRED_BIT_EXT
reports if enabling weighted_bipred_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_PRED_NO_TABLE_BIT_EXT
reports that when weighted_pred_flag or weighted_bipred_flag in StdVideoH265PpsFlags are enabled, the implementation is able to internally decide syntax for pred_weight_table. -
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSQUANT_BYPASS_ENABLED_BIT_EXT
reports if enabling transquant_bypass_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_ENTROPY_CODING_SYNC_ENABLED_BIT_EXT
reports if enabling entropy_coding_sync_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_DEBLOCKING_FILTER_OVERRIDE_ENABLED_BIT_EXT
reports if enabling deblocking_filter_override_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILE_PER_FRAME_BIT_EXT
reports if encoding multiple tiles per frame is supported. If not set, the implementation is only able to encode a single tile for each frame. -
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_SLICE_PER_TILE_BIT_EXT
reports if encoding multiple slices per tile is supported. If not set, the implementation is only able to encode a single slice for each tile. -
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILE_PER_SLICE_BIT_EXT
reports if encoding multiple tiles per slice is supported. If not set, the implementation is only able to encode a single tile for each slice. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SLICE_SEGMENT_CTB_COUNT_BIT_EXT
reports support for configuring VkVideoEncodeH265NaluSliceSegmentEXT::ctbCount
and slice_segment_address in StdVideoEncodeH265SliceSegmentHeader for each slice segment in a frame with multiple slice segments. If not supported, the implementation decides the number of CTBs in each slice segment based on VkVideoEncodeH265VclFrameInfoEXT::naluSliceSegmentEntryCount
. -
VK_VIDEO_ENCODE_H265_CAPABILITY_ROW_UNALIGNED_SLICE_SEGMENT_BIT_EXT
reports that each slice segment in a frame with a single or multiple tiles per slice may begin or finish at any offset in a CTB row. If not supported, all slice segments in such a frame must begin at the start of a CTB row (and hence each slice segment must finish at the end of a CTB row). Also reports that each slice segment in a frame with multiple slices per tile may begin or finish at any offset within the enclosing tile’s CTB row. If not supported, slice segments in such a frame must begin at the start of the enclosing tile’s CTB row (and hence each slice segment must finish at the end of the enclosing tile’s CTB row). -
VK_VIDEO_ENCODE_H265_CAPABILITY_DEPENDENT_SLICE_SEGMENT_BIT_EXT
reports if enabling dependent_slice_segment_flag in StdVideoEncodeH265SliceHeaderFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_EXT
reports that whenVK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_SLICE_PER_TILE_BIT_EXT
is supported and a frame is encoded with multiple slices, the implementation allows encoding each slice segment with a differentStdVideoEncodeH265SliceSegmentHeader
::slice_type. If not supported, all slice segments of the frame must be encoded with the sameslice_type
which corresponds to the picture type of the frame. For example, all slice segments of a P-frame would be encoded as P-slices.
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265InputModeFlagsEXT;
VkVideoEncodeH265InputModeFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH265InputModeFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::inputModeFlags
, indicating the
commmand buffer input granularities supported by the implementation, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265InputModeFlagBitsEXT {
VK_VIDEO_ENCODE_H265_INPUT_MODE_FRAME_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_INPUT_MODE_SLICE_SEGMENT_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_INPUT_MODE_NON_VCL_BIT_EXT = 0x00000004,
} VkVideoEncodeH265InputModeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_INPUT_MODE_FRAME_BIT_EXT
indicates that a single command buffer must at least encode an entire frame. Any non-VCL NALUs must be encoded using the same command buffer as the frame ifVK_VIDEO_ENCODE_H265_INPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H265_INPUT_MODE_SLICE_SEGMENT_BIT_EXT
indicates that a single command buffer must at least encode a single slice segment. Any non-VCL NALUs must be encoded using the same command buffer as the first slice segment of the frame ifVK_VIDEO_ENCODE_H265_INPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H265_INPUT_MODE_NON_VCL_BIT_EXT
indicates that a single command buffer may encode a non-VCL NALU by itself.
An implementation must support at least one of
VK_VIDEO_ENCODE_H265_INPUT_MODE_FRAME_BIT_EXT
or
VK_VIDEO_ENCODE_H265_INPUT_MODE_SLICE_SEGMENT_BIT_EXT
.
If VK_VIDEO_ENCODE_H265_INPUT_MODE_SLICE_SEGMENT_BIT_EXT
is not
supported, the following two additional restrictions apply for frames
encoded with multiple slice segments.
First, all frame slice segments must have the same pReferenceFinalLists.
Second, the order in which slice segments appear in
VkVideoEncodeH265VclFrameInfoEXT::pNaluSliceSegmentEntries
or in
the command buffer must match the placement order of the slice segments in
the frame.
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265OutputModeFlagsEXT;
VkVideoEncodeH265OutputModeFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH265OutputModeFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::outputModeFlags
, indicating
the minimum bitstream generation commands that must be included between
each vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR pair
(henceforth simply begin/end pair), are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265OutputModeFlagBitsEXT {
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_FRAME_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_SLICE_SEGMENT_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_NON_VCL_BIT_EXT = 0x00000004,
} VkVideoEncodeH265OutputModeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_FRAME_BIT_EXT
indicates that calls to generate all NALUs of a frame must be included within a single begin/end pair. Any non-VCL NALUs must be encoded within the same begin/end pair ifVK_VIDEO_ENCODE_H265_OUTPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_SLICE_SEGMENT_BIT_EXT
indicates that each begin/end pair must encode at least one slice segment. Any non-VCL NALUs must be encoded within the same begin/end pair as the first slice segment of the frame ifVK_VIDEO_ENCODE_H265_OUTPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_NON_VCL_BIT_EXT
indicates that each begin/end pair may encode only a non-VCL NALU by itself. An implementation must support at least one ofVK_VIDEO_ENCODE_H265_OUTPUT_MODE_FRAME_BIT_EXT
orVK_VIDEO_ENCODE_H265_OUTPUT_MODE_SLICE_SEGMENT_BIT_EXT
.
A single begin/end pair must not encode more than a single frame.
The bitstreams of NALUs generated within a single begin/end pair are written continuously into the same bitstream buffer (any padding between the NALUs must be compliant to the H.265 standard).
The supported input modes must be coarser or equal to the supported output modes. For example, it is illegal to report slice segment input is supported but only frame output is supported.
An implementation must report one of the following combinations of input/output modes:
-
Input: Frame, Output: Frame
-
Input: Frame, Output: Frame and Non-VCL
-
Input: Frame, Output: Slice Segment
-
Input: Frame, Output: Slice Segment and Non-VCL
-
Input: Slice Segment, Output: Slice Segment
-
Input: Slice Segment, Output: Slice Segment and Non-VCL
-
Input: Frame and Non-VCL, Output: Frame and Non-VCL
-
Input: Frame and Non-VCL, Output: Slice Segment and Non-VCL
-
Input: Slice Segment and Non-VCL, Output: Slice Segment and Non-VCL
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265CtbSizeFlagsEXT;
VkVideoEncodeH265CtbSizeFlagsEXT
is a bitmask type for setting a mask
of zero or more VkVideoEncodeH265CtbSizeFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::ctbSizes
, indicating the CTB
sizes supported by the implementation, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265CtbSizeFlagBitsEXT {
VK_VIDEO_ENCODE_H265_CTB_SIZE_16_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_CTB_SIZE_32_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_CTB_SIZE_64_BIT_EXT = 0x00000004,
} VkVideoEncodeH265CtbSizeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_CTB_SIZE_16_BIT_EXT
specifies that a CTB size of 16x16 is supported. -
VK_VIDEO_ENCODE_H265_CTB_SIZE_32_BIT_EXT
specifies that a CTB size of 32x32 is supported. -
VK_VIDEO_ENCODE_H265_CTB_SIZE_64_BIT_EXT
specifies that a CTB size of 64x64 is supported.
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265TransformBlockSizeFlagsEXT;
VkVideoEncodeH265TransformBlockSizeFlagsEXT
is a bitmask type for
setting a mask of zero or more
VkVideoEncodeH265TransformBlockSizeFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::transformBlockSizes
,
indicating the transform block sizes supported by the implementation, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265TransformBlockSizeFlagBitsEXT {
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_4_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_8_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_16_BIT_EXT = 0x00000004,
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_32_BIT_EXT = 0x00000008,
} VkVideoEncodeH265TransformBlockSizeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_4_BIT_EXT
specifies that a transform block size of 4x4 is supported. -
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_8_BIT_EXT
specifies that a transform block size of 8x8 is supported. -
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_16_BIT_EXT
specifies that a transform block size of 16x16 is supported. -
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_32_BIT_EXT
specifies that a transform block size of 32x32 is supported.
39.10.3. Create Information
When creating a Video Session object with
VkVideoSessionCreateInfoKHR::pVideoProfile->videoCodecOperation
specified as VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_EXT
, add a
VkVideoEncodeH265SessionCreateInfoEXT structure to the pNext
chain of the VkVideoSessionCreateInfoKHR structure passed to
vkCreateVideoSessionKHR in order to specify the H.265-specific video
encoder session creation parameters.
The VkVideoEncodeH265SessionCreateInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265SessionCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkVideoEncodeH265CreateFlagsEXT flags;
const VkExtensionProperties* pStdExtensionVersion;
} VkVideoEncodeH265SessionCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use. -
pStdExtensionVersion
is a pointer to a VkExtensionProperties structure specifying the H.265 codec extension version.
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265CreateFlagsEXT;
VkVideoEncodeH265CreateFlagsEXT
is a bitmask type for setting a mask,
but is currently reserved for future use.
39.10.4. Encoder H.265 Video Session Parameters Object
When creating a Video Session Parameters object, add a
VkVideoEncodeH265SessionParametersCreateInfoEXT structure to the
pNext
chain of the VkVideoSessionParametersCreateInfoKHR
structure passed to vkCreateVideoSessionParametersKHR in order to
specify the H.265-specific video encoder session parameters.
The VkVideoEncodeH265SessionParametersCreateInfoEXT
structure is
defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265SessionParametersCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t maxVpsStdCount;
uint32_t maxSpsStdCount;
uint32_t maxPpsStdCount;
const VkVideoEncodeH265SessionParametersAddInfoEXT* pParametersAddInfo;
} VkVideoEncodeH265SessionParametersCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxVpsStdCount
is the maximum number of entries of typeStdVideoH265VideoParameterSet
withinVkVideoSessionParametersKHR
. -
maxSpsStdCount
is the maximum number of entries of typeStdVideoH265SequenceParameterSet
withinVkVideoSessionParametersKHR
. -
maxPpsStdCount
is the maximum number of entries of typeStdVideoH265PictureParameterSet
withinVkVideoSessionParametersKHR
. -
pParametersAddInfo
isNULL
or a pointer to a VkVideoEncodeH265SessionParametersAddInfoEXT structure specifying the video session parameters to add upon creation of this object.
When a VkVideoSessionParametersKHR object contains
maxVpsStdCount
StdVideoH265VideoParameterSet
entries, no
additional StdVideoH265VideoParameterSet
entries can be added to it,
and VK_ERROR_TOO_MANY_OBJECTS
will be returned if an attempt is made
to add these entries.
When a VkVideoSessionParametersKHR object contains
maxSpsStdCount
StdVideoH265SequenceParameterSet
entries, no
additional StdVideoH265SequenceParameterSet
entries can be added to it,
and VK_ERROR_TOO_MANY_OBJECTS
will be returned if an attempt is made
to add these entries.
When a VkVideoSessionParametersKHR object contains
maxPpsStdCount
StdVideoH265PictureParameterSet
entries, no
additional StdVideoH265PictureParameterSet
entries can be added to it,
and VK_ERROR_TOO_MANY_OBJECTS
will be returned if an attempt is made
to add these entries.
The VkVideoEncodeH265SessionParametersAddInfoEXT
structure is defined
as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265SessionParametersAddInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t vpsStdCount;
const StdVideoH265VideoParameterSet* pVpsStd;
uint32_t spsStdCount;
const StdVideoH265SequenceParameterSet* pSpsStd;
uint32_t ppsStdCount;
const StdVideoH265PictureParameterSet* pPpsStd;
} VkVideoEncodeH265SessionParametersAddInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
vpsStdCount
is the number of VPS elements inpVpsStd
. -
pVpsStd
is a pointer to an array ofvpsStdCount
StdVideoH265VideoParameterSet
structures representing H.265 video parameter sets. -
spsStdCount
is the number of SPS elements inpSpsStd
. -
pSpsStd
is a pointer to an array ofspsStdCount
StdVideoH265SequenceParameterSet
structures representing H.265 sequence parameter sets. -
ppsStdCount
is the number of PPS elements inpPpsStd
. -
pPpsStd
is a pointer to an array ofppsStdCount
StdVideoH265PictureParameterSet
structures representing H.265 picture parameter sets.
39.10.5. Frame Encoding
In order to encode a frame, add a VkVideoEncodeH265VclFrameInfoEXT
structure to the pNext
chain of the VkVideoEncodeInfoKHR
structure passed to the vkCmdEncodeVideoKHR command.
The VkVideoEncodeH265VclFrameInfoEXT structure representing a frame encode operation is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265VclFrameInfoEXT {
VkStructureType sType;
const void* pNext;
const VkVideoEncodeH265ReferenceListsEXT* pReferenceFinalLists;
uint32_t naluSliceSegmentEntryCount;
const VkVideoEncodeH265NaluSliceSegmentEXT* pNaluSliceSegmentEntries;
const StdVideoEncodeH265PictureInfo* pCurrentPictureInfo;
} VkVideoEncodeH265VclFrameInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pReferenceFinalLists
isNULL
or a pointer to a VkVideoEncodeH265ReferenceListsEXT structure specifying the reference lists to be used for the current picture. -
naluSliceSegmentEntryCount
is the number of slice segment NALUs in the frame. -
pNaluSliceSegmentEntries
is a pointer to an array of VkVideoEncodeH265NaluSliceSegmentEXT structures specifying the division of the current picture into slice segments and the properties of these slice segments. -
pCurrentPictureInfo
is a pointer to aStdVideoEncodeH265PictureInfo
structure specifying the syntax and other codec-specific information from the H.265 specification, associated with this picture.
The VkVideoEncodeH265NaluSliceSegmentEXT structure representing a slice segment is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265NaluSliceSegmentEXT {
VkStructureType sType;
const void* pNext;
uint32_t ctbCount;
const VkVideoEncodeH265ReferenceListsEXT* pReferenceFinalLists;
const StdVideoEncodeH265SliceSegmentHeader* pSliceSegmentHeaderStd;
} VkVideoEncodeH265NaluSliceSegmentEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
ctbCount
is the number of CTBs in this slice segment. -
pReferenceFinalLists
isNULL
or a pointer to a VkVideoEncodeH265ReferenceListsEXT structure specifying the reference lists to be used for the current slice segment. IfpReferenceFinalLists
is notNULL
, these reference lists override the reference lists provided in VkVideoEncodeH265VclFrameInfoEXT::pReferenceFinalLists
. -
pSliceSegmentHeaderStd
is a pointer to aStdVideoEncodeH265SliceSegmentHeader
structure specifying the slice segment header for the current slice segment.
The VkVideoEncodeH265DpbSlotInfoEXT structure, representing a reconstructed picture that is being used as a reference picture, is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265DpbSlotInfoEXT {
VkStructureType sType;
const void* pNext;
int8_t slotIndex;
const StdVideoEncodeH265ReferenceInfo* pStdReferenceInfo;
} VkVideoEncodeH265DpbSlotInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
slotIndex
is the DPB Slot index for this picture. -
pStdReferenceInfo
is a pointer to aStdVideoEncodeH265ReferenceInfo
structure specifying the syntax and other codec-specific information from the H.265 specification, associated with this reference picture.
The VkVideoEncodeH265ReferenceListsEXT structure representing reference lists is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265ReferenceListsEXT {
VkStructureType sType;
const void* pNext;
uint8_t referenceList0EntryCount;
const VkVideoEncodeH265DpbSlotInfoEXT* pReferenceList0Entries;
uint8_t referenceList1EntryCount;
const VkVideoEncodeH265DpbSlotInfoEXT* pReferenceList1Entries;
const StdVideoEncodeH265ReferenceModifications* pReferenceModifications;
} VkVideoEncodeH265ReferenceListsEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
referenceList0EntryCount
is the number of reference pictures in reference list L0 and is identical toStdVideoEncodeH265SliceSegmentHeader
::num_ref_idx_l0_active_minus1
+ 1. -
pReferenceList0Entries
is a pointer to an array ofreferenceList0EntryCount
VkVideoEncodeH265DpbSlotInfoEXT structures specifying the reference list L0 entries for the current picture. -
referenceList1EntryCount
is the number of reference pictures in reference list L1 and is identical toStdVideoEncodeH265SliceSegmentHeader
::num_ref_idx_l1_active_minus1
+ 1. -
pReferenceList1Entries
is a pointer to an array ofreferenceList1EntryCount
VkVideoEncodeH265DpbSlotInfoEXT structures specifying the reference list L1 entries for the current picture. -
pReferenceModifications
is a pointer to aStdVideoEncodeH265ReferenceModifications
structure specifying reference list modifications.
The VkVideoEncodeH265EmitPictureParametersEXT structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265EmitPictureParametersEXT {
VkStructureType sType;
const void* pNext;
uint8_t vpsId;
uint8_t spsId;
VkBool32 emitVpsEnable;
VkBool32 emitSpsEnable;
uint32_t ppsIdEntryCount;
const uint8_t* ppsIdEntries;
} VkVideoEncodeH265EmitPictureParametersEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
vpsId
is the H.265 VPS ID for the H.265 VPS to insert in the bitstream. The VPS ID must match the VPS provided invpsStd
of VkVideoEncodeH265SessionParametersCreateInfoEXT. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR. -
spsId
is the H.265 SPS ID for the H.265 SPS to insert in the bitstream. The SPS ID must match one of the IDs of the SPS(s) provided inpSpsStd
of VkVideoEncodeH265SessionParametersCreateInfoEXT to identify the SPS parameter set to insert in the bitstream. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR. -
emitVpsEnable
enables the emitting of the VPS structure with id ofvpsId
. -
emitSpsEnable
enables the emitting of the SPS structure with id ofspsId
. -
ppsIdEntryCount
is the number of entries in theppsIdEntries
. If this parameter is0
then no pps entries are going to be emitted in the bitstream. -
ppsIdEntries
is the H.265 PPS IDs for the H.265 PPS to insert in the bitstream. The PPS IDs must match one of the IDs of the PPS(s) provided inpPpsStd
of VkVideoEncodeH265SessionParametersCreateInfoEXT to identify the PPS parameter set to insert in the bitstream. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR.
39.10.6. Rate control
The VkVideoEncodeH265RateControlInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265RateControlInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t gopFrameCount;
uint32_t idrPeriod;
uint32_t consecutiveBFrameCount;
VkVideoEncodeH265RateControlStructureFlagBitsEXT rateControlStructure;
uint8_t subLayerCount;
} VkVideoEncodeH265RateControlInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
gopFrameCount
is the number of frames contained within the group of pictures (GOP), starting from an intra frame and until the next intra frame. If it is set to 0, the implementation chooses a suitable value. If it is set toUINT32_MAX
, the GOP length is treated as infinite. -
idrPeriod
is the interval, in terms of number of frames, between two IDR frames. If it is set to 0, the implementation chooses a suitable value. If it is set toUINT32_MAX
, the IDR period is treated as infinite. -
consecutiveBFrameCount
is the number of consecutive B-frames between I- and/or P-frames within the GOP. -
rateControlStructure
is a VkVideoEncodeH265RateControlStructureFlagBitsEXT value specifying the expected encode stream reference structure, to aid in rate control calculations. -
subLayerCount
specifies the number of sub layers enabled in the stream.
In order to provide H.265-specific stream rate control parameters, add a
VkVideoEncodeH265RateControlInfoEXT
structure to the pNext
chain
of the VkVideoEncodeRateControlInfoKHR structure in the pNext
chain of the VkVideoCodingControlInfoKHR structure passed to the
vkCmdControlVideoCodingKHR command.
The parameters from this structure act as a guidance for implementations to apply various rate control heuristics.
It is possible to infer the picture type to be used when encoding a frame,
on the basis of the values provided for consecutiveBFrameCount
,
idrPeriod
, and gopFrameCount
, but this inferred picture type
will not be used by implementations to override the picture type provided in
vkCmdEncodeVideoKHR.
Additionally, it is not required for the video session to be reset if the
inferred picture type does not match the actual picture type.
Possible values of
VkVideoEncodeH265RateControlInfoEXT::rateControlStructure
,
specifying a video stream reference structure as a hint for the rate control
implementation, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265RateControlStructureFlagBitsEXT {
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_UNKNOWN_EXT = 0,
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_FLAT_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_DYADIC_BIT_EXT = 0x00000002,
} VkVideoEncodeH265RateControlStructureFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_UNKNOWN_EXT
is0
, and specifies a reference structure unknown at the time of stream rate control configuration. -
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_FLAT_BIT_EXT
specifies a flat reference structure. -
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_DYADIC_BIT_EXT
specifies a dyadic reference structure.
The VkVideoEncodeH265RateControlLayerInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265RateControlLayerInfoEXT {
VkStructureType sType;
const void* pNext;
uint8_t temporalId;
VkBool32 useInitialRcQp;
VkVideoEncodeH265QpEXT initialRcQp;
VkBool32 useMinQp;
VkVideoEncodeH265QpEXT minQp;
VkBool32 useMaxQp;
VkVideoEncodeH265QpEXT maxQp;
VkBool32 useMaxFrameSize;
VkVideoEncodeH265FrameSizeEXT maxFrameSize;
} VkVideoEncodeH265RateControlLayerInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
temporalId
specifies the H.265 temporal ID of the video coding layer that settings provided in this structure and its parent VkVideoEncodeRateControlLayerInfoKHR structure apply to. -
useInitialRcQp
indicates whether the values withininitialRcQp
should be used by the implementation. -
initialRcQp
provides the QP values for each picture type, to be used in rate control calculations at the start of video encode operations on a newly-created video session, or immediately after a session reset. These values are ignored when VkVideoEncodeRateControlInfoKHR::rateControlMode
isVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
useMinQp
indicates whether the values withinminQp
should be used by the implementation. When it is set toVK_FALSE
, the implementation ignores the values inminQp
and chooses suitable values. -
minQp
provides the lower bound on the QP values for each picture type, to be used in rate control calculations. -
useMaxQp
indicates whether the values withinmaxQp
should be used by the implementation. When it is set toVK_FALSE
, the implementation ignores the values inmaxQp
and chooses suitable values. -
maxQp
provides the upper bound on the QP values for each picture type, to be used in rate control calculations. -
useMaxFrameSize
indicates whether the values withinmaxFrameSize
should be used by the implementation. -
maxFrameSize
provides the upper bound on the encoded frame size for each picture type. The implementation does not guarantee the encoded frame sizes will be within the specified limits, however these limits may be used as a guide in rate control calculations. If enabled and not set properly, themaxQp
limit may prevent the implementation from respecting themaxFrameSize
limit.
H.265-specific per-layer rate control parameters must be specified by
adding a VkVideoEncodeH265RateControlLayerInfoEXT
structure to the
pNext
chain of each VkVideoEncodeRateControlLayerInfoKHR
structure in a call to vkCmdControlVideoCodingKHR command, when the
command buffer context has an active video encode H.265 session.
The VkVideoEncodeH265QpEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265QpEXT {
int32_t qpI;
int32_t qpP;
int32_t qpB;
} VkVideoEncodeH265QpEXT;
-
qpI
is the QP to be used for I-frames. -
qpP
is the QP to be used for P-frames. -
qpB
is the QP to be used for B-frames.
The VkVideoEncodeH265FrameSizeEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265FrameSizeEXT {
uint32_t frameISize;
uint32_t framePSize;
uint32_t frameBSize;
} VkVideoEncodeH265FrameSizeEXT;
-
frameISize
is the size in bytes to be used for I-frames. -
framePSize
is the size in bytes to be used for P-frames. -
frameBSize
is the size in bytes to be used for B-frames.