42. Video Coding
Vulkan implementations may expose one or more queue families supporting video coding operations. These operations are performed by recording them into a command buffer within a video coding scope, and submitting them to queues with compatible video coding capabilities.
The Vulkan video functionalities are designed to be made available through a set of APIs built on top of each other, consisting of:
-
A core API providing common video coding functionalities,
-
APIs providing codec-independent video decode and video encode related functionalities, respectively,
-
Additional codec-specific APIs built on top of those.
This chapter details the fundamental components and operations of these.
42.1. Video Picture Resources
In the context of video coding, multidimensional arrays of image data that can be used as the source or target of video coding operations are referred to as video picture resources. They may store additional metadata that includes implementation-private information used during the execution of video coding operations, as discussed later.
Video picture resources are backed by VkImage objects. Individual subregions of VkImageView objects created from such resources can be used as decode output pictures, encode input pictures, reconstructed pictures, and/or reference pictures.
The parameters of a video picture resource are specified using a
VkVideoPictureResourceInfoKHR
structure.
The VkVideoPictureResourceInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoPictureResourceInfoKHR {
VkStructureType sType;
const void* pNext;
VkOffset2D codedOffset;
VkExtent2D codedExtent;
uint32_t baseArrayLayer;
VkImageView imageViewBinding;
} VkVideoPictureResourceInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
codedOffset
is the offset in texels of the image subregion to use. -
codedExtent
is the size in pixels of the coded image data. -
baseArrayLayer
is the array layer of the image view specified inimageViewBinding
to use as the video picture resource. -
imageViewBinding
is an image view representing the video picture resource.
The image subresource referred to by such a structure is defined as the
image array layer index specified in baseArrayLayer
relative to the
image subresource range the image view specified in imageViewBinding
was created with.
The meaning of the codedOffset
and codedExtent
depends on the
command and context the video picture resource is used in, as well as on the
used video profile and corresponding codec-specific
semantics, as described later.
A video picture resource is uniquely defined by the image subresource
referred to by an instance of this structure, together with the
codedOffset
and codedExtent
members that identify the image
subregion within the image subresource referenced corresponding to the video
picture resource according to the particular codec-specific semantics.
Accesses to image data within a video picture resource happen at the
granularity indicated by
VkVideoCapabilitiesKHR::pictureAccessGranularity
, as returned by
vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile.
As a result, given an effective image subregion corresponding to a video
picture resource, the actual image subregion accessed may be larger than
that as it may include additional padding texels due to the picture access
granularity.
Any writes performed by video coding operations to such padding texels will
result in undefined texel values.
Two video picture resources match if they refer to the same image
subresource and they specify identical codedOffset
and
codedExtent
values.
42.2. Decoded Picture Buffer
An integral part of video coding pipelines is the reconstruction of pictures from a compressed video bitstream. A reconstructed picture is a video picture resource resulting from this process.
Such reconstructed pictures can be used as reference pictures in subsequent video coding operations to provide predictions of the values of samples of subsequently decoded or encoded pictures. The correct use of such reconstructed pictures as reference pictures is driven by the video compression standard, the implementation, and the application-specific use cases.
The list of reference pictures used to provide such predictions within a single video coding operation is referred to as the list of active reference pictures.
The decoded picture buffer (DPB) is an indexed data structure that
maintains the set of reference pictures available to be used in video coding
operations.
Individual indexed entries of the DPB are referred to as the
decoded picture buffer (DPB) slots.
The range of valid DPB slot indices is between zero and
N-1
, where N
is the capacity of the DPB.
Each DPB slot can refer to a reference picture containing a video frame
or can refer to up to two reference pictures containing the top and/or
bottom fields that, when both present, together represent a full video
frame
.
In Vulkan, the state and the backing store of the DPB is separated as follows:
-
The state of individual DPB slots is maintained by video session objects.
-
The backing store of DPB slots is provided by subregions of VkImage objects used as video picture resources.
In addition, the implementation may also maintain opaque metadata associated with DPB slots, including:
Such metadata may be stored by the implementation as part of the DPB slot state maintained by the video session, or as part of the video picture resource backing the DPB slot.
Any metadata stored in the video picture resources backing DPB slots are independent of the video session used to store it, hence such video picture resources can be shared with other video sessions. Correspondingly, any metadata that is dependent on the video session will always be stored as part of the DPB slot state maintained by that video session.
The responsibility of managing the DPB is split between the application and the implementation as follows:
-
The application maintains the association between DPB slot indices and corresponding video picture resources.
-
The implementation maintains global and per-slot opaque reference picture metadata.
In addition, the application is also responsible for managing the mapping between the codec-specific picture IDs and DPB slots, and any other codec-specific states unless otherwise specified.
42.2.1. DPB Slot States
At a given time, each DPB slot is either in active or inactive state. Initially, all DPB slots managed by a video session are in inactive state.
A DPB slot can be activated by using it as the target of picture reconstruction within a video coding operation, changing its state to active.
As part of the picture reconstruction, the implementation may also generate reference picture metadata.
If such a video coding operation completes successfully, the activated DPB slot will have a valid picture reference and the reconstructed picture is associated with the DPB slot. This is true even if the DPB slot is used as the target of a picture reconstruction that only sets up a top field or bottom field reference picture and thus does not yet refer to a complete frame. However, if any data provided as input to such a video coding operation is not compliant to the video compression standard used, that video coding operation may complete unsuccessfully, in which case the activated DPB slot will have an invalid picture reference. This is true even if the DPB slot previously had a valid picture reference to a top field or bottom field reference picture, but the reconstruction of the other field corresponding to the DPB slot failed.
The application can use queries to get feedback about the outcome of video coding operations and use the resulting VkQueryResultStatusKHR value to determine whether the video coding operation completed successfully (result status is positive) or unsuccessfully (result status is negative).
Using a reference picture associated with a DPB slot that has an invalid picture reference as an active reference picture in subsequent video coding operations is legal, however, the contents of the outputs of such operations are undefined, and any DPB slots activated by such video coding operations will also have an invalid picture reference. This is true even if such video coding operations may otherwise complete successfully.
A DPB slot can also be deactivated by the application, changing its state to inactive and invalidating any picture references and reference picture metadata associated with the DPB slot.
A DPB slot can be activated with a new frame even if it is already active. In this case all previous associations of the DPB slots with reference pictures are replaced with an association with the reconstructed picture used to activate it. If an already active DPB slot is activated with a reconstructed field picture, then the behavior is as follows:
-
If the DPB slot is currently associated with a frame, then that association is replaced with an association with the reconstructed field picture used to activate it.
-
If the DPB slot is not currently associated with a top field picture and the DPB slot is activated with a top field picture, or if the DPB slot is not currently associated with a bottom field picture and the DPB slot is activated with a bottom field picture, then the DPB slot is associated with the reconstructed field picture used to activate it, without disturbing the other field picture association, if any.
-
If the DPB slot is currently associated with a top field picture and the DPB slot is activated with a new top field picture, or if the DPB slot is currently associated with a bottom field picture and the DPB slot is activated with a new bottom field picture, then that association is replaced with an association with the reconstructed field picture used to activate it, without disturbing the other field picture association, if any.
42.3. Video Profiles
The VkVideoProfileInfoKHR
structure is defined as follows:
// Provided by VK_KHR_video_queue
typedef struct VkVideoProfileInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoCodecOperationFlagBitsKHR videoCodecOperation;
VkVideoChromaSubsamplingFlagsKHR chromaSubsampling;
VkVideoComponentBitDepthFlagsKHR lumaBitDepth;
VkVideoComponentBitDepthFlagsKHR chromaBitDepth;
} VkVideoProfileInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
videoCodecOperation
is a VkVideoCodecOperationFlagBitsKHR value specifying a video codec operation. -
chromaSubsampling
is a bitmask of VkVideoChromaSubsamplingFlagBitsKHR specifying video chroma subsampling information. -
lumaBitDepth
is a bitmask of VkVideoComponentBitDepthFlagBitsKHR specifying video luma bit depth information. -
chromaBitDepth
is a bitmask of VkVideoComponentBitDepthFlagBitsKHR specifying video chroma bit depth information.
Video profiles are provided as input to video capability queries such as vkGetPhysicalDeviceVideoCapabilitiesKHR or vkGetPhysicalDeviceVideoFormatPropertiesKHR, as well as when creating resources to be used by video coding operations such as images, buffers, query pools, and video sessions.
The full description of a video profile is specified by an instance of this
structure, and the codec-specific and auxiliary structures provided in its
pNext
chain.
When this structure is specified as an input parameter to
vkGetPhysicalDeviceVideoCapabilitiesKHR, or through the
pProfiles
member of an VkVideoProfileListInfoKHR structure in
the pNext
chain of the input parameter of a query command such as
vkGetPhysicalDeviceVideoFormatPropertiesKHR or
vkGetPhysicalDeviceImageFormatProperties2, the following error codes
indicate specific causes of the failure of the query operation:
-
VK_ERROR_VIDEO_PICTURE_LAYOUT_NOT_SUPPORTED_KHR
indicates that the requested video picture layout (e.g. through thepictureLayout
member of a VkVideoDecodeH264ProfileInfoKHR structure included in thepNext
chain ofVkVideoProfileInfoKHR
) is not supported. -
VK_ERROR_VIDEO_PROFILE_OPERATION_NOT_SUPPORTED_KHR
indicates that a video profile operation specified byvideoCodecOperation
is not supported. -
VK_ERROR_VIDEO_PROFILE_FORMAT_NOT_SUPPORTED_KHR
indicates that video format parameters specified bychromaSubsampling
,lumaBitDepth
, orchromaBitDepth
are not supported. -
VK_ERROR_VIDEO_PROFILE_CODEC_NOT_SUPPORTED_KHR
indicates that the codec-specific parameters corresponding to the video codec operation are not supported.
Possible values of VkVideoProfileInfoKHR::videoCodecOperation
,
specifying the type of video coding operation and video compression standard
used by a video profile, are:
// Provided by VK_KHR_video_queue
typedef enum VkVideoCodecOperationFlagBitsKHR {
VK_VIDEO_CODEC_OPERATION_NONE_KHR = 0,
#ifdef VK_ENABLE_BETA_EXTENSIONS
// Provided by VK_EXT_video_encode_h264
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT = 0x00010000,
#endif
#ifdef VK_ENABLE_BETA_EXTENSIONS
// Provided by VK_EXT_video_encode_h265
VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_EXT = 0x00020000,
#endif
// Provided by VK_KHR_video_decode_h264
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR = 0x00000001,
// Provided by VK_KHR_video_decode_h265
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR = 0x00000002,
} VkVideoCodecOperationFlagBitsKHR;
-
VK_VIDEO_CODEC_OPERATION_NONE_KHR
indicates no support for any video codec operations. -
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
specifies support for H.264 video decode operations. -
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
specifies support for H.265 video decode operations. -
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT
specifies support for H.264 video encode operations. -
VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_EXT
specifies support for H.265 video encode operations.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCodecOperationFlagsKHR;
VkVideoCodecOperationFlagsKHR
is a bitmask type for setting a mask of
zero or more VkVideoCodecOperationFlagBitsKHR.
The video format chroma subsampling is defined with the following enums:
// Provided by VK_KHR_video_queue
typedef enum VkVideoChromaSubsamplingFlagBitsKHR {
VK_VIDEO_CHROMA_SUBSAMPLING_INVALID_KHR = 0,
VK_VIDEO_CHROMA_SUBSAMPLING_MONOCHROME_BIT_KHR = 0x00000001,
VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR = 0x00000002,
VK_VIDEO_CHROMA_SUBSAMPLING_422_BIT_KHR = 0x00000004,
VK_VIDEO_CHROMA_SUBSAMPLING_444_BIT_KHR = 0x00000008,
} VkVideoChromaSubsamplingFlagBitsKHR;
-
VK_VIDEO_CHROMA_SUBSAMPLING_MONOCHROME_BIT_KHR
specifies that the format is monochrome. -
VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR
specified that the format is 4:2:0 chroma subsampled, i.e. the two chroma components are sampled horizontally and vertically at half the sample rate of the luma component. -
VK_VIDEO_CHROMA_SUBSAMPLING_422_BIT_KHR
- the format is 4:2:2 chroma subsampled, i.e. the two chroma components are sampled horizontally at half the sample rate of luma component. -
VK_VIDEO_CHROMA_SUBSAMPLING_444_BIT_KHR
- the format is 4:4:4 chroma sampled, i.e. all three components of the Y′CBCR format are sampled at the same rate, thus there is no chroma subsampling.
Chroma subsampling is described in more detail in the Chroma Reconstruction section.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoChromaSubsamplingFlagsKHR;
VkVideoChromaSubsamplingFlagsKHR
is a bitmask type for setting a mask
of zero or more VkVideoChromaSubsamplingFlagBitsKHR.
Possible values for the video format component bit depth are:
// Provided by VK_KHR_video_queue
typedef enum VkVideoComponentBitDepthFlagBitsKHR {
VK_VIDEO_COMPONENT_BIT_DEPTH_INVALID_KHR = 0,
VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR = 0x00000001,
VK_VIDEO_COMPONENT_BIT_DEPTH_10_BIT_KHR = 0x00000004,
VK_VIDEO_COMPONENT_BIT_DEPTH_12_BIT_KHR = 0x00000010,
} VkVideoComponentBitDepthFlagBitsKHR;
-
VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR
specifies a component bit depth of 8 bits. -
VK_VIDEO_COMPONENT_BIT_DEPTH_10_BIT_KHR
specifies a component bit depth of 10 bits. -
VK_VIDEO_COMPONENT_BIT_DEPTH_12_BIT_KHR
specifies a component bit depth of 12 bits.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoComponentBitDepthFlagsKHR;
VkVideoComponentBitDepthFlagsKHR
is a bitmask type for setting a mask
of zero or more VkVideoComponentBitDepthFlagBitsKHR.
Additional information about the video decode use case can be provided by
adding a VkVideoDecodeUsageInfoKHR
structure to the pNext
chain
of VkVideoProfileInfoKHR.
The VkVideoDecodeUsageInfoKHR
structure is defined as:
// Provided by VK_KHR_video_decode_queue
typedef struct VkVideoDecodeUsageInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoDecodeUsageFlagsKHR videoUsageHints;
} VkVideoDecodeUsageInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
videoUsageHints
is a bitmask of VkVideoDecodeUsageFlagBitsKHR specifying hints about the intended use of the video decode profile.
The following bits can be specified in
VkVideoDecodeUsageInfoKHR::videoUsageHints
as a hint about the
video decode use case:
// Provided by VK_KHR_video_decode_queue
typedef enum VkVideoDecodeUsageFlagBitsKHR {
VK_VIDEO_DECODE_USAGE_DEFAULT_KHR = 0,
VK_VIDEO_DECODE_USAGE_TRANSCODING_BIT_KHR = 0x00000001,
VK_VIDEO_DECODE_USAGE_OFFLINE_BIT_KHR = 0x00000002,
VK_VIDEO_DECODE_USAGE_STREAMING_BIT_KHR = 0x00000004,
} VkVideoDecodeUsageFlagBitsKHR;
-
VK_VIDEO_DECODE_USAGE_TRANSCODING_BIT_KHR
specifies that video decoding is intended to be used in conjunction with video encoding to transcode a video bitstream with the same and/or different codecs. -
VK_VIDEO_DECODE_USAGE_OFFLINE_BIT_KHR
specifies that video decoding is intended to be used to consume a local video bitstream. -
VK_VIDEO_DECODE_USAGE_STREAMING_BIT_KHR
specifies that video decoding is intended to be used to consume a video bitstream received as a continuous flow over network.
Note
There are no restrictions on the combination of bits that can be specified by the application. However, applications should use reasonable combinations in order for the implementation to be able to select the most appropriate mode of operation for the particular use case. |
// Provided by VK_KHR_video_decode_queue
typedef VkFlags VkVideoDecodeUsageFlagsKHR;
VkVideoDecodeUsageFlagsKHR
is a bitmask type for setting a mask of
zero or more VkVideoDecodeUsageFlagBitsKHR.
Additional information about the video encode use case can be provided by
adding a VkVideoEncodeUsageInfoKHR
structure to the pNext
chain
of VkVideoProfileInfoKHR.
The VkVideoEncodeUsageInfoKHR
structure is defined as:
// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeUsageInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoEncodeUsageFlagsKHR videoUsageHints;
VkVideoEncodeContentFlagsKHR videoContentHints;
VkVideoEncodeTuningModeKHR tuningMode;
} VkVideoEncodeUsageInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
videoUsageHints
is a bitmask of VkVideoEncodeUsageFlagBitsKHR specifying hints about the intended use of the video encode profile. -
videoContentHints
is a bitmask of VkVideoEncodeContentFlagBitsKHR specifying hints about the content to be encoded using the video encode profile. -
tuningMode
is a VkVideoEncodeTuningModeKHR value specifying the tuning mode to use when encoding with the video profile.
The following bits can be specified in
VkVideoEncodeUsageInfoKHR::videoUsageHints
as a hint about the
video encode use case:
// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeUsageFlagBitsKHR {
VK_VIDEO_ENCODE_USAGE_DEFAULT_KHR = 0,
VK_VIDEO_ENCODE_USAGE_TRANSCODING_BIT_KHR = 0x00000001,
VK_VIDEO_ENCODE_USAGE_STREAMING_BIT_KHR = 0x00000002,
VK_VIDEO_ENCODE_USAGE_RECORDING_BIT_KHR = 0x00000004,
VK_VIDEO_ENCODE_USAGE_CONFERENCING_BIT_KHR = 0x00000008,
} VkVideoEncodeUsageFlagBitsKHR;
-
VK_VIDEO_ENCODE_USAGE_TRANSCODING_BIT_KHR
specifies that video encoding is intended to be used in conjunction with video decoding to transcode a video bitstream with the same and/or different codecs. -
VK_VIDEO_ENCODE_USAGE_STREAMING_BIT_KHR
specifies that video encoding is intended to be used to produce a video bitstream that is expected to be sent as a continuous flow over network. -
VK_VIDEO_ENCODE_USAGE_RECORDING_BIT_KHR
specifies that video encoding is intended to be used for real-time recording for offline consumption. -
VK_VIDEO_ENCODE_USAGE_CONFERENCING_BIT_KHR
specifies that video encoding is intended to be used in a video conferencing scenario.
Note
There are no restrictions on the combination of bits that can be specified by the application. However, applications should use reasonable combinations in order for the implementation to be able to select the most appropriate mode of operation for the particular use case. |
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeUsageFlagsKHR;
VkVideoEncodeUsageFlagsKHR
is a bitmask type for setting a mask of
zero or more VkVideoEncodeUsageFlagBitsKHR.
The following bits can be specified in
VkVideoEncodeUsageInfoKHR::videoContentHints
as a hint about the
encoded video content:
// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeContentFlagBitsKHR {
VK_VIDEO_ENCODE_CONTENT_DEFAULT_KHR = 0,
VK_VIDEO_ENCODE_CONTENT_CAMERA_BIT_KHR = 0x00000001,
VK_VIDEO_ENCODE_CONTENT_DESKTOP_BIT_KHR = 0x00000002,
VK_VIDEO_ENCODE_CONTENT_RENDERED_BIT_KHR = 0x00000004,
} VkVideoEncodeContentFlagBitsKHR;
-
VK_VIDEO_ENCODE_CONTENT_CAMERA_BIT_KHR
specifies that video encoding is intended to be used to encode camera content. -
VK_VIDEO_ENCODE_CONTENT_DESKTOP_BIT_KHR
specifies that video encoding is intended to be used to encode desktop content. -
VK_VIDEO_ENCODE_CONTENT_RENDERED_BIT_KHR
specified that video encoding is intended to be used to encode rendered (e.g. game) content.
Note
There are no restrictions on the combination of bits that can be specified by the application. However, applications should use reasonable combinations in order for the implementation to be able to select the most appropriate mode of operation for the particular content type. |
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeContentFlagsKHR;
VkVideoEncodeContentFlagsKHR
is a bitmask type for setting a mask of
zero or more VkVideoEncodeContentFlagBitsKHR.
Possible video encode tuning mode values are as follows:
// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeTuningModeKHR {
VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR = 0,
VK_VIDEO_ENCODE_TUNING_MODE_HIGH_QUALITY_KHR = 1,
VK_VIDEO_ENCODE_TUNING_MODE_LOW_LATENCY_KHR = 2,
VK_VIDEO_ENCODE_TUNING_MODE_ULTRA_LOW_LATENCY_KHR = 3,
VK_VIDEO_ENCODE_TUNING_MODE_LOSSLESS_KHR = 4,
} VkVideoEncodeTuningModeKHR;
-
VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR
specifies the default tuning mode. -
VK_VIDEO_ENCODE_TUNING_MODE_HIGH_QUALITY_KHR
specifies that video encoding is tuned for high quality. When using this tuning mode, the implementation may compromise the latency of video encoding operations to improve quality. -
VK_VIDEO_ENCODE_TUNING_MODE_LOW_LATENCY_KHR
specifies that video encoding is tuned for low latency. When using this tuning mode, the implementation may compromise quality to increase the performance and lower the latency of video encode operations. -
VK_VIDEO_ENCODE_TUNING_MODE_ULTRA_LOW_LATENCY_KHR
specifies that video encoding is tuned for ultra-low latency. When using this tuning mode, the implementation may compromise quality to maximize the performance and minimize the latency of video encoding operations. -
VK_VIDEO_ENCODE_TUNING_MODE_LOSSLESS_KHR
specifies that video encoding is tuned for lossless encoding. When using this tuning mode, video encode operations produce lossless output.
The VkVideoProfileListInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoProfileListInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t profileCount;
const VkVideoProfileInfoKHR* pProfiles;
} VkVideoProfileListInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
profileCount
is the number of elements in thepProfiles
array. -
pProfiles
is a pointer to an array of VkVideoProfileInfoKHR structures.
Note:
Video transcoding is an example of a use case that necessitates the specification of multiple profiles in various contexts. |
When the application provides a video decode profile and one or more video encode profiles in the profile list, the implementation ensures that any capabilitities returned or resources created are suitable for the video transcoding use cases without the need for manual data transformations.
42.4. Video Capabilities
42.4.1. Video Coding Capabilities
To query video coding capabilities for a specific video profile, call:
// Provided by VK_KHR_video_queue
VkResult vkGetPhysicalDeviceVideoCapabilitiesKHR(
VkPhysicalDevice physicalDevice,
const VkVideoProfileInfoKHR* pVideoProfile,
VkVideoCapabilitiesKHR* pCapabilities);
-
physicalDevice
is the physical device from which to query the video decode or encode capabilities. -
pVideoProfile
is a pointer to a VkVideoProfileInfoKHR structure. -
pCapabilities
is a pointer to a VkVideoCapabilitiesKHR structure in which the capabilities are returned.
If the video profile described by pVideoProfile
is
supported by the implementation, then this command returns VK_SUCCESS
and pCapabilities
is filled with the capabilities supported with the
specified video profile.
Otherwise, one of the video-profile-specific
error codes are returned.
The VkVideoCapabilitiesKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoCapabilitiesKHR {
VkStructureType sType;
void* pNext;
VkVideoCapabilityFlagsKHR flags;
VkDeviceSize minBitstreamBufferOffsetAlignment;
VkDeviceSize minBitstreamBufferSizeAlignment;
VkExtent2D pictureAccessGranularity;
VkExtent2D minCodedExtent;
VkExtent2D maxCodedExtent;
uint32_t maxDpbSlots;
uint32_t maxActiveReferencePictures;
VkExtensionProperties stdHeaderVersion;
} VkVideoCapabilitiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoCapabilityFlagBitsKHR specifying capability flags. -
minBitstreamBufferOffsetAlignment
is the minimum alignment for bitstream buffer offsets. -
minBitstreamBufferSizeAlignment
is the minimum alignment for bitstream buffer range sizes. -
pictureAccessGranularity
is the granularity at which image access to video picture resources happen. -
minCodedExtent
is the minimum width and height of the coded frames. -
maxCodedExtent
is the maximum width and height of the coded frames. -
maxDpbSlots
is the maximum number of DPB slots supported by a single video session. -
maxActiveReferencePictures
is the maximum number of active reference pictures a single video coding operation can use. -
stdHeaderVersion
is a VkExtensionProperties structure reporting the Video Std header name and version supported for the video profile.
Note:
It is common for video compression standards to allow using all reference
pictures associated with active DPB slots as active reference pictures,
hence for video decode profiles the values returned in |
Bits which can be set in VkVideoCapabilitiesKHR::flags
are:
// Provided by VK_KHR_video_queue
typedef enum VkVideoCapabilityFlagBitsKHR {
VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR = 0x00000001,
VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR = 0x00000002,
} VkVideoCapabilityFlagBitsKHR;
-
VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR
specifies that video sessions support producing and consuming protected content. -
VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR
specifies that the video picture resources associated with the DPB slots of a video session can be backed by separateVkImage
objects. If this capability flag is not present, then all DPB slots of a video session must be associated with video picture resources backed by the sameVkImage
object (e.g. using different layers of the same image).
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCapabilityFlagsKHR;
VkVideoCapabilityFlagsKHR
is a bitmask type for setting a mask of zero
or more VkVideoCapabilityFlagBitsKHR.
42.4.2. Video Format Capabilities
To enumerate the supported output, input and DPB image formats and corresponding capabilities for a specific video profile, call:
// Provided by VK_KHR_video_queue
VkResult vkGetPhysicalDeviceVideoFormatPropertiesKHR(
VkPhysicalDevice physicalDevice,
const VkPhysicalDeviceVideoFormatInfoKHR* pVideoFormatInfo,
uint32_t* pVideoFormatPropertyCount,
VkVideoFormatPropertiesKHR* pVideoFormatProperties);
-
physicalDevice
is the physical device from which to query the video format properties. -
pVideoFormatInfo
is a pointer to a VkPhysicalDeviceVideoFormatInfoKHR structure specifying the usage and video profiles for which supported image formats and capabilities are returned. -
pVideoFormatPropertyCount
is a pointer to an integer related to the number of video format properties available or queried, as described below. -
pVideoFormatProperties
is a pointer to an array of VkVideoFormatPropertiesKHR structures in which supported image formats and capabilities are returned.
If pVideoFormatProperties
is NULL
, then the number of video format
properties supported for the given physicalDevice
is returned in
pVideoFormatPropertyCount
.
Otherwise, pVideoFormatPropertyCount
must point to a variable set by
the user to the number of elements in the pVideoFormatProperties
array, and on return the variable is overwritten with the number of values
actually written to pVideoFormatProperties
.
If the value of pVideoFormatPropertyCount
is less than the number of
video format properties supported, at most pVideoFormatPropertyCount
values will be written to pVideoFormatProperties
, and
VK_INCOMPLETE
will be returned instead of VK_SUCCESS
, to
indicate that not all the available values were returned.
Video format properties are always queried with respect to a specific set of
video profiles.
These are specified by chaining the VkVideoProfileListInfoKHR
structure to pVideoFormatInfo
.
For most use cases, the images are used by a single video session and a single video profile is provided. For a use case such as video transcoding, where a decode session output image can be used as encode input in one or more encode sessions, multiple video profiles corresponding to the video sessions that will share the image must be provided.
If any of the video profiles specified via
VkVideoProfileListInfoKHR::pProfiles
are not supported, then
this command returns one of the video-profile-specific error codes.
Furthermore, if VkPhysicalDeviceVideoFormatInfoKHR::imageUsage
includes any image usage flags not supported by the specified video
profiles, then this command returns
VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR
.
This command also returns VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR
if
VkPhysicalDeviceVideoFormatInfoKHR::imageUsage
does not include
the appropriate flags as dictated by the decode capability flags returned in
VkVideoDecodeCapabilitiesKHR::flags
for any of the profiles
specified in the VkVideoProfileListInfoKHR structure provided in the
pNext
chain of pVideoFormatInfo
.
If the decode capability flags include
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR
but not
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR
, then in
order to query video format properties for decode DPB and output usage,
VkPhysicalDeviceVideoFormatInfoKHR::imageUsage
must include
both VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR
and
VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR
.
Otherwise, the call will fail with
VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR
.
If the decode capability flags include
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR
but not
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR
, then in
order to query video format properties for decode DPB usage,
VkPhysicalDeviceVideoFormatInfoKHR::imageUsage
must include
VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR
, but not
VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR
.
Otherwise, the call will fail with
VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR
.
Similarly, to query video format properties for decode output usage,
VkPhysicalDeviceVideoFormatInfoKHR::imageUsage
must include
VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR
, but not
VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR
.
Otherwise, the call will fail with
VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR
.
The imageUsage
member of the VkPhysicalDeviceVideoFormatInfoKHR
structure specifies the expected video usage flags that the returned video
formats must support.
Correspondingly, the imageUsageFlags
member of each
VkVideoFormatPropertiesKHR structure returned will contain at least
the same set of image usage flags.
If the implementation supports using video input, output, or DPB images of a
particular format in operations other than video decode/encode then the
imageUsageFlags
member of the corresponding
VkVideoFormatPropertiesKHR structure returned will include additional
image usage flags indicating that.
Note:
For most use cases, only decode or encode related usage flags are going to be specified. For a use case such as transcode, if the image were to be shared between decode and encode session(s), then both decode and encode related usage flags can be set. |
Multiple VkVideoFormatPropertiesKHR
entries may be returned with the
same format
member with different componentMapping
,
imageType
, or imageTiling
values, as described later.
In addition, a different set of VkVideoFormatPropertiesKHR
entries
may be returned depending on the imageUsage
member of the
VkPhysicalDeviceVideoFormatInfoKHR
structure, even for the same set of
video profiles, for example, based on whether encode input, encode DPB,
decode output, and/or decode DPB usage is requested.
The application can select the parameters returned in the
VkVideoFormatPropertiesKHR
entries and use compatible parameters when
creating the input, output, and DPB images.
The implementation will report all image creation and usage flags that are
valid for images used with the requested video profiles but applications
should create images only with those that are necessary for the particular
use case.
Before creating an image, the application can obtain the complete set of
supported image format features by calling
vkGetPhysicalDeviceImageFormatProperties2 using parameters derived
from the members of one of the reported VkVideoFormatPropertiesKHR
entries and adding the same VkVideoProfileListInfoKHR structure to the
pNext
chain of VkPhysicalDeviceImageFormatInfo2.
The componentMapping
member of VkVideoFormatPropertiesKHR
defines the ordering of the Y′CBCR color channels from the perspective of
the video codec operations specified in VkVideoProfileListInfoKHR.
For example, if the implementation produces video decode output with the
format VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
where the blue and red
chrominance channels are swapped then the componentMapping
member of
the corresponding VkVideoFormatPropertiesKHR
structure will have the
following member values:
components.r = VK_COMPONENT_SWIZZLE_B; // Cb component
components.g = VK_COMPONENT_SWIZZLE_IDENTITY; // Y component
components.b = VK_COMPONENT_SWIZZLE_R; // Cr component
components.a = VK_COMPONENT_SWIZZLE_IDENTITY; // unused, defaults to 1.0
The VkPhysicalDeviceVideoFormatInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkPhysicalDeviceVideoFormatInfoKHR {
VkStructureType sType;
const void* pNext;
VkImageUsageFlags imageUsage;
} VkPhysicalDeviceVideoFormatInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
imageUsage
is a bitmask of VkImageUsageFlagBits specifying the intended usage of the video images.
The VkVideoFormatPropertiesKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoFormatPropertiesKHR {
VkStructureType sType;
void* pNext;
VkFormat format;
VkComponentMapping componentMapping;
VkImageCreateFlags imageCreateFlags;
VkImageType imageType;
VkImageTiling imageTiling;
VkImageUsageFlags imageUsageFlags;
} VkVideoFormatPropertiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
format
is a VkFormat that specifies the format that can be used with the specified video profiles and image usages. -
componentMapping
defines the color channel order used for the format.format
along withcomponentMapping
describe how the color channels are ordered when producing video decoder output or are expected to be ordered in video encoder input, when applicable. If theformat
reported does not require component swizzling then all members ofcomponentMapping
will be set toVK_COMPONENT_SWIZZLE_IDENTITY
. -
imageCreateFlags
is a bitmask of VkImageCreateFlagBits specifying the supported image creation flags for the format. -
imageType
is a VkImageType that specifies the image type the format can be used with. -
imageTiling
is a VkImageTiling that specifies the image tiling the format can be used with. -
imageUsageFlags
is a bitmask of VkImageUsageFlagBits specifying the supported image usage flags for the format.
42.5. Video Sessions
Video sessions are objects that represent and maintain the state needed to perform video decode or encode operations using a specific video profile.
Video sessions are represented by VkVideoSessionKHR
handles:
// Provided by VK_KHR_video_queue
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkVideoSessionKHR)
42.5.1. Creating a Video Session
To create a video session object, call:
// Provided by VK_KHR_video_queue
VkResult vkCreateVideoSessionKHR(
VkDevice device,
const VkVideoSessionCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkVideoSessionKHR* pVideoSession);
-
device
is the logical device that creates the video session. -
pCreateInfo
is a pointer to a VkVideoSessionCreateInfoKHR structure containing parameters to be used to create the video session. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pVideoSession
is a pointer to a VkVideoSessionKHR handle in which the resulting video session object is returned.
The resulting video session object is said to be created with the video
codec operation specified in
pCreateInfo->pVideoProfile->videoCodecOperation
.
The name and version of the codec-specific Video Std header to be used with
the video session is specified by the VkExtensionProperties structure
pointed to by pCreateInfo->pStdHeaderVersion
.
If a non-existent or unsupported Video Std header version is specified in
pCreateInfo->pStdHeaderVersion->specVersion
, then this command returns
VK_ERROR_VIDEO_STD_VERSION_NOT_SUPPORTED_KHR
.
Video session objects are created in uninitialized state.
In order to transition the video session into initial state, the
application must issue a vkCmdControlVideoCodingKHR command with
VkVideoCodingControlInfoKHR::flags
including
VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR
.
Video session objects also maintain the
state of the DPB.
The number of DPB slots usable with the created video session is specified
in pCreateInfo->maxDpbSlots
, and each slot is initially in the
inactive state.
Each DPB slot maintained by the created video session can refer to a reference picture representing a video frame.
In addition, if the videoCodecOperation
member of the
VkVideoProfileInfoKHR structure pointed to by
pCreateInfo->pVideoProfile
is
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
and the
pictureLayout
member of the VkVideoDecodeH264ProfileInfoKHR
structure provided in the VkVideoProfileInfoKHR::pNext
chain is
not VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_KHR
, then the
created video session supports interlaced frames and each DPB
slot maintained by the created video session can instead refer to
separate top field and bottom field reference pictures
that together can represent a full video frame.
In this case, it is up to the application, driven by the video content,
whether it associates any individual DPB slot with separate top and/or
bottom field pictures or a single picture representing a full frame.
The created video session can be used to perform video coding operations
using video frames up to the maximum size specified in
pCreateInfo->maxCodedExtent
.
The minimum frame size allowed is implicitly derived from
VkVideoCapabilitiesKHR::minCodedExtent
, as returned by
vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile
specified by pCreateInfo->pVideoProfile
.
Accordingly, the created video session is said to be created with a
minCodedExtent
equal to that.
The VkVideoSessionCreateInfoKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionCreateInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t queueFamilyIndex;
VkVideoSessionCreateFlagsKHR flags;
const VkVideoProfileInfoKHR* pVideoProfile;
VkFormat pictureFormat;
VkExtent2D maxCodedExtent;
VkFormat referencePictureFormat;
uint32_t maxDpbSlots;
uint32_t maxActiveReferencePictures;
const VkExtensionProperties* pStdHeaderVersion;
} VkVideoSessionCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
queueFamilyIndex
is the index of the queue family the created video session will be used with. -
flags
is a bitmask of VkVideoSessionCreateFlagBitsKHR specifying creation flags. -
pVideoProfile
is a pointer to a VkVideoProfileInfoKHR structure specifying the video profile the created video session will be used with. -
pictureFormat
is the image format the created video session will be used with. IfpVideoProfile->videoCodecOperation
specifies a decode operation, thenpictureFormat
is the image format of decode output pictures usable with the created video session. IfpVideoProfile->videoCodecOperation
specifies an encode operation, thenpictureFormat
is the image format of encode input pictures usable with the created video session. -
maxCodedExtent
is the maximum width and height of the coded frames the created video session will be used with. -
referencePictureFormat
is the image format of reference pictures stored in the DPB the created video session will be used with. -
maxDpbSlots
is the maximum number of DPB Slots that can be used with the created video session. -
maxActiveReferencePictures
is the maximum number of active reference pictures that can be used in a single video coding operation using the created video session. -
pStdHeaderVersion
is a pointer to a VkExtensionProperties structure requesting the Video Std header version to use for thevideoCodecOperation
specified inpVideoProfile
.
Bits which can be set in VkVideoSessionCreateInfoKHR::flags
are:
// Provided by VK_KHR_video_queue
typedef enum VkVideoSessionCreateFlagBitsKHR {
VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR = 0x00000001,
} VkVideoSessionCreateFlagBitsKHR;
-
VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR
specifies that the video session uses protected video content.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoSessionCreateFlagsKHR;
VkVideoSessionCreateFlagsKHR
is a bitmask type for setting a mask of
zero or more VkVideoSessionCreateFlagBitsKHR.
42.5.2. Destroying a Video Session
To destroy a video session, call:
// Provided by VK_KHR_video_queue
void vkDestroyVideoSessionKHR(
VkDevice device,
VkVideoSessionKHR videoSession,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the video session. -
videoSession
is the video session to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
42.5.3. Video Session Memory Association
After creating a video session object, and before the object can be used to record video coding operations into command buffers using it, the application must allocate and bind device memory to the video session. Device memory is allocated separately (see Device Memory) and then associated with the video session.
Video sessions may have multiple memory bindings identified by unique unsigned integer values. Appropriate device memory must be bound to each such memory binding before using the video session to record command buffer commands with it.
To determine the memory requirements for a video session object, call:
// Provided by VK_KHR_video_queue
VkResult vkGetVideoSessionMemoryRequirementsKHR(
VkDevice device,
VkVideoSessionKHR videoSession,
uint32_t* pMemoryRequirementsCount,
VkVideoSessionMemoryRequirementsKHR* pMemoryRequirements);
-
device
is the logical device that owns the video session. -
videoSession
is the video session to query. -
pMemoryRequirementsCount
is a pointer to an integer related to the number of memory binding requirements available or queried, as described below. -
pMemoryRequirements
isNULL
or a pointer to an array of VkVideoSessionMemoryRequirementsKHR structures in which the memory binding requirements of the video session are returned.
If pMemoryRequirements
is NULL
, then the number of memory bindings
required for the video session is returned in
pMemoryRequirementsCount
.
Otherwise, pMemoryRequirementsCount
must point to a variable set by
the user with the number of elements in the pMemoryRequirements
array,
and on return the variable is overwritten with the number of memory binding
requirements actually written to pMemoryRequirements
.
If pMemoryRequirementsCount
is less than the number of memory bindings
required for the video session, then at most pMemoryRequirementsCount
elements will be written to pMemoryRequirements
, and
VK_INCOMPLETE
will be returned, instead of VK_SUCCESS
, to
indicate that not all required memory binding requirements were returned.
The VkVideoSessionMemoryRequirementsKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionMemoryRequirementsKHR {
VkStructureType sType;
void* pNext;
uint32_t memoryBindIndex;
VkMemoryRequirements memoryRequirements;
} VkVideoSessionMemoryRequirementsKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
memoryBindIndex
is the index of the memory binding. -
memoryRequirements
is a VkMemoryRequirements structure in which the requested memory binding requirements for the binding index specified bymemoryBindIndex
are returned.
To attach memory to a video session object, call:
// Provided by VK_KHR_video_queue
VkResult vkBindVideoSessionMemoryKHR(
VkDevice device,
VkVideoSessionKHR videoSession,
uint32_t bindSessionMemoryInfoCount,
const VkBindVideoSessionMemoryInfoKHR* pBindSessionMemoryInfos);
-
device
is the logical device that owns the video session. -
videoSession
is the video session to be bound with device memory. -
bindSessionMemoryInfoCount
is the number of elements inpBindSessionMemoryInfos
. -
pBindSessionMemoryInfos
is a pointer to an array ofbindSessionMemoryInfoCount
VkBindVideoSessionMemoryInfoKHR structures specifying memory regions to be bound to specific memory bindings of the video session.
The valid usage statements below refer to the VkMemoryRequirements
structure corresponding to a specific element of
pBindSessionMemoryInfos
, which is defined as follows:
-
If the
memoryBindIndex
member of the element ofpBindSessionMemoryInfos
in question matches thememoryBindIndex
member of one of the elements returned inpMemoryRequirements
when vkGetVideoSessionMemoryRequirementsKHR is called with the samevideoSession
and withpMemoryRequirementsCount
equal tobindSessionMemoryInfoCount
, then thememoryRequirements
member of that element ofpMemoryRequirements
is the VkMemoryRequirements structure corresponding to the element ofpBindSessionMemoryInfos
in question. -
Otherwise the element of
pBindSessionMemoryInfos
in question is said to not have a corresponding VkMemoryRequirements structure.
The VkBindVideoSessionMemoryInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkBindVideoSessionMemoryInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t memoryBindIndex;
VkDeviceMemory memory;
VkDeviceSize memoryOffset;
VkDeviceSize memorySize;
} VkBindVideoSessionMemoryInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
memoryBindIndex
is the memory binding index to bind memory to. -
memory
is the allocated device memory to be bound to the video session’s memory binding with indexmemoryBindIndex
. -
memoryOffset
is the start offset of the region ofmemory
which is to be bound. -
memorySize
is the size in bytes of the region ofmemory
, starting frommemoryOffset
bytes, to be bound.
42.6. Video Profile Compatibility
Resources and query pools used with a particular video session must be compatible with the video profile the video session was created with.
A VkBuffer is compatible with a video profile if it was created with
the VkBufferCreateInfo::pNext
chain including a
VkVideoProfileListInfoKHR structure with its pProfiles
member
containing an element matching the VkVideoProfileInfoKHR structure
chain describing the video profile, and
VkBufferCreateInfo::usage
including at least one bit specific to
video coding usage.
-
VK_BUFFER_USAGE_VIDEO_DECODE_SRC_BIT_KHR
-
VK_BUFFER_USAGE_VIDEO_DECODE_DST_BIT_KHR
-
VK_BUFFER_USAGE_VIDEO_ENCODE_SRC_BIT_KHR
-
VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR
A VkImage is compatible with a video profile if it was created with
the VkImageCreateInfo::pNext
chain including a
VkVideoProfileListInfoKHR structure with its pProfiles
member
containing an element matching the VkVideoProfileInfoKHR structure
chain describing the video profile, and VkImageCreateInfo::usage
including at least one bit specific to video coding usage.
-
VK_IMAGE_USAGE_VIDEO_DECODE_SRC_BIT_KHR
-
VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR
-
VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR
-
VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR
-
VK_IMAGE_USAGE_VIDEO_ENCODE_DST_BIT_KHR
-
VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR
A VkImageView is compatible with a video profile if the VkImage it was created from is also compatible with that video profile.
A VkQueryPool is compatible with a video profile if it was created
with the VkQueryPoolCreateInfo::pNext
chain including a
VkVideoProfileInfoKHR structure chain describing the same video
profile, and VkQueryPoolCreateInfo::queryType
having one of the
following values:
-
VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR
-
VK_QUERY_TYPE_VIDEO_ENCODE_BITSTREAM_BUFFER_RANGE_KHR
42.7. Video Session Parameters
Video session parameters objects can store preprocessed codec-specific parameters used with a compatible video session, and enable reducing the number of parameters needed to be provided and processed by the implementation while recording video coding operations into command buffers.
Parameters stored in such objects are immutable to facilitate the concurrent use of the stored parameters in multiple threads. At the same time, new parameters can be added to existing objects using the vkUpdateVideoSessionParametersKHR command.
In order to support concurrent use of the stored immutable parameters while
also allowing the video session parameters object to be extended with new
parameters, each video session parameters object maintains an update
sequence counter that is set to 0
at object creation time and must be
incremented by each subsequent update operation.
Certain video sequences that adhere to particular video compression standards permit updating previously supplied parameters. If a parameter update is necessary, the application has the following options:
-
Cache the set of parameters on the application side and create a new video session parameters object adding all the parameters with appropriate changes, as necessary; or
-
Create a new video session parameters object providing only the updated parameters and the previously used object as the template, which ensures that parameters not specified at creation time will be copied unmodified from the template object.
The actual types of parameters that can be stored and the capacity for individual parameter types, and the methods of initializing, updating, and referring to individual parameters are specific to the video codec operation the video session parameters object was created with.
-
For
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
these are defined in the H.264 Decode Parameter Sets section. -
For
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
these are defined in the H.265 Decode Parameter Sets section.
Video session parameters are represented by
VkVideoSessionParametersKHR
handles:
// Provided by VK_KHR_video_queue
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkVideoSessionParametersKHR)
42.7.1. Creating Video Session Parameters
To create a video session parameters object, call:
// Provided by VK_KHR_video_queue
VkResult vkCreateVideoSessionParametersKHR(
VkDevice device,
const VkVideoSessionParametersCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkVideoSessionParametersKHR* pVideoSessionParameters);
-
device
is the logical device that creates the video session parameters object. -
pCreateInfo
is a pointer to VkVideoSessionParametersCreateInfoKHR structure containing parameters to be used to create the video session parameters object. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter. -
pVideoSessionParameters
is a pointer to a VkVideoSessionParametersKHR handle in which the resulting video session parameters object is returned.
The resulting video session parameters object is said to be created with the
video codec operation pCreateInfo->videoSession
was created with.
If pCreateInfo->videoSessionParametersTemplate
is not
VK_NULL_HANDLE
, then it will be used as a template for constructing
the new video session parameters object.
This happens by first adding any parameters according to the additional
creation parameters provided in the pCreateInfo->pNext
chain, followed
by adding any parameters from the template object that have a key that does
not match the key of any of the already added parameters.
If pCreateInfo->videoSession
was created with the video codec
operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
, then the
created video session parameters object will initially contain the following
sets of parameter entries:
-
StdVideoH264SequenceParameterSet
structures representing H.264 SPS entries, as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH264SessionParametersCreateInfoKHR structure provided in thepCreateInfo->pNext
chain is notNULL
, then the set ofStdVideoH264SequenceParameterSet
entries specified inpParametersAddInfo->pStdSPSs
are added first; -
If
pCreateInfo->videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH264SequenceParameterSet
entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the sameseq_parameter_set_id
.
-
-
StdVideoH264PictureParameterSet
structures representing H.264 PPS entries, as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH264SessionParametersCreateInfoKHR structure provided in thepCreateInfo->pNext
chain is notNULL
, then the set ofStdVideoH264PictureParameterSet
entries specified inpParametersAddInfo->pStdPPSs
are added first; -
If
pCreateInfo->videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH264PictureParameterSet
entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the sameseq_parameter_set_id
andpic_parameter_set_id
.
-
If pCreateInfo->videoSession
was created with the video codec
operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
, then the
created video session parameters object will initially contain the following
sets of parameter entries:
-
StdVideoH265VideoParameterSet
structures representing H.264 VPS entries, as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in thepCreateInfo->pNext
chain is notNULL
, then the set ofStdVideoH265VideoParameterSet
entries specified inpParametersAddInfo->pStdVPSs
are added first; -
If
pCreateInfo->videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH265VideoParameterSet
entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the samevps_video_parameter_set_id
.
-
-
StdVideoH265SequenceParameterSet
structures representing H.265 SPS entries, as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in thepCreateInfo->pNext
chain is notNULL
, then the set ofStdVideoH265SequenceParameterSet
entries specified inpParametersAddInfo->pStdSPSs
are added first; -
If
pCreateInfo->videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH265SequenceParameterSet
entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the samesps_video_parameter_set_id
andsps_seq_parameter_set_id
.
-
-
StdVideoH265PictureParameterSet
structures representing H.265 PPS entries, as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in thepCreateInfo->pNext
chain is notNULL
, then the set ofStdVideoH265PictureParameterSet
entries specified inpParametersAddInfo->pStdPPSs
are added first; -
If
pCreateInfo->videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH265PictureParameterSet
entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the samesps_video_parameter_set_id
,pps_seq_parameter_set_id
, andpps_pic_parameter_set_id
.
-
The VkVideoSessionParametersCreateInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionParametersCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoSessionParametersCreateFlagsKHR flags;
VkVideoSessionParametersKHR videoSessionParametersTemplate;
VkVideoSessionKHR videoSession;
} VkVideoSessionParametersCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use. -
videoSessionParametersTemplate
isVK_NULL_HANDLE
or a valid handle to a VkVideoSessionParametersKHR object used as a template for constructing the new video session parameters object. -
videoSession
is the video session object against which the video session parameters object is going to be created.
Limiting values are defined below that are referenced by the relevant valid usage statements of this structure.
-
If
videoSession
was created with the codec operationVK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
, then letStdVideoH264SequenceParameterSet spsAddList[]
be the list of H.264 SPS entries to add to the created video session parameters object, defined as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH264SessionParametersCreateInfoKHR structure provided in thepNext
chain is notNULL
, then the set ofStdVideoH264SequenceParameterSet
entries specified inpParametersAddInfo->pStdSPSs
are added tospsAddList
; -
If
videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH264SequenceParameterSet
entry stored in it withseq_parameter_set_id
not matching any of the entries already inspsAddList
is added tospsAddList
.
-
-
If
videoSession
was created with the codec operationVK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
, then letStdVideoH264PictureParameterSet ppsAddList[]
be the list of H.264 PPS entries to add to the created video session parameters object, defined as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH264SessionParametersCreateInfoKHR structure provided in thepNext
chain is notNULL
, then the set ofStdVideoH264PictureParameterSet
entries specified inpParametersAddInfo->pStdPPSs
are added toppsAddList
; -
If
videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH264PictureParameterSet
entry stored in it withseq_parameter_set_id
orpic_parameter_set_id
not matching any of the entries already inppsAddList
is added toppsAddList
.
-
-
If
videoSession
was created with the codec operationVK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
, then letStdVideoH265VideoParameterSet vpsAddList[]
be the list of H.265 VPS entries to add to the created video session parameters object, defined as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in thepNext
chain is notNULL
, then the set ofStdVideoH265VideoParameterSet
entries specified inpParametersAddInfo->pStdVPSs
are added tovpsAddList
; -
If
videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH265VideoParameterSet
entry stored in it withvps_video_parameter_set_id
not matching any of the entries already invpsAddList
is added tovpsAddList
.
-
-
If
videoSession
was created with the codec operationVK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
, then letStdVideoH265SequenceParameterSet spsAddList[]
be the list of H.265 SPS entries to add to the created video session parameters object, defined as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in thepNext
chain is notNULL
, then the set ofStdVideoH265SequenceParameterSet
entries specified inpParametersAddInfo->pStdSPSs
are added tospsAddList
; -
If
videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH265SequenceParameterSet
entry stored in it withsps_video_parameter_set_id
orsps_seq_parameter_set_id
not matching any of the entries already inspsAddList
is added tospsAddList
.
-
-
If
videoSession
was created with the codec operationVK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
, then letStdVideoH265PictureParameterSet ppsAddList[]
be the list of H.265 PPS entries to add to the created video session parameters object, defined as follows:-
If the
pParametersAddInfo
member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in thepNext
chain is notNULL
, then the set ofStdVideoH265PictureParameterSet
entries specified inpParametersAddInfo->pStdPPSs
are added toppsAddList
; -
If
videoSessionParametersTemplate
is notVK_NULL_HANDLE
, then eachStdVideoH265PictureParameterSet
entry stored in it withsps_video_parameter_set_id
,pps_seq_parameter_set_id
, orpps_pic_parameter_set_id
not matching any of the entries already inppsAddList
is added toppsAddList
.
-
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoSessionParametersCreateFlagsKHR;
VkVideoSessionParametersCreateFlagsKHR
is a bitmask type for setting a
mask, but is currently reserved for future use.
42.7.2. Destroying Video Session Parameters
To destroy a video session parameters object, call:
// Provided by VK_KHR_video_queue
void vkDestroyVideoSessionParametersKHR(
VkDevice device,
VkVideoSessionParametersKHR videoSessionParameters,
const VkAllocationCallbacks* pAllocator);
-
device
is the logical device that destroys the video session parameters object. -
videoSessionParameters
is the video session parameters object to destroy. -
pAllocator
controls host memory allocation as described in the Memory Allocation chapter.
42.7.3. Updating Video Session Parameters
To update video session parameters object with new parameters, call:
// Provided by VK_KHR_video_queue
VkResult vkUpdateVideoSessionParametersKHR(
VkDevice device,
VkVideoSessionParametersKHR videoSessionParameters,
const VkVideoSessionParametersUpdateInfoKHR* pUpdateInfo);
-
device
is the logical device that updates the video session parameters. -
videoSessionParameters
is the video session parameters object to update. -
pUpdateInfo
is a pointer to a VkVideoSessionParametersUpdateInfoKHR structure specifying the parameter update information.
After a successful call to this command, the
update sequence counter of
videoSessionParameters
is changed to the value specified in
pUpdateInfo->updateSequenceCount
.
Note:
As each update issued to a video session parameters object needs to specify the next available update sequence count value, concurrent updates of the same video session parameters object are inherently disallowed. However, recording video coding operations to command buffers referring to parameters previously added to the video session parameters object is allowed, even if there is a concurrent update in progress adding some new entries to the object. |
If videoSessionParameters
was created with the video codec operation
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
and the
pUpdateInfo->pNext
chain includes a
VkVideoDecodeH264SessionParametersAddInfoKHR structure, then this
command adds the following parameter entries to
videoSessionParameters
:
-
The H.264 SPS entries specified in VkVideoDecodeH264SessionParametersAddInfoKHR::
pStdSPSs
. -
The H.264 PPS entries specified in VkVideoDecodeH264SessionParametersAddInfoKHR::
pStdPPSs
.
If videoSessionParameters
was created with the video codec operation
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
and the
pUpdateInfo->pNext
chain includes a
VkVideoDecodeH265SessionParametersAddInfoKHR structure, then this
command adds the following parameter entries to
videoSessionParameters
:
-
The H.265 VPS entries specified in VkVideoDecodeH265SessionParametersAddInfoKHR::
pStdVPSs
. -
The H.265 SPS entries specified in VkVideoDecodeH265SessionParametersAddInfoKHR::
pStdSPSs
. -
The H.265 PPS entries specified in VkVideoDecodeH265SessionParametersAddInfoKHR::
pStdPPSs
.
The VkVideoSessionParametersUpdateInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionParametersUpdateInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t updateSequenceCount;
} VkVideoSessionParametersUpdateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
updateSequenceCount
is the new update sequence count to set for the video session parameters object.
42.8. Video Coding Scope
Applications can record video coding commands for a video session only within a video coding scope.
To begin a video coding scope, call:
// Provided by VK_KHR_video_queue
void vkCmdBeginVideoCodingKHR(
VkCommandBuffer commandBuffer,
const VkVideoBeginCodingInfoKHR* pBeginInfo);
-
commandBuffer
is the command buffer in which to record the command. -
pBeginInfo
is a pointer to a VkVideoBeginCodingInfoKHR structure specifying the parameters of the video coding scope, including the video session and video session parameters object to use.
After beginning a video coding scope, the video session object specified in
pBeginInfo->videoSession
is bound to the command buffer, and the
command buffer is ready to record video coding operations.
Similarly, if pBeginInfo->videoSessionParameters
is not
VK_NULL_HANDLE
, it is also bound to the command buffer, and video
coding operations can refer to the codec-specific parameters stored in it.
This command also establishes the set of bound reference picture resources that can be used as reconstructed pictures or reference pictures within the video coding scope. Each element of this set consists of a video picture resource and the DPB slot index associated with it, if there is one.
The set of bound reference picture resources is immutable within a video coding scope, however, the DPB slot index associated with any of the bound reference picture resources can change during the video coding scope in response to video coding operations.
The VkVideoReferenceSlotInfoKHR structures provided as the elements of
pBeginInfo->pReferenceSlots
are interpreted by this command as
follows:
-
If
slotIndex
is non-negative andpPictureResource
is notNULL
, then the video picture resource defined by the VkVideoPictureResourceInfoKHR structure pointed to bypPictureResource
is added to the set of bound reference picture resources and is associated with the DPB slot index specified inslotIndex
. -
If
slotIndex
is non-negative andpPictureResource
isNULL
, then the DPB slot with indexslotIndex
is deactivated by this command. -
If
slotIndex
is negative andpPictureResource
is notNULL
, then the video picture resource defined by the VkVideoPictureResourceInfoKHR structure pointed to bypPictureResource
is added to the set of bound reference picture resources without an associated DPB slot. Such a picture resource can be subsequently used as a reconstructed picture to associate it with a DPB slot. -
If
slotIndex
is negative andpPictureResource
isNULL
, then the element is ignored.
Note:
It is possible for multiple bound reference picture resources to be associated with the same DPB slot index, or for a single bound reference picture to refer to multiple separate reference pictures. For example, in case of an H.264 decode profile with interlaced frame support a single DPB slot can refer to two separate pictures for the top and bottom fields. Depending on the picture layout used by the H.264 decode profile, the following special cases may arise:
|
All non-negative slotIndex
values specified in the elements of
pBeginInfo->pReferenceSlots
must identify DPB slots of the video
session that are in the active state at the time this
command is executed on the device.
Note:
The application does not have to specify an entry in
|
The VkVideoBeginCodingInfoKHR structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoBeginCodingInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoBeginCodingFlagsKHR flags;
VkVideoSessionKHR videoSession;
VkVideoSessionParametersKHR videoSessionParameters;
uint32_t referenceSlotCount;
const VkVideoReferenceSlotInfoKHR* pReferenceSlots;
} VkVideoBeginCodingInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use. -
videoSession
is the video session object to be bound for the processing of the video commands. -
videoSessionParameters
isVK_NULL_HANDLE
or a handle of a VkVideoSessionParametersKHR object to be used for the processing of the video commands. IfVK_NULL_HANDLE
, then no video session parameters object is bound for the duration of the video coding scope. -
referenceSlotCount
is the number of elements in thepReferenceSlots
array. -
pReferenceSlots
is a pointer to an array of VkVideoReferenceSlotInfoKHR structures specifying the information used to determine the set of bound reference picture resources for the video coding scope and their initial association with DPB slot indices.
Limiting values are defined below that are referenced by the relevant valid usage statements of this structure.
-
Let
VkOffset2D codedOffsetGranularity
be the minimum alignment requirement for the coded offset of video picture resources. Unless otherwise defined, the value of thex
andy
members ofcodedOffsetGranularity
are0
.-
If
videoSession
was created with an H.264 decode profile with a VkVideoDecodeH264ProfileInfoKHR::pictureLayout
ofVK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR
, thencodedOffsetGranularity
is equal to VkVideoDecodeH264CapabilitiesKHR::fieldOffsetGranularity
, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for that video profile.
-
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoBeginCodingFlagsKHR;
VkVideoBeginCodingFlagsKHR
is a bitmask type for setting a mask, but
is currently reserved for future use.
The VkVideoReferenceSlotInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoReferenceSlotInfoKHR {
VkStructureType sType;
const void* pNext;
int32_t slotIndex;
const VkVideoPictureResourceInfoKHR* pPictureResource;
} VkVideoReferenceSlotInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
slotIndex
is the index of the DPB slot or a negative integer value. -
pPictureResource
isNULL
or a pointer to a VkVideoPictureResourceInfoKHR structure describing the video picture resource associated with the DPB slot index specified byslotIndex
.
To end a video coding scope, call:
// Provided by VK_KHR_video_queue
void vkCmdEndVideoCodingKHR(
VkCommandBuffer commandBuffer,
const VkVideoEndCodingInfoKHR* pEndCodingInfo);
-
commandBuffer
is the command buffer in which to record the command. -
pEndCodingInfo
is a pointer to a VkVideoEndCodingInfoKHR structure specifying the parameters for ending the video coding scope.
After ending a video coding scope, the video session object, the optional video session parameters object, and all reference picture resources previously bound by the corresponding vkCmdBeginVideoCodingKHR command are unbound.
The VkVideoEndCodingInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoEndCodingInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoEndCodingFlagsKHR flags;
} VkVideoEndCodingInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use.
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoEndCodingFlagsKHR;
VkVideoEndCodingFlagsKHR
is a bitmask type for setting a mask, but is
currently reserved for future use.
42.9. Video Coding Control
To apply dynamic controls to the currently bound video session object, call:
// Provided by VK_KHR_video_queue
void vkCmdControlVideoCodingKHR(
VkCommandBuffer commandBuffer,
const VkVideoCodingControlInfoKHR* pCodingControlInfo);
-
commandBuffer
is the command buffer in which to record the command. -
pCodingControlInfo
is a pointer to a VkVideoCodingControlInfoKHR structure specifying the control parameters.
The control parameters provided in this call are applied to the video session at the time the command executes on the device and are in effect until a subsequent call to this command with the same video session bound changes the corresponding control parameters.
A newly created video session must be reset before performing video coding
operations using it by including VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR
in pCodingControlInfo->flags
.
The reset operation also returns all DPB slots of the video session to the
inactive state.
Correspondingly, any DPB slot index associated with the
bound reference picture resources is
removed.
For encode sessions, the reset operation returns rate control configuration to implementation default settings.
After video coding operations are performed using a video session, the reset operation can be used to return the video session to the same initial state as after the reset of a newly created video session. This can be used, for example, when different video sequences are needed to be processed with the same video session object.
The VkVideoCodingControlInfoKHR
structure is defined as:
// Provided by VK_KHR_video_queue
typedef struct VkVideoCodingControlInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoCodingControlFlagsKHR flags;
} VkVideoCodingControlInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoCodingControlFlagsKHR specifying control flags.
Bits which can be set in VkVideoCodingControlInfoKHR::flags
,
specifying the video coding control parameters to be modified, are:
// Provided by VK_KHR_video_queue
typedef enum VkVideoCodingControlFlagBitsKHR {
VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR = 0x00000001,
#ifdef VK_ENABLE_BETA_EXTENSIONS
// Provided by VK_KHR_video_encode_queue
VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR = 0x00000002,
#endif
#ifdef VK_ENABLE_BETA_EXTENSIONS
// Provided by VK_KHR_video_encode_queue
VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_LAYER_BIT_KHR = 0x00000004,
#endif
} VkVideoCodingControlFlagBitsKHR;
-
VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR
indicates a request for the bound video session to be reset before other coding control parameters are applied. -
VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR
indicates that the coding control parameters include video encode rate control parameters (see VkVideoEncodeRateControlInfoKHR). -
VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_LAYER_BIT_KHR
indicates that the coding control parameters include video encode rate control layer parameters (see VkVideoEncodeRateControlLayerInfoKHR).
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCodingControlFlagsKHR;
VkVideoCodingControlFlagsKHR
is a bitmask type for setting a mask of
zero or more VkVideoCodingControlFlagBitsKHR.
42.10. Video Decode Operations
Video decode operations consume compressed video data from a video bitstream buffer and zero or more reference pictures, and produce a decode output picture and an optional reconstructed picture.
Note:
Such decode output pictures can be shared with the Decoded Picture Buffer, and can also be used as the input of video encode operations, with graphics or compute operations, or with Window System Integration APIs, depending on the capabilities of the implementation. |
Video decode operations may access the following resources in the
VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR
stage:
-
The source video bitstream buffer range and the image subregions corresponding to the list of active reference pictures with access
VK_ACCESS_2_VIDEO_DECODE_READ_BIT_KHR
. -
The image subregions corresponding to the target decode output picture and reconstructed picture with access
VK_ACCESS_2_VIDEO_DECODE_WRITE_BIT_KHR
.
The image subresource of each video picture resource accessed by the video coding operation is specified using a corresponding VkVideoPictureResourceInfoKHR structure. Each such image subresource must be in the appropriate image layout as follows:
-
If the image subresource is used in the video decode operation only as decode output picture, then it must be in the
VK_IMAGE_LAYOUT_VIDEO_DECODE_DST_KHR
layout. -
If the image subresource is used in the video decode operation both as decode output picture and reconstructed picture, then it must be in the
VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR
layout. -
If the image subresource is used in the video decode operation only as reconstructed picture, then it must be in the
VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR
layout. -
If the image subresource is used in the video decode operation as a reference picture, then it must be in the
VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR
layout.
A video decode operation may complete unsuccessfully. In this case the decode output picture will have undefined contents. Similarly, if a reconstructed picture is specified, it will also have undefined contents, and the activated DPB slot will have an invalid picture reference.
42.10.1. Codec-Specific Semantics
The following aspects of video decode operations are codec-specific:
-
The interpretation of the contents of the source video bitstream buffer range.
-
The construction and interpretation of the list of active reference pictures and the interpretation of the picture data referred to by the corresponding image subregions.
-
The construction and interpretation of information related to the decode output picture and the generation of picture data to the corresponding image subregion.
-
The construction and interpretation of information related to the optional reconstructed picture and the generation of picture data to the corresponding image subregion.
These codec-specific behaviors are defined for each video codec operation separately.
-
If the used video codec operation is
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
, then the codec-specific aspects of the video decoding process are performed as defined in the H.264 Decode Operations section. -
If the used video codec operation is
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
, then the codec-specific aspects of the video decoding process are performed as defined in the H.265 Decode Operations section.
42.10.2. Video Decode Operation Steps
Each video decode operation performs the following steps in the
VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR
stage:
-
Reads the encoded video data from the source video bitstream buffer range.
-
Performs picture reconstruction of the encoded video data according to the codec-specific semantics, applying any prediction data read from the active reference pictures in the process;
-
Writes the decoded picture data to the decode output picture, and to the reconstructed picture, if one is specified and is different from the decode output picture, according to the codec-specific semantics;
-
When reconstructed picture information is provided, the requested DPB slot is activated with the specified picture and the DPB slot index is associated with the corresponding bound reference picture resource.
42.10.3. Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with
pVideoProfile->videoCodecOperation
specifying a decode operation, the
VkVideoDecodeCapabilitiesKHR
structure must be included in the
pNext
chain of the VkVideoCapabilitiesKHR structure to retrieve
capabilities specific to video decoding.
The VkVideoDecodeCapabilitiesKHR
structure is defined as:
// Provided by VK_KHR_video_decode_queue
typedef struct VkVideoDecodeCapabilitiesKHR {
VkStructureType sType;
void* pNext;
VkVideoDecodeCapabilityFlagsKHR flags;
} VkVideoDecodeCapabilitiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoDecodeCapabilityFlagBitsKHR describing the supported video decoding capabilities.
Bits which may be set in VkVideoDecodeCapabilitiesKHR::flags
,
indicating the decoding capabilities supported, are:
// Provided by VK_KHR_video_decode_queue
typedef enum VkVideoDecodeCapabilityFlagBitsKHR {
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR = 0x00000001,
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR = 0x00000002,
} VkVideoDecodeCapabilityFlagBitsKHR;
-
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR
indicates support for using the same video picture resource as the reconstructed picture and decode output picture in a video decode operation. -
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR
indicates support for using distinct video picture resources as the reconstructed picture and decode output picture in a video decode operation.
Implementations are only required to support one of
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR
and
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR
.
Accordingly, applications should handle both cases to maximize portability.
Note:
If both |
// Provided by VK_KHR_video_decode_queue
typedef VkFlags VkVideoDecodeCapabilityFlagsKHR;
VkVideoDecodeCapabilityFlagsKHR
is a bitmask type for setting a mask
of zero or more VkVideoDecodeCapabilityFlagBitsKHR.
42.10.4. Video Decode Commands
To launch video decode operations, call:
// Provided by VK_KHR_video_decode_queue
void vkCmdDecodeVideoKHR(
VkCommandBuffer commandBuffer,
const VkVideoDecodeInfoKHR* pDecodeInfo);
-
commandBuffer
is the command buffer in which to record the command. -
pDecodeInfo
is a pointer to a VkVideoDecodeInfoKHR structure specifying the parameters of the video decode operations.
Each call issues one or more video decode operations.
The implicit parameter opCount
corresponds to the number of video
decode operations issued by the command.
After calling this command, the
active query index of each
active query is incremented by opCount
.
Currently each call to this command results in the issue of a single video decode operation.
- Active Reference Picture Information
-
The list of active reference pictures used by a video decode operation is a list of image subregions used as the source of reference picture data and related parameters, and is derived from the VkVideoReferenceSlotInfoKHR structures provided as the elements of the
pDecodeInfo->pReferenceSlots
array. For each element ofpDecodeInfo->pReferenceSlots
, one or more elements are added to the active reference picture list, as defined by the codec-specific semantics. Each element of this list contains the following information:-
The image subregion within the image subresource referred to by the video picture resource used as the reference picture.
-
The DPB slot index the reference picture is associated with.
-
The codec-specific reference information related to the reference picture.
-
- Reconstructed Picture Information
-
Information related to the optional reconstructed picture used by a video decode operation is derived from the VkVideoReferenceSlotInfoKHR structure pointed to by
pDecodeInfo->pSetupReferenceSlot
, if notNULL
, as defined by the codec-specific semantics, and consists of the following:-
The image subregion within the image subresource referred to by the video picture resource used as the reconstructed picture.
-
The DPB slot index to activate with the reconstructed picture.
-
The codec-specific reference information related to the reconstructed picture.
-
- Decode Output Picture Information
-
Information related to the decode output picture used by a video decode operation is derived from
pDecodeInfo->dstPictureResource
and any codec-specific parameters provided in thepDecodeInfo->pNext
chain, as defined by the codec-specific semantics, and consists of the following:-
The image subregion within the image subresource referred to by the video picture resource used as the decode output picture.
-
The codec-specific picture information related to the decode output picture.
-
Several limiting values are defined below that are referenced by the relevant valid usage statements of this command.
-
Let
uint32_t activeReferencePictureCount
be the size of the list of active reference pictures used by the video decode operation. Unless otherwise defined,activeReferencePictureCount
is set to the value ofpDecodeInfo->referenceSlotCount
.-
If the bound video session was created with an H.264 decode profile, then let
activeReferencePictureCount
be the value ofpDecodeInfo->referenceSlotCount
plus the number of elements of thepDecodeInfo->pReferenceSlots
array that have a VkVideoDecodeH264DpbSlotInfoKHR structure included in theirpNext
chain with bothpStdReferenceInfo->flags.top_field_flag
andpStdReferenceInfo->flags.bottom_field_flag
set.NoteThis means that the elements of
pDecodeInfo->pReferenceSlots
that include both a top and bottom field reference are counted as two separate active reference pictures, as described in the active reference picture list construction rules for H.264 decode operations.
-
-
Let
VkOffset2D codedOffsetGranularity
be the minimum alignment requirement for the coded offset of video picture resources. Unless otherwise defined, the value of thex
andy
members ofcodedOffsetGranularity
are0
.-
If the bound video session was created with an H.264 decode profile with a VkVideoDecodeH264ProfileInfoKHR::
pictureLayout
ofVK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR
, thencodedOffsetGranularity
is equal to VkVideoDecodeH264CapabilitiesKHR::fieldOffsetGranularity
, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for that video profile.
-
-
Let
uint32_t dpbFrameUseCount[]
be an array of sizemaxDpbSlots
, wheremaxDpbSlots
is the VkVideoSessionCreateInfoKHR::maxDpbSlots
the bound video session was created with, with each element indicating the number of times a frame associated with the corresponding DPB slot index is referred to by the video coding operation. Let the initial value of each element of the array be0
.-
If
pDecodeInfo->pSetupReferenceSlot
is notNULL
, thendpbFrameUseCount[i]
is incremented by one, wherei
equalspDecodeInfo->pSetupReferenceSlot->slotIndex
. If the bound video session object was created with an H.264 decode profile, thendpbFrameUseCount[i]
is decremented by one if eitherpStdReferenceInfo->flags.top_field_flag
orpStdReferenceInfo->flags.bottom_field_flag
is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in thepDecodeInfo->pSetupReferenceSlot->pNext
chain. -
For each element of
pDecodeInfo->pReferenceSlots
,dpbFrameUseCount[i]
is incremented by one, wherei
equals theslotIndex
member of the corresponding element. If the bound video session object was created with an H.264 decode profile, thendpbFrameUseCount[i]
is decremented by one if eitherpStdReferenceInfo->flags.top_field_flag
orpStdReferenceInfo->flags.bottom_field_flag
is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in thepNext
chain of the corresponding element ofpDecodeInfo->pReferenceSlots
.
-
-
Let
uint32_t dpbTopFieldUseCount[]
anduint32_t dpbBottomFieldUseCount[]
be arrays of sizemaxDpbSlots
, wheremaxDpbSlots
is the VkVideoSessionCreateInfoKHR::maxDpbSlots
the bound video session was created with, with each element indicating the number of times the top field or the bottom field, respectively, associated with the corresponding DPB slot index is referred to by the video coding operation. Let the initial value of each element of the arrays be0
.-
If the bound video session object was created with an H.264 decode profile and
pDecodeInfo->pSetupReferenceSlot
is notNULL
, then perform the following:-
If
pStdReferenceInfo->flags.top_field_flag
is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in thepDecodeInfo->pSetupReferenceSlot->pNext
chain, thendpbTopFieldUseCount[i]
is incremented by one, wherei
equalspDecodeInfo->pSetupReferenceSlot->slotIndex
. -
If
pStdReferenceInfo->flags.bottom_field_flag
is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in thepDecodeInfo->pSetupReferenceSlot->pNext
chain, thendpbBottomFieldUseCount[i]
is incremented by one, wherei
equalspDecodeInfo->pSetupReferenceSlot->slotIndex
.
-
-
If the bound video session object was created with an H.264 decode profile, then perform the following for each element of
pDecodeInfo->pReferenceSlots
:-
If
pStdReferenceInfo->flags.top_field_flag
is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in thepNext
chain of the element, thendpbTopFieldUseCount[i]
is incremented by one, wherei
equals theslotIndex
member of the element. -
If
pStdReferenceInfo->flags.bottom_field_flag
is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in thepNext
chain of the element, thendpbBottomFieldUseCount[i]
is incremented by one, wherei
equals theslotIndex
member of the element.
-
-
The VkVideoDecodeInfoKHR
structure is defined as:
// Provided by VK_KHR_video_decode_queue
typedef struct VkVideoDecodeInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoDecodeFlagsKHR flags;
VkBuffer srcBuffer;
VkDeviceSize srcBufferOffset;
VkDeviceSize srcBufferRange;
VkVideoPictureResourceInfoKHR dstPictureResource;
const VkVideoReferenceSlotInfoKHR* pSetupReferenceSlot;
uint32_t referenceSlotCount;
const VkVideoReferenceSlotInfoKHR* pReferenceSlots;
} VkVideoDecodeInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use. -
srcBuffer
is the source video bitstream buffer to read the encoded bitstream from. -
srcBufferOffset
is the starting offset in bytes from the start ofsrcBuffer
to read the encoded bitstream from. -
srcBufferRange
is the size in bytes of the encoded bitstream to decode fromsrcBuffer
, starting fromsrcBufferOffset
. -
dstPictureResource
is the video picture resource to use as the decode output picture. -
pSetupReferenceSlot
isNULL
or a pointer to a VkVideoReferenceSlotInfoKHR structure describing the DPB slot to activate and the video picture resource to use as the reconstructed picture to activate the DPB slot with. -
referenceSlotCount
is the number of elements in thepReferenceSlots
array. -
pReferenceSlots
is a pointer to an array of VkVideoReferenceSlotInfoKHR structures describing the DPB slots and corresponding reference picture resources to use in this video decode operation (the set of active reference pictures).
// Provided by VK_KHR_video_decode_queue
typedef VkFlags VkVideoDecodeFlagsKHR;
VkVideoDecodeFlagsKHR
is a bitmask type for setting a mask, but is
currently reserved for future use.
42.11. H.264 Decode Operations
Video decode operations using an H.264 decode profile can be used to decode elementary video stream sequences compliant to the ITU-T H.264 Specification.
Note
Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos. |
This process is performed according to the video decode operation steps with the codec-specific semantics defined in section 8 of the ITU-T H.264 Specification as follows:
-
Syntax elements, derived values, and other parameters are applied from the following structures:
-
The
StdVideoH264SequenceParameterSet
structure corresponding to the active SPS specifying the H.264 sequence parameter set. -
The
StdVideoH264PictureParameterSet
structure corresponding to the active PPS specifying the H.264 picture parameter set. -
The
StdVideoDecodeH264PictureInfo
structure specifying the H.264 picture information. -
The
StdVideoDecodeH264ReferenceInfo
structures specifying the H.264 reference information corresponding to the optional reconstructed picture and any active reference pictures.
-
-
The contents of the provided video bitstream buffer range are interpreted as defined in the H.264 Decode Bitstream Data Access section.
-
Picture data in the video picture resources corresponding to the used active reference pictures, decode output picture, and optional reconstructed picture is accessed as defined in the H.264 Decode Picture Data Access section.
If the parameters and the bitstream adhere to the syntactic and semantic requirements defined in the corresponding sections of the ITU-T H.264 Specification, as described above, and the DPB slots associated with the active reference pictures all refer to valid picture references, then the video decode operation will complete successfully. Otherwise, the video decode operation may complete unsuccessfully.
42.11.1. H.264 Decode Bitstream Data Access
If the target decode output picture is a frame, then the video bitstream buffer range should contain a VCL NAL unit comprised of the slice headers and data of a picture representing an entire frame, as defined in sections 7.3.3 and 7.3.4, and this data is interpreted as defined in sections 7.4.3 and 7.4.4 of the ITU-T H.264 Specification, respectively.
If the target decode output picture is a field, then the video bitstream buffer range should contain a VCL NAL unit comprised of the slice headers and data of a picture representing a field, as defined in sections 7.3.3 and 7.3.4, and this data is interpreted as defined in sections 7.4.3 and 7.4.4 of the ITU-T H.264 Specification, respectively.
The offsets provided in
VkVideoDecodeH264PictureInfoKHR::pSliceOffsets
should specify
the starting offsets corresponding to each slice header within the video
bitstream buffer range.
42.11.2. H.264 Decode Picture Data Access
The effective imageOffset
and imageExtent
corresponding to a
decode output picture,
reference picture, or
reconstructed picture used in video decode
operations with an H.264 decode profile are defined
as follows:
-
imageOffset
is (codedOffset.x
,codedOffset.y
) andimageExtent
is (codedExtent.width
,codedExtent.height
), if the picture represents a frame. -
imageOffset
is (codedOffset.x
,codedOffset.y
) andimageExtent
is (codedExtent.width
,codedExtent.height
), if the picture represents a field and the picture layout of the used H.264 decode profile isVK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR
. -
imageOffset
is (codedOffset.x
,codedOffset.y
) andimageExtent
is (codedExtent.width
,codedExtent.height
/ 2), if the picture represents a field and the picture layout of the used H.264 decode profile isVK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR
.
Where codedOffset
and codedExtent
are the members of the
VkVideoPictureResourceInfoKHR structure corresponding to the picture.
However, accesses to image data within a video picture resource happen at
the granularity indicated by
VkVideoCapabilitiesKHR::pictureAccessGranularity
, as returned by
vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile.
This means that the complete image subregion accessed by video coding
operations using an H.264 decode profile for the
video picture resource is defined as the set of texels within the coordinate
range:
-
([
startX
,endX
),[startY
,endY
))
Where:
-
startX
equalsimageOffset.x
rounded down to the nearest integer multiple ofpictureAccessGranularity.width
; -
endX
equalsimageOffset.x
+imageExtent.width
rounded up to the nearest integer multiple ofpictureAccessGranularity.width
and clamped to the width of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure; -
startY equals
imageOffset.y
rounded down to the nearest integer multiple ofpictureAccessGranularity.height
; -
endY equals
imageOffset.y
+imageExtent.height
rounded up to the nearest integer multiple ofpictureAccessGranularity.height
and clamped to the height of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure.
In case of video decode operations using an H.264
decode profile, any access to a picture at the coordinates
(x
,y
), as defined by the ITU-T H.264
Specification, is an access to the image subresource
referred to by the corresponding
VkVideoPictureResourceInfoKHR structure at the texel coordinates
specified below:
-
(
x
,y
), if the accessed picture represents a frame. -
(
x
,y
× 2), if the accessed picture represents a top field and the picture layout of the used H.264 decode profile isVK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR
. -
(
x
,y
× 2 + 1), if the accessed picture represents a bottom field and the picture layout of the used H.264 decode profile isVK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR
. -
(
x
,y
), if the accessed picture represents a top field and the picture layout of the used H.264 decode profile isVK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR
. -
(
codedOffset.x
+x
,codedOffset.y
+y
), if the accessed picture represents a bottom field and the picture layout of the used H.264 decode profile isVK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR
.
Where codedOffset
is the member of the corresponding
VkVideoPictureResourceInfoKHR structure.
42.11.3. H.264 Decode Profile
A video profile supporting H.264 video decode operations is specified by
setting VkVideoProfileInfoKHR::videoCodecOperation
to
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
and adding a
VkVideoDecodeH264ProfileInfoKHR
structure to the
VkVideoProfileInfoKHR::pNext
chain.
The VkVideoDecodeH264ProfileInfoKHR
structure is defined as:
// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264ProfileInfoKHR {
VkStructureType sType;
const void* pNext;
StdVideoH264ProfileIdc stdProfileIdc;
VkVideoDecodeH264PictureLayoutFlagBitsKHR pictureLayout;
} VkVideoDecodeH264ProfileInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdProfileIdc
is aStdVideoH264ProfileIdc
value specifying the H.264 codec profile IDC, as defined in section A.2 of the ITU-T H.264 Specification. -
pictureLayout
is a VkVideoDecodeH264PictureLayoutFlagBitsKHR value specifying the picture layout used by the H.264 video sequence to be decoded.
The H.264 video decode picture layout flags are defined as follows:
// Provided by VK_KHR_video_decode_h264
typedef enum VkVideoDecodeH264PictureLayoutFlagBitsKHR {
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_KHR = 0,
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR = 0x00000001,
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR = 0x00000002,
} VkVideoDecodeH264PictureLayoutFlagBitsKHR;
-
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_KHR
specifies support for progressive content. This flag has the value0
. -
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR
specifies support for or use of a picture layout for interlaced content where all lines belonging to the top field are decoded to the even-numbered lines within the picture resource, and all lines belonging to the bottom field are decoded to the odd-numbered lines within the picture resource. -
VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR
specifies support for or use of a picture layout for interlaced content where all lines belonging to a field are grouped together in a single image subregion, and the two fields comprising the frame can be stored in separate image subregions of the same image subresource or in separate image subresources.
// Provided by VK_KHR_video_decode_h264
typedef VkFlags VkVideoDecodeH264PictureLayoutFlagsKHR;
VkVideoDecodeH264PictureLayoutFlagsKHR
is a bitmask type for setting a
mask of zero or more VkVideoDecodeH264PictureLayoutFlagBitsKHR.
42.11.4. H.264 Decode Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the
capabilities for an H.264 decode profile, the
VkVideoCapabilitiesKHR::pNext
chain must include a
VkVideoDecodeH264CapabilitiesKHR
structure that will be filled with
the profile-specific capabilities.
The VkVideoDecodeH264CapabilitiesKHR
structure is defined as:
// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264CapabilitiesKHR {
VkStructureType sType;
void* pNext;
StdVideoH264LevelIdc maxLevelIdc;
VkOffset2D fieldOffsetGranularity;
} VkVideoDecodeH264CapabilitiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxLevelIdc
is aStdVideoH264LevelIdc
value specifying the maximum H.264 level supported by the profile, as defined in section A.3 of the ITU-T H.264 Specification. -
fieldOffsetGranularity
is the minimum alignment for VkVideoPictureResourceInfoKHR::codedOffset
specified for a video picture resource when using the picture layoutVK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR
.
42.11.5. H.264 Decode Parameter Sets
Video session parameters objects created with
the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
can contain the following types of parameters:
- H.264 Sequence Parameter Sets (SPS)
-
Represented by
StdVideoH264SequenceParameterSet
structures and interpreted as follows:-
reserved1
andreserved2
are used only for padding purposes and are otherwise ignored; -
seq_parameter_set_id
is used as the key of the SPS entry; -
if
flags.seq_scaling_matrix_present_flag
is set, then theStdVideoH264ScalingLists
structure pointed to bypScalingLists
is interpreted as follows:-
scaling_list_present_mask
is a bitmask where bit index i corresponds toseq_scaling_list_present_flag[i]
as defined in section 7.4.2.1 of the ITU-T H.264 Specification; -
use_default_scaling_matrix_mask
is a bitmask where bit index i corresponds toUseDefaultScalingMatrix4x4Flag[i]
, when i < 6, or corresponds toUseDefaultScalingMatrix8x8Flag[i-6]
, otherwise, as defined in section 7.3.2.1 of the ITU-T H.264 Specification; -
ScalingList4x4
andScalingList8x8
correspond to the identically named syntax elements defined in section 7.3.2.1 of the ITU-T H.264 Specification;
-
-
if
flags.vui_parameters_present_flag
is set, thenpSequenceParameterSetVui
points to aStdVideoH264SequenceParameterSetVui
structure that is interpreted as follows:-
reserved1
is used only for padding purposes and is otherwise ignored; -
if
flags.nal_hrd_parameters_present_flag
orflags.vcl_hrd_parameters_present_flag
is set, then theStdVideoH264HrdParameters
structure pointed to bypHrdParameters
is interpreted as follows:-
reserved1
is used only for padding purposes and is otherwise ignored; -
all other members of
StdVideoH264HrdParameters
are interpreted as defined in section E.2.2 of the ITU-T H.264 Specification;
-
-
all other members of
StdVideoH264SequenceParameterSetVui
are interpreted as defined in section E.2.1 of the ITU-T H.264 Specification;
-
-
all other members of
StdVideoH264SequenceParameterSet
are interpreted as defined in section 7.4.2.1 of the ITU-T H.264 Specification.
-
- H.264 Picture Parameter Sets (PPS)
-
Represented by
StdVideoH264PictureParameterSet
structures and interpreted as follows:-
the pair constructed from
seq_parameter_set_id
andpic_parameter_set_id
is used as the key of the PPS entry; -
if
flags.pic_scaling_matrix_present_flag
is set, then theStdVideoH264ScalingLists
structure pointed to bypScalingLists
is interpreted as follows:-
scaling_list_present_mask
is a bitmask where bit index i corresponds topic_scaling_list_present_flag[i]
as defined in section 7.4.2.2 of the ITU-T H.264 Specification; -
use_default_scaling_matrix_mask
is a bitmask where bit index i corresponds toUseDefaultScalingMatrix4x4Flag[i]
, when i < 6, or corresponds toUseDefaultScalingMatrix8x8Flag[i-6]
, otherwise, as defined in section 7.3.2.2 of the ITU-T H.264 Specification; -
ScalingList4x4
andScalingList8x8
correspond to the identically named syntax elements defined in section 7.3.2.2 of the ITU-T H.264 Specification;
-
-
all other members of
StdVideoH264PictureParameterSet
are interpreted as defined in section 7.4.2.2 of the ITU-T H.264 Specification.
-
When a video session parameters object is
created with the codec operation
VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
, the
VkVideoSessionParametersCreateInfoKHR::pNext
chain must include
a VkVideoDecodeH264SessionParametersCreateInfoKHR
structure specifying
the capacity and initial contents of the object.
The VkVideoDecodeH264SessionParametersCreateInfoKHR
structure is
defined as:
// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264SessionParametersCreateInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t maxStdSPSCount;
uint32_t maxStdPPSCount;
const VkVideoDecodeH264SessionParametersAddInfoKHR* pParametersAddInfo;
} VkVideoDecodeH264SessionParametersCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxStdSPSCount
is the maximum number of H.264 SPS entries the createdVkVideoSessionParametersKHR
can contain. -
maxStdPPSCount
is the maximum number of H.264 PPS entries the createdVkVideoSessionParametersKHR
can contain. -
pParametersAddInfo
isNULL
or a pointer to a VkVideoDecodeH264SessionParametersAddInfoKHR structure specifying H.264 parameters to add upon object creation.
The VkVideoDecodeH264SessionParametersAddInfoKHR
structure is defined
as:
// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264SessionParametersAddInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t stdSPSCount;
const StdVideoH264SequenceParameterSet* pStdSPSs;
uint32_t stdPPSCount;
const StdVideoH264PictureParameterSet* pStdPPSs;
} VkVideoDecodeH264SessionParametersAddInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdSPSCount
is the number of elements in thepStdSPSs
array. -
pStdSPSs
is a pointer to an array ofStdVideoH264SequenceParameterSet
structures describing the H.264 SPS entries to add. -
stdPPSCount
is the number of elements in thepStdPPSs
array. -
pStdPPSs
is a pointer to an array ofStdVideoH264PictureParameterSet
structures describing the H.264 PPS entries to add.
This structure can be specified in the following places:
-
In the
pParametersAddInfo
member of the VkVideoDecodeH264SessionParametersCreateInfoKHR structure specified in thepNext
chain of VkVideoSessionParametersCreateInfoKHR used to create a video session parameters object. In this case, if the video codec operation the video session parameters object is created with isVK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
, then it defines the set of initial parameters to add to the created object (see Creating Video Session Parameters). -
In the
pNext
chain of VkVideoSessionParametersUpdateInfoKHR. In this case, if the video codec operation the video session parameters object to be updated was created with isVK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
, then it defines the set of parameters to add to it (see Updating Video Session Parameters).
42.11.6. H.264 Decoding Parameters
The VkVideoDecodeH264PictureInfoKHR
structure is defined as:
// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264PictureInfoKHR {
VkStructureType sType;
const void* pNext;
const StdVideoDecodeH264PictureInfo* pStdPictureInfo;
uint32_t sliceCount;
const uint32_t* pSliceOffsets;
} VkVideoDecodeH264PictureInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pStdPictureInfo
is a pointer to aStdVideoDecodeH264PictureInfo
structure specifying H.264 picture information. -
sliceCount
is the number of elements inpSliceOffsets
. -
pSliceOffsets
is a pointer to an array ofsliceCount
offsets specifying the start offset of the slices of the picture within the video bitstream buffer range specified in VkVideoDecodeInfoKHR.
This structure is specified in the pNext
chain of the
VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR to
specify the codec-specific picture information for an H.264
decode operation.
- Decode Output Picture Information
-
When this structure is specified in the
pNext
chain of the VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR, the information related to the decode output picture is defined as follows:-
If
pStdPictureInfo->flags.field_pic_flag
is not set, then the picture represents a frame. -
If
pStdPictureInfo->flags.field_pic_flag
is set, then the picture represents a field. Specifically:-
If
pStdPictureInfo->flags.bottom_field_flag
is not set, then the picture represents the top field of the frame. -
If
pStdPictureInfo->flags.bottom_field_flag
is set, then the picture represents the bottom field of the frame.
-
-
The image subregion used is determined according to the H.264 Decode Picture Data Access section.
-
The decode output picture is associated with the H.264 picture information provided in
pStdPictureInfo
.
-
- Std Picture Information
-
The members of the
StdVideoDecodeH264PictureInfo
structure pointed to bypStdPictureInfo
are interpreted as follows:-
reserved1
andreserved2
are used only for padding purposes and are otherwise ignored; -
flags.is_intra
as defined in section 3.73 of the ITU-T H.264 Specification; -
flags.is_reference
as defined in section 3.136 of the ITU-T H.264 Specification; -
flags.complementary_field_pair
as defined in section 3.35 of the ITU-T H.264 Specification; -
seq_parameter_set_id
andpic_parameter_set_id
are used to identify the active parameter sets, as described below; -
all other members are interpreted as defined in section 7.4.3 of the ITU-T H.264 Specification.
-
- Active Parameter Sets
-
The members of the
StdVideoDecodeH264PictureInfo
structure pointed to bypStdPictureInfo
are used to select the active parameter sets to use from the bound video session parameters object, as follows:-
The active SPS is the SPS identified by the key specified in
StdVideoDecodeH264PictureInfo
::seq_parameter_set_id
. -
The active PPS is the PPS identified by the key specified by the pair constructed from
StdVideoDecodeH264PictureInfo
::seq_parameter_set_id
andStdVideoDecodeH264PictureInfo
::pic_parameter_set_id
.
-
The VkVideoDecodeH264DpbSlotInfoKHR
structure is defined as:
// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264DpbSlotInfoKHR {
VkStructureType sType;
const void* pNext;
const StdVideoDecodeH264ReferenceInfo* pStdReferenceInfo;
} VkVideoDecodeH264DpbSlotInfoKHR;
-
sType
is the type of this structure. -
pStdReferenceInfo
is a pointer to aStdVideoDecodeH264ReferenceInfo
structure specifying H.264 reference information.
This structure is specified in the pNext
chain of
VkVideoDecodeInfoKHR::pSetupReferenceSlot
, if not NULL
, and
the pNext
chain of the elements of
VkVideoDecodeInfoKHR::pReferenceSlots
to specify the
codec-specific reference picture information for an H.264
decode operation.
- Active Reference Picture Information
-
When this structure is specified in the
pNext
chain of the elements of VkVideoDecodeInfoKHR::pReferenceSlots
, one or two elements are added to the list of active reference pictures used by the video decode operation for each element of VkVideoDecodeInfoKHR::pReferenceSlots
as follows:-
If neither
pStdReferenceInfo->flags.top_field_flag
norpStdReferenceInfo->flags.bottom_field_flag
is set, then the picture is added as a frame reference to the list of active reference pictures. -
If
pStdReferenceInfo->flags.top_field_flag
is set, then the picture is added as a top field reference to the list of active reference pictures. -
If
pStdReferenceInfo->flags.bottom_field_flag
is set, then the picture is added as a bottom field reference to the list of active reference pictures. -
For each added reference picture, the corresponding image subregion used is determined according to the H.264 Decode Picture Data Access section.
-
Each added reference picture is associated with the DPB slot index specified in the
slotIndex
member of the corresponding element of VkVideoDecodeInfoKHR::pReferenceSlots
. -
Each added reference picture is associated with the H.264 reference information provided in
pStdReferenceInfo
.
-
Note
When both the top and bottom field of an interlaced frame currently
associated with a DPB slot is intended to be used as an active reference
picture and both fields are stored in the same image subregion (which is the
case when using
|
- Reconstructed Picture Information
-
When this structure is specified in the
pNext
chain of VkVideoDecodeInfoKHR::pSetupReferenceSlot
, the information related to the reconstructed picture is defined as follows:-
If neither
pStdReferenceInfo->flags.top_field_flag
norpStdReferenceInfo->flags.bottom_field_flag
is set, then the picture represents a frame. -
If
pStdReferenceInfo->flags.top_field_flag
is set, then the picture represents a field, specifically, the top field of the frame. -
If
pStdReferenceInfo->flags.bottom_field_flag
is set, then the picture represents a field, specifically, the bottom field of the frame. -
The image subregion used is determined according to the H.264 Decode Picture Data Access section.
-
The reconstructed picture is used to activate the DPB slot with the index specified in VkVideoDecodeInfoKHR::
pSetupReferenceSlot->slotIndex
. -
The reconstructed picture is associated with the H.264 reference information provided in
pStdReferenceInfo
.
-
- Std Reference Information
-
The members of the
StdVideoDecodeH264ReferenceInfo
structure pointed to bypStdReferenceInfo
are interpreted as follows:-
flags.top_field_flag
is used to indicate whether the reference is used as top field reference; -
flags.bottom_field_flag
is used to indicate whether the reference is used as bottom field reference; -
flags.used_for_long_term_reference
is used to indicate whether the picture is marked as “used for long-term reference” as defined in section 8.2.5.1 of the ITU-T H.264 Specification; -
flags.is_non_existing
is used to indicate whether the picture is marked as “non-existing” as defined in section 8.2.5.2 of the ITU-T H.264 Specification; -
all other members are interpreted as defined in section 8.2 of the ITU-T H.264 Specification.
-
42.11.7. H.264 Decode Requirements
This section describes the required H.264 decoding capabilities for
physical devices that have at least one queue family that supports the video
codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR
, as
returned by vkGetPhysicalDeviceQueueFamilyProperties2 in
VkQueueFamilyVideoPropertiesKHR::videoCodecOperations
.
Video Std Header Name | Version |
---|---|
|
1.0.0 |
42.12. H.265 Decode Operations
Video decode operations using an H.265 decode profile can be used to decode elementary video stream sequences compliant to the ITU-T H.265 Specification.
Note
Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos. |
This process is performed according to the video decode operation steps with the codec-specific semantics defined in section 8 of ITU-T H.265 Specification:
-
Syntax elements, derived values, and other parameters are applied from the following structures:
-
The
StdVideoH265VideoParameterSet
structure corresponding to the active VPS specifying the H.265 video parameter set. -
The
StdVideoH265SequenceParameterSet
structure corresponding to the active SPS specifying the H.265 sequence parameter set. -
The
StdVideoH265PictureParameterSet
structure corresponding to the active PPS specifying the H.265 picture parameter set. -
The
StdVideoDecodeH265PictureInfo
structure specifying the H.265 picture information. -
The
StdVideoDecodeH264ReferenceInfo
structures specifying the H.265 reference information corresponding to the optional reconstructed picture and any active reference pictures.
-
-
The contents of the provided video bitstream buffer range are interpreted as defined in the H.265 Decode Bitstream Data Access section.
-
Picture data in the video picture resources corresponding to the used active reference pictures, decode output picture, and optional reconstructed picture is accessed as defined in the H.265 Decode Picture Data Access section.
If the parameters and the bitstream adhere to the syntactic and semantic requirements defined in the corresponding sections of the ITU-T H.265 Specification, as described above, and the DPB slots associated with the active reference pictures all refer to valid picture references, then the video decode operation will complete successfully. Otherwise, the video decode operation may complete unsuccessfully.
42.12.1. H.265 Decode Bitstream Data Access
The video bitstream buffer range should contain a VCL NAL unit comprised of the slice segment headers and data of a picture representing a frame, as defined in sections 7.3.6 and 7.3.8, and this data is interpreted as defined in sections 7.4.7 and 7.4.9 of the ITU-T H.265 Specification, respectively.
The offsets provided in
VkVideoDecodeH265PictureInfoKHR::pSliceSegmentOffsets
should
specify the starting offsets corresponding to each slice segment header
within the video bitstream buffer range.
42.12.2. H.265 Decode Picture Data Access
Accesses to image data within a video picture resource happen at the
granularity indicated by
VkVideoCapabilitiesKHR::pictureAccessGranularity
, as returned by
vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile.
Accordingly, the complete image subregion of a
decode output picture,
reference picture, or
reconstructed picture accessed by video coding
operations using an H.265 decode profile is defined
as the set of texels within the coordinate range:
-
([0,
endX
),[0,endY
))
Where:
-
endX
equalscodedExtent.width
rounded up to the nearest integer multiple ofpictureAccessGranularity.width
and clamped to the width of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure; -
endY equals
codedExtent.height
rounded up to the nearest integer multiple ofpictureAccessGranularity.height
and clamped to the height of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;
Where codedExtent
is the member of the
VkVideoPictureResourceInfoKHR structure corresponding to the picture.
In case of video decode operations using an H.265
decode profile, any access to a picture at the coordinates
(x
,y
), as defined by the ITU-T H.265
Specification, is an access to the image subresource
referred to by the corresponding
VkVideoPictureResourceInfoKHR structure at the texel coordinates
(x
,y
).
42.12.3. H.265 Decode Profile
A video profile supporting H.265 video decode operations is specified by
setting VkVideoProfileInfoKHR::videoCodecOperation
to
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
and adding a
VkVideoDecodeH265ProfileInfoKHR
structure to the
VkVideoProfileInfoKHR::pNext
chain.
The VkVideoDecodeH265ProfileInfoKHR
structure is defined as:
// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265ProfileInfoKHR {
VkStructureType sType;
const void* pNext;
StdVideoH265ProfileIdc stdProfileIdc;
} VkVideoDecodeH265ProfileInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdProfileIdc
is aStdVideoH265ProfileIdc
value specifying the H.265 codec profile IDC, as defined in section A3 of the ITU-T H.265 Specification.
42.12.4. H.265 Decode Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the
capabilities for an H.265 decode profile, the
VkVideoCapabilitiesKHR::pNext
chain must include a
VkVideoDecodeH265CapabilitiesKHR
structure that will be filled with
the profile-specific capabilities.
The VkVideoDecodeH265CapabilitiesKHR
structure is defined as:
// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265CapabilitiesKHR {
VkStructureType sType;
void* pNext;
StdVideoH265LevelIdc maxLevelIdc;
} VkVideoDecodeH265CapabilitiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxLevelIdc
is aStdVideoH265LevelIdc
value specifying the maximum H.265 level supported by the profile, as defined in section A.4 of the ITU-T H.265 Specification.
42.12.5. H.265 Decode Parameter Sets
Video session parameters objects created with
the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
can contain the following types of parameters:
- H.265 Video Parameter Sets (VPS)
-
Represented by
StdVideoH265VideoParameterSet
structures and interpreted as follows:-
reserved1
,reserved2
, andreserved3
are used only for padding purposes and are otherwise ignored; -
vps_video_parameter_set_id
is used as the key of the VPS entry; -
the
max_latency_increase_plus1
,max_dec_pic_buffering_minus1
, andmax_num_reorder_pics
members of theStdVideoH265DecPicBufMgr
structure pointed to bypDecPicBufMgr
correspond tovps_max_latency_increase_plus1
,vps_max_dec_pic_buffering_minus1
, andvps_max_num_reorder_pics
, respectively, as defined in section 7.4.3.1 of the ITU-T H.265 Specification; -
the
StdVideoH265HrdParameters
structure pointed to bypHrdParameters
is interpreted as follows:-
reserved
is used only for padding purposes and is otherwise ignored; -
flags.fixed_pic_rate_general_flag
is a bitmask where bit index i corresponds tofixed_pic_rate_general_flag[i]
as defined in section E.3.2 of the ITU-T H.265 Specification; -
flags.fixed_pic_rate_within_cvs_flag
is a bitmask where bit index i corresponds tofixed_pic_rate_within_cvs_flag[i]
as defined in section E.3.2 of the ITU-T H.265 Specification; -
flags.low_delay_hrd_flag
is a bitmask where bit index i corresponds tolow_delay_hrd_flag[i]
as defined in section E.3.2 of the ITU-T H.265 Specification; -
if
flags.nal_hrd_parameters_present_flag
is set, thenpSubLayerHrdParametersNal
points to an array ofvps_max_sub_layers_minus1
+ 1 number ofStdVideoH265SubLayerHrdParameters
structures wherevps_max_sub_layers_minus1
is the corresponding member of the encompassingStdVideoH265VideoParameterSet
structure and each element is interpreted as follows:-
cbr_flag
is a bitmask where bit index i corresponds tocbr_flag[i]
as defined in section E.3.3 of the ITU-T H.265 Specification; -
all other members of the
StdVideoH265SubLayerHrdParameters
structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;
-
-
if
flags.vcl_hrd_parameters_present_flag
is set, thenpSubLayerHrdParametersVcl
points to an array ofvps_max_sub_layers_minus1
+ 1 number ofStdVideoH265SubLayerHrdParameters
structures wherevps_max_sub_layers_minus1
is the corresponding member of the encompassingStdVideoH265VideoParameterSet
structure and each element is interpreted as follows:-
cbr_flag
is a bitmask where bit index i corresponds tocbr_flag[i]
as defined in section E.3.3 of the ITU-T H.265 Specification; -
all other members of the
StdVideoH265SubLayerHrdParameters
structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;
-
-
all other members of
StdVideoH265HrdParameters
are interpreted as defined in section E.3.2 of the ITU-T H.265 Specification;
-
-
the members of the
StdVideoH265ProfileTierLevel
structure pointed to bypProfileTierLevel
are interpreted as defined in section 7.4.4 of the ITU-T H.265 Specification; -
all other members of
StdVideoH265VideoParameterSet
are interpreted as defined in section 7.4.3.1 of the ITU-T H.265 Specification.
-
- H.265 Sequence Parameter Sets (SPS)
-
Represented by
StdVideoH265SequenceParameterSet
structures and interpreted as follows:-
reserved1
andreserved2
are used only for padding purposes and are otherwise ignored; -
the pair constructed from
sps_video_parameter_set_id
andsps_seq_parameter_set_id
is used as the key of the SPS entry; -
the members of the
StdVideoH265ProfileTierLevel
structure pointed to bypProfileTierLevel
are interpreted as defined in section 7.4.4 of the ITU-T H.265 Specification; -
the
max_latency_increase_plus1
,max_dec_pic_buffering_minus1
, andmax_num_reorder_pics
members of theStdVideoH265DecPicBufMgr
structure pointed to bypDecPicBufMgr
correspond tosps_max_latency_increase_plus1
,sps_max_dec_pic_buffering_minus1
, andsps_max_num_reorder_pics
, respectively, as defined in section 7.4.3.2 of the ITU-T H.265 Specification; -
if
flags.sps_scaling_list_data_present_flag
is set, then theStdVideoH265ScalingLists
structure pointed to bypScalingLists
is interpreted as follows:-
ScalingList4x4
,ScalingList8x8
,ScalingList16x16
, andScalingList32x32
correspond toScalingList[0]
,ScalingList[1]
,ScalingList[2]
, andScalingList[3]
, respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification; -
ScalingListDCCoef16x16
andScalingListDCCoef32x32
correspond toscaling_list_dc_coef_minus8[0]
andscaling_list_dc_coef_minus8[1]
, respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;
-
-
pShortTermRefPicSet
points to an array ofnum_short_term_ref_pic_sets
number ofStdVideoH265ShortTermRefPicSet
structures where each element is interpreted as follows:-
reserved1
,reserved2
, andreserved3
are used only for padding purposes and are otherwise ignored; -
used_by_curr_pic_flag
is a bitmask where bit index i corresponds toused_by_curr_pic_flag[i]
as defined in section 7.4.8 of the ITU-T H.265 Specification; -
use_delta_flag
is a bitmask where bit index i corresponds touse_delta_flag[i]
as defined in section 7.4.8 of the ITU-T H.265 Specification; -
used_by_curr_pic_s0_flag
is a bitmask where bit index i corresponds toused_by_curr_pic_s0_flag[i]
as defined in section 7.4.8 of the ITU-T H.265 Specification; -
used_by_curr_pic_s1_flag
is a bitmask where bit index i corresponds toused_by_curr_pic_s1_flag[i]
as defined in section 7.4.8 of the ITU-T H.265 Specification; -
all other members of
StdVideoH265ShortTermRefPicSet
are interpreted as defined in section 7.4.8 of the ITU-T H.265 Specification;
-
-
if
flags.long_term_ref_pics_present_flag
is set then theStdVideoH265LongTermRefPicsSps
structure pointed to bypLongTermRefPicsSps
is interpreted as follows:-
used_by_curr_pic_lt_sps_flag
is a bitmask where bit index i corresponds toused_by_curr_pic_lt_sps_flag[i]
as defined in section 7.4.3.2 of the ITU-T H.265 Specification; -
all other members of
StdVideoH265LongTermRefPicsSps
are interpreted as defined in section 7.4.3.2 of the ITU-T H.265 Specification;
-
-
if
flags.vui_parameters_present_flag
is set, then theStdVideoH265SequenceParameterSetVui
structure pointed to bypSequenceParameterSetVui
is interpreted as follows:-
reserved1
,reserved2
, andreserved3
are used only for padding purposes and are otherwise ignored; -
the
StdVideoH265HrdParameters
structure pointed to bypHrdParameters
is interpreted as follows:-
flags.fixed_pic_rate_general_flag
is a bitmask where bit index i corresponds tofixed_pic_rate_general_flag[i]
as defined in section E.3.2 of the ITU-T H.265 Specification; -
flags.fixed_pic_rate_within_cvs_flag
is a bitmask where bit index i corresponds tofixed_pic_rate_within_cvs_flag[i]
as defined in section E.3.2 of the ITU-T H.265 Specification; -
flags.low_delay_hrd_flag
is a bitmask where bit index i corresponds tolow_delay_hrd_flag[i]
as defined in section E.3.2 of the ITU-T H.265 Specification; -
if
flags.nal_hrd_parameters_present_flag
is set, thenpSubLayerHrdParametersNal
points to an array ofsps_max_sub_layers_minus1
+ 1 number ofStdVideoH265SubLayerHrdParameters
structures wheresps_max_sub_layers_minus1
is the corresponding member of the encompassingStdVideoH265SequenceParameterSet
structure and each element is interpreted as follows:-
cbr_flag
is a bitmask where bit index i corresponds tocbr_flag[i]
as defined in section E.3.3 of the ITU-T H.265 Specification; -
all other members of the
StdVideoH265SubLayerHrdParameters
structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;
-
-
if
flags.vcl_hrd_parameters_present_flag
is set, thenpSubLayerHrdParametersVcl
points to an array ofsps_max_sub_layers_minus1
+ 1 number ofStdVideoH265SubLayerHrdParameters
structures wheresps_max_sub_layers_minus1
is the corresponding member of the encompassingStdVideoH265SequenceParameterSet
structure and each element is interpreted as follows:-
cbr_flag
is a bitmask where bit index i corresponds tocbr_flag[i]
as defined in section E.3.3 of the ITU-T H.265 Specification; -
all other members of the
StdVideoH265SubLayerHrdParameters
structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;
-
-
all other members of
StdVideoH265HrdParameters
are interpreted as defined in section E.3.2 of the ITU-T H.265 Specification;
-
-
all other members of
pSequenceParameterSetVui
are interpreted as defined in section E.3.1 of the ITU-T H.265 Specification;
-
-
if
flags.sps_palette_predictor_initializer_present_flag
is set, then thePredictorPaletteEntries
member of theStdVideoH265PredictorPaletteEntries
structure pointed to bypPredictorPaletteEntries
is interpreted as defined in section 7.4.9.13 of the ITU-T H.265 Specification; -
all other members of
StdVideoH265SequenceParameterSet
are interpreted as defined in section 7.4.3.1 of the ITU-T H.265 Specification.
-
- H.265 Picture Parameter Sets (PPS)
-
Represented by
StdVideoH265PictureParameterSet
structures and interpreted as follows:-
reserved1
,reserved2
, andreserved3
are used only for padding purposes and are otherwise ignored; -
the triplet constructed from
sps_video_parameter_set_id
,pps_seq_parameter_set_id
, andpps_pic_parameter_set_id
is used as the key of the PPS entry; -
if
flags.pps_scaling_list_data_present_flag
is set, then theStdVideoH265ScalingLists
structure pointed to bypScalingLists
is interpreted as follows:-
ScalingList4x4
,ScalingList8x8
,ScalingList16x16
, andScalingList32x32
correspond toScalingList[0]
,ScalingList[1]
,ScalingList[2]
, andScalingList[3]
, respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification; -
ScalingListDCCoef16x16
andScalingListDCCoef32x32
correspond toscaling_list_dc_coef_minus8[0]
andscaling_list_dc_coef_minus8[1]
, respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;
-
-
if
flags.pps_palette_predictor_initializer_present_flag
is set, then thePredictorPaletteEntries
member of theStdVideoH265PredictorPaletteEntries
structure pointed to bypPredictorPaletteEntries
is interpreted as defined in section 7.4.9.13 of the ITU-T H.265 Specification; -
all other members of
StdVideoH265PictureParameterSet
are interpreted as defined in section 7.4.3.3 of the ITU-T H.265 Specification.
-
When a video session parameters object is
created with the codec operation
VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
, the
VkVideoSessionParametersCreateInfoKHR::pNext
chain must include
a VkVideoDecodeH265SessionParametersCreateInfoKHR
structure specifying
the capacity and initial contents of the object.
The VkVideoDecodeH265SessionParametersCreateInfoKHR
structure is
defined as:
// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265SessionParametersCreateInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t maxStdVPSCount;
uint32_t maxStdSPSCount;
uint32_t maxStdPPSCount;
const VkVideoDecodeH265SessionParametersAddInfoKHR* pParametersAddInfo;
} VkVideoDecodeH265SessionParametersCreateInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxStdVPSCount
is the maximum number of H.265 VPS entries the createdVkVideoSessionParametersKHR
can contain. -
maxStdSPSCount
is the maximum number of H.265 SPS entries the createdVkVideoSessionParametersKHR
can contain. -
maxStdPPSCount
is the maximum number of H.265 PPS entries the createdVkVideoSessionParametersKHR
can contain. -
pParametersAddInfo
isNULL
or a pointer to a VkVideoDecodeH265SessionParametersAddInfoKHR structure specifying H.265 parameters to add upon object creation.
The VkVideoDecodeH265SessionParametersAddInfoKHR
structure is defined
as:
// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265SessionParametersAddInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t stdVPSCount;
const StdVideoH265VideoParameterSet* pStdVPSs;
uint32_t stdSPSCount;
const StdVideoH265SequenceParameterSet* pStdSPSs;
uint32_t stdPPSCount;
const StdVideoH265PictureParameterSet* pStdPPSs;
} VkVideoDecodeH265SessionParametersAddInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdVPSCount
is the number of elements in thepStdVPSs
array. -
pStdVPSs
is a pointer to an array ofStdVideoH265VideoParameterSet
structures describing the H.265 VPS entries to add. -
stdSPSCount
is the number of elements in thepStdSPSs
array. -
pStdSPSs
is a pointer to an array ofStdVideoH265SequenceParameterSet
structures describing the H.265 SPS entries to add. -
stdPPSCount
is the number of elements in thepStdPPSs
array. -
pStdPPSs
is a pointer to an array ofStdVideoH265PictureParameterSet
structures describing the H.265 PPS entries to add.
This structure can be specified in the following places:
-
In the
pParametersAddInfo
member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure specified in thepNext
chain of VkVideoSessionParametersCreateInfoKHR used to create a video session parameters object. In this case, if the video codec operation the video session parameters object is created with isVK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
, then it defines the set of initial parameters to add to the created object (see Creating Video Session Parameters). -
In the
pNext
chain of VkVideoSessionParametersUpdateInfoKHR. In this case, if the video codec operation the video session parameters object to be updated was created with isVK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
, then it defines the set of parameters to add to it (see Updating Video Session Parameters).
42.12.6. H.265 Decoding Parameters
The VkVideoDecodeH265PictureInfoKHR
structure is defined as:
// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265PictureInfoKHR {
VkStructureType sType;
const void* pNext;
StdVideoDecodeH265PictureInfo* pStdPictureInfo;
uint32_t sliceSegmentCount;
const uint32_t* pSliceSegmentOffsets;
} VkVideoDecodeH265PictureInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pStdPictureInfo
is a pointer to aStdVideoDecodeH265PictureInfo
structure specifying H.265 picture information. -
sliceSegmentCount
is the number of elements inpSliceSegmentOffsets
. -
pSliceSegmentOffsets
is a pointer to an array ofsliceSegmentCount
offsets specifying the start offset of the slice segments of the picture within the video bitstream buffer range specified in VkVideoDecodeInfoKHR.
This structure is specified in the pNext
chain of the
VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR to
specify the codec-specific picture information for an H.265
decode operation.
- Decode Output Picture Information
-
When this structure is specified in the
pNext
chain of the VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR, the information related to the decode output picture is defined as follows:-
The image subregion used is determined according to the H.265 Decode Picture Data Access section.
-
The decode output picture is associated with the H.265 picture information provided in
pStdPictureInfo
.
-
- Std Picture Information
-
The members of the
StdVideoDecodeH265PictureInfo
structure pointed to bypStdPictureInfo
are interpreted as follows:-
reserved
is used only for padding purposes and is otherwise ignored; -
flags.IrapPicFlag
as defined in section 3.73 of the ITU-T H.265 Specification; -
flags.IdrPicFlag
as defined in section 3.67 of the ITU-T H.265 Specification; -
flags.IsReference
as defined in section 3.132 of the ITU-T H.265 Specification; -
sps_video_parameter_set_id
,pps_seq_parameter_set_id
, andpps_pic_parameter_set_id
are used to identify the active parameter sets, as described below; -
PicOrderCntVal
as defined in section 8.3.1 of the ITU-T H.265 Specification; -
NumBitsForSTRefPicSetInSlice
is the number of bits used inst_ref_pic_set
whenshort_term_ref_pic_set_sps_flag
is0
, or0
otherwise, as defined in sections 7.4.7 and 7.4.8 of the ITU-T H.265 Specification; -
NumDeltaPocsOfRefRpsIdx
is the value ofNumDeltaPocs[RefRpsIdx]
whenshort_term_ref_pic_set_sps_flag
is1
, or0
otherwise, as defined in sections 7.4.7 and 7.4.8 of the ITU-T H.265 Specification; -
RefPicSetStCurrBefore
,RefPicSetStCurrAfter
, andRefPicSetLtCurr
are interpreted as defined in section 8.3.2 of the ITU-T H.265 Specification where each element of these arrays either identifies an active reference picture using its DPB slot index or contains the value 0xFF to indicate "no reference picture"; -
all other members are interpreted as defined in section 8.3.2 of the ITU-T H.265 Specification.
-
- Active Parameter Sets
-
The members of the
StdVideoDecodeH265PictureInfo
structure pointed to bypStdPictureInfo
are used to select the active parameter sets to use from the bound video session parameters object, as follows:-
The active VPS is the VPS identified by the key specified in
StdVideoDecodeH265PictureInfo
::sps_video_parameter_set_id
. -
The active SPS is the SPS identified by the key specified by the pair constructed from
StdVideoDecodeH265PictureInfo
::sps_video_parameter_set_id
andStdVideoDecodeH265PictureInfo
::pps_seq_parameter_set_id
. -
The active PPS is the PPS identified by the key specified by the triplet constructed from
StdVideoDecodeH265PictureInfo
::sps_video_parameter_set_id
,StdVideoDecodeH265PictureInfo
::pps_seq_parameter_set_id
, andStdVideoDecodeH265PictureInfo
::pps_pic_parameter_set_id
.
-
The VkVideoDecodeH265DpbSlotInfoKHR
structure is defined as:
// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265DpbSlotInfoKHR {
VkStructureType sType;
const void* pNext;
const StdVideoDecodeH265ReferenceInfo* pStdReferenceInfo;
} VkVideoDecodeH265DpbSlotInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pStdReferenceInfo
is a pointer to aStdVideoDecodeH265ReferenceInfo
structure specifying reference picture information described in section 8.3 of the ITU-T H.265 Specification.
This structure is specified in the pNext
chain of
VkVideoDecodeInfoKHR::pSetupReferenceSlot
, if not NULL
, and
the pNext
chain of the elements of
VkVideoDecodeInfoKHR::pReferenceSlots
to specify the
codec-specific reference picture information for an H.265
decode operation.
- Active Reference Picture Information
-
When this structure is specified in the
pNext
chain of the elements of VkVideoDecodeInfoKHR::pReferenceSlots
, one element is added to the list of active reference pictures used by the video decode operation for each element of VkVideoDecodeInfoKHR::pReferenceSlots
as follows:-
The image subregion used is determined according to the H.265 Decode Picture Data Access section.
-
The reference picture is associated with the DPB slot index specified in the
slotIndex
member of the corresponding element of VkVideoDecodeInfoKHR::pReferenceSlots
. -
The reference picture is associated with the H.265 reference information provided in
pStdReferenceInfo
.
-
- Reconstructed Picture Information
-
When this structure is specified in the
pNext
chain of VkVideoDecodeInfoKHR::pSetupReferenceSlot
, the information related to the reconstructed picture is defined as follows:-
The image subregion used is determined according to the H.265 Decode Picture Data Access section.
-
The reconstructed picture is used to activate the DPB slot with the index specified in VkVideoDecodeInfoKHR::
pSetupReferenceSlot->slotIndex
. -
The reconstructed picture is associated with the H.265 reference information provided in
pStdReferenceInfo
.
-
- Std Reference Information
-
The members of the
StdVideoDecodeH265ReferenceInfo
structure pointed to bypStdReferenceInfo
are interpreted as follows:-
flags.used_for_long_term_reference
is used to indicate whether the picture is marked as “used for long-term reference” as defined in section 8.3.2 of the ITU-T H.265 Specification; -
flags.unused_for_reference
is used to indicate whether the picture is marked as “unused for reference” as defined in section 8.3.2 of the ITU-T H.265 Specification; -
all other members are interpreted as defined in section 8.3 of the ITU-T H.265 Specification.
-
42.12.7. H.265 Decode Requirements
This section describes the required H.265 decoding capabilities for
physical devices that have at least one queue family that supports the video
codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR
, as
returned by vkGetPhysicalDeviceQueueFamilyProperties2 in
VkQueueFamilyVideoPropertiesKHR::videoCodecOperations
.
Video Std Header Name | Version |
---|---|
|
1.0.0 |
42.13. Video Encode Operations
Before the application can start recording Vulkan command buffers for the Video Encode Operations, it must do the following, beforehand:
-
Ensure that the implementation can encode the Video Content by querying the supported codec operations and profiles using vkGetPhysicalDeviceQueueFamilyProperties2.
-
By using vkGetPhysicalDeviceVideoFormatPropertiesKHR and providing one or more video profiles, choose the Vulkan formats supported by the implementation. The formats for input and reference pictures must be queried and chosen separately. Refer to the section on Video Format Capabilities.
-
Before creating an image to be used as a video picture resource, obtain the supported image creation parameters by querying with vkGetPhysicalDeviceFormatProperties2 and vkGetPhysicalDeviceImageFormatProperties2 using one of the reported formats and adding VkVideoProfileListInfoKHR to the
pNext
chain of VkFormatProperties2. When querying the parameters with vkGetPhysicalDeviceImageFormatProperties2 for images targeting input and reference (DPB) pictures, the VkPhysicalDeviceImageFormatInfo2::usage
field should containVK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR
andVK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR
, respectively. -
Create none, some, or all of the required images for the input and reference pictures. More Video Picture Resources can be created at some later point if needed while processing the content to be encoded. Also, if the size of the picture to be encoded is expected to change, the images can be created based on the maximum expected content size.
-
Create the video session to be used for video encode operations. Before creating the Encode Video Session, the encode capabilities should be queried with vkGetPhysicalDeviceVideoCapabilitiesKHR to obtain the limits of the parameters allowed by the implementation for a particular codec profile.
-
Bind memory resources with the encode video session by calling vkBindVideoSessionMemoryKHR. The video session cannot be used until memory resources are allocated and bound to it. In order to determine the required memory sizes and heap types of the device memory allocations, vkGetVideoSessionMemoryRequirementsKHR should be called.
-
Create one or more Video Session Parameter objects for use across command buffer recording operations, if required by the codec extension in use. These objects must be created against a video session with the parameters required by the codec. Each Video Session Parameter object created is a child object of the associated Session object and cannot be bound in the command buffer with any other Session Object.
The recording of Video Encode Commands against a Vulkan Command Buffer consists of the following sequence:
-
vkCmdBeginVideoCodingKHR starts the recording of one or more Video Encode operations in the command buffer. For each Video Encode Command operation, a Video Session must be bound to the command buffer within this command. This command establishes a Vulkan Video Encode Context that consists of the bound Video Session Object, Session Parameters Object, and the required Video Picture Resources. The established Video Encode Context is in effect until the vkCmdEndVideoCodingKHR command is recorded. If more Video Encode operations are to be required after the vkCmdEndVideoCodingKHR command, another Video Encode Context can be started with the vkCmdBeginVideoCodingKHR command.
-
vkCmdEncodeVideoKHR specifies one or more frames to be encoded. The VkVideoEncodeInfoKHR parameters, and the codec extension structures chained to this, specify the details of the encode operation.
-
vkCmdControlVideoCodingKHR records operations against the encoded data, encoding device, or the Video Session state.
-
vkCmdEndVideoCodingKHR signals the end of the recording of the Vulkan Video Encode Context, as established by vkCmdBeginVideoCodingKHR.
In addition to the above, the following commands can be recorded between vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR:
-
Query operations
-
Global Memory Barriers
-
Buffer Memory Barriers
-
Image Memory Barriers (these must be used to transition the Video Picture Resources to the proper
VK_IMAGE_LAYOUT_VIDEO_ENCODE_SRC_KHR
andVK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR
layouts). -
Pipeline Barriers
-
Events
-
Timestamps
-
Device Groups (device mask)
The following Video Encode related commands must be recorded outside the Vulkan Video Encode Context established with the vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR commands:
-
Sparse Memory Binding
-
Copy Commands
-
Clear Commands
42.13.1. Encode Input Picture
The primary source of input pixels for the video encoding process is the Encode Input Picture, represented by a VkImageView. It may also be a direct target of video decode, graphics, or compute operations , or with Window System Integration APIs .
42.13.2. Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with
pVideoProfile->videoCodecOperation
specified as one of the encode
operation bits, the VkVideoEncodeCapabilitiesKHR structure must be
included in the pNext
chain of the VkVideoCapabilitiesKHR
structure to retrieve capabilities specific to video encoding.
The VkVideoEncodeCapabilitiesKHR
structure is defined as:
// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeCapabilitiesKHR {
VkStructureType sType;
void* pNext;
VkVideoEncodeCapabilityFlagsKHR flags;
VkVideoEncodeRateControlModeFlagsKHR rateControlModes;
uint8_t rateControlLayerCount;
uint8_t qualityLevelCount;
VkExtent2D inputImageDataFillAlignment;
} VkVideoEncodeCapabilitiesKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoEncodeCapabilityFlagBitsKHR describing supported encoding features. -
rateControlModes
is a bitmask of VkVideoEncodeRateControlModeFlagBitsKHR describing supported rate control modes. All implementations must supportVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
rateControlLayerCount
reports the maximum number of rate control layers supported. Implementations must report at least 1. -
qualityLevelCount
is the number of discrete quality levels supported. Implementations must report at least 1. -
inputImageDataFillAlignment
reports alignment of data that should be filled in the input image horizontally and vertically in pixels before encode operations are performed on the input image.
The input content and encode resolution (specified in
VkVideoEncodeInfoKHR::codedExtent
) may not be aligned with the
codec-specific coding block size.
For example, the input content may be 1920x1080 and the coding block size
may be 16x16 pixel blocks.
In this example, the content is horizontally aligned with the coding block
size, but not vertically aligned with the coding block size.
Encoding of the last row of blocks may be impacted by contents of the input
image in pixel rows 1081 to 1088 (the next vertical alignment with the
coding block size).
In general, to ensure efficient encoding for the last row/column of blocks,
and/or to ensure consistent encoding results between repeated encoding of
the same input content, these extra pixel rows/columns should be filled to
known values up to the coding block size alignment before encoding
operations are performed.
Some implementations support performing auto-fill of unaligned pixels beyond
a specific alignment, which is reported in
inputImageDataFillAlignment
.
For example, if an implementation reports 1x1 in
inputImageDataFillAlignment
, then the implementation will perform
auto-fill for any unaligned pixels beyond the encode resolution up to the
next coding block size.
For a coding block size of 16x16, if the implementation reports 16x16 in
inputImageDataFillAlignment
, then it is the application’s
responsibility to fill any unaligned pixels, if desired.
If not, it may impact the encoding efficiency, but it will not affect the
validity of the generated bitstream.
If the implementation reports 8x8 in inputImageDataFillAlignment
, then
for the 1920x1080 example, since the content is aligned to 8 pixels
vertically, the implementation will auto-fill pixel rows 1081 to 1088 (up to
the 16x16 coding block size in the example).
The auto-fill value(s) are implementation-specific.
The auto-fill value(s) are not written to the input image memory, but are
used as part of the encoding operation on the input image.
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeCapabilityFlagsKHR;
VkVideoEncodeCapabilityFlagsKHR
is a bitmask type for setting a mask
of zero or more VkVideoEncodeCapabilityFlagBitsKHR.
Bits which may be set in VkVideoEncodeCapabilitiesKHR::flags
,
indicating the encoding tools supported, are:
// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeCapabilityFlagBitsKHR {
VK_VIDEO_ENCODE_CAPABILITY_PRECEDING_EXTERNALLY_ENCODED_BYTES_BIT_KHR = 0x00000001,
} VkVideoEncodeCapabilityFlagBitsKHR;
-
VK_VIDEO_ENCODE_CAPABILITY_PRECEDING_EXTERNALLY_ENCODED_BYTES_BIT_KHR
reports that the implementation supports use of VkVideoEncodeInfoKHR::precedingExternallyEncodedBytes
.
42.13.3. Video Encode Commands
To launch video encode operations, call:
// Provided by VK_KHR_video_encode_queue
void vkCmdEncodeVideoKHR(
VkCommandBuffer commandBuffer,
const VkVideoEncodeInfoKHR* pEncodeInfo);
-
commandBuffer
is the command buffer to be filled with this function for encoding to generate a bitstream. -
pEncodeInfo
is a pointer to a VkVideoEncodeInfoKHR structure.
Each call issues one or more video encode operations.
The implicit parameter opCount
corresponds to the number of video
encode operations issued by the command.
After calling this command, the
active query index of each
active query is incremented by opCount
.
Currently each call to this command results in the issue of a single video encode operation.
The VkVideoEncodeInfoKHR
structure is defined as:
// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoEncodeFlagsKHR flags;
uint32_t qualityLevel;
VkBuffer dstBitstreamBuffer;
VkDeviceSize dstBitstreamBufferOffset;
VkDeviceSize dstBitstreamBufferMaxRange;
VkVideoPictureResourceInfoKHR srcPictureResource;
const VkVideoReferenceSlotInfoKHR* pSetupReferenceSlot;
uint32_t referenceSlotCount;
const VkVideoReferenceSlotInfoKHR* pReferenceSlots;
uint32_t precedingExternallyEncodedBytes;
} VkVideoEncodeInfoKHR;
-
sType
is the type of this structure. -
pNext
is a pointer to a structure extending this structure. A codec-specific extension structure must be chained to specify what bitstream unit to generate with this encode operation. -
flags
is reserved for future use. -
qualityLevel
is the coding quality level of the encoding. It is defined by the codec-specific extensions. -
dstBitstreamBuffer
is the buffer where the encoded bitstream output will be produced. -
dstBitstreamBufferOffset
is the offset in thedstBitstreamBuffer
where the encoded bitstream output will start.dstBitstreamBufferOffset
’s value must be aligned to VkVideoCapabilitiesKHR::minBitstreamBufferOffsetAlignment
, as reported by the implementation. -
dstBitstreamBufferMaxRange
is the maximum size of thedstBitstreamBuffer
that can be used while the encoded bitstream output is produced.dstBitstreamBufferMaxRange
’s value must be aligned to VkVideoCapabilitiesKHR::minBitstreamBufferSizeAlignment
, as reported by the implementation. -
srcPictureResource
is the Picture Resource of the Input Picture to be encoded by the operation. -
pSetupReferenceSlot
is a pointer to a VkVideoReferenceSlotInfoKHR structure used for generating a reconstructed reference slot and Picture Resource.pSetupReferenceSlot->slotIndex
specifies the slot index number to use as a target for producing the Reconstructed (DPB) data.pSetupReferenceSlot
must be one of the entries provided in VkVideoBeginCodingInfoKHR via thepReferenceSlots
within the vkCmdBeginVideoCodingKHR command that established the Vulkan Video Encode Context for this command. -
referenceSlotCount
is the number of Reconstructed Reference Pictures that will be used when this encoding operation is executing. -
pReferenceSlots
isNULL
or a pointer to an array of VkVideoReferenceSlotInfoKHR structures that will be used when this encoding operation is executing. Each entry inpReferenceSlots
must be one of the entries provided in VkVideoBeginCodingInfoKHR via thepReferenceSlots
within the vkCmdBeginVideoCodingKHR command that established the Vulkan Video Encode Context for this command. -
precedingExternallyEncodedBytes
is the number of bytes externally encoded for insertion in the active video encode session overall bitstream prior to the bitstream that will be generated by the implementation for this instance ofVkVideoEncodeInfoKHR
. Valid when VkVideoEncodeRateControlInfoKHR::rateControlMode
is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. The value provided is used to update the implementation’s rate control algorithm for the rate control layer this instance ofVkVideoEncodeInfoKHR
belongs to, by accounting for the bitrate budget consumed by these externally encoded bytes. See VkVideoEncodeRateControlInfoKHR for additional information about encode rate control.
The coded size of the encode operation is specified in codedExtent
of
srcPictureResource
.
Multiple vkCmdEncodeVideoKHR commands may be recorded within a Vulkan
Video Encode Context.
The execution of each vkCmdEncodeVideoKHR command will result in
generating codec-specific bitstream units.
These bitstream units are generated consecutively into the bitstream buffer
specified in dstBitstreamBuffer
of a VkVideoEncodeInfoKHR
structure within the vkCmdBeginVideoCodingKHR command.
The produced bitstream is the sum of all these bitstream units, including
any padding between the bitstream units.
Any bitstream padding must be filled with data compliant to the codec
standard so as not to cause any syntax errors during decoding of the
bitstream units with the padding included.
The range of the bitstream buffer written can be queried via
video encode bitstream buffer
range queries.
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeFlagsKHR;
VkVideoEncodeFlagsKHR is a bitmask type for setting a mask, but is currently reserved for future use.
The VkVideoEncodeRateControlInfoKHR
structure is defined as:
// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeRateControlInfoKHR {
VkStructureType sType;
const void* pNext;
VkVideoEncodeRateControlFlagsKHR flags;
VkVideoEncodeRateControlModeFlagBitsKHR rateControlMode;
uint8_t layerCount;
const VkVideoEncodeRateControlLayerInfoKHR* pLayerConfigs;
} VkVideoEncodeRateControlInfoKHR;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is reserved for future use. -
rateControlMode
is a VkVideoEncodeRateControlModeFlagBitsKHR value specifying the encode stream rate control mode. -
layerCount
specifies the number of rate control layers in the video encode stream. -
pLayerConfigs
is a pointer to an array of VkVideoEncodeRateControlLayerInfoKHR structures specifying the rate control configurations oflayerCount
rate control layers.
Including this structure in the pNext
chain of
VkVideoCodingControlInfoKHR and including
VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR
in
VkVideoCodingControlInfoKHR::flags
will define stream rate
control settings for video encoding.
Additional structures providing codec-specific rate control parameters may
need to be included in the pNext
chain of
VkVideoCodingControlInfoKHR
depending on the codec profile the bound
video session was created with and the parameters specified in
VkVideoEncodeRateControlInfoKHR
(see Video
Coding Control).
To ensure that the video session is properly initialized with stream-level rate control settings, the application must call vkCmdControlVideoCodingKHR with stream-level rate control settings at least once in execution order before the first vkCmdEncodeVideoKHR command that is executed after video session reset. If not provided, default implementation-specific stream rate control settings will be used.
Stream rate control settings can also be re-initialized during an active
video encoding session.
The re-initialization takes effect whenever the
VkVideoEncodeRateControlInfoKHR
structure is included in the
pNext
chain of the VkVideoCodingControlInfoKHR structure in the
call to vkCmdControlVideoCodingKHR, and only impacts
vkCmdEncodeVideoKHR operations that follow in execution order.
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeRateControlFlagsKHR;
VkVideoEncodeRateControlFlagsKHR
is a bitmask type for setting a mask,
but currently reserved for future use.
The rate control modes are defined with the following enums:
// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeRateControlModeFlagBitsKHR {
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR = 0,
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR = 1,
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR = 2,
} VkVideoEncodeRateControlModeFlagBitsKHR;
-
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
for disabling rate control. -
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR
for constant bitrate rate control mode. -
VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR
for variable bitrate rate control mode.
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeRateControlModeFlagsKHR;
VkVideoEncodeRateControlModeFlagsKHR
is a bitmask type for setting a
mask of zero or more VkVideoEncodeRateControlModeFlagBitsKHR.
The VkVideoEncodeRateControlLayerInfoKHR
structure is defined as:
// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeRateControlLayerInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t averageBitrate;
uint32_t maxBitrate;
uint32_t frameRateNumerator;
uint32_t frameRateDenominator;
uint32_t virtualBufferSizeInMs;
uint32_t initialVirtualBufferSizeInMs;
} VkVideoEncodeRateControlLayerInfoKHR;
-
sType
is the type of this structure. -
pNext
is a pointer to a structure extending this structure. -
averageBitrate
is the average bitrate in bits/second. Valid when rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
maxBitrate
is the peak bitrate in bits/second. Valid when rate control mode isVK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR
. -
frameRateNumerator
is the numerator of the frame rate. Valid when rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
frameRateDenominator
is the denominator of the frame rate. Valid when rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
virtualBufferSizeInMs
is the leaky bucket model virtual buffer size in milliseconds, with respect to peak bitrate. Valid when rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. For example, virtual buffer size is (virtualBufferSizeInMs
×maxBitrate
/ 1000). -
initialVirtualBufferSizeInMs
is the initial occupancy in milliseconds of the virtual buffer in the leaky bucket model. Valid when the rate control mode is notVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
.
A codec-specific structure specifying additional per-layer rate control
settings must be chained to VkVideoEncodeRateControlLayerInfoKHR
.
If multiple rate control layers are enabled
(VkVideoEncodeRateControlInfoKHR::layerCount
is greater than 1),
then the chained codec-specific extension structure also identifies the
specific video coding layer its parent
VkVideoEncodeRateControlLayerInfoKHR
applies to.
If multiple rate control layers are enabled, the number of rate control
layers must match the number of video coding layers.
The specification for an encode codec-specific extension would describe how
multiple video coding layers are enabled for the corresponding codec.
Per-layer rate control settings for all enabled rate control layers must be
initialized or re-initialized whenever stream rate control settings are
provided via VkVideoEncodeRateControlInfoKHR.
This is done by specifying settings for all enabled rate control layers in
VkVideoEncodeRateControlInfoKHR::pLayerConfigs
.
Including this structure in the pNext
chain of
VkVideoCodingControlInfoKHR and including
VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_LAYER_BIT_KHR
in
VkVideoCodingControlInfoKHR::flags
will define stream rate
control settings for individual layers during video encoding.
This adjustment only impacts the specified layer without impacting the rate
control settings or implementation rate control algorithm behavior for any
other enabled rate control layers.
The adjustment takes effect whenever the corresponding
vkCmdControlVideoCodingKHR is executed, and only impacts
vkCmdEncodeVideoKHR operations pertaining to the corresponding video
coding layer that follow in execution order.
It is possible for an application to enable multiple video coding layers
(via codec-specific extensions to encoding operations) while only enabling a
single layer of rate control for the entire video stream.
To achieve this, layerCount
in VkVideoEncodeRateControlInfoKHR
must be set to 1, and the single VkVideoEncodeRateControlLayerInfoKHR
provided in pLayerConfigs
would apply to all encoded segments of the
video stream, regardless of which codec-defined video coding layer they
belong to.
In this case, the implementation decides bitrate distribution across video
coding layers (if applicable to the specified stream rate control mode).
42.14. Encode H.264
The VK_EXT_video_encode_h264
extension adds H.264 codec specific
structures/types needed to support H.264 encoding.
Unless otherwise noted, all references to the H.264 specification are to the
2010 edition published by the ITU-T, dated March 2010.
This specification is available at https://www.itu.int/rec/T-REC-H.264.
Note
Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos. |
42.14.1. H.264 encode profile
The VkVideoEncodeH264ProfileInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264ProfileInfoEXT {
VkStructureType sType;
const void* pNext;
StdVideoH264ProfileIdc stdProfileIdc;
} VkVideoEncodeH264ProfileInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdProfileIdc
is aStdVideoH264ProfileIdc
value specifying the H.264 codec profile IDC.
An H.264 encode profile is specified by including a
VkVideoEncodeH264ProfileInfoEXT
structure in the pNext
chain of
the VkVideoProfileInfoKHR structure when
VkVideoProfileInfoKHR::videoCodecOperation
is
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT
.
42.14.2. Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with
pVideoProfile->videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT
, the
VkVideoEncodeH264CapabilitiesEXT structure must be included in the
pNext
chain of the VkVideoCapabilitiesKHR structure to retrieve
more capabilities specific to H.264 video encoding.
The VkVideoEncodeH264CapabilitiesEXT structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264CapabilitiesEXT {
VkStructureType sType;
void* pNext;
VkVideoEncodeH264CapabilityFlagsEXT flags;
VkVideoEncodeH264InputModeFlagsEXT inputModeFlags;
VkVideoEncodeH264OutputModeFlagsEXT outputModeFlags;
uint8_t maxPPictureL0ReferenceCount;
uint8_t maxBPictureL0ReferenceCount;
uint8_t maxL1ReferenceCount;
VkBool32 motionVectorsOverPicBoundariesFlag;
uint32_t maxBytesPerPicDenom;
uint32_t maxBitsPerMbDenom;
uint32_t log2MaxMvLengthHorizontal;
uint32_t log2MaxMvLengthVertical;
} VkVideoEncodeH264CapabilitiesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoEncodeH264CapabilityFlagBitsEXT describing supported encoding tools. -
inputModeFlags
is a bitmask of VkVideoEncodeH264InputModeFlagBitsEXT describing supported command buffer input granularities/modes. -
outputModeFlags
is a bitmask of VkVideoEncodeH264OutputModeFlagBitsEXT describing supported output (bitstream size reporting) granularities/modes. -
maxPPictureL0ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L0 for P pictures. -
maxBPictureL0ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L0 for B pictures. The reported value is0
if encoding of B pictures is not supported. -
maxL1ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L1 if encoding of B pictures is supported. The reported value is0
if encoding of B pictures is not supported. -
motionVectorsOverPicBoundariesFlag
ifVK_TRUE
, indicates motion_vectors_over_pic_boundaries_flag will be enabled if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags. -
maxBytesPerPicDenom
reports the value that will be used for max_bytes_per_pic_denom if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags. -
maxBitsPerMbDenom
reports the value that will be used for max_bits_per_mb_denom if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags. -
log2MaxMvLengthHorizontal
reports the value that will be used for log2_max_mv_length_horizontal if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags. -
log2MaxMvLengthVertical
reports the value that will be used for log2_max_mv_length_vertical if bitstream_restriction_flag is enabled in StdVideoH264SpsVuiFlags.
When vkGetPhysicalDeviceVideoCapabilitiesKHR is called to query the
capabilities with parameter videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_EXT
, a
VkVideoEncodeH264CapabilitiesEXT
structure can be chained to
VkVideoCapabilitiesKHR to retrieve H.264 extension specific
capabilities.
// Provided by VK_EXT_video_encode_h264
typedef VkFlags VkVideoEncodeH264CapabilityFlagsEXT;
VkVideoEncodeH264CapabilityFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH264CapabilityFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH264CapabilitiesEXT::flags
, indicating the encoding
tools supported, are:
// Provided by VK_EXT_video_encode_h264
typedef enum VkVideoEncodeH264CapabilityFlagBitsEXT {
VK_VIDEO_ENCODE_H264_CAPABILITY_DIRECT_8X8_INFERENCE_ENABLED_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H264_CAPABILITY_DIRECT_8X8_INFERENCE_DISABLED_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H264_CAPABILITY_SEPARATE_COLOUR_PLANE_BIT_EXT = 0x00000004,
VK_VIDEO_ENCODE_H264_CAPABILITY_QPPRIME_Y_ZERO_TRANSFORM_BYPASS_BIT_EXT = 0x00000008,
VK_VIDEO_ENCODE_H264_CAPABILITY_SCALING_LISTS_BIT_EXT = 0x00000010,
VK_VIDEO_ENCODE_H264_CAPABILITY_HRD_COMPLIANCE_BIT_EXT = 0x00000020,
VK_VIDEO_ENCODE_H264_CAPABILITY_CHROMA_QP_OFFSET_BIT_EXT = 0x00000040,
VK_VIDEO_ENCODE_H264_CAPABILITY_SECOND_CHROMA_QP_OFFSET_BIT_EXT = 0x00000080,
VK_VIDEO_ENCODE_H264_CAPABILITY_PIC_INIT_QP_MINUS26_BIT_EXT = 0x00000100,
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_PRED_BIT_EXT = 0x00000200,
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_BIPRED_EXPLICIT_BIT_EXT = 0x00000400,
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_BIPRED_IMPLICIT_BIT_EXT = 0x00000800,
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_PRED_NO_TABLE_BIT_EXT = 0x00001000,
VK_VIDEO_ENCODE_H264_CAPABILITY_TRANSFORM_8X8_BIT_EXT = 0x00002000,
VK_VIDEO_ENCODE_H264_CAPABILITY_CABAC_BIT_EXT = 0x00004000,
VK_VIDEO_ENCODE_H264_CAPABILITY_CAVLC_BIT_EXT = 0x00008000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_DISABLED_BIT_EXT = 0x00010000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_ENABLED_BIT_EXT = 0x00020000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_PARTIAL_BIT_EXT = 0x00040000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DISABLE_DIRECT_SPATIAL_MV_PRED_BIT_EXT = 0x00080000,
VK_VIDEO_ENCODE_H264_CAPABILITY_MULTIPLE_SLICE_PER_FRAME_BIT_EXT = 0x00100000,
VK_VIDEO_ENCODE_H264_CAPABILITY_SLICE_MB_COUNT_BIT_EXT = 0x00200000,
VK_VIDEO_ENCODE_H264_CAPABILITY_ROW_UNALIGNED_SLICE_BIT_EXT = 0x00400000,
VK_VIDEO_ENCODE_H264_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_EXT = 0x00800000,
VK_VIDEO_ENCODE_H264_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_EXT = 0x01000000,
} VkVideoEncodeH264CapabilityFlagBitsEXT;
-
VK_VIDEO_ENCODE_H264_CAPABILITY_DIRECT_8X8_INFERENCE_ENABLED_BIT_EXT
reports if enabling direct_8x8_inference_flag in StdVideoH264SpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_DIRECT_8X8_INFERENCE_DISABLED_BIT_EXT
reports if disabling direct_8x8_inference_flag in StdVideoH264SpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_SEPARATE_COLOUR_PLANE_BIT_EXT
reports if enabling separate_colour_plane_flag in StdVideoH264SpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_QPPRIME_Y_ZERO_TRANSFORM_BYPASS_BIT_EXT
reports if enabling qpprime_y_zero_transform_bypass_flag in StdVideoH264SpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_SCALING_LISTS_BIT_EXT
reports if enabling seq_scaling_matrix_present_flag in StdVideoH264SpsFlags or pic_scaling_matrix_present_flag in StdVideoH264PpsFlags are supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_HRD_COMPLIANCE_BIT_EXT
reports if the implementation guarantees generating a HRD compliant bitstream if nal_hrd_parameters_present_flag or vcl_hrd_parameters_present_flag are enabled in StdVideoH264SpsVuiFlags. -
VK_VIDEO_ENCODE_H264_CAPABILITY_CHROMA_QP_OFFSET_BIT_EXT
reports if setting non-zero chroma_qp_index_offset in StdVideoH264PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_SECOND_CHROMA_QP_OFFSET_BIT_EXT
reports if setting non-zero second_chroma_qp_index_offset in StdVideoH264PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_PIC_INIT_QP_MINUS26_BIT_EXT
reports if setting non-zero pic_init_qp_minus26 in StdVideoH264PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_PRED_BIT_EXT
reports if enabling weighted_pred_flag in StdVideoH264PpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_BIPRED_EXPLICIT_BIT_EXT
reports if using STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_EXPLICIT from StdVideoH264WeightedBipredIdc is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_BIPRED_IMPLICIT_BIT_EXT
reports if using STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_IMPLICIT from StdVideoH264WeightedBipredIdc is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_WEIGHTED_PRED_NO_TABLE_BIT_EXT
reports that when weighted_pred_flag is enabled or STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_EXPLICIT from StdVideoH264WeightedBipredIdc is used, the implementation is able to internally decide syntax for pred_weight_table. -
VK_VIDEO_ENCODE_H264_CAPABILITY_TRANSFORM_8X8_BIT_EXT
reports if enabling transform_8x8_mode_flag in StdVideoH264PpsFlags is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_CABAC_BIT_EXT
reports if CABAC entropy coding is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_CAVLC_BIT_EXT
reports if CAVLC entropy coding is supported. An implementation must support at least one entropy coding mode. -
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_DISABLED_BIT_EXT
reports if using STD_VIDEO_H264_DISABLE_DEBLOCKING_FILTER_IDC_DISABLED from StdVideoH264DisableDeblockingFilterIdc is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_ENABLED_BIT_EXT
reports if using STD_VIDEO_H264_DISABLE_DEBLOCKING_FILTER_IDC_ENABLED from StdVideoH264DisableDeblockingFilterIdc is supported. -
VK_VIDEO_ENCODE_H264_CAPABILITY_DEBLOCKING_FILTER_PARTIAL_BIT_EXT
reports if using STD_VIDEO_H264_DISABLE_DEBLOCKING_FILTER_IDC_PARTIAL from StdVideoH264DisableDeblockingFilterIdc is supported. An implementation must support at least one deblocking filter mode. -
VK_VIDEO_ENCODE_H264_CAPABILITY_DISABLE_DIRECT_SPATIAL_MV_PRED_BIT_EXT
reports if disablingStdVideoEncodeH264SliceHeaderFlags
::direct_spatial_mv_pred_flag is supported when it is present in the slice header. -
VK_VIDEO_ENCODE_H264_CAPABILITY_MULTIPLE_SLICE_PER_FRAME_BIT_EXT
reports if encoding multiple slices per frame is supported. If not set, the implementation is only able to encode a single slice for the entire frame. -
VK_VIDEO_ENCODE_H264_CAPABILITY_SLICE_MB_COUNT_BIT_EXT
reports support for configuring VkVideoEncodeH264NaluSliceInfoEXT::mbCount
and first_mb_in_slice in StdVideoEncodeH264SliceHeader for each slice in a frame with multiple slices. If not supported, the implementation decides the number of macroblocks in each slice based on VkVideoEncodeH264VclFrameInfoEXT::naluSliceEntryCount
. -
VK_VIDEO_ENCODE_H264_CAPABILITY_ROW_UNALIGNED_SLICE_BIT_EXT
reports that each slice in a frame with multiple slices may begin or finish at any offset in a macroblock row. If not supported, all slices in the frame must begin at the start of a macroblock row (and hence each slice must finish at the end of a macroblock row). -
VK_VIDEO_ENCODE_H264_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_EXT
reports that whenVK_VIDEO_ENCODE_H264_CAPABILITY_MULTIPLE_SLICE_PER_FRAME_BIT_EXT
is supported and a frame is encoded with multiple slices, the implementation allows encoding each slice with a differentStdVideoEncodeH264SliceHeader
::slice_type. If not supported, all slices of the frame must be encoded with the sameslice_type
which corresponds to the picture type of the frame. For example, all slices of a P-frame would be encoded as P-slices. -
VK_VIDEO_ENCODE_H264_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_EXT
reports support for using a B frame as L1 reference.
// Provided by VK_EXT_video_encode_h264
typedef VkFlags VkVideoEncodeH264InputModeFlagsEXT;
VkVideoEncodeH264InputModeFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH264InputModeFlagBitsEXT.
The inputModeFlags
field reports the various command buffer input
granularities supported by the implementation as follows:
// Provided by VK_EXT_video_encode_h264
typedef enum VkVideoEncodeH264InputModeFlagBitsEXT {
VK_VIDEO_ENCODE_H264_INPUT_MODE_FRAME_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H264_INPUT_MODE_SLICE_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H264_INPUT_MODE_NON_VCL_BIT_EXT = 0x00000004,
} VkVideoEncodeH264InputModeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H264_INPUT_MODE_FRAME_BIT_EXT
indicates that a single command buffer must at least encode an entire frame. Any non-VCL NALUs must be encoded using the same command buffer as the frame ifVK_VIDEO_ENCODE_H264_INPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H264_INPUT_MODE_SLICE_BIT_EXT
indicates that a single command buffer must at least encode a single slice. Any non-VCL NALUs must be encoded using the same command buffer as the first slice of the frame ifVK_VIDEO_ENCODE_H264_INPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H264_INPUT_MODE_NON_VCL_BIT_EXT
indicates that a single command buffer may encode a non-VCL NALU by itself.
An implementation must support at least one of
VK_VIDEO_ENCODE_H264_INPUT_MODE_FRAME_BIT_EXT
or
VK_VIDEO_ENCODE_H264_INPUT_MODE_SLICE_BIT_EXT
.
If VK_VIDEO_ENCODE_H264_INPUT_MODE_SLICE_BIT_EXT
is not supported, the
following two additional restrictions apply for frames encoded with multiple
slices.
First, all frame slices must have the same pRefList0ModOperations and the
same pRefList1ModOperations.
Second, the order in which slices appear in
VkVideoEncodeH264VclFrameInfoEXT::pNaluSliceEntries
or in the
command buffer must match the placement order of the slices in the frame.
// Provided by VK_EXT_video_encode_h264
typedef VkFlags VkVideoEncodeH264OutputModeFlagsEXT;
VkVideoEncodeH264OutputModeFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH264InputModeFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH264CapabilitiesEXT::outputModeFlags
, indicating
the minimum bitstream generation commands that must be included between
each vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR pair
(henceforth simply begin/end pair), are:
// Provided by VK_EXT_video_encode_h264
typedef enum VkVideoEncodeH264OutputModeFlagBitsEXT {
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_FRAME_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_SLICE_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_NON_VCL_BIT_EXT = 0x00000004,
} VkVideoEncodeH264OutputModeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_FRAME_BIT_EXT
indicates that calls to generate all NALUs of a frame must be included within a single begin/end pair. Any non-VCL NALUs must be encoded within the same begin/end pair ifVK_VIDEO_ENCODE_H264_OUTPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_SLICE_BIT_EXT
indicates that each begin/end pair must encode at least one slice. Any non-VCL NALUs must be encoded within the same begin/end pair as the first slice of the frame ifVK_VIDEO_ENCODE_H264_OUTPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H264_OUTPUT_MODE_NON_VCL_BIT_EXT
indicates that each begin/end pair may encode only a non-VCL NALU by itself. An implementation must support at least one ofVK_VIDEO_ENCODE_H264_OUTPUT_MODE_FRAME_BIT_EXT
orVK_VIDEO_ENCODE_H264_OUTPUT_MODE_SLICE_BIT_EXT
.
A single begin/end pair must not encode more than a single frame.
The bitstreams of NALUs generated within a single begin/end pair are written continuously into the same bitstream buffer (any padding between the NALUs must be compliant to the H.264 standard).
The supported input modes must be coarser or equal to the supported output modes. For example, it is illegal to report slice input is supported but only frame output is supported.
An implementation must report one of the following combinations of input/output modes:
-
Input: Frame, Output: Frame
-
Input: Frame, Output: Frame and Non-VCL
-
Input: Frame, Output: Slice
-
Input: Frame, Output: Slice and Non-VCL
-
Input: Slice, Output: Slice
-
Input: Slice, Output: Slice and Non-VCL
-
Input: Frame and Non-VCL, Output: Frame and Non-VCL
-
Input: Frame and Non-VCL, Output: Slice and Non-VCL
-
Input: Slice and Non-VCL, Output: Slice and Non-VCL
42.14.3. Encoder Parameter Sets
To reduce parameter traffic during encoding, the encoder parameter set object supports storing H.264 SPS/PPS parameter sets that may be later referenced during encoding.
The VkVideoEncodeH264SessionParametersCreateInfoEXT structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264SessionParametersCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t maxStdSPSCount;
uint32_t maxStdPPSCount;
const VkVideoEncodeH264SessionParametersAddInfoEXT* pParametersAddInfo;
} VkVideoEncodeH264SessionParametersCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxStdSPSCount
is the maximum number of SPS parameters that theVkVideoSessionParametersKHR
can contain. -
maxStdPPSCount
is the maximum number of PPS parameters that theVkVideoSessionParametersKHR
can contain. -
pParametersAddInfo
isNULL
or a pointer to aVkVideoEncodeH264SessionParametersAddInfoEXT
structure specifying H.264 parameters to add upon object creation.
A VkVideoEncodeH264SessionParametersCreateInfoEXT
structure holding
one H.264 SPS and at least one H.264 PPS parameter set must be chained to
VkVideoSessionParametersCreateInfoKHR when calling
vkCreateVideoSessionParametersKHR to store these parameter set(s) with
the encoder parameter set object for later reference.
The provided H.264 SPS/PPS parameters must be within the limits specified
during encoder creation for the encoder specified in
VkVideoSessionParametersCreateInfoKHR.
The VkVideoEncodeH264SessionParametersAddInfoEXT
structure is defined
as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264SessionParametersAddInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t stdSPSCount;
const StdVideoH264SequenceParameterSet* pStdSPSs;
uint32_t stdPPSCount;
const StdVideoH264PictureParameterSet* pStdPPSs;
} VkVideoEncodeH264SessionParametersAddInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdSPSCount
is the number of SPS elements in thepStdSPSs
. Its value must be less than or equal to the value ofmaxStdSPSCount
. -
pStdSPSs
is a pointer to an array ofStdVideoH264SequenceParameterSet
structures representing H.264 sequence parameter sets. Each element of the array must have a unique H.264 SPS ID. -
stdPPSCount
is the number of PPS provided inpStdPPSs
. Its value must be less than or equal to the value ofmaxStdPPSCount
. -
pStdPPSs
is a pointer to an array ofStdVideoH264PictureParameterSet
structures representing H.264 picture parameter sets. Each element of the array must have a unique H.264 SPS-PPS ID pair.
42.14.4. Frame Encoding
In order to encode a frame, add a VkVideoEncodeH264VclFrameInfoEXT
structure to the pNext
chain of the VkVideoEncodeInfoKHR
structure passed to the vkCmdEncodeVideoKHR command.
The VkVideoEncodeH264VclFrameInfoEXT structure representing a frame encode operation is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264VclFrameInfoEXT {
VkStructureType sType;
const void* pNext;
const VkVideoEncodeH264ReferenceListsInfoEXT* pReferenceFinalLists;
uint32_t naluSliceEntryCount;
const VkVideoEncodeH264NaluSliceInfoEXT* pNaluSliceEntries;
const StdVideoEncodeH264PictureInfo* pCurrentPictureInfo;
} VkVideoEncodeH264VclFrameInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pReferenceFinalLists
isNULL
or a pointer to a VkVideoEncodeH264ReferenceListsInfoEXT structure specifying the reference lists to be used for the current picture. -
naluSliceEntryCount
is the number of slice NALUs in the frame. -
pNaluSliceEntries
is a pointer to an array ofnaluSliceEntryCount
VkVideoEncodeH264NaluSliceInfoEXT structures specifying the division of the current picture into slices and the properties of these slices. This is an ordered sequence; the NALUs are generated consecutively in VkVideoEncodeInfoKHR::dstBitstreamBuffer
in the same order as in this array. -
pCurrentPictureInfo
is a pointer to aStdVideoEncodeH264PictureInfo
structure specifying the syntax and other codec-specific information from the H.264 specification associated with this picture. The information provided must reflect the decoded picture marking operations that are applicable to this frame.
The VkVideoEncodeH264NaluSliceInfoEXT structure representing a slice is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264NaluSliceInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t mbCount;
const VkVideoEncodeH264ReferenceListsInfoEXT* pReferenceFinalLists;
const StdVideoEncodeH264SliceHeader* pSliceHeaderStd;
} VkVideoEncodeH264NaluSliceInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
mbCount
is the number of macroblocks in this slice. -
pReferenceFinalLists
isNULL
or a pointer to a VkVideoEncodeH264ReferenceListsInfoEXT structure specifying the reference lists to be used for the current slice. IfpReferenceFinalLists
is notNULL
, these reference lists override the reference lists provided in VkVideoEncodeH264VclFrameInfoEXT::pReferenceFinalLists
. -
pSliceHeaderStd
is a pointer to aStdVideoEncodeH264SliceHeader
structure specifying the slice header for the current slice.
The VkVideoEncodeH264DpbSlotInfoEXT structure, representing a reconstructed picture that is being used as a reference picture, is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264DpbSlotInfoEXT {
VkStructureType sType;
const void* pNext;
int8_t slotIndex;
const StdVideoEncodeH264ReferenceInfo* pStdReferenceInfo;
} VkVideoEncodeH264DpbSlotInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
slotIndex
is the DPB Slot index for this picture.slotIndex
must match theslotIndex
inpSetupReferenceSlot
of VkVideoEncodeInfoKHR in the command used to encode the corresponding picture. -
pStdReferenceInfo
is a pointer to aStdVideoEncodeH264ReferenceInfo
structure specifying the syntax and other codec-specific information from the H.264 specification associated with this reference picture.
The VkVideoEncodeH264ReferenceListsInfoEXT structure representing reference lists is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264ReferenceListsInfoEXT {
VkStructureType sType;
const void* pNext;
uint8_t referenceList0EntryCount;
const VkVideoEncodeH264DpbSlotInfoEXT* pReferenceList0Entries;
uint8_t referenceList1EntryCount;
const VkVideoEncodeH264DpbSlotInfoEXT* pReferenceList1Entries;
const StdVideoEncodeH264RefMemMgmtCtrlOperations* pMemMgmtCtrlOperations;
} VkVideoEncodeH264ReferenceListsInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
referenceList0EntryCount
is the number of reference pictures in reference list L0 and is identical toStdVideoEncodeH264SliceHeader
::num_ref_idx_l0_active_minus1
+ 1. -
pReferenceList0Entries
is a pointer to an array ofreferenceList0EntryCount
VkVideoEncodeH264DpbSlotInfoEXT structures specifying the reference list L0 entries for the current picture. The entries provided must be ordered after all reference list L0 modification operations are applied (i.e. final list order). The entries provided must not reflect decoded picture marking operations in this frame that are applicable to references; the impact of such operations must be reflected in future frame encode commands. The slot index in each entry must match one of the slot indexes provided in thepReferenceSlots
of the parent VkVideoEncodeInfoKHR structure. -
referenceList1EntryCount
is the number of reference pictures in reference list L1 and is identical toStdVideoEncodeH264SliceHeader
::num_ref_idx_l1_active_minus1
+ 1. -
pReferenceList1Entries
is a pointer to an array ofreferenceList1EntryCount
VkVideoEncodeH264DpbSlotInfoEXT structures specifying the reference list L1 entries for the current picture. The entries provided must be ordered after all reference list L1 modification operations are applied (i.e. final list order). The entries provided must not reflect decoded picture marking operations in this frame that are applicable to references; the impact of such operations must be reflected in future frame encode commands. The slot index in each entry must match one of the slot indexes provided in thepReferenceSlots
of the parent VkVideoEncodeInfoKHR structure. -
pMemMgmtCtrlOperations
is a pointer to aStdVideoEncodeH264RefMemMgmtCtrlOperations
structure specifying reference lists modifications and decoded picture marking operations.
The VkVideoEncodeH264EmitPictureParametersInfoEXT
structure is defined
as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264EmitPictureParametersInfoEXT {
VkStructureType sType;
const void* pNext;
uint8_t spsId;
VkBool32 emitSpsEnable;
uint32_t ppsIdEntryCount;
const uint8_t* ppsIdEntries;
} VkVideoEncodeH264EmitPictureParametersInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
spsId
is the H.264 SPS ID for the H.264 SPS to insert in the bitstream. The SPS ID must match the SPS provided inspsStd
of VkVideoEncodeH264SessionParametersCreateInfoEXT. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR. -
emitSpsEnable
enables the emitting of the SPS structure with id ofspsId
. -
ppsIdEntryCount
is the number of entries in theppsIdEntries
. If this parameter is0
then no pps entries are going to be emitted in the bitstream. -
ppsIdEntries
is a pointer to an array of H.264 PPS IDs for the H.264 PPS to insert in the bitstream. The PPS IDs must match one of the IDs of the PPS(s) provided inpStdPPSs
of VkVideoEncodeH264SessionParametersCreateInfoEXT to identify the PPS parameter set to insert in the bitstream. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR.
42.14.5. Rate control
The VkVideoEncodeH264RateControlInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264RateControlInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t gopFrameCount;
uint32_t idrPeriod;
uint32_t consecutiveBFrameCount;
VkVideoEncodeH264RateControlStructureEXT rateControlStructure;
uint8_t temporalLayerCount;
} VkVideoEncodeH264RateControlInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
gopFrameCount
is the number of frames contained within the group of pictures (GOP), starting from an intra frame and until the next intra frame. If it is set to 0, the implementation chooses a suitable value. If it is set toUINT32_MAX
, the GOP length is treated as infinite. -
idrPeriod
is the interval, in terms of number of frames, between two IDR frames. If it is set to 0, the implementation chooses a suitable value. If it is set toUINT32_MAX
, the IDR period is treated as infinite. -
consecutiveBFrameCount
is the number of consecutive B-frames between I- and/or P-frames within the GOP. -
rateControlStructure
is a VkVideoEncodeH264RateControlStructureEXT value specifying the expected encode stream reference structure, to aid in rate control calculations. -
temporalLayerCount
specifies the number of temporal layers enabled in the stream.
In order to provide H.264-specific stream rate control parameters, add a
VkVideoEncodeH264RateControlInfoEXT
structure to the pNext
chain
of the VkVideoEncodeRateControlInfoKHR structure in the pNext
chain of the VkVideoCodingControlInfoKHR structure passed to the
vkCmdControlVideoCodingKHR command.
The parameters from this structure act as a guidance for implementations to apply various rate control heuristics.
It is possible to infer the picture type to be used when encoding a frame,
on the basis of the values provided for consecutiveBFrameCount
,
idrPeriod
, and gopFrameCount
, but this inferred picture type
will not be used by implementations to override the picture type provided in
vkCmdEncodeVideoKHR.
Additionally, it is not required for the video session to be reset if the
inferred picture type does not match the actual picture type.
The rateControlStructure
in VkVideoEncodeH264RateControlInfoEXT
specifies one of the following video stream reference structures as a hint
for the rate control implementation:
// Provided by VK_EXT_video_encode_h264
typedef enum VkVideoEncodeH264RateControlStructureEXT {
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_UNKNOWN_EXT = 0,
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_FLAT_EXT = 1,
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_DYADIC_EXT = 2,
} VkVideoEncodeH264RateControlStructureEXT;
-
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_UNKNOWN_EXT
specifies a reference structure unknown at the time of stream rate control configuration. -
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_FLAT_EXT
specifies a flat reference structure. -
VK_VIDEO_ENCODE_H264_RATE_CONTROL_STRUCTURE_DYADIC_EXT
specifies a dyadic reference structure.
The VkVideoEncodeH264RateControlLayerInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264RateControlLayerInfoEXT {
VkStructureType sType;
const void* pNext;
uint8_t temporalLayerId;
VkBool32 useInitialRcQp;
VkVideoEncodeH264QpEXT initialRcQp;
VkBool32 useMinQp;
VkVideoEncodeH264QpEXT minQp;
VkBool32 useMaxQp;
VkVideoEncodeH264QpEXT maxQp;
VkBool32 useMaxFrameSize;
VkVideoEncodeH264FrameSizeEXT maxFrameSize;
} VkVideoEncodeH264RateControlLayerInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
temporalLayerId
specifies the H.264 temporal layer ID of the video coding layer that settings provided in this structure and its parent VkVideoEncodeRateControlLayerInfoKHR structure apply to. -
useInitialRcQp
indicates whether the values withininitialRcQp
should be used by the implementation. -
initialRcQp
provides the QP values for each picture type, to be used in rate control calculations at the start of video encode operations on a newly-created video session, or immediately after a session reset. These values are ignored when VkVideoEncodeRateControlInfoKHR::rateControlMode
isVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
useMinQp
indicates whether the values withinminQp
should be used by the implementation. When it is set toVK_FALSE
, the implementation ignores the values inminQp
and chooses suitable values. -
minQp
provides the lower bound on the QP values for each picture type, to be used in rate control calculations. -
useMaxQp
indicates whether the values withinmaxQp
should be used by the implementation. When it is set toVK_FALSE
, the implementation ignores the values inmaxQp
and chooses suitable values. -
maxQp
provides the upper bound on the QP values for each picture type, to be used in rate control calculations. -
useMaxFrameSize
indicates whether the values withinmaxFrameSize
should be used by the implementation. -
maxFrameSize
provides the upper bound on the encoded frame size for each picture type. The implementation does not guarantee the encoded frame sizes will be within the specified limits, however these limits may be used as a guide in rate control calculations. If enabled and not set properly, themaxQp
limit may prevent the implementation from respecting themaxFrameSize
limit.
H.264-specific per-layer rate control parameters must be specified by
adding a VkVideoEncodeH264RateControlLayerInfoEXT
structure to the
pNext
chain of each VkVideoEncodeRateControlLayerInfoKHR
structure in a call to vkCmdControlVideoCodingKHR command, when the
command buffer context has an active video encode H.264 session.
The VkVideoEncodeH264QpEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264QpEXT {
int32_t qpI;
int32_t qpP;
int32_t qpB;
} VkVideoEncodeH264QpEXT;
-
qpI
is the QP to be used for I-frames. -
qpP
is the QP to be used for P-frames. -
qpB
is the QP to be used for B-frames.
The VkVideoEncodeH264FrameSizeEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h264
typedef struct VkVideoEncodeH264FrameSizeEXT {
uint32_t frameISize;
uint32_t framePSize;
uint32_t frameBSize;
} VkVideoEncodeH264FrameSizeEXT;
-
frameISize
is the size in bytes to be used for I-frames. -
framePSize
is the size in bytes to be used for P-frames. -
frameBSize
is the size in bytes to be used for B-frames.
42.15. Encode H.265
The VK_EXT_video_encode_h265
extension adds H.265 codec-specific
structures/types needed to support H.265 video encoding.
Unless otherwise noted, all references to the H.265 specification are to the
2013 edition published by the ITU-T, dated April 2013.
This specification is available at https://www.itu.int/rec/T-REC-H.265.
Note
Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos. |
42.15.1. H.265 encode profile
An H.265 encode profile is specified by including the
VkVideoEncodeH265ProfileInfoEXT structure in the pNext
chain of
the VkVideoProfileInfoKHR structure when
VkVideoProfileInfoKHR::videoCodecOperation
is
VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_EXT
.
The VkVideoEncodeH265ProfileInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265ProfileInfoEXT {
VkStructureType sType;
const void* pNext;
StdVideoH265ProfileIdc stdProfileIdc;
} VkVideoEncodeH265ProfileInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdProfileIdc
is aStdVideoH265ProfileIdc
value specifying the H.265 codec profile IDC.
42.15.2. Capabilities
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with
pVideoProfile->videoCodecOperation
specified as
VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_EXT
, the
VkVideoEncodeH265CapabilitiesEXT structure must be included in the
pNext
chain of the VkVideoCapabilitiesKHR structure to retrieve
more capabilities specific to H.265 video encoding.
The VkVideoEncodeH265CapabilitiesEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265CapabilitiesEXT {
VkStructureType sType;
void* pNext;
VkVideoEncodeH265CapabilityFlagsEXT flags;
VkVideoEncodeH265InputModeFlagsEXT inputModeFlags;
VkVideoEncodeH265OutputModeFlagsEXT outputModeFlags;
VkVideoEncodeH265CtbSizeFlagsEXT ctbSizes;
VkVideoEncodeH265TransformBlockSizeFlagsEXT transformBlockSizes;
uint8_t maxPPictureL0ReferenceCount;
uint8_t maxBPictureL0ReferenceCount;
uint8_t maxL1ReferenceCount;
uint8_t maxSubLayersCount;
uint8_t minLog2MinLumaCodingBlockSizeMinus3;
uint8_t maxLog2MinLumaCodingBlockSizeMinus3;
uint8_t minLog2MinLumaTransformBlockSizeMinus2;
uint8_t maxLog2MinLumaTransformBlockSizeMinus2;
uint8_t minMaxTransformHierarchyDepthInter;
uint8_t maxMaxTransformHierarchyDepthInter;
uint8_t minMaxTransformHierarchyDepthIntra;
uint8_t maxMaxTransformHierarchyDepthIntra;
uint8_t maxDiffCuQpDeltaDepth;
uint8_t minMaxNumMergeCand;
uint8_t maxMaxNumMergeCand;
} VkVideoEncodeH265CapabilitiesEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
flags
is a bitmask of VkVideoEncodeH265CapabilityFlagBitsEXT describing supported encoding tools. -
inputModeFlags
is a bitmask of VkVideoEncodeH265InputModeFlagBitsEXT describing the command buffer input granularities/modes supported by the implementation. -
outputModeFlags
is a bitmask of VkVideoEncodeH265OutputModeFlagBitsEXT describing the output (bitstream size reporting) granularities/modes supported by the implementation. -
ctbSizes
is a bitmask of VkVideoEncodeH265CtbSizeFlagBitsEXT describing the supported CTB sizes. -
transformBlockSizes
is a bitmask of VkVideoEncodeH265TransformBlockSizeFlagBitsEXT describing the supported transform block sizes. -
maxPPictureL0ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L0 for P pictures. -
maxBPictureL0ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L0 for B pictures. The reported value is0
if encoding of B pictures is not supported. -
maxL1ReferenceCount
reports the maximum number of reference pictures the implementation supports in the reference list L1 if encoding of B pictures is supported. The reported value is0
if encoding of B pictures is not supported. -
maxSubLayersCount
reports the maximum number of sublayers. -
minLog2MinLumaCodingBlockSizeMinus3
reports the minimum value that may be set for log2_min_luma_coding_block_size_minus3 in StdVideoH265SequenceParameterSet. -
maxLog2MinLumaCodingBlockSizeMinus3
reports the maximum value that may be set for log2_min_luma_coding_block_size_minus3 in StdVideoH265SequenceParameterSet. -
minLog2MinLumaTransformBlockSizeMinus2
reports the minimum value that may be set for log2_min_luma_transform_block_size_minus2 in StdVideoH265SequenceParameterSet. -
maxLog2MinLumaTransformBlockSizeMinus2
reports the maximum value that may be set for log2_min_luma_transform_block_size_minus2 in StdVideoH265SequenceParameterSet. -
minMaxTransformHierarchyDepthInter
reports the minimum value that may be set for max_transform_hierarchy_depth_inter in StdVideoH265SequenceParameterSet. -
maxMaxTransformHierarchyDepthInter
reports the maximum value that may be set for max_transform_hierarchy_depth_inter in StdVideoH265SequenceParameterSet. -
minMaxTransformHierarchyDepthIntra
reports the minimum value that may be set for max_transform_hierarchy_depth_intra in StdVideoH265SequenceParameterSet. -
maxMaxTransformHierarchyDepthIntra
reports the maximum value that may be set for max_transform_hierarchy_depth_intra in StdVideoH265SequenceParameterSet. -
maxDiffCuQpDeltaDepth
reports the maximum value that may be set for diff_cu_qp_delta_depth in StdVideoH265PictureParameterSet. -
minMaxNumMergeCand
reports the minimum value that may be set for MaxNumMergeCand in StdVideoEncodeH265SliceHeader. -
maxMaxNumMergeCand
reports the maximum value that may be set for MaxNumMergeCand in StdVideoEncodeH265SliceHeader.
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265CapabilityFlagsEXT;
VkVideoEncodeH265CapabilityFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH265CapabilityFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::flags
, indicating the encoding
tools supported, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265CapabilityFlagBitsEXT {
VK_VIDEO_ENCODE_H265_CAPABILITY_SEPARATE_COLOUR_PLANE_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_CAPABILITY_SCALING_LISTS_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_CAPABILITY_SAMPLE_ADAPTIVE_OFFSET_ENABLED_BIT_EXT = 0x00000004,
VK_VIDEO_ENCODE_H265_CAPABILITY_PCM_ENABLE_BIT_EXT = 0x00000008,
VK_VIDEO_ENCODE_H265_CAPABILITY_SPS_TEMPORAL_MVP_ENABLED_BIT_EXT = 0x00000010,
VK_VIDEO_ENCODE_H265_CAPABILITY_HRD_COMPLIANCE_BIT_EXT = 0x00000020,
VK_VIDEO_ENCODE_H265_CAPABILITY_INIT_QP_MINUS26_BIT_EXT = 0x00000040,
VK_VIDEO_ENCODE_H265_CAPABILITY_LOG2_PARALLEL_MERGE_LEVEL_MINUS2_BIT_EXT = 0x00000080,
VK_VIDEO_ENCODE_H265_CAPABILITY_SIGN_DATA_HIDING_ENABLED_BIT_EXT = 0x00000100,
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSFORM_SKIP_ENABLED_BIT_EXT = 0x00000200,
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSFORM_SKIP_DISABLED_BIT_EXT = 0x00000400,
VK_VIDEO_ENCODE_H265_CAPABILITY_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_BIT_EXT = 0x00000800,
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_PRED_BIT_EXT = 0x00001000,
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_BIPRED_BIT_EXT = 0x00002000,
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_PRED_NO_TABLE_BIT_EXT = 0x00004000,
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSQUANT_BYPASS_ENABLED_BIT_EXT = 0x00008000,
VK_VIDEO_ENCODE_H265_CAPABILITY_ENTROPY_CODING_SYNC_ENABLED_BIT_EXT = 0x00010000,
VK_VIDEO_ENCODE_H265_CAPABILITY_DEBLOCKING_FILTER_OVERRIDE_ENABLED_BIT_EXT = 0x00020000,
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILE_PER_FRAME_BIT_EXT = 0x00040000,
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_SLICE_PER_TILE_BIT_EXT = 0x00080000,
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILE_PER_SLICE_BIT_EXT = 0x00100000,
VK_VIDEO_ENCODE_H265_CAPABILITY_SLICE_SEGMENT_CTB_COUNT_BIT_EXT = 0x00200000,
VK_VIDEO_ENCODE_H265_CAPABILITY_ROW_UNALIGNED_SLICE_SEGMENT_BIT_EXT = 0x00400000,
VK_VIDEO_ENCODE_H265_CAPABILITY_DEPENDENT_SLICE_SEGMENT_BIT_EXT = 0x00800000,
VK_VIDEO_ENCODE_H265_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_EXT = 0x01000000,
VK_VIDEO_ENCODE_H265_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_EXT = 0x02000000,
} VkVideoEncodeH265CapabilityFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_CAPABILITY_SEPARATE_COLOUR_PLANE_BIT_EXT
reports if enabling separate_colour_plane_flag in StdVideoH265SpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SCALING_LISTS_BIT_EXT
reports if enabling scaling_list_enabled_flag and sps_scaling_list_data_present_flag in StdVideoH265SpsFlags, or enabling pps_scaling_list_data_present_flag in StdVideoH265PpsFlags are supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SAMPLE_ADAPTIVE_OFFSET_ENABLED_BIT_EXT
reports if enabling sample_adaptive_offset_enabled_flag in StdVideoH265SpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_PCM_ENABLE_BIT_EXT
reports if enabling pcm_enable_flag in StdVideoH265SpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SPS_TEMPORAL_MVP_ENABLED_BIT_EXT
reports if enabling sps_temporal_mvp_enabled_flag in StdVideoH265SpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_HRD_COMPLIANCE_BIT_EXT
reports if the implementation guarantees generating a HRD compliant bitstream if nal_hrd_parameters_present_flag, vcl_hrd_parameters_present_flag, or sub_pic_hrd_params_present_flag are enabled in StdVideoH265HrdFlags, or vui_hrd_parameters_present_flag is enabled in StdVideoH265SpsVuiFlags. -
VK_VIDEO_ENCODE_H265_CAPABILITY_INIT_QP_MINUS26_BIT_EXT
reports if setting non-zero init_qp_minus26 in StdVideoH265PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_LOG2_PARALLEL_MERGE_LEVEL_MINUS2_BIT_EXT
reports if setting non-zero value for log2_parallel_merge_level_minus2 in StdVideoH265PictureParameterSet is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SIGN_DATA_HIDING_ENABLED_BIT_EXT
reports if enabling sign_data_hiding_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSFORM_SKIP_ENABLED_BIT_EXT
reports if enabling transform_skip_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSFORM_SKIP_DISABLED_BIT_EXT
reports if disabling transform_skip_enabled_flag in StdVideoH265PpsFlags is supported. Implementations must report at least one ofVK_VIDEO_ENCODE_H265_CAPABILITY_TRANSFORM_SKIP_ENABLED_BIT_EXT
andVK_VIDEO_ENCODE_H265_CAPABILITY_TRANSFORM_SKIP_DISABLED_BIT_EXT
as supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_BIT_EXT
reports if enabling pps_slice_chroma_qp_offsets_present_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_PRED_BIT_EXT
reports if enabling weighted_pred_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_BIPRED_BIT_EXT
reports if enabling weighted_bipred_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_WEIGHTED_PRED_NO_TABLE_BIT_EXT
reports that when weighted_pred_flag or weighted_bipred_flag in StdVideoH265PpsFlags are enabled, the implementation is able to internally decide syntax for pred_weight_table. -
VK_VIDEO_ENCODE_H265_CAPABILITY_TRANSQUANT_BYPASS_ENABLED_BIT_EXT
reports if enabling transquant_bypass_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_ENTROPY_CODING_SYNC_ENABLED_BIT_EXT
reports if enabling entropy_coding_sync_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_DEBLOCKING_FILTER_OVERRIDE_ENABLED_BIT_EXT
reports if enabling deblocking_filter_override_enabled_flag in StdVideoH265PpsFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILE_PER_FRAME_BIT_EXT
reports if encoding multiple tiles per frame is supported. If not set, the implementation is only able to encode a single tile for each frame. -
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_SLICE_PER_TILE_BIT_EXT
reports if encoding multiple slices per tile is supported. If not set, the implementation is only able to encode a single slice for each tile. -
VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILE_PER_SLICE_BIT_EXT
reports if encoding multiple tiles per slice is supported. If not set, the implementation is only able to encode a single tile for each slice. -
VK_VIDEO_ENCODE_H265_CAPABILITY_SLICE_SEGMENT_CTB_COUNT_BIT_EXT
reports support for configuring VkVideoEncodeH265NaluSliceSegmentInfoEXT::ctbCount
and slice_segment_address in StdVideoEncodeH265SliceSegmentHeader for each slice segment in a frame with multiple slice segments. If not supported, the implementation decides the number of CTBs in each slice segment based on VkVideoEncodeH265VclFrameInfoEXT::naluSliceSegmentEntryCount
. -
VK_VIDEO_ENCODE_H265_CAPABILITY_ROW_UNALIGNED_SLICE_SEGMENT_BIT_EXT
reports that each slice segment in a frame with a single or multiple tiles per slice may begin or finish at any offset in a CTB row. If not supported, all slice segments in such a frame must begin at the start of a CTB row (and hence each slice segment must finish at the end of a CTB row). Also reports that each slice segment in a frame with multiple slices per tile may begin or finish at any offset within the enclosing tile’s CTB row. If not supported, slice segments in such a frame must begin at the start of the enclosing tile’s CTB row (and hence each slice segment must finish at the end of the enclosing tile’s CTB row). -
VK_VIDEO_ENCODE_H265_CAPABILITY_DEPENDENT_SLICE_SEGMENT_BIT_EXT
reports if enabling dependent_slice_segment_flag in StdVideoEncodeH265SliceHeaderFlags is supported. -
VK_VIDEO_ENCODE_H265_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_EXT
reports that whenVK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_SLICE_PER_TILE_BIT_EXT
is supported and a frame is encoded with multiple slices, the implementation allows encoding each slice segment with a differentStdVideoEncodeH265SliceSegmentHeader
::slice_type. If not supported, all slice segments of the frame must be encoded with the sameslice_type
which corresponds to the picture type of the frame. For example, all slice segments of a P-frame would be encoded as P-slices. -
VK_VIDEO_ENCODE_H265_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_EXT
reports support for using a B frame as L1 reference.
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265InputModeFlagsEXT;
VkVideoEncodeH265InputModeFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH265InputModeFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::inputModeFlags
, indicating the
command buffer input granularities supported by the implementation, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265InputModeFlagBitsEXT {
VK_VIDEO_ENCODE_H265_INPUT_MODE_FRAME_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_INPUT_MODE_SLICE_SEGMENT_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_INPUT_MODE_NON_VCL_BIT_EXT = 0x00000004,
} VkVideoEncodeH265InputModeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_INPUT_MODE_FRAME_BIT_EXT
indicates that a single command buffer must at least encode an entire frame. Any non-VCL NALUs must be encoded using the same command buffer as the frame ifVK_VIDEO_ENCODE_H265_INPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H265_INPUT_MODE_SLICE_SEGMENT_BIT_EXT
indicates that a single command buffer must at least encode a single slice segment. Any non-VCL NALUs must be encoded using the same command buffer as the first slice segment of the frame ifVK_VIDEO_ENCODE_H265_INPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H265_INPUT_MODE_NON_VCL_BIT_EXT
indicates that a single command buffer may encode a non-VCL NALU by itself.
An implementation must support at least one of
VK_VIDEO_ENCODE_H265_INPUT_MODE_FRAME_BIT_EXT
or
VK_VIDEO_ENCODE_H265_INPUT_MODE_SLICE_SEGMENT_BIT_EXT
.
If VK_VIDEO_ENCODE_H265_INPUT_MODE_SLICE_SEGMENT_BIT_EXT
is not
supported, the following two additional restrictions apply for frames
encoded with multiple slice segments.
First, all frame slice segments must have the same pReferenceFinalLists.
Second, the order in which slice segments appear in
VkVideoEncodeH265VclFrameInfoEXT::pNaluSliceSegmentEntries
or in
the command buffer must match the placement order of the slice segments in
the frame.
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265OutputModeFlagsEXT;
VkVideoEncodeH265OutputModeFlagsEXT
is a bitmask type for setting a
mask of zero or more VkVideoEncodeH265OutputModeFlagBitsEXT.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::outputModeFlags
, indicating
the minimum bitstream generation commands that must be included between
each vkCmdBeginVideoCodingKHR and vkCmdEndVideoCodingKHR pair
(henceforth simply begin/end pair), are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265OutputModeFlagBitsEXT {
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_FRAME_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_SLICE_SEGMENT_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_NON_VCL_BIT_EXT = 0x00000004,
} VkVideoEncodeH265OutputModeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_FRAME_BIT_EXT
indicates that calls to generate all NALUs of a frame must be included within a single begin/end pair. Any non-VCL NALUs must be encoded within the same begin/end pair ifVK_VIDEO_ENCODE_H265_OUTPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_SLICE_SEGMENT_BIT_EXT
indicates that each begin/end pair must encode at least one slice segment. Any non-VCL NALUs must be encoded within the same begin/end pair as the first slice segment of the frame ifVK_VIDEO_ENCODE_H265_OUTPUT_MODE_NON_VCL_BIT_EXT
is not supported. -
VK_VIDEO_ENCODE_H265_OUTPUT_MODE_NON_VCL_BIT_EXT
indicates that each begin/end pair may encode only a non-VCL NALU by itself. An implementation must support at least one ofVK_VIDEO_ENCODE_H265_OUTPUT_MODE_FRAME_BIT_EXT
orVK_VIDEO_ENCODE_H265_OUTPUT_MODE_SLICE_SEGMENT_BIT_EXT
.
A single begin/end pair must not encode more than a single frame.
The bitstreams of NALUs generated within a single begin/end pair are written continuously into the same bitstream buffer (any padding between the NALUs must be compliant to the H.265 standard).
The supported input modes must be coarser or equal to the supported output modes. For example, it is illegal to report slice segment input is supported but only frame output is supported.
An implementation must report one of the following combinations of input/output modes:
-
Input: Frame, Output: Frame
-
Input: Frame, Output: Frame and Non-VCL
-
Input: Frame, Output: Slice Segment
-
Input: Frame, Output: Slice Segment and Non-VCL
-
Input: Slice Segment, Output: Slice Segment
-
Input: Slice Segment, Output: Slice Segment and Non-VCL
-
Input: Frame and Non-VCL, Output: Frame and Non-VCL
-
Input: Frame and Non-VCL, Output: Slice Segment and Non-VCL
-
Input: Slice Segment and Non-VCL, Output: Slice Segment and Non-VCL
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265CtbSizeFlagsEXT;
VkVideoEncodeH265CtbSizeFlagsEXT
is a bitmask type for setting a mask
of zero or more VkVideoEncodeH265CtbSizeFlagBitsEXT.
Implementations must set at least one of
VkVideoEncodeH265CtbSizeFlagBitsEXT
.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::ctbSizes
, indicating the CTB
sizes supported by the implementation, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265CtbSizeFlagBitsEXT {
VK_VIDEO_ENCODE_H265_CTB_SIZE_16_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_CTB_SIZE_32_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_CTB_SIZE_64_BIT_EXT = 0x00000004,
} VkVideoEncodeH265CtbSizeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_CTB_SIZE_16_BIT_EXT
specifies that a CTB size of 16x16 is supported. -
VK_VIDEO_ENCODE_H265_CTB_SIZE_32_BIT_EXT
specifies that a CTB size of 32x32 is supported. -
VK_VIDEO_ENCODE_H265_CTB_SIZE_64_BIT_EXT
specifies that a CTB size of 64x64 is supported.
// Provided by VK_EXT_video_encode_h265
typedef VkFlags VkVideoEncodeH265TransformBlockSizeFlagsEXT;
VkVideoEncodeH265TransformBlockSizeFlagsEXT
is a bitmask type for
setting a mask of zero or more
VkVideoEncodeH265TransformBlockSizeFlagBitsEXT.
Implementations must set at least one of
VkVideoEncodeH265TransformBlockSizeFlagBitsEXT
.
Bits which may be set in
VkVideoEncodeH265CapabilitiesEXT::transformBlockSizes
,
indicating the transform block sizes supported by the implementation, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265TransformBlockSizeFlagBitsEXT {
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_4_BIT_EXT = 0x00000001,
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_8_BIT_EXT = 0x00000002,
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_16_BIT_EXT = 0x00000004,
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_32_BIT_EXT = 0x00000008,
} VkVideoEncodeH265TransformBlockSizeFlagBitsEXT;
-
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_4_BIT_EXT
specifies that a transform block size of 4x4 is supported. -
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_8_BIT_EXT
specifies that a transform block size of 8x8 is supported. -
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_16_BIT_EXT
specifies that a transform block size of 16x16 is supported. -
VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_32_BIT_EXT
specifies that a transform block size of 32x32 is supported.
42.15.3. Encoder H.265 Video Session Parameters Object
When creating a Video Session Parameters object, add a
VkVideoEncodeH265SessionParametersCreateInfoEXT structure to the
pNext
chain of the VkVideoSessionParametersCreateInfoKHR
structure passed to vkCreateVideoSessionParametersKHR in order to
specify the H.265-specific video encoder session parameters.
The VkVideoEncodeH265SessionParametersCreateInfoEXT
structure is
defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265SessionParametersCreateInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t maxStdVPSCount;
uint32_t maxStdSPSCount;
uint32_t maxStdPPSCount;
const VkVideoEncodeH265SessionParametersAddInfoEXT* pParametersAddInfo;
} VkVideoEncodeH265SessionParametersCreateInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
maxStdVPSCount
is the maximum number of entries of typeStdVideoH265VideoParameterSet
withinVkVideoSessionParametersKHR
. -
maxStdSPSCount
is the maximum number of entries of typeStdVideoH265SequenceParameterSet
withinVkVideoSessionParametersKHR
. -
maxStdPPSCount
is the maximum number of entries of typeStdVideoH265PictureParameterSet
withinVkVideoSessionParametersKHR
. -
pParametersAddInfo
isNULL
or a pointer to a VkVideoEncodeH265SessionParametersAddInfoEXT structure specifying the video session parameters to add upon creation of this object.
When a VkVideoSessionParametersKHR object contains
maxStdVPSCount
StdVideoH265VideoParameterSet
entries, no
additional StdVideoH265VideoParameterSet
entries can be added to it,
and VK_ERROR_TOO_MANY_OBJECTS
will be returned if an attempt is made
to add these entries.
When a VkVideoSessionParametersKHR object contains
maxStdSPSCount
StdVideoH265SequenceParameterSet
entries, no
additional StdVideoH265SequenceParameterSet
entries can be added to it,
and VK_ERROR_TOO_MANY_OBJECTS
will be returned if an attempt is made
to add these entries.
When a VkVideoSessionParametersKHR object contains
maxStdPPSCount
StdVideoH265PictureParameterSet
entries, no
additional StdVideoH265PictureParameterSet
entries can be added to it,
and VK_ERROR_TOO_MANY_OBJECTS
will be returned if an attempt is made
to add these entries.
The VkVideoEncodeH265SessionParametersAddInfoEXT
structure is defined
as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265SessionParametersAddInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t stdVPSCount;
const StdVideoH265VideoParameterSet* pStdVPSs;
uint32_t stdSPSCount;
const StdVideoH265SequenceParameterSet* pStdSPSs;
uint32_t stdPPSCount;
const StdVideoH265PictureParameterSet* pStdPPSs;
} VkVideoEncodeH265SessionParametersAddInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
stdVPSCount
is the number of VPS elements inpStdVPSs
. -
pStdVPSs
is a pointer to an array ofstdVPSCount
StdVideoH265VideoParameterSet
structures representing H.265 video parameter sets. -
stdSPSCount
is the number of SPS elements inpStdSPSs
. -
pStdSPSs
is a pointer to an array ofstdSPSCount
StdVideoH265SequenceParameterSet
structures representing H.265 sequence parameter sets. -
stdPPSCount
is the number of PPS elements inpStdPPSs
. -
pStdPPSs
is a pointer to an array ofstdPPSCount
StdVideoH265PictureParameterSet
structures representing H.265 picture parameter sets.
42.15.4. Frame Encoding
In order to encode a frame, add a VkVideoEncodeH265VclFrameInfoEXT
structure to the pNext
chain of the VkVideoEncodeInfoKHR
structure passed to the vkCmdEncodeVideoKHR command.
The VkVideoEncodeH265VclFrameInfoEXT structure representing a frame encode operation is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265VclFrameInfoEXT {
VkStructureType sType;
const void* pNext;
const VkVideoEncodeH265ReferenceListsInfoEXT* pReferenceFinalLists;
uint32_t naluSliceSegmentEntryCount;
const VkVideoEncodeH265NaluSliceSegmentInfoEXT* pNaluSliceSegmentEntries;
const StdVideoEncodeH265PictureInfo* pCurrentPictureInfo;
} VkVideoEncodeH265VclFrameInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
pReferenceFinalLists
isNULL
or a pointer to a VkVideoEncodeH265ReferenceListsInfoEXT structure specifying the reference lists to be used for the current picture. -
naluSliceSegmentEntryCount
is the number of slice segment NALUs in the frame. -
pNaluSliceSegmentEntries
is a pointer to an array of VkVideoEncodeH265NaluSliceSegmentInfoEXT structures specifying the division of the current picture into slice segments and the properties of these slice segments. -
pCurrentPictureInfo
is a pointer to aStdVideoEncodeH265PictureInfo
structure specifying the syntax and other codec-specific information from the H.265 specification, associated with this picture.
The VkVideoEncodeH265NaluSliceSegmentInfoEXT structure representing a slice segment is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265NaluSliceSegmentInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t ctbCount;
const VkVideoEncodeH265ReferenceListsInfoEXT* pReferenceFinalLists;
const StdVideoEncodeH265SliceSegmentHeader* pSliceSegmentHeaderStd;
} VkVideoEncodeH265NaluSliceSegmentInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
ctbCount
is the number of CTBs in this slice segment. -
pReferenceFinalLists
isNULL
or a pointer to a VkVideoEncodeH265ReferenceListsInfoEXT structure specifying the reference lists to be used for the current slice segment. IfpReferenceFinalLists
is notNULL
, these reference lists override the reference lists provided in VkVideoEncodeH265VclFrameInfoEXT::pReferenceFinalLists
. -
pSliceSegmentHeaderStd
is a pointer to aStdVideoEncodeH265SliceSegmentHeader
structure specifying the slice segment header for the current slice segment.
The VkVideoEncodeH265DpbSlotInfoEXT structure, representing a reconstructed picture that is being used as a reference picture, is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265DpbSlotInfoEXT {
VkStructureType sType;
const void* pNext;
int8_t slotIndex;
const StdVideoEncodeH265ReferenceInfo* pStdReferenceInfo;
} VkVideoEncodeH265DpbSlotInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
slotIndex
is the DPB Slot index for this picture. -
pStdReferenceInfo
is a pointer to aStdVideoEncodeH265ReferenceInfo
structure specifying the syntax and other codec-specific information from the H.265 specification, associated with this reference picture.
The VkVideoEncodeH265ReferenceListsInfoEXT structure representing reference lists is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265ReferenceListsInfoEXT {
VkStructureType sType;
const void* pNext;
uint8_t referenceList0EntryCount;
const VkVideoEncodeH265DpbSlotInfoEXT* pReferenceList0Entries;
uint8_t referenceList1EntryCount;
const VkVideoEncodeH265DpbSlotInfoEXT* pReferenceList1Entries;
const StdVideoEncodeH265ReferenceModifications* pReferenceModifications;
} VkVideoEncodeH265ReferenceListsInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
referenceList0EntryCount
is the number of reference pictures in reference list L0 and is identical toStdVideoEncodeH265SliceSegmentHeader
::num_ref_idx_l0_active_minus1
+ 1. -
pReferenceList0Entries
is a pointer to an array ofreferenceList0EntryCount
VkVideoEncodeH265DpbSlotInfoEXT structures specifying the reference list L0 entries for the current picture. -
referenceList1EntryCount
is the number of reference pictures in reference list L1 and is identical toStdVideoEncodeH265SliceSegmentHeader
::num_ref_idx_l1_active_minus1
+ 1. -
pReferenceList1Entries
is a pointer to an array ofreferenceList1EntryCount
VkVideoEncodeH265DpbSlotInfoEXT structures specifying the reference list L1 entries for the current picture. -
pReferenceModifications
is a pointer to aStdVideoEncodeH265ReferenceModifications
structure specifying reference list modifications.
The VkVideoEncodeH265EmitPictureParametersInfoEXT structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265EmitPictureParametersInfoEXT {
VkStructureType sType;
const void* pNext;
uint8_t vpsId;
uint8_t spsId;
VkBool32 emitVpsEnable;
VkBool32 emitSpsEnable;
uint32_t ppsIdEntryCount;
const uint8_t* ppsIdEntries;
} VkVideoEncodeH265EmitPictureParametersInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
vpsId
is the H.265 VPS ID for the H.265 VPS to insert in the bitstream. The VPS ID must match the VPS provided invpsStd
of VkVideoEncodeH265SessionParametersCreateInfoEXT. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR. -
spsId
is the H.265 SPS ID for the H.265 SPS to insert in the bitstream. The SPS ID must match one of the IDs of the SPS(s) provided inpStdSPSs
of VkVideoEncodeH265SessionParametersCreateInfoEXT to identify the SPS parameter set to insert in the bitstream. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR. -
emitVpsEnable
enables the emitting of the VPS structure with id ofvpsId
. -
emitSpsEnable
enables the emitting of the SPS structure with id ofspsId
. -
ppsIdEntryCount
is the number of entries in theppsIdEntries
. If this parameter is0
then no pps entries are going to be emitted in the bitstream. -
ppsIdEntries
is the H.265 PPS IDs for the H.265 PPS to insert in the bitstream. The PPS IDs must match one of the IDs of the PPS(s) provided inpStdPPSs
of VkVideoEncodeH265SessionParametersCreateInfoEXT to identify the PPS parameter set to insert in the bitstream. This is retrieved from the VkVideoSessionParametersKHR object provided in VkVideoBeginCodingInfoKHR.
42.15.5. Rate control
The VkVideoEncodeH265RateControlInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265RateControlInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t gopFrameCount;
uint32_t idrPeriod;
uint32_t consecutiveBFrameCount;
VkVideoEncodeH265RateControlStructureEXT rateControlStructure;
uint8_t subLayerCount;
} VkVideoEncodeH265RateControlInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
gopFrameCount
is the number of frames contained within the group of pictures (GOP), starting from an intra frame and until the next intra frame. If it is set to 0, the implementation chooses a suitable value. If it is set toUINT32_MAX
, the GOP length is treated as infinite. -
idrPeriod
is the interval, in terms of number of frames, between two IDR frames. If it is set to 0, the implementation chooses a suitable value. If it is set toUINT32_MAX
, the IDR period is treated as infinite. -
consecutiveBFrameCount
is the number of consecutive B-frames between I- and/or P-frames within the GOP. -
rateControlStructure
is a VkVideoEncodeH265RateControlStructureEXT value specifying the expected encode stream reference structure, to aid in rate control calculations. -
subLayerCount
specifies the number of sub layers enabled in the stream.
In order to provide H.265-specific stream rate control parameters, add a
VkVideoEncodeH265RateControlInfoEXT
structure to the pNext
chain
of the VkVideoEncodeRateControlInfoKHR structure in the pNext
chain of the VkVideoCodingControlInfoKHR structure passed to the
vkCmdControlVideoCodingKHR command.
The parameters from this structure act as a guidance for implementations to apply various rate control heuristics.
It is possible to infer the picture type to be used when encoding a frame,
on the basis of the values provided for consecutiveBFrameCount
,
idrPeriod
, and gopFrameCount
, but this inferred picture type
will not be used by implementations to override the picture type provided in
vkCmdEncodeVideoKHR.
Additionally, it is not required for the video session to be reset if the
inferred picture type does not match the actual picture type.
Possible values of
VkVideoEncodeH265RateControlInfoEXT::rateControlStructure
,
specifying a video stream reference structure as a hint for the rate control
implementation, are:
// Provided by VK_EXT_video_encode_h265
typedef enum VkVideoEncodeH265RateControlStructureEXT {
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_UNKNOWN_EXT = 0,
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_FLAT_EXT = 1,
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_DYADIC_EXT = 2,
} VkVideoEncodeH265RateControlStructureEXT;
-
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_UNKNOWN_EXT
specifies a reference structure unknown at the time of stream rate control configuration. -
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_FLAT_EXT
specifies a flat reference structure. -
VK_VIDEO_ENCODE_H265_RATE_CONTROL_STRUCTURE_DYADIC_EXT
specifies a dyadic reference structure.
The VkVideoEncodeH265RateControlLayerInfoEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265RateControlLayerInfoEXT {
VkStructureType sType;
const void* pNext;
uint8_t temporalId;
VkBool32 useInitialRcQp;
VkVideoEncodeH265QpEXT initialRcQp;
VkBool32 useMinQp;
VkVideoEncodeH265QpEXT minQp;
VkBool32 useMaxQp;
VkVideoEncodeH265QpEXT maxQp;
VkBool32 useMaxFrameSize;
VkVideoEncodeH265FrameSizeEXT maxFrameSize;
} VkVideoEncodeH265RateControlLayerInfoEXT;
-
sType
is the type of this structure. -
pNext
isNULL
or a pointer to a structure extending this structure. -
temporalId
specifies the H.265 temporal ID of the video coding layer that settings provided in this structure and its parent VkVideoEncodeRateControlLayerInfoKHR structure apply to. -
useInitialRcQp
indicates whether the values withininitialRcQp
should be used by the implementation. -
initialRcQp
provides the QP values for each picture type, to be used in rate control calculations at the start of video encode operations on a newly-created video session, or immediately after a session reset. These values are ignored when VkVideoEncodeRateControlInfoKHR::rateControlMode
isVK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR
. -
useMinQp
indicates whether the values withinminQp
should be used by the implementation. When it is set toVK_FALSE
, the implementation ignores the values inminQp
and chooses suitable values. -
minQp
provides the lower bound on the QP values for each picture type, to be used in rate control calculations. -
useMaxQp
indicates whether the values withinmaxQp
should be used by the implementation. When it is set toVK_FALSE
, the implementation ignores the values inmaxQp
and chooses suitable values. -
maxQp
provides the upper bound on the QP values for each picture type, to be used in rate control calculations. -
useMaxFrameSize
indicates whether the values withinmaxFrameSize
should be used by the implementation. -
maxFrameSize
provides the upper bound on the encoded frame size for each picture type. The implementation does not guarantee the encoded frame sizes will be within the specified limits, however these limits may be used as a guide in rate control calculations. If enabled and not set properly, themaxQp
limit may prevent the implementation from respecting themaxFrameSize
limit.
H.265-specific per-layer rate control parameters must be specified by
adding a VkVideoEncodeH265RateControlLayerInfoEXT
structure to the
pNext
chain of each VkVideoEncodeRateControlLayerInfoKHR
structure in a call to vkCmdControlVideoCodingKHR command, when the
command buffer context has an active video encode H.265 session.
The VkVideoEncodeH265QpEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265QpEXT {
int32_t qpI;
int32_t qpP;
int32_t qpB;
} VkVideoEncodeH265QpEXT;
-
qpI
is the QP to be used for I-frames. -
qpP
is the QP to be used for P-frames. -
qpB
is the QP to be used for B-frames.
The VkVideoEncodeH265FrameSizeEXT
structure is defined as:
// Provided by VK_EXT_video_encode_h265
typedef struct VkVideoEncodeH265FrameSizeEXT {
uint32_t frameISize;
uint32_t framePSize;
uint32_t frameBSize;
} VkVideoEncodeH265FrameSizeEXT;
-
frameISize
is the size in bytes to be used for I-frames. -
framePSize
is the size in bytes to be used for P-frames. -
frameBSize
is the size in bytes to be used for B-frames.