CUDA Python 11.6.0 Release notes

Released on Januray 12, 2022

Hightlights

  • Support CUDA Toolkit 11.6

  • Support Profiler APIs

  • Support Graphic APIs (EGL, GL, VDPAU)

  • Support changing default stream

  • Relaxed primitive interoperability

Default stream

Changing default stream to Per-Thread-Default-Stream (PTDS) is done through environment variable before execution:

export CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM=1

When set to 1, the default stream is the per-thread default stream. When set to 0, the default stream is the legacy default stream. This defaults to 0, for the legacy default stream. It may default to 1 in a future release of Numba. See Stream Synchronization Behavior for an explanation of the legacy and per-thread default streams.

Primitive interoperability

APIs accepting classes that wrap a primitive value are now interoperable with the underlining value.

Example 1: Structure member handles interoperability.

>>> waitParams = cuda.CUstreamMemOpWaitValueParams_st()
>>> waitParams.value64 = 1
>>> waitParams.value64
<cuuint64_t 1>
>>> waitParams.value64 = cuda.cuuint64_t(2)
>>> waitParams.value64
<cuuint64_t 2>

Example 2: Function signature handles interoperability.

>>> cudart.cudaStreamQuery(cudart.cudaStreamNonBlocking)
(<cudaError_t.cudaSuccess: 0>,)
>>> cudart.cudaStreamQuery(cudart.cudaStream_t(cudart.cudaStreamNonBlocking))
(<cudaError_t.cudaSuccess: 0>,)

Limitations

CUDA Functions Not Supported in this Release

  • Symbol APIs

    • cudaGraphExecMemcpyNodeSetParamsFromSymbol

    • cudaGraphExecMemcpyNodeSetParamsToSymbol

    • cudaGraphAddMemcpyNodeToSymbol

    • cudaGraphAddMemcpyNodeFromSymbol

    • cudaGraphMemcpyNodeSetParamsToSymbol

    • cudaGraphMemcpyNodeSetParamsFromSymbol

    • cudaMemcpyToSymbol

    • cudaMemcpyFromSymbol

    • cudaMemcpyToSymbolAsync

    • cudaMemcpyFromSymbolAsync

    • cudaGetSymbolAddress

    • cudaGetSymbolSize

    • cudaGetFuncBySymbol

  • Launch Options

    • cudaLaunchKernel

    • cudaLaunchCooperativeKernel

    • cudaLaunchCooperativeKernelMultiDevice

  • cudaSetValidDevices

  • cudaVDPAUSetVDPAUDevice

Note

Deprecated APIs are removed from tracking