A set of functions that aid in oneDNN debugging and profiling.
More...
A set of functions that aid in oneDNN debugging and profiling.
◆ DNNL_JIT_PROFILE_LINUX_JITDUMP_USE_TSC
#define DNNL_JIT_PROFILE_LINUX_JITDUMP_USE_TSC 8u |
◆ version_t
◆ status
Status values returned by the library functions.
Enumerator |
---|
success | The operation was successful.
|
out_of_memory | The operation failed due to an out-of-memory condition.
|
invalid_arguments | The operation failed because of incorrect function arguments.
|
unimplemented | The operation failed because requested functionality is not implemented.
|
iterator_ends | Primitive iterator passed over last primitive descriptor.
|
runtime_error | Primitive or engine failed on execution.
|
not_required | Queried element is not required for given primitive.
|
◆ cpu_isa
CPU instruction set flags.
Enumerator |
---|
all | Any ISA (excepting those listed as initial support)
|
sse41 | Intel Streaming SIMD Extensions 4.1 (Intel SSE4.1)
|
avx | Intel Advanced Vector Extensions (Intel AVX)
|
avx2 | Intel Advanced Vector Extensions 2 (Intel AVX2)
|
avx512_mic | Intel Advanced Vector Extensions 512 (Intel AVX-512) subset for Intel Xeon Phi processors x200 Series.
|
avx512_mic_4ops | Intel AVX-512 subset for Intel Xeon Phi processors 7235, 7285, 7295 Series.
|
avx512_core | Intel AVX-512 subset for Intel Xeon Scalable processor family and Intel Core processor family.
|
avx512_core_vnni | Intel AVX-512 and Intel Deep Learning Boost (Intel DL Boost) support for Intel Xeon Scalable processor family and Intel Core processor family.
|
avx512_core_bf16 | Intel AVX-512, Intel DL Boost and bfloat16 support for Intel Xeon Scalable processor family and Intel Core processor family.
|
avx512_core_amx | Intel AVX-512, Intel DL Boost and bfloat16 support and Intel AMX with 8-bit integer and bfloat16 support (initial support)
|
avx2_vnni | Intel AVX2 and Intel Deep Learning Boost (Intel DL Boost) support.
|
◆ cpu_isa_hints
CPU ISA hints flags.
Enumerator |
---|
no_hints | No hints (use default features)
|
prefer_ymm | Prefer to exclusively use Ymm registers for computations.
|
◆ dnnl_cpu_isa_t
CPU instruction set flags.
Enumerator |
---|
dnnl_cpu_isa_all | Any ISA (excepting those listed as initial support)
|
dnnl_cpu_isa_sse41 | Intel Streaming SIMD Extensions 4.1 (Intel SSE4.1)
|
dnnl_cpu_isa_avx | Intel Advanced Vector Extensions (Intel AVX)
|
dnnl_cpu_isa_avx2 | Intel Advanced Vector Extensions 2 (Intel AVX2)
|
dnnl_cpu_isa_avx512_mic | Intel Advanced Vector Extensions 512 (Intel AVX-512) subset for Intel Xeon Phi processors x200 Series.
|
dnnl_cpu_isa_avx512_mic_4ops | Intel AVX-512 subset for Intel Xeon Phi processors 7235, 7285, 7295 Series.
|
dnnl_cpu_isa_avx512_core | Intel AVX-512 subset for Intel Xeon Scalable processor family and Intel Core processor family.
|
dnnl_cpu_isa_avx512_core_vnni | Intel AVX-512 and Intel Deep Learning Boost (Intel DL Boost) support for Intel Xeon Scalable processor family and Intel Core processor family.
|
dnnl_cpu_isa_avx512_core_bf16 | Intel AVX-512, Intel DL Boost and bfloat16 support for Intel Xeon Scalable processor family and Intel Core processor family.
|
dnnl_cpu_isa_avx512_core_amx | Intel AVX-512, Intel DL Boost and bfloat16 support and Intel AMX with 8-bit integer and bfloat16 support (initial support)
|
dnnl_cpu_isa_avx2_vnni | Intel AVX2 and Intel Deep Learning Boost (Intel DL Boost) support.
|
◆ dnnl_cpu_isa_hints_t
CPU ISA hints flags.
Enumerator |
---|
dnnl_cpu_isa_no_hints | No hints (use default features)
|
dnnl_cpu_isa_prefer_ymm | Prefer to exclusively use Ymm registers for computations.
|
◆ dnnl_set_verbose()
Configures verbose output to stdout.
- Note
- Enabling verbose output affects performance. This setting overrides the DNNL_VERBOSE environment variable.
- Parameters
-
level | Verbosity level:
- 0: no verbose output (default),
- 1: primitive information at execution,
- 2: primitive information at creation and execution.
|
- Returns
- dnnl_invalid_arguments/dnnl::status::invalid_arguments if the
level
value is invalid, and dnnl_success/dnnl::status::success on success.
◆ dnnl_set_jit_dump()
◆ dnnl_version()
Returns library version information.
- Returns
- Pointer to a constant structure containing
- major: major version number,
- minor: minor version number,
- patch: patch release number,
- hash: git commit hash.
◆ dnnl_set_jit_profiling_flags()
dnnl_status_t DNNL_API dnnl_set_jit_profiling_flags |
( |
unsigned |
flags | ) |
|
◆ dnnl_set_jit_profiling_jitdumpdir()
dnnl_status_t DNNL_API dnnl_set_jit_profiling_jitdumpdir |
( |
const char * |
dir | ) |
|
Sets JIT dump output path.
Only applicable to Linux and is only used when profiling flags have DNNL_JIT_PROFILE_LINUX_PERF bit set.
After the first JIT kernel is generated, the jitdump output will be placed into temporary directory created using the mkdtemp template 'dir/.debug/jit/dnnl.XXXXXX'.
- See also
- Profiling oneDNN Performance
- Note
- This setting overrides JITDUMPDIR environment variable. If JITDUMPDIR is not set, and this function is never called, the path defaults to HOME. Passing NULL reverts the value to default.
-
The directory is accessed only when the first JIT kernel is being created. JIT profiling will be disabled in case of any errors accessing or creating this directory.
- Parameters
-
- Returns
- dnnl_success/dnnl::status::success if the output directory was set correctly and an error status otherwise.
-
dnnl_unimplemented/dnnl::status::unimplemented on Windows.
◆ dnnl_set_max_cpu_isa()
Sets the maximal ISA the library can dispatch to on the CPU.
See dnnl_cpu_isa_t and dnnl::cpu_isa for the list of the values accepted by the C and C++ API functions respectively.
This function has effect only once, and returns an error on subsequent calls. It should also be invoked before any other oneDNN API call, otherwise it may return an error.
This function overrides the DNNL_MAX_CPU_ISA environment variable. The environment variable can be set to the desired maximal ISA name in upper case and with dnnl_cpu_isa prefix removed. For example: DNNL_MAX_CPU_ISA=AVX2
.
- Note
- The ISAs are only partially ordered:
- SSE41 < AVX < AVX2,
- AVX2 < AVX512_MIC < AVX512_MIC_4OPS,
- AVX2 < AVX512_CORE < AVX512_CORE_VNNI < AVX512_CORE_BF16 < AVX512_CORE_AMX,
- AVX2 < AVX2_VNNI.
- See also
- CPU Dispatcher Control for more details
- Parameters
-
- Returns
- dnnl_success/dnnl::status::success on success and a dnnl_invalid_arguments/dnnl::status::invalid_arguments if the
isa
parameter is invalid or the ISA cannot be changed at this time.
-
dnnl_unimplemented/dnnl::status::unimplemented if the feature was disabled at build time (see Build Options for more details).
◆ dnnl_get_effective_cpu_isa()
◆ dnnl_set_cpu_isa_hints()
◆ dnnl_get_cpu_isa_hints()
◆ set_verbose()
status dnnl::set_verbose |
( |
int |
level | ) |
|
|
inline |
Configures verbose output to stdout.
- Note
- Enabling verbose output affects performance. This setting overrides the DNNL_VERBOSE environment variable.
- Parameters
-
level | Verbosity level:
- 0: no verbose output (default),
- 1: primitive information at execution,
- 2: primitive information at creation and execution.
|
- Returns
- dnnl_invalid_arguments/dnnl::status::invalid_arguments if the
level
value is invalid, and dnnl_success/dnnl::status::success on success.
◆ version()
Returns library version information.
- Returns
- Pointer to a constant structure containing
- major: major version number,
- minor: minor version number,
- patch: patch release number,
- hash: git commit hash.
◆ set_jit_dump()
status dnnl::set_jit_dump |
( |
int |
enable | ) |
|
|
inline |
◆ set_jit_profiling_flags()
status dnnl::set_jit_profiling_flags |
( |
unsigned |
flags | ) |
|
|
inline |
◆ set_jit_profiling_jitdumpdir()
status dnnl::set_jit_profiling_jitdumpdir |
( |
const std::string & |
dir | ) |
|
|
inline |
Sets JIT dump output path.
Only applicable to Linux and is only used when profiling flags have DNNL_JIT_PROFILE_LINUX_PERF bit set.
After the first JIT kernel is generated, the jitdump output will be placed into temporary directory created using the mkdtemp template 'dir/.debug/jit/dnnl.XXXXXX'.
- See also
- Profiling oneDNN Performance
- Note
- This setting overrides JITDUMPDIR environment variable. If JITDUMPDIR is not set, and this function is never called, the path defaults to HOME. Passing NULL reverts the value to default.
-
The directory is accessed only when the first JIT kernel is being created. JIT profiling will be disabled in case of any errors accessing or creating this directory.
- Parameters
-
- Returns
- dnnl_success/dnnl::status::success if the output directory was set correctly and an error status otherwise.
-
dnnl_unimplemented/dnnl::status::unimplemented on Windows.
◆ set_max_cpu_isa()
Sets the maximal ISA the library can dispatch to on the CPU.
See dnnl_cpu_isa_t and dnnl::cpu_isa for the list of the values accepted by the C and C++ API functions respectively.
This function has effect only once, and returns an error on subsequent calls. It should also be invoked before any other oneDNN API call, otherwise it may return an error.
This function overrides the DNNL_MAX_CPU_ISA environment variable. The environment variable can be set to the desired maximal ISA name in upper case and with dnnl_cpu_isa prefix removed. For example: DNNL_MAX_CPU_ISA=AVX2
.
- Note
- The ISAs are only partially ordered:
- SSE41 < AVX < AVX2,
- AVX2 < AVX512_MIC < AVX512_MIC_4OPS,
- AVX2 < AVX512_CORE < AVX512_CORE_VNNI < AVX512_CORE_BF16 < AVX512_CORE_AMX,
- AVX2 < AVX2_VNNI.
- See also
- CPU Dispatcher Control for more details
- Parameters
-
- Returns
- dnnl_success/dnnl::status::success on success and a dnnl_invalid_arguments/dnnl::status::invalid_arguments if the
isa
parameter is invalid or the ISA cannot be changed at this time.
-
dnnl_unimplemented/dnnl::status::unimplemented if the feature was disabled at build time (see Build Options for more details).
◆ get_effective_cpu_isa()
cpu_isa dnnl::get_effective_cpu_isa |
( |
| ) |
|
|
inline |
◆ set_cpu_isa_hints()
◆ get_cpu_isa_hints()