It is often useful to collect information about how much of an application runtime is spent executing oneDNN primitives and which of those take the most time. oneDNN verbose mode enables tracing execution of oneDNN primitives and collection of basic statistics like execution time and primitive parameters.
The behavior is controlled with DNNL_VERBOSE
environment variable or dnnl_set_verbose function.
Value | Behavior |
---|---|
0 | no verbose output (default) |
1 | primitive information at execution |
2 | primitive information at creation and execution |
The function setting takes precedence over the environment variable.
The first lines of verbose information contain the build version and git hash, if available, as well as CPU and GPU runtimes, and the supported instruction set architecture.
Each subsequent line of verbose information is formatted as a comma-separated list containing:
dnnl_verbose
marker stringcreate[:cache_hit]
, create[:cache_miss]
or exec
cpu
or gpu
convolution
, reorder
, sum
, etcforward_training
, forward_inference
, or backward
This produces the following output (the line break was added to fit the page width):
Please see the profiling example here, as it uses DNNL_VERBOSE output to tune oneDNN code to align with best practices.