How to draw error bands¶
We show two ways to compute 1-sigma bands around a fitted curve.
Whether the curve describes a probability density (from a maximum-likelihood fit) or an expectation (from a least-squares fit) does not matter, the procedure is the same. We demonstrate this on an unbinned extended maximum-likelihood fit of a Gaussian.
[1]:
import numpy as np
from numba_stats import norm
from iminuit import Minuit
from iminuit.cost import ExtendedUnbinnedNLL
import matplotlib.pyplot as plt
# generate toy sample
rng = np.random.default_rng(1)
x = rng.normal(size=100)
# bin it
w, xe = np.histogram(x, bins=100, range=(-5, 5))
# compute bin-wise density estimates
werr = w ** 0.5
cx = 0.5 * (xe[1:] + xe[:-1])
dx = np.diff(xe)
d = w / dx
derr = werr / dx
# define model and cost function
def model(x, par):
return par[0], par[0] * norm.pdf(x, par[1], par[2])
cost = ExtendedUnbinnedNLL(x, model)
# fit the model
m = Minuit(cost, (1, 0, 1))
m.migrad()
m.hesse()
# plot everything
plt.errorbar(cx, d, derr, fmt="o", label="data", zorder=0)
plt.plot(cx, model(cx, m.values)[1], lw=3,
label="fit")
plt.legend(frameon=False,
title=f"$n = {m.values[0]:.2f} +/- {m.errors[0]:.2f}$\n"
f"$\mu = {m.values[1]:.2f} +/- {m.errors[1]:.2f}$\n"
f"$\sigma = {m.values[2]:.2f} +/- {m.errors[2]:.2f}$");
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
File __init__.pxd:942, in numpy.import_array()
RuntimeError: module compiled against API version 0x10 but this version of numpy is 0xf
During handling of the above exception, another exception occurred:
ImportError Traceback (most recent call last)
Input In [1], in <cell line: 2>()
1 import numpy as np
----> 2 from numba_stats import norm
3 from iminuit import Minuit
4 from iminuit.cost import ExtendedUnbinnedNLL
File ~/python-iminuit/src/python-iminuit/test-env/lib/python3.10/site-packages/numba_stats/norm.py:9, in <module>
1 """
2 Normal distribution.
3
(...)
6 scipy.stats.norm: Scipy equivalent.
7 """
8 import numpy as np
----> 9 from ._special import erfinv as _erfinv
10 from ._util import _jit, _trans, _generate_wrappers, _prange
11 from math import erf as _erf
File ~/python-iminuit/src/python-iminuit/test-env/lib/python3.10/site-packages/numba_stats/_special.py:7, in <module>
5 from numba.extending import get_cython_function_address
6 from numba.types import WrapperAddressProtocol, float64
----> 7 import scipy.special.cython_special as cysp
10 def get(name, signature):
11 # create new function object with correct signature that numba can call by extracting
12 # function pointer from scipy.special.cython_special; uses scipy/cython internals
13 index = 1 if signature.return_type is float64 else 0
File /usr/lib/python3.10/site-packages/scipy/special/__init__.py:649, in <module>
1 """
2 ========================================
3 Special functions (:mod:`scipy.special`)
(...)
644
645 """
647 from ._sf_error import SpecialFunctionWarning, SpecialFunctionError
--> 649 from . import _ufuncs
650 from ._ufuncs import *
652 from . import _basic
File /usr/lib/python3.10/site-packages/scipy/special/_ufuncs.pyx:1, in init scipy.special._ufuncs()
File scipy/special/_ufuncs_extra_code_common.pxi:34, in init scipy.special._ufuncs_cxx()
File __init__.pxd:944, in numpy.import_array()
ImportError: numpy.core.multiarray failed to import
We want to understand how uncertain the Gaussian curve is. Thus we want to draw a 1-sigma error band around the curve, which approximates the 68 % confidence interval.
With error propagation¶
The uncertainty is quantified in form of the covariance matrix of the fitted parameters. We can use error propagation to obtain the uncertainty of the curve,
where \(C\) is the covariance matrix of the input vector, \(C'\) is the covariance matrix of the output vector and \(J\) is the matrix of first derivatives of the mapping function between input and output. The mapping in this case is the curve, \(\vec y = f(\vec{x}; \vec{p})\), regarded as a function of \(\vec{p}\) and not of \(\vec{x}\), which is fixed. The function maps from \(\vec{p}\) to \(\vec{y}\) and the Jacobi matrix is made from elements
To compute the derivatives one can sometimes use Sympy or an auto-differentiation tool like JAX if the function permits it, but in general they need to be computed numerically. The library Jacobi provides a fast and robust calculator for numerical derivatives and a function for error propagation.
[2]:
from jacobi import propagate
# run error propagation
y, ycov = propagate(lambda p: model(cx, p)[1], m.values, m.covariance)
# plot everything
plt.errorbar(cx, d, derr, fmt="o", label="data", zorder=0)
plt.plot(cx, y, lw=3, label="fit")
# draw 1 sigma error band
yerr_prop = np.diag(ycov) ** 0.5
plt.fill_between(cx, y - yerr_prop, y + yerr_prop, facecolor="C1", alpha=0.5)
plt.legend(frameon=False,
title=f"$n = {m.values[0]:.2f} +/- {m.errors[0]:.2f}$\n"
f"$\mu = {m.values[1]:.2f} +/- {m.errors[1]:.2f}$\n"
f"$\sigma = {m.values[2]:.2f} +/- {m.errors[2]:.2f}$");
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Input In [2], in <cell line: 1>()
----> 1 from jacobi import propagate
3 # run error propagation
4 y, ycov = propagate(lambda p: model(cx, p)[1], m.values, m.covariance)
ModuleNotFoundError: No module named 'jacobi'
Error propagation is relatively fast.
[3]:
%%timeit -r 1 -n 1000
propagate(lambda p: model(cx, p)[1], m.values, m.covariance)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 get_ipython().run_cell_magic('timeit', '-r 1 -n 1000', 'propagate(lambda p: model(cx, p)[1], m.values, m.covariance)\n')
File /usr/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2358, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
2356 with self.builtin_trap:
2357 args = (magic_arg_s, cell)
-> 2358 result = fn(*args, **kwargs)
2359 return result
File /usr/lib/python3.10/site-packages/IPython/core/magics/execution.py:1166, in ExecutionMagics.timeit(self, line, cell, local_ns)
1163 if time_number >= 0.2:
1164 break
-> 1166 all_runs = timer.repeat(repeat, number)
1167 best = min(all_runs) / number
1168 worst = max(all_runs) / number
File /usr/lib/python3.10/timeit.py:206, in Timer.repeat(self, repeat, number)
204 r = []
205 for i in range(repeat):
--> 206 t = self.timeit(number)
207 r.append(t)
208 return r
File /usr/lib/python3.10/site-packages/IPython/core/magics/execution.py:156, in Timer.timeit(self, number)
154 gc.disable()
155 try:
--> 156 timing = self.inner(it, self.timer)
157 finally:
158 if gcold:
File <magic-timeit>:1, in inner(_it, _timer)
NameError: name 'propagate' is not defined
With the bootstrap¶
Another generic way to compute uncertainties is bootstrapping. We know that the parameters asymptotically follow a multivariate normal distribution, so we can simulate new experiments with varied parameter values.
[4]:
rng = np.random.default_rng(1)
par_b = rng.multivariate_normal(m.values, m.covariance, size=1000)
# standard deviation of bootstrapped curves
y_b = [model(cx, p)[1] for p in par_b]
yerr_boot = np.std(y_b, axis=0)
# plot everything
plt.errorbar(cx, d, derr, fmt="o", label="data", zorder=0)
plt.plot(cx, y, lw=3, label="fit")
# draw 1 sigma error band
plt.fill_between(cx, y - yerr_boot, y + yerr_boot, facecolor="C1", alpha=0.5)
plt.legend(frameon=False,
title=f"$n = {m.values[0]:.2f} +/- {m.errors[0]:.2f}$\n"
f"$\mu = {m.values[1]:.2f} +/- {m.errors[1]:.2f}$\n"
f"$\sigma = {m.values[2]:.2f} +/- {m.errors[2]:.2f}$");
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [4], in <cell line: 3>()
1 rng = np.random.default_rng(1)
----> 3 par_b = rng.multivariate_normal(m.values, m.covariance, size=1000)
5 # standard deviation of bootstrapped curves
6 y_b = [model(cx, p)[1] for p in par_b]
NameError: name 'm' is not defined
The result is visually indistinguishable from before, as it should be. If you worry about deviations between the two methods, read on.
In this example, computing the band from 1000 samples is slower than error propagation.
[5]:
%%timeit -r 1 -n 100
par_b = rng.multivariate_normal(m.values, m.covariance, size=1000)
y_b = [model(cx, p)[1] for p in par_b]
np.std(y_b, axis=0)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [5], in <cell line: 1>()
----> 1 get_ipython().run_cell_magic('timeit', '-r 1 -n 100', 'par_b = rng.multivariate_normal(m.values, m.covariance, size=1000)\ny_b = [model(cx, p)[1] for p in par_b]\nnp.std(y_b, axis=0)\n')
File /usr/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2358, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
2356 with self.builtin_trap:
2357 args = (magic_arg_s, cell)
-> 2358 result = fn(*args, **kwargs)
2359 return result
File /usr/lib/python3.10/site-packages/IPython/core/magics/execution.py:1166, in ExecutionMagics.timeit(self, line, cell, local_ns)
1163 if time_number >= 0.2:
1164 break
-> 1166 all_runs = timer.repeat(repeat, number)
1167 best = min(all_runs) / number
1168 worst = max(all_runs) / number
File /usr/lib/python3.10/timeit.py:206, in Timer.repeat(self, repeat, number)
204 r = []
205 for i in range(repeat):
--> 206 t = self.timeit(number)
207 r.append(t)
208 return r
File /usr/lib/python3.10/site-packages/IPython/core/magics/execution.py:156, in Timer.timeit(self, number)
154 gc.disable()
155 try:
--> 156 timing = self.inner(it, self.timer)
157 finally:
158 if gcold:
File <magic-timeit>:1, in inner(_it, _timer)
NameError: name 'm' is not defined
However, the calculation time scales linearly with the number of samples. One can simply draw fewer samples if the additional uncertainty is acceptable. If we draw only 50 samples, bootstrapping wins over numerical error propagation.
[6]:
%%timeit -r 1 -n 1000
rng = np.random.default_rng(1)
par_b = rng.multivariate_normal(m.values, m.covariance, size=50)
y_b = [model(cx, p)[1] for p in par_b]
np.std(y_b, axis=0)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 get_ipython().run_cell_magic('timeit', '-r 1 -n 1000', 'rng = np.random.default_rng(1)\npar_b = rng.multivariate_normal(m.values, m.covariance, size=50)\ny_b = [model(cx, p)[1] for p in par_b]\nnp.std(y_b, axis=0)\n')
File /usr/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2358, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
2356 with self.builtin_trap:
2357 args = (magic_arg_s, cell)
-> 2358 result = fn(*args, **kwargs)
2359 return result
File /usr/lib/python3.10/site-packages/IPython/core/magics/execution.py:1166, in ExecutionMagics.timeit(self, line, cell, local_ns)
1163 if time_number >= 0.2:
1164 break
-> 1166 all_runs = timer.repeat(repeat, number)
1167 best = min(all_runs) / number
1168 worst = max(all_runs) / number
File /usr/lib/python3.10/timeit.py:206, in Timer.repeat(self, repeat, number)
204 r = []
205 for i in range(repeat):
--> 206 t = self.timeit(number)
207 r.append(t)
208 return r
File /usr/lib/python3.10/site-packages/IPython/core/magics/execution.py:156, in Timer.timeit(self, number)
154 gc.disable()
155 try:
--> 156 timing = self.inner(it, self.timer)
157 finally:
158 if gcold:
File <magic-timeit>:2, in inner(_it, _timer)
NameError: name 'm' is not defined
Let’s see how the result looks, whether it deviates noticably.
[7]:
# compute bootstrapped curves with 50 samples
par_b = rng.multivariate_normal(m.values, m.covariance, size=50)
y_b = [model(cx, p)[1] for p in par_b]
yerr_boot_50 = np.std(y_b, axis=0)
# plot everything
plt.errorbar(cx, d, derr, fmt="o", label="data", zorder=0)
plt.plot(cx, y, lw=3, label="fit")
# draw 1 sigma error band
plt.fill_between(cx, y - yerr_boot_50, y + yerr_boot_50, facecolor="C1", alpha=0.5)
plt.legend(frameon=False,
title=f"$n = {m.values[0]:.2f} +/- {m.errors[0]:.2f}$\n"
f"$\mu = {m.values[1]:.2f} +/- {m.errors[1]:.2f}$\n"
f"$\sigma = {m.values[2]:.2f} +/- {m.errors[2]:.2f}$");
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [7], in <cell line: 2>()
1 # compute bootstrapped curves with 50 samples
----> 2 par_b = rng.multivariate_normal(m.values, m.covariance, size=50)
3 y_b = [model(cx, p)[1] for p in par_b]
4 yerr_boot_50 = np.std(y_b, axis=0)
NameError: name 'm' is not defined
No, the result is still visually indistinguishable. This suggests that 50 samples can be enough for plotting.
Numerically, the three error bands differ at the 10 % level in the central region (expected relative error is \(50^{-1/2} \approx 0.14\)). The eye cannot pickup these differences, but they are there. The curves differ more in the tails, which is not visible in linear scale, but noticable in log-scale.
[8]:
fig, ax = plt.subplots(1, 2, figsize=(12, 5))
plt.sca(ax[0])
plt.plot(cx, y - yerr_prop, "-C0", label="prop")
plt.plot(cx, y + yerr_prop, "-C0", label="prop")
plt.plot(cx, y - yerr_boot, "--C1", label="boot[1000]")
plt.plot(cx, y + yerr_boot, "--C1", label="boot[1000]")
plt.plot(cx, y - yerr_boot_50, ":C2", label="boot[50]")
plt.plot(cx, y + yerr_boot_50, ":C2", label="boot[50]")
plt.legend()
plt.semilogy();
plt.sca(ax[1])
plt.plot(cx, yerr_boot / yerr_prop, label="boot[1000] / prop")
plt.plot(cx, yerr_boot_50 / yerr_prop, label="boot[50] / prop")
plt.legend()
plt.axhline(1, ls="--", color="0.5", zorder=0)
for delta in (-0.1, 0.1):
plt.axhline(1 + delta, ls=":", color="0.5", zorder=0)
plt.ylim(0.5, 1.5);
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 fig, ax = plt.subplots(1, 2, figsize=(12, 5))
3 plt.sca(ax[0])
4 plt.plot(cx, y - yerr_prop, "-C0", label="prop")
NameError: name 'plt' is not defined
We see that the bootstrapped bands are a bit wider in the tails. This is caused by non-linearities that are neglected in error propagation.
Which is better? Error propagation or bootstrap?¶
There is no clear-cut answer. At the visual level, both methods are usually fine (even with small number of bootstrap samples). Which calculation is more accurate depends on details of the problem. Fortunately, the sources of error are orthogonal for both methods, so each method can be used to check the other.
The bootstrap error is caused by sampling. It can be reduced by drawing more samples, the relative error is proportional to \(N^{-1/2}\).
The propagation error is caused by errors in the Jacobian and by the error of using a first-order Taylor series in the computation.