Generic least-squares function

This tutorial shows how to write a basic generic least-squares cost function that works well with iminuit.

We have seen in the basic tutorial how to make a least-squares function with an explicit signature that iminuit could read to find the parameter names automatically. Part of the structure of a least-squares function is always the same. What changes is the model that predicts the y-values and its parameters. Here, we show how to make a generic weighted least-squares that extracts the parameters from the model which the user provides.

Note: Cost functions for common use-cases can be imported from ``iminuit.cost``, including a better version of a generic least-squares function that we build here. The built-in cost functions come with extra features and use some insights to make them work better than naive implementations, so prefer the built-in cost functions if you can.

Derive parameters from model

[1]:
import numpy as np
from iminuit import Minuit
from iminuit.util import describe
[2]:
class LeastSquares:
    """
    Generic least-squares cost function with error.
    """

    errordef = Minuit.LEAST_SQUARES # for Minuit to compute errors correctly

    def __init__(self, model, x, y, err):
        self.model = model  # model predicts y for given x
        self.x = np.asarray(x)
        self.y = np.asarray(y)
        self.err = np.asarray(err)

    def __call__(self, *par):  # we accept a variable number of model parameters
        ym = self.model(self.x, *par)
        return np.sum((self.y - ym) ** 2 / self.err ** 2)

Let’s try it out with iminuit.

[3]:
def line(x, a, b):  # simple straight line model with explicit parameters
    return a + b * x

rng = np.random.default_rng(1)
x_data = np.arange(1, 6, dtype=float)
y_err = np.ones_like(x_data)
y_data = line(x_data, 1, 2) + rng.normal(0, y_err)

lsq = LeastSquares(line, x_data, y_data, y_err)

# this fails
try:
    m = Minuit(lsq, a=0, b=0)
    m.errordef=Minuit.LEAST_SQUARES
except:
    import traceback
    traceback.print_exc()
Traceback (most recent call last):
  File "/tmp/ipykernel_7539/4284129589.py", line 13, in <cell line: 12>
    m = Minuit(lsq, a=0, b=0)
  File "/build/python-iminuit/src/python-iminuit/build/lib.linux-x86_64-3.10/iminuit/minuit.py", line 622, in __init__
    self._init_state = _make_init_state(self._pos2var, start, kwds)
  File "/build/python-iminuit/src/python-iminuit/build/lib.linux-x86_64-3.10/iminuit/minuit.py", line 2449, in _make_init_state
    raise RuntimeError(
RuntimeError: a is not one of the parameters []

What happened? iminuit uses introspection to detect the parameter names and the number of parameters. It uses the describe utility for that, but it fails, since the generic method signature LeastSquares.__call__(self, *par), does not reveal the number and names of the parameters.

The information could be extracted from the model signature, but iminuit knows nothing about the signature of line(x, a, b) here. We can fix this by generating a function signature for the LeastSquares class from the signature of the model.

[4]:
# get the args from line and strip 'x'
describe(line)[1:]
[4]:
['a', 'b']
[5]:
from iminuit.util import make_func_code

# now inject that into the lsq object with the make_func_code tool
lsq.func_code = make_func_code(describe(line)[1:])

# now we get the right answer
describe(lsq)
[5]:
['a', 'b']

We can put this code into the init function of our generic least-squares class to obtain a generic least-squares class which works with iminuit.

[6]:
class BetterLeastSquares(LeastSquares):
    def __init__(self, model, x, y, err):
        super().__init__(model, x, y, err)
        self.func_code = make_func_code(describe(model)[1:])
[7]:
lsq = BetterLeastSquares(line, x_data, y_data, y_err)

m = Minuit(lsq, a=0, b=0)
m.migrad()
[7]:
Migrad
FCN = 3.079 Nfcn = 30
EDM = 2.9e-23 (Goal: 0.0002)
Valid Minimum No Parameters at limit
Below EDM threshold (goal x 10) Below call limit
Covariance Hesse ok Accurate Pos. def. Not forced
Name Value Hesse Error Minos Error- Minos Error+ Limit- Limit+ Fixed
0 a 1.5 1.0
1 b 1.90 0.32
a b
a 1.1 -0.3 (-0.905)
b -0.3 (-0.905) 0.1

Report number of data points

The minimum value of our least-squares function is asymptotically chi2-distributed. This is useful, because it allows us to judge the quality of the fit. iminuit automatically reports the reduced chi2 value \(\chi^2/n_\text{dof}\) if the cost function reports the number of data points.

[8]:
class EvenBetterLeastSquares(BetterLeastSquares):
    @property
    def ndata(self):
        return len(self.x)

lsq = EvenBetterLeastSquares(line, x_data, y_data, y_err)

m = Minuit(lsq, a=0, b=0)
m.migrad()
[8]:
Migrad
FCN = 3.079 (chi2/ndof = 1.0) Nfcn = 30
EDM = 2.9e-23 (Goal: 0.0002)
Valid Minimum No Parameters at limit
Below EDM threshold (goal x 10) Below call limit
Covariance Hesse ok Accurate Pos. def. Not forced
Name Value Hesse Error Minos Error- Minos Error+ Limit- Limit+ Fixed
0 a 1.5 1.0
1 b 1.90 0.32
a b
a 1.1 -0.3 (-0.905)
b -0.3 (-0.905) 0.1

As a developer of a cost function, you have to make sure that its minimum value is chi2-distributed if you add the property ndata to your cost function, Minuit has no way of knowing that.

For binned likelihoods, one can make the minimum value asymptotically chi2-distributed by applying transformations, see Baker and Cousins, Nucl.Instrum.Meth. 221 (1984) 437-442.