Table of Contents
Name | Direction | Type | Default | Description |
---|---|---|---|---|
Function | InOut | Function | Mandatory | Parameters defining the fitting function and its initial values |
InputWorkspace | Input | Workspace | Mandatory | Name of the input Workspace |
IgnoreInvalidData | Input | boolean | False | Flag to ignore infinities, NaNs and data with zero errors. |
DomainType | Input | string | Simple | The type of function domain to use: Simple, Sequential, or Parallel. Allowed values: [‘Simple’, ‘Sequential’, ‘Parallel’] |
EvaluationType | Input | string | CentrePoint | The way the function is evaluated on histogram data sets. If value is “CentrePoint” then function is evaluated at centre of each bin. If it is “Histogram” then function is integrated within the bin and the integrals returned. Allowed values: [‘CentrePoint’, ‘Histogram’] |
PeakRadius | Input | number | 0 | A value of the peak radius the peak functions should use. A peak radius defines an interval on the x axis around the centre of the peak where its values are calculated. Values outside the interval are not calculated and assumed zeros.Numerically the radius is a whole number of peak widths (FWHM) that fit into the interval on each side from the centre. The default value of 0 means the whole x axis. |
Output | Input | string | A base name for output workspaces. |
This algorithm explores the surface of the χ2 around its minimum and estimates the standard deviations for the parameters. The value of the output property is a base name for two output table workspaces: ‘<Output>_errors’ and ‘<Output>_pdf’. The former workspace contains parameter error estimates and the latter shows χ2‘s 1d slices along each parameter.
The procedure for calculating errors of parameters is described in Chapter 15 of Numerical recipes in C [1] and Chapter 9 of Statistical Data Analysis [2]. Here, we summarise the main results.
Consider the input dataset D0, with fit parameters a0. Assuming Gaussian noise in the data, it is possible to create n artificial datasets Dj for j=1,2,..,n. If we were to run the fit on each dataset, a set of parameters aj would be obtained. The distribution of these parameters, about our original parameter set δa=aj−a0 is described by the multivariate normal distribution,
where Δχ2=χ2(aJ)−χ2(a0), is the difference between the chi squared statistics obtained for each parameter set,
If we consider the variation of a single parameter ak, while the other parameters are the values that minimize χ2, the quantity Δχ2 will be distributed as a chi squared distributed, fχ2(x;ν) with 1 degree of freedom, ν=1. From this distribution, the probability of finding increases less than Δχ2 can be found from the integral,
where p is the desired confidence level. For instance, if we consider a confidence level of p=68.3% (1-sigma), we find a critical value for the chi squared fluctuation of Δχ2<1. From this desired confidence level, we can then calculate the parameter variation, (σleft, σright), that increases chi squared by 1 and obtain a 68.3% confidence interval for the single parameter ak:
This algorithm obtains the 1-sigma confidence level by varying a single parameter at a time and recording the extremums of the values that increase χ2 by 1. The results are outputted in the table ‘<Output>_errors’, which reports the left and right 1-sigma errors. This procedure is repeated for 2-sigma and 3-sigma, with the results displayed in the columns (2-sigma) and (3-sigma). An additional value is also reported, termed the quadratic error. This is the 1-sigma error obtained from a Taylor expansion of the chi squared function about its minimum:
where C is the covariance matrix obtained from least squares fitting of the data, see the Fitting documentation for further details. For a linear model (or approximately linear model) the quadratic error should be equal to the left and right errors, i.e it will be symmetric. This can be seen in the following example
from mantid.simpleapi import *
import matplotlib.pyplot as plt
import numpy as np
# create the data
x = np.linspace(1, 10, 250)
np.random.seed(0)
y = 1 + 2.0*x + 0.4*np.random.randn(x.size)
ws = CreateWorkspace(x, y)
# run the fit
func = "name=LinearBackground, A0=1, A1=2";
func_ = FunctionFactory.Instance().createInitialized(func)
fit_output = Fit(InputWorkspace=ws, Function=func_, Output="LinearFit")
a0_fit = fit_output.OutputParameters.column(1)[0]
a1_fit = fit_output.OutputParameters.column(1)[1]
# explore the chi squared profile for the fit parameters
fitted_func = "name=LinearBackground, A0={}, A1={}".format(a0_fit, a1_fit);
ProfileChiSquared1D(fitted_func, ws, Output="LinearProfile")
# print left and right errors of parameters
# you should note that they are approx equal to the quadratic error for this linear model
error_table = mtd["LinearProfile_errors"]
lerror_a0 = error_table.column(3)[0]
rerror_a0= error_table.column(4)[0]
qerror_a0 = error_table.column(9)[0]
print("1-sigma error bounds of A0 are {} and {}, with quadratic estimate {}".format(lerror_a0, rerror_a0, qerror_a0))
lerror_a1 = error_table.column(3)[1]
rerror_a1= error_table.column(4)[1]
qerror_a1 = error_table.column(9)[1]
print("1-sigma error bounds of A1 are {} and {}, with quadratic estimate {}".format(lerror_a1, rerror_a1, qerror_a1))
For a non-linear model, it’s possible that the left and right variances will not be equal, leading to an asymmetric error. This is shown in the example below:
# import mantid algorithms, numpy and matplotlib
from mantid.simpleapi import *
import matplotlib.pyplot as plt
import numpy as np
# create decaying exponential data
x = np.linspace(1, 10, 250)
np.random.seed(0)
y = 3.0*np.exp(-x/2) + 0.1*np.random.randn(x.size)
ws = CreateWorkspace(x, y)
# run the fit
func = "name=ExpDecay,Height=3.0, Lifetime=0.5";
func_ = FunctionFactory.Instance().createInitialized(func)
fit_output = Fit(InputWorkspace=ws, Function=func_, Output="ExpFit")
height_fit = fit_output.OutputParameters.column(1)[0]
lifetime_fit = fit_output.OutputParameters.column(1)[1]
# explore the chi squared profile for the fit parameters
fitted_func = "name=ExpDecay, Height={}, Lifetime={}".format(height_fit, lifetime_fit);
ProfileChiSquared1D(fitted_func, ws, Output="ExpProfile")
# print left and right errors of parameters
# you should note that they differ from the quadratic errors
error_table = mtd["ExpProfile_errors"]
lerror_height = error_table.column(3)[0]
rerror_height= error_table.column(4)[0]
qerror_height = error_table.column(9)[0]
print("1-sigma error bounds of Height are {} and {}, with quadratic estimate {}".format(lerror_height, rerror_height, qerror_height))
lerror_lifetime = error_table.column(3)[1]
rerror_lifetime= error_table.column(4)[1]
qerror_lifetime = error_table.column(9)[1]
print("1-sigma error bounds of Lifetime are {} and {}, with quadratic estimate {}".format(lerror_lifetime, rerror_lifetime, qerror_lifetime))
For each problem, the error table has the following columns:
Column | Description |
---|---|
Parameter | Parameter name |
Value | Parameter value passed with the Function property |
Value at Min | The minimum point of the 1d slice of the χ2. If the Function is at the minimum then Value at Min should be equal to Value. |
Left Error (1-sigma) | The negative deviation from the minimum point equivalent to 1σ. Estimated from analisys of the surface. |
Right Error (1-sigma) | The positive deviation from the minimum point equivalent to 1σ. Estimated from analisys of the surface. |
Left Error (2-sigma) | The negative deviation from the minimum point equivalent to 2σ. Estimated from analisys of the surface. |
Right Error (2-sigma) | The positive deviation from the minimum point equivalent to 2σ. Estimated from analisys of the surface. |
Left Error (3-sigma) | The negative deviation from the minimum point equivalent to 3σ. Estimated from analisys of the surface. |
Right Error (3-sigma) | The positive deviation from the minimum point equivalent to 3σ. Estimated from analisys of the surface. |
Quadratic Error | 1σ standard deviation in the quadratic approximation of the surface. |
This algorithm also reports a probability density function (PDF) for each parameter, stored in the ‘<Output>_pdf’ table. This PDF is found from Eq. (1), which relates the increase in chi squared, to the probability of the parameter variation. The pdf table also contains slices of the χ2 along each parameter. It has 3 column per parameter. The first column of the 3 is the parameter values, the second has the χ2 and the third is the probability density function normalised to have 1 at the maximum. Plotting the second column of each parameter will show the change in χ2 with respect to the parameter value.
[1] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. 1992. Numerical recipes in C (2nd ed.): the art of scientific computing. Cambridge University Press, USA.
[2] G. Cowan, Statistical Data Analysis, Clarendon, Oxford, 1998
Categories: AlgorithmIndex | Optimization
C++ header: ProfileChiSquared1D.h (last modified: 2021-03-31)
C++ source: ProfileChiSquared1D.cpp (last modified: 2021-03-31)