\(\renewcommand\AA{\unicode{x212B}}\)

ProfileChiSquared1D v1

../_images/ProfileChiSquared1D-v1_dlg.png

ProfileChiSquared1D dialog.

Summary

Profiles chi squared about its minimum to obtain parameter errors for the input function.

Properties

Name Direction Type Default Description
Function InOut Function Mandatory Parameters defining the fitting function and its initial values
InputWorkspace Input Workspace Mandatory Name of the input Workspace
IgnoreInvalidData Input boolean False Flag to ignore infinities, NaNs and data with zero errors.
DomainType Input string Simple The type of function domain to use: Simple, Sequential, or Parallel. Allowed values: [‘Simple’, ‘Sequential’, ‘Parallel’]
EvaluationType Input string CentrePoint The way the function is evaluated on histogram data sets. If value is “CentrePoint” then function is evaluated at centre of each bin. If it is “Histogram” then function is integrated within the bin and the integrals returned. Allowed values: [‘CentrePoint’, ‘Histogram’]
StepSizeMethod Input string Default The way the step size is calculated for numerical derivatives. See the section about step sizes in the Fit algorithm documentation to understand the difference between “Default” and “Sqrt epsilon”. Allowed values: [‘Default’, ‘Sqrt epsilon’]
PeakRadius Input number 0 A value of the peak radius the peak functions should use. A peak radius defines an interval on the x axis around the centre of the peak where its values are calculated. Values outside the interval are not calculated and assumed zeros.Numerically the radius is a whole number of peak widths (FWHM) that fit into the interval on each side from the centre. The default value of 0 means the whole x axis.
Output Input string   A base name for output workspaces.

Description

This algorithm explores the surface of the \(\chi^{2}\) around its minimum and estimates the standard deviations for the parameters. The value of the output property is a base name for two output table workspaces: ‘<Output>_errors’ and ‘<Output>_pdf’. The former workspace contains parameter error estimates and the latter shows \(\chi^{2}\)‘s 1d slices along each parameter.

The procedure for calculating errors of parameters is described in Chapter 15 of Numerical recipes in C [1] and Chapter 9 of Statistical Data Analysis [2]. Here, we summarise the main results.

Consider the input dataset \(D_0\), with fit parameters \(\mathbf a_0\). Assuming Gaussian noise in the data, it is possible to create \(n\) artificial datasets \(D_j\) for \(j=1,2,..,n\). If we were to run the fit on each dataset, a set of parameters \(\mathbf a_j\) would be obtained. The distribution of these parameters, about our original parameter set \(\delta \mathbf a = \mathbf a_j - \mathbf a_0\) is described by the multivariate normal distribution,

\[P(\delta \mathbf a ) \propto exp (-\Delta \chi^2)\]

where \(\Delta \chi^2=\chi^2(\mathbf a_J) - \chi^2(\mathbf a_0)\), is the difference between the chi squared statistics obtained for each parameter set,

\[\Delta \chi^2 = \sum_{i \in D_0} \left ( \frac{y_i -f(x_i;\mathbf a_J)}{\sigma_i}\right)^2 - \left ( \frac{y_i -f(x_i;\mathbf a_0)}{\sigma_i}\right)^2\]

If we consider the variation of a single parameter \(a_k\), while the other parameters are the values that minimize \(\chi^2\), the quantity \(\Delta \chi^2\) will be distributed as a chi squared distributed, \(f_{\chi^2}(x; \nu)\) with 1 degree of freedom, \(\nu=1\). From this distribution, the probability of finding increases less than \(\Delta \chi^2\) can be found from the integral,

\[p = P(X < \Delta \chi^2 ) = \int_0^{\Delta \chi^2} f_{\chi^2}(x; 1) dx\]

where \(p\) is the desired confidence level. For instance, if we consider a confidence level of \(p=68.3\%\) (1-sigma), we find a critical value for the chi squared fluctuation of \(\Delta \chi^2 < 1\). From this desired confidence level, we can then calculate the parameter variation, (\(\sigma_{left}\), \(\sigma_{right}\)), that increases chi squared by 1 and obtain a \(68.3\%\) confidence interval for the single parameter \(a_k\):

\[P( a_k - \sigma_{left} < a_k < a_k + \sigma_{right}) = 68.3\%\]

This algorithm obtains the 1-sigma confidence level by varying a single parameter at a time and recording the extremums of the values that increase \(\chi^2\) by \(1\). The results are outputted in the table ‘<Output>_errors’, which reports the left and right 1-sigma errors. This procedure is repeated for 2-sigma and 3-sigma, with the results displayed in the columns (2-sigma) and (3-sigma). An additional value is also reported, termed the quadratic error. This is the 1-sigma error obtained from a Taylor expansion of the chi squared function about its minimum:

\[\chi^2(\mathbf a + \delta \mathbf a) = \chi^2(\mathbf a)_{min} + \delta \mathbf a^T \mathbf C \delta \mathbf a\]

where \(\mathbf{C}\) is the covariance matrix obtained from least squares fitting of the data, see the Fitting documentation for further details. For a linear model (or approximately linear model) the quadratic error should be equal to the left and right errors, i.e it will be symmetric. This can be seen in the following example

from mantid.simpleapi import *
import matplotlib.pyplot as plt
import numpy as np

# create the data
x = np.linspace(1, 10, 250)
np.random.seed(0)
y = 1 + 2.0*x + 0.4*np.random.randn(x.size)
ws = CreateWorkspace(x, y)

# run the fit
func = "name=LinearBackground, A0=1, A1=2";
func_ = FunctionFactory.Instance().createInitialized(func)

fit_output = Fit(InputWorkspace=ws, Function=func_, Output="LinearFit")
a0_fit = fit_output.OutputParameters.column(1)[0]
a1_fit = fit_output.OutputParameters.column(1)[1]

# explore the chi squared profile for the fit parameters
fitted_func = "name=LinearBackground, A0={}, A1={}".format(a0_fit, a1_fit);
ProfileChiSquared1D(fitted_func, ws, Output="LinearProfile")

# print left and right errors of parameters
# you should note that they are approx equal to the quadratic error for this linear model
error_table = mtd["LinearProfile_errors"]
lerror_a0 = error_table.column(3)[0]
rerror_a0= error_table.column(4)[0]
qerror_a0 = error_table.column(9)[0]
print("1-sigma error bounds of A0 are {} and {}, with quadratic estimate {}".format(lerror_a0, rerror_a0, qerror_a0))

lerror_a1 = error_table.column(3)[1]
rerror_a1= error_table.column(4)[1]
qerror_a1 = error_table.column(9)[1]
print("1-sigma error bounds of A1 are {} and {}, with quadratic estimate {}".format(lerror_a1, rerror_a1, qerror_a1))

For a non-linear model, it’s possible that the left and right variances will not be equal, leading to an asymmetric error. This is shown in the example below:

# import mantid algorithms, numpy and matplotlib
from mantid.simpleapi import *
import matplotlib.pyplot as plt
import numpy as np


# create decaying exponential data
x = np.linspace(1, 10, 250)
np.random.seed(0)
y = 3.0*np.exp(-x/2) + 0.1*np.random.randn(x.size)
ws = CreateWorkspace(x, y)

# run the fit
func = "name=ExpDecay,Height=3.0, Lifetime=0.5";
func_ = FunctionFactory.Instance().createInitialized(func)

fit_output = Fit(InputWorkspace=ws, Function=func_, Output="ExpFit")
height_fit = fit_output.OutputParameters.column(1)[0]
lifetime_fit = fit_output.OutputParameters.column(1)[1]

# explore the chi squared profile for the fit parameters
fitted_func = "name=ExpDecay, Height={}, Lifetime={}".format(height_fit, lifetime_fit);
ProfileChiSquared1D(fitted_func, ws, Output="ExpProfile")

# print left and right errors of parameters
# you should note that they differ from the quadratic errors
error_table = mtd["ExpProfile_errors"]
lerror_height = error_table.column(3)[0]
rerror_height= error_table.column(4)[0]
qerror_height = error_table.column(9)[0]
print("1-sigma error bounds of Height are {} and {}, with quadratic estimate {}".format(lerror_height, rerror_height, qerror_height))

lerror_lifetime = error_table.column(3)[1]
rerror_lifetime= error_table.column(4)[1]
qerror_lifetime = error_table.column(9)[1]
print("1-sigma error bounds of Lifetime are {} and {}, with quadratic estimate {}".format(lerror_lifetime, rerror_lifetime, qerror_lifetime))

For each problem, the error table has the following columns:

Column Description
Parameter Parameter name
Value Parameter value passed with the Function property
Value at Min The minimum point of the 1d slice of the \(\chi^{2}\). If the Function is at the minimum then Value at Min should be equal to Value.
Left Error (1-sigma) The negative deviation from the minimum point equivalent to \(1\sigma\). Estimated from analisys of the surface.
Right Error (1-sigma) The positive deviation from the minimum point equivalent to \(1\sigma\). Estimated from analisys of the surface.
Left Error (2-sigma) The negative deviation from the minimum point equivalent to \(2\sigma\). Estimated from analisys of the surface.
Right Error (2-sigma) The positive deviation from the minimum point equivalent to \(2\sigma\). Estimated from analisys of the surface.
Left Error (3-sigma) The negative deviation from the minimum point equivalent to \(3\sigma\). Estimated from analisys of the surface.
Right Error (3-sigma) The positive deviation from the minimum point equivalent to \(3\sigma\). Estimated from analisys of the surface.
Quadratic Error \(1\sigma\) standard deviation in the quadratic approximation of the surface.

This algorithm also reports a probability density function (PDF) for each parameter, stored in the ‘<Output>_pdf’ table. This PDF is found from Eq. (1), which relates the increase in chi squared, to the probability of the parameter variation. The pdf table also contains slices of the \(\chi^{2}\) along each parameter. It has 3 column per parameter. The first column of the 3 is the parameter values, the second has the \(\chi^{2}\) and the third is the probability density function normalised to have 1 at the maximum. Plotting the second column of each parameter will show the change in \(\chi^{2}\) with respect to the parameter value.

References

[1] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. 1992. Numerical recipes in C (2nd ed.): the art of scientific computing. Cambridge University Press, USA.

[2] G. Cowan, Statistical Data Analysis, Clarendon, Oxford, 1998

Categories: AlgorithmIndex | Optimization