05 February, 2018

About Least Squares Adjustment Methods In Metrology


A large field of use of least squares refers to the analysis of data series in order to highlight tendencies or laws of behavior that allow modeling the phenomena in a more compact form for their present or future use.

Many experiments in metrology involve the use of instruments or measuring systems that measure values of a response variable corresponding to given values of a stimulus variable. The calibration of such instruments requires using a set of measurements of stimulus and response variables, along with their associated uncertainties, to determine estimates of the parameters that best describe the relationship between them, along with their associated uncertainty matrix. Algorithms for determining the calibration parameters for problems where measurements of the stimulus variable can be assumed to be accurate relative to those of the response variable are well known. However, there is much less awareness in the metrology community of solution algorithms for more generalised problems. Many standards that deal with calibration, while identifying the types of regression problems to be solved, offer only limited advice regarding the selection and use of solution algorithms. There is also a need for guidance regarding uncertainty evaluation. The aim of this post is to provide more assistance to the metrologist, allowing the calibration problem to be easily classified according to the uncertainty structure associated with the measurement data, providing solution algorithms for the main types of calibration problems, and describing the evaluation of uncertainties associated with the calibration parameters.

Frequently, the starting data that are used for the adjustment in the literature are usually considered as exact values, without uncertainty, so that the greater or lesser accuracy of the result is due, exclusively, to the mismatch between said data and the model used to the adjustment.
However, when the data are considered with true metrological sense, they should be considered as values with uncertainties, not always negligible and, in many cases, with common contributions that determine correlations between those data.

Three examples are included to illustrate the applications of the method in real cases.
This post presents some characteristics of least squares method (LSMs) when used in metrological applications. A large field where LSMs are used with the analysis of data sets in order to discover trends or behavioral laws; That allows the phenomena to be modeled in a more compact mathematical form for present or future use.

Often, in the literature, input data used for least squares fitting are regarded as exact values without uncertainty. Therefore, the uncertainty of output parameters from the least squares fit is due exclusively to the lack of fit between input data and values predicted by the model (once the least squares fit is performed).

However, input data in metrological application always have uncertainties, sometimes negligible, sometimes not. Even in many cases we should consider correlation between input data.

1.- Introduction

In many occasions, it is not possible or it is not interesting to directly measure the characteristic parameters of a measurand. Instead, other parameters are measured directly and using a certain physic-mathematical model that relates the measured parameters with those that are desired to know a system of equations is generated from which it is possible to determine later.

Let x1, x2 ... xm be the results provided directly by the measurement system (input variables) and y1, y2 ... yn the parameters you want to know (output variables). The relationship between both sets of variables is assumed known and expressed as follows:
The sign "≅" instead of the equal sign (=) shows the variability of the measurement results.
It works with superabundant information (m> n) so the system (1) is overdetermined and does not have, in general, a solution. However, it is possible to find a solution y1, y2 ... yn such that the differences between the first and second terms of equations (1) are minimal. For this, the system of "approximate" equations (1) can be transformed into an "exact" system by introducing the so-called residues e1, e2 ... em:
The vector y1, y2 ... yn is chosen in such a way that the resulting waste vector e1, e2 ... em is "the one that most closely approaches the null vector". When this condition is imposed minimizing the quadratic sum of the residuals, the method is known as the least squares adjustment method (LSM). See, for instance, [1] (section 1.1.4), [2] (section 5.3), [3] (section 4.3).
In addition, and since the deviations introduced by the measurement system are usually very small, the system of non-linear equations described above is, in general, linearizable around the nominal value of the measuring point.
By adopting the nominal as a new origin for the variables, the system (1) results:
Evaluating the partial derivatives for the nominal value of the measurement point.
Analogously, the system (2) is linearized by
A general approach for many metrological applications can be established by the following equations and variables:
which can still be written more compactly by x ≅ f (y, q)

x ≅ f (y, q). General least squares fitting model
This display of the system of equations (5) where it is necessary to determine n parameters, y1, y2 ... yn, (output quantities) from m measurement data, x1, x2 ... xm, (input quantities ), knowing, in addition, a set of complementary parameters, q1, q2 ... qp. By minimizing the quadratic sum of the residuals ei, the equations necessary to determine the parameters y1, y2 ... and n are obtained. See, for example, [5] (section 3) and [6].
Similar to the one that allowed moving from (1) and (2) to (3) and (4), the linearization of the system (5) determines
Where:
Jy and Jq are known values matrices that have been evaluated in the nominal environment at the working point.
In the most general case, values and uncertainties are included in the three classes of variables considered, although sometimes the uncertainties can be very small and treated as negligible.
The model (5) is flexible and allows to collect different possibilities. For example, functions fi may be several or be reduced to only one; the input data may correspond to values of different magnitudes or to repetitions of the same magnitude;
Depending on the work hypotheses adopted, three LSModels are analyzed in the following sections. Which are called:
- Ordinary Least Squares Method, (OLSM)
- Generalized Least Squares Method (GLSM)
- Total Least Squares Method (TLSM)

2.- Ordinary Least Squares Method (OLSM)

An Ordinary LSM problem checks the following hypotheses:
The functions fi (y1, y2 ... yn) do not contain any parameter that comes from a measurement result with uncertainty. In other words, the functions fi (y1, y2 ... yn) only contain the unknowns y1, y2 ... yn to be determined and known parameters or coefficients without uncertainty.
In practice, the uncertainties of the parameters present in fi (y1, y2 ... yn) do not have to be strictly null. It is enough that his final contribution to the uncertainties of the output variables y1, y2 ... yn is negligible.
The input variables (known measures), that is, the x1, x2 ... xm comply with the following:
• All of them have the same typical uncertainty, estimated by
• u (x1) = u (x2) = ... = u (xm) = ux
• All of them are statistically incorrect. That is, all the correlation coefficients r (xi, xj) are null (for i ≠ j). Or put another way, all covariances are null, u (xi, xj) = 0.
That is, the covariance matrix Cx = cov (x) of the vector of random variables x = [x1, x2 ... xm] T responds to
• The initial estimate of ux will be confirmed or modified by the adjustment process as explained below.
Consequently, the systems of equations (6), (7) are reduced in this case to
where the notation has been simplified by A = Jy. The residual squared sum Q is
The minimum condition on Q demands, ∇Q = 0 that is,
So that
system whose solution is
From the previous expression we obtain the covariance matrix of the output variables:
On the other hand, a measurement of the quality of the adjustment is:
It can be shown that s² is an estimator of the variance of the fit. If the variance of the adjustment is much greater than the initially estimated for the input variables, it is that the latter have been undervalued and, consequently, the value s² is adopted for ux².
In order to illustrate this reasoning, the results of a comparison between laboratories measuring the same pattern can be analyzed. A great variability of the standard deviation measures (s²) is not consistent with much lower values of the uncertainty estimated by the participating laboratories and the next criterion can be adopted
A more detailed statistical justification, adopting normality for the variables involved, is based on the fact that a good estimate of ux determines that Q/ux² is distributed according to a χ² with m-n degrees of freedom. It is possible to obtain a critical value χc² below which the probability of obtaining values of χ² is very high (for example 95%). To this value corresponds another, sc², according to (15). When s ≤ sc, it is accepted as an event within 95% of expected events in the hypothesis that ux has not been underestimated. In most practical cases, the results obtained are very similar to those of the criterion (16), so we will adopt the criterion (16) whose application is immediate.

3.- Generalized Least Squares Method (GLSM)

A problem of GLSM. (see [1], section 4.3, [13]) verifies the following hypotheses:
The functions fi (y1, y2 ... yn) do not contain any parameter that comes from a measurement result with uncertainty. In other words, the functions fi (y1, y2 ... yn) only contain the unknowns y1, y2 ... yn to be determined and known parameters or coefficients without uncertainty.
In practice, the uncertainties of the parameters present in fi (y1, y2 ... yn) do not have to be strictly null. It is enough that his final contribution to the uncertainties of the output variables y1, y2 ... yn is negligible.
The variables of the independent term, that is, x1, x2 ... xm, do not usually have the same uncertainty and it is possible that there are also non-zero correlations between them: for some pair xi, xj r (xi, xj) ≠ 0 which is equivalent to au (xi, xj) ≠ 0.
The covariance matrix
of the vector of random variables is assumed to be known x = [x1, x2 ... xm] T that is presented multiplied by a positive factor, s², unknown, which will allow correction in the adjustment of a possible undervaluation of C~x.
The basic idea to solve a generalized problem (GLS) is to reduce it to another of the ordinary type (OLS) by changing variables from a weighting matrix W. The new variable, z, is given by
So it should be
The matrix W can be determined by several procedures. One of them is through the Cholesky decomposition (W = L-1 with Cx = LLT and lower triangular L) on the covariance matrix of the measured values, Cx = cov (x), which can be implemented directly in MATLAB or Octave by the command:
W = inv (chol (Cx, 'lower'))
Multiplying (9) by the weighting matrix is obtained
The new system of equations, with à = W (A • y), can be solved by OLSM because the required conditions for the variances and covariances of the input variables in said model (8) are already met; That results:
Applying the same criteria as in the previous case (16)
If s~ <1 we should take s =1                           (20.1)
If s~ >1 we would take s=s~                            (20.2)
The covariance of the output variables, Cy = cov (y), responds to the same general expression (14) so that
the estimate of the initial matrix, Cx = cov (x), is affected by the same factor, that is, s²Cx.

4.- Total Least Squares Method (TLSM)

A problem of Total LSM [12] verifies the following hypotheses:
  1. In this case, the functions fi (y1, y2 ... yn) contain a series of parameters q1, q2 ... qp, which are known with non-negligible uncertainties, so the uncertainties are present both in the term x1, x2 ... xm as in the functions fi (y1, y2 ... yn). It is about analyzing the general model 
  2.  The covariance matrix Cx = cov (x) is known, perhaps in the format Cx=s²~Cx as commented in case GLS
  3. Also, another data is the covariance matrix Cq=cov(q)=s²~Cq of the vector of parameters q = [q1, q2 ... qp] known that is presented multiplied by the same previous positive factor, , unknown, which will also correct in the adjustment a possible undervaluation of ~Cq.
 Assuming that we work on a linearized system, expression (6) can be written
Now, the term on the left (independent term) is the one that collects all the uncertainty of the problem. In the term of the right (matrix A) there is no uncertainty. Therefore, in addition to having linearized the problem, it has become a GLSM. Therefore, the transformation (17) from the matrix is applicable
Formally obtaining expressions similar to (18) and (19) and, finally (21)
Applying the same criteria as in the previous cases (16) and (20), it results


5.- Example of adjustment by Ordinary Least Squares (OLS)

The examples presented refer to a groove pattern used in dimensional metrology as a calibration standard for the vertical scale of rugosimeters and confocal and tunneling microscopes. The height or depth of the slot is of the order of a few micrometers but its value is usually expressed in nanometers with typical uncertainties of the order of ten nanometers.
Statement 1:
It is desired to analyze the variation as a function of time (temporal drift) of a groove standard.
The following table shows the history of external calibrations (in National Institutes of Metrology, INM) of such standard:
Moments in which the respective calibrations were made are measured in years with respect to the current date.
The calibrations were carried out in four different INMs, having repeated three calibrations in the same INM.
It is intended to make an adjustment of the type: x(t) = a + b·t
Where a would represent the estimate of the slot height at the current date (expressed in nm) and b would represent the annual drift (expressed in nm/year)
Neglecting the calibration uncertainty, you want to make an Ordinary Least Squares Method (OLS) providing the following results:
  1. The height of the groove at the current date along with its typical uncertainty
  2. The annual drift b together with its typical uncertainty
  3. The correlation coefficient r; (a, b)
The previous statement satisfies the conditions of the OLS analysis of section 2, although some symbols have been modified (y1 by a; y2 by b) and there is a parameter, t, whose values are known without uncertainty. According to all this, the system (1) is written:
-The first of the expressions (9), x ≅ A • y, is specified in:
-The solution (13) determines the results: a = 2395.976 nm and b = -2.939 nm/year.
Although the statement collects the typical uncertainties of the values certified by the NMI, the statement indicates that, in this first statement, those are omitted, so the final uncertainty comes exclusively from the mismatch that follows from (13) with value: s = 8.15 nm. Note, however, that the value obtained for s is of the same order as the uncertainties u(x) contained in statement no.1
The covariance matrix of the output variables responds to (14) with ux² = s². The result is:
and with two significant figures u (a) = 6.4 nm; u (b) = 1.2 nm / year; r (a, b) = +0.85.
Figure 1 reproduces the program Ej1 that performs these calculations in MATLAB/Octave.
Fig 1. Code Ej1 in MATLAB/Octave for OLS adjustment. LMM, ETSIIM.

6.- Example of adjustment for Generalized Least Squares (GLS)

This second example extends the data in statement 1 with more information.
Statement 2:
It is a matter of re-making an adjustment of the type x (t) = a + b • t this time by Generalized Least Squared Method (GLSM) taking into account the following additional information:
  1. The calibration uncertainties U (xi)
  2. The presence of correlation between calibration corrections:
  • If the calibration corrections have been provided by the INM itself, it will be assumed that the correlation coefficient is +0.5
  • If the calibration corrections have been provided by different INMs, it will be assumed that the correlation coefficient is zero
With these data, the conditions of the GLS analysis of section 3 are met, so the process starts by determining the covariance matrix, Cx, of the independent term (vector of measurements x). In the program of figure 2, we work with the matrix ~Cx of section 3, which is designated Cx, so that it can be enlarged at the end if ~s >1 (20). To do this, we first obtain the matrix R of correlation coefficients, which satisfies:
  • Unit value in each element of its main diagonal
  • Values equal to 0.5 at positions (1,3) (1,4) (3,4) (3,1) (4,1) and (4,3). Note that the pattern has been recalibrated once at INM#3 (calibrations 1, 3 and 4). In the rest of the INMs the calibration has not been repeated.
Introducing the measurement uncertainties of x is obtained, in nm²,
The weighting matrix is determined by the Choleski transformation, and it results
With the matrix à = W • A you get to the system (18) that determines the solution
In the program Ej2 (where A2 is the matrix Ã) the concrete values are:
that is to say: a = 2395.0 nm and b = -3.5 nm/year.
Note that there has been a small variation in a (1 nm) but a significant variation in b (0.6 nm/year) with respect to the solution obtained in the previous example (OLS adjustment).
The mean square residual (19) is s = 1.1833 which implies that the matrix ~Cx is slightly undervalued, and according to criterion (20), s=~s , and
And with two significant values:
Figure 2 reproduces the program Ej2 that performs these calculations in MATLAB/Octave
 Fig.2 performs GLS adjustements in MATLAB/Octave. LMM, ETSIIM.

7.- Example of adjustment for Totals Leasts Squares Method (TLS)

This last example expands the data in statements 1 and 2 with more information.
Statement 3:
The same adjustment type is analyzed as in the previous two examples, x(t)=a + b • t, now by Totals Least Squares (TLS) when considering the following additional information about the time measures:
or have an uncertainty of u(ti) = 1/√12 month
or there is no correlation between them: u(ti, tj) = 0 ∀ i ≠ j
The reason why u(ti) = 1/√12 month is taken is because in the calibration certificates the duration of the pattern calibration is one month. Assuming a uniform distribution with an amplitude of one month, the typical uncertainty results with the indicated value. Expressed in years, it is
u(ti) = 1/(12√12) year = 0.024 year.
Under these conditions, it is necessary to take into account parameters with uncertainties in the functions (5), so a treatment with Total Least Squares Method (TLSM) expressing ~Cb according to (23). The Ej3 program uses the matrices ~Cx, ~Cq and ~Jq where ~Cx is the same as in the previous example. The other two matrices are:
Corresponding matrix ~Jq to the partial derivatives matrix with respect to the parameters qi (in this case ti) of the functions fi = a + b • ti (section 1).
The value of b (which is an unknown) would not be known until after the adjustment. However, as in the previous exercise, an estimate for b of -3.5 nm/year has been obtained. This value is used in the calculation of ~Cb=cov(b). From this matrix, work is now done as in the previous exercise, obtaining
The final results, with two significant figures, are:
that, at the level of rounding adopted, they coincide with those obtained in the previous example given that the coefficient is very small and the contribution
is negligible compared to Cx in (23). Consequently, Cb ≅ Cx.
Figure 3 reproduces the program Ej3 that performs the calculations of this last example in MATLAB/Octave.
Fig 3. Code Ej3 in MATLAB/Octave for TLS adjustment by LMM, ETSIIM.

References & Bibliography.-
[1] Björck, A .: Numerical Methods for Least Squares Problems. SIAM, 1996. ISBN 0 898771 360 9
[2] Golub, G.H. ; Van Loan, C.F. : Matrix Computations. John Hopkins University Press. 1996. 3rd Edition. ISBN 0 8018 5414-8
[3] Strang, G .; Borre, K.: Linear Algebra, Geodesy and GPS. Ed. Wellesley Cambridge Press. ISBN 0-9614088-6-3 (1997)
[4] Forbes, A.B. : Parameter Estimation Based on Least Squares Problems. Section 5 of the book Data Modeling for Metrology and Testing in Metrology Science. Ed. Birkhaüser. ISBN 978-0-8176-4592-2 (2009)
[5] Cox, M.G. ; Forbes, A.B. ; Harris, P.M. : The classification and solution of regression problems for calibration. NPL Report CMSC 24/03.
[6] Nielsen, L.: Evaluation of measurements by the method of least squares. Proceedings of the Algorithms for Approximation IV Congress. Ed J Leversley et al (Huddersfield: University of Huddersfield) pp 170-86. Available on the BIPM.
[7] Van Huffel, S .; Lemmerling, P.: Total Least Squares an Error-in-Variable Modeling. Ed. Springer-Science + Business Media, B.V. ISBN 978-90-481-5957-4 (2002)
[8] Morkovsky, I .; Van Huffel, S.: Overview of TLS Methods. Signal Processing, vol. 87, pp. 2283-2302, 2007.
[9] de Vicente, J.: Least Squares Adjustments Methods.: approximation to the general problem. Chapter 8 of the notes of the subject Models for Calibrations and Measures of the Metrology Master of the CEM-UPM.
[10] de Vicente, J.: Least Squares Adjustments Methods Guide In Metrology. Chapter 14 of the notes of the subject Models for Calibrations and Measures of the Metrology Master of the CEM-UPM.
[11] JCGM 102: 2011. Evaluation of measurement data – 2nd Supplement to the. "Guide to the expression of uncertainty in measurement" Extension to any number of output quantities. Available on the BIPM.
[12] Wikipedia. Total Least Squares.
[13] Wikipedia. Generalized Least Squares.