Pratical guide for the assessment, quality control, and uncertainty analysis of an oenological analysis method
RESOLUTION OIV-OENO 418-2013
PRACTICAL GUIDE FOR THE ASSESSMENT, QUALITY CONTROL, AND UNCERTAINTY ANALYSIS OF AN OENOLOGICAL ANALYSIS METHOD
THE GENERAL ASSEMBLY
IN VIEW OF Article 2 paragraph 2 iv of the Agreement of 3 April 2001 Establishing the International Organisation of Vine and Wine;
IN VIEW OF the works of the Sub-commission of methods of analysis of the OIV;
IN VIEW OF resolution OENO 7/1998 concerning the Validation principle appearing in the Compendium of International Methods of Wine and Must Analysis;
IN VIEW OF resolution OENO 7/2000 concerning the Estimation of the detection and quantification limits of a method of analysis appearing in the Compendium of International Methods of Wine and Must Analysis;
IN VIEW OF resolution OENO 10/2005 concerning the Practical guide for the validation, quality control, and uncertainty assessment of an alternative oenological analysis method appearing in the Compendium of International Methods of Analysis of Wines and Musts;
STRESSES the importance of assisting wine laboratories, carrying out serial analyses, with their assessment of the usual, internal, or alternative standardised analysis methods they implement, and in particular with the statistical tools used to evaluate certain characteristics and to take into account changes in standards; and
CONSIDERING that only the methods published in the Compendium of International Methods of Analysis of Wines and Musts of the OIV or in the Compendium of International Methods of Analysis of Spirituous Beverages of Vitivinicultural Origin of the OIV act as official reference guides and are to be used to settle any disputes that may arise;
DECIDES, without prejudice to the documents listed in Annex E of the Compendium of International Methods of Analysis of Wines and Musts, to adopt and publish independently the attached guidelines “Practical guide for the evaluation, quality control and study of uncertainties of an oenological method of analysis”.
4.1. Definition of the requirement
4.3. Evaluation of the technical requirements
4.4. Development and evaluation of the method
4.5. On-going quality control of the results
5.3. Gauging (of a measuring instrument)
5.4. Intermediate precision conditions
5.6. Reproducibility conditions
5.7. Scope of the analysis method
5.9. MAD (maximum allowable deviation)
5.11. Intermediate precision standard deviation
5.12. Repeatability standard deviation
5.13. Reproducibility standard deviation
5.21. Uncertainty of measurement
5.22. Trueness of the measurement
5.24. Method quantification limit (QL)
5.27. Certified Reference Material (CRM)
5.29. Measurement or measuring or test
5.35. Accepted reference value
6. General principles: Measurement error
7. Initial evaluation of an oenological analysis method
7.2. Step one: identification of needs
7.2.1. Definition of analysable matrices
7.2.2. Definition of the measuring range
7.2.3. Definition of the required performance levels
7.3. Step two: evaluating method performance characteristics in the laboratory
7.3.1. Study of the calibration function
7.3.3. Study of the detection limit and the quantification limit of an analysis method
7.3.4. Study of extraction yields
7.3.5. Study of matrix interferences
7.3.6. Precision of the method
7.3.7. Study of method accuracy
7.3.8. Comparison of two methods
7.4. Step three: laboratory findings
8. Internal Quality Control of analysis methods (IQC)
8.4. Control of the analytical series
8.4.2. Control of trueness based on reference materials
8.5. Control of the analysis system
8.5.3. Internal comparison of different analysis systems of the same parameter
8.5.4. External comparison of the analysis system
9. Study of the measurement uncertainty
9.2.1. Definition of the measurand, and description of the quantitative analysis method
9.2.3. Calculation of the estimated standard uncertainty using the precision and bias approach
9.2.4. Estimating uncertainty in a global approach based on proficiency-testing schemes (PTS)
Annex A: Confidence interval relating to the standard deviation and mean (ISO 5725)
Annex B: β-risk calculation for the study of accuracy
Appendix C: Bias study (ISO 11352)
Practical guide for the assessment, quality control, and uncertainty analysis of an oenological analysis method
Warning; this guide is not a reference document and is only provided for its informative value. Unlike the methods contained in the book of international wine and must analysis methods of the OIV or in the book of international grape-based spirits’ analysis methods of the OIV, the methods contained herein cannot be considered as reference methods. Only the methods published in the book of international wine and must analysis methods of the OIV or in the book of international grape-based spirits’ analysis methods of the OIV are official and can be used for the settlement of any possible dispute that may arise.
1. Purpose
The purpose of this guide is to assist the oenological laboratories carrying out serial analysis in their processes involving the evaluation, internal quality control, and estimation of the uncertainty of an oenological analysis method that they use..
This Guide is a practical approach for the evaluation, the quality control and the study of the uncertainties of an oenological method of analysis. It must be simply considered as an auxiliary (aid) for the laboratory. The consultation of the other documents normative or statutory or published (edited) by entities internationally recognized, in particular the documents which are published (edited) by accreditation bodies, is strongly recommended.
The proposed guide is a complete document because it treats statistical calculations for the validation of a method, the control interns of quality, the comparisons interlaboratories and the calculation of the uncertainties.
The tries asked in the experimental plans of this guide must be realized in intermediate precision conditions or within reproducibility.
The laboratory has to verify the coherence of the results:
- The homogeneity of the variances within series with test of Cochran and the coherence of the standard deviation within series with the standard deviation possibly fixed by the normative method when there are within series results.
- The Gaussian distribution results and what it does not tend to it between the series of the data when there are results in different series.
2. Preamble
International standard ISO 17025 states that accredited laboratories must, when implementing an analytical method, make sure of the quality of the results obtained. To do so, it indicates several steps. The first step consists in defining the customers' requirements or regulatory requirements concerning the parameter in question, in order to determine, thereafter, whether the method used meets those requirements. The second step includes an initial evaluation of the method to evaluate its characteristics. In order not to introduce confusion with the notion of interlaboratory validation defined in this Compendium of OIV methods of Type I, II and III, the term evaluation will be used instead of that of validation. Once the method has been evaluated and applied, the laboratories must develop inspection and traceability methods enabling them to ensure the characteristics and performance of the method are maintained. Finally, they must assess the uncertainty of the results obtained using the method.
This guide was primarily prepared for wine laboratories carrying out analyses on fairly large series of must or wine samples. Clearly defining the scope of application in this way facilitated a relevant choice of suitable tools. The guide is strictly compliant with the above-mentioned standards, which are listed in the references at the end of the document. The various chapters include examples of applications taken from wine laboratories using these tools.
The reader’s attention should be drawn to the fact that the mathematical tools presented in the guide must always be subject to critical analysis. When implementing any of them, the laboratory must check whether the method allows for the application of a given tool. Similarly, the laboratory should not be satisfied with the results of the calculations performed using the various tools proposed but must check their relevance and significance. Whenever necessary, the authors have indicated the points requiring further reflection by the laboratory.
3. Normative References
The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
- ISO GUIDE 99: International vocabulary of metrology — Basic and general concepts and associated terms (VIM)
- ISO 5725-2: Accuracy (trueness and precision) of measurement methods and results — Part 2: Basic method for the determination of repeatability and reproducibility of a standard measurement method
- ISO 8258: Shewhart control charts
- ISO 11352: Estimation of measurement uncertainty based on validation and quality control data
- NF T90-210: Protocol for the initial method performance assessment in a laboratory
4. Scope of application
The standards relating to the various procedures in this guide are distinct. However, it seems appropriate to propose a single implementation guide because they form an indivisible whole when used by a laboratory working on a routine basis. They form the quality assurance environment that a laboratory must implement in order to meet the requirements of ISO 17025. It ensures the performance characteristics of a method of analysis, confirms that they comply with the requirements, and verifies their constancy over time.
During the development of a method of analysis, the laboratory must carry out a certain number of preliminary steps:
4.1. Definition of the requirement
The laboratory must determine the needs it must meet (customer demand, economic criteria, regulatory data, etc.) and assess the performance characteristics it expects of a method to meet these needs.
4.2. Choice of method
Depending on the needs identified, the laboratory may choose to apply different types of methods:
- Standard or consensually accepted method whose characteristics have been determined by an interlaboratory approach.
- Standard or consensually accepted method with minor technical adaptations that do not affect its principle.
- Published and consensually accepted method, but which has not been subject to interlaboratory validation (e.g. type IV OIV methods).
- Method developed in-house.
4.3. Evaluation of the technical requirements
Once the method has been chosen, the laboratory must evaluate the technical needs (equipment, reagents, skills, etc.) that will need to be implemented.
4.4. Development and evaluation of the method
During the development of a method, the laboratory must conduct an initial assessment that precedes the decision to apply it. This evaluation depends on the type of method chosen. After it has been completed, the laboratory accepts and implements the method.
4.5. On-going quality control of the results
The laboratory then continuously monitors the quality of the results with a regular assessment of the data concerning the precision (random error) and trueness (systematic error) and an estimate of uncertainty.
This guide describes these various steps.
5. General vocabulary
The definitions given below are for the use of this document and are based on normative data references in the bibliography, in particular the update of the VIM (International Vocabulary of Metrology).
5.1. Measurement Bias
Bias or estimation of a systematic error.
5.2. Blank test
Test carried out in the absence of a matrix (reagent blank) or on a matrix which does not contain the analyte (matrix blank).
It is essential for the laboratory to specify which type of blank it uses.
5.3. Gauging (of a measuring instrument)
Material positioning of each reference mark (or certain principal reference marks only) of a measuring instrument according to the corresponding value of the measurand. This is a case of applied calibration.
5.4. Intermediate precision conditions
Measurement condition in a set of conditions that includes the same measurement procedure, the same place and repeated measurements on the same object or similar objects for a extended period of time, but may include other conditions that are made to vary.
5.5. Repeatability conditions
Measurement condition in a set of conditions that includes the same measurement procedure, the same operators, the same measurement system, the same operating conditions, and the same place, as well as repeated measurements on the same object or similar objects during a short period of time.
5.6. Reproducibility conditions
Measurement condition in a set of conditions that includes different locations, operators, and measurement systems, as well as repeated measurements on the same or similar objects.
5.7. Scope of the analysis method
Combination of the various matrix types and the range of analyte concentrations covered, to which the analysis method applies.
5.8. Measurement range
All the values of quantities of the same nature that a given measuring instrument or system can measure with a specific instrumental uncertainty, under specified conditions.
5.9. MAD (maximum allowable deviation)
The MAD is an acceptance criterion around an accepted reference value defined on the basis of regulatory, normative, or informative requirements or client requests, or chosen by the laboratory itself.
5.10. Standard deviation
For a series of n measurements of the same measurand, the quantity s characterizing the dispersion of the results and given by the formula:
being the result of the ith measurement and
the arithmetic mean of the n results in question.
5.11. Intermediate precision standard deviation
Standard deviation of replicates obtained under conditions of intermediate precision.
5.12. Repeatability standard deviation
Standard deviation of replicates obtained under repeatability conditions.
5.13. Reproducibility standard deviation
Standard deviation of replicates obtained under reproducibility conditions.
5.14. Random error
Component of the measurement error which, in replicate measurements, varies unpredictably.
5.15. Measurement error
Difference between the measured value of a quantity and a reference value.
5.16. Systematic error
Component of the measurement error which, in replicate measurements, remains constant or varies in a predictable manner.
NOTE: it is possible to correct for a known systematic error.
5.17. Calibration
Operation that, under specified conditions, in a first step, establishes a relation between the values with measurement uncertainties provided by measurement standards and corresponding indications with associated measurement uncertainties and, in a second step, uses this information to establish a relation for obtaining a measurement result from an indication.
5.18. Accuracy of measurement
The closeness of agreement between a measured value and the true value of the measurand.
The term "accuracy", when applied to a set of test results, implies a combination of random components (precision) and a common systematic error or a bias component (trueness).
5.19. Measurement precision
The closeness of agreement between the indications or measured values obtained by repeated measurements of the same or similar objects under specified conditions.
NOTE 1 Precision depends only on the distribution of random errors and does not relate to the true or to the specified value.
NOTE 2 Precision is usually expressed numerically by characteristics such as the standard deviation or the coefficient of variation under the specified conditions.
NOTE 3 The specified conditions can be, for example, repeatability conditions, intermediate precision conditions, or reproducibility conditions.
5.20. Quantity (measurable)
Property of a phenomenon, body, or substance that can be expressed quantitatively as a number and a reference.
5.21. Uncertainty of measurement
Non-negative parameter that characterizes the dispersion of the values attributed to a measurand, based on the information used.
5.22. Trueness of the measurement
The closeness of agreement between the average of an infinite number of repeated measured values and a reference value
NOTE 1 The trueness of the measurement varies inversely with the systematic error, but is not related to the random error.
NOTE 2 The measurement of trueness is generally expressed in terms of bias.
5.23. Detection limit (DL)
Measured value, obtained by a given measurement process, for which the probability of falsely stating the absence of a constituent in a material is β, given the probability α of falsely stating its presence.
5.24. Method quantification limit (QL)
The lowest amount of an analyte that can be quantitatively determined with an acceptable uncertainty under the experimental conditions described in the method.
In the absence of normative or regulatory requirements, the acceptable uncertainty on the quantification limit is 60% of the quantification limit, by convention.
5.25. Test material
Material or substance to which the analysis method under consideration is to be applied.
5.26. Reference Material (RM)
Material sufficiently homogeneous and stable with respect to specified properties, which has been prepared to be suitable for its intended use in an investigation of measurement error.
5.27. Certified Reference Material (CRM)
Reference material accompanied by documentation issued by an authoritative body and providing one or more specified property values with associated uncertainties and traceability, obtained by valid procedures.
5.28. Matrix
All the constituents of the sample other than the analyte.
By extension, a matrix is defined by the analyst as a set of samples characterised by homogeneous behaviour with respect to the analysis method used.
5.29. Measurement or measuring or test
The process of experimentally obtaining one or more values that can be reasonably attributed to a quantity.
5.30. Measurand
The quantity to be measured.
5.31. Measurement method
Generic description of the logical organisation of the operations implemented in measuring.
5.32. Calibration model
Mathematical function which links an informative value to a measurand such as the concentration in analyte, within a given interval.
Examples of calibration functions: linear model, quadratic model, new rational model.
NOTE An alternative analysis method can consist of a simplified version of the reference method.
5.33. Average
For a series of n measurements of the same measurand, the mean value, given by the formula:
xi being the result of the ith measurement.
5.34. Measurement result
Set of values attributed to a measurand, supplemented by any other available relevant information.
NOTE: the measurement result is usually expressed by a single measured value and measurement uncertainty.
5.35. Accepted reference value
A value that serves as an agreed-upon reference for comparison, and which is derived from:
- a theoretical or established value, based on scientific principles;
- an assigned or certified value, based on experimental work of a national or international organisation;
- a consensus or certified value, based on collaborative experimental work under the auspices of a scientific or engineering group;
- when a), b), and c) are not available, the expectation of the (measurable) quantity, i.e. the mean of a specified population of measurements.
Within the particular framework of this document, and where possible, the accepted reference value (or conventionally true value) of the sample is provided by:
- the certificate value of a certified reference material,
- the consensus value resulting from an interlaboratory comparison,
- the arithmetic mean of the values of measurements repeated as per the reference method, corrected for bias using estimates of bias obtained from parallel measurements of some other reference material of a different type,
- the value targeted by addition of the analyte to a matrix representative of the scope of application.
5.36. Variance
Square of the standard deviation.
6. General principles: Measurement error
Any measurement carried out using the method under study yields a result which is inevitably associated with a measurement error, defined as being the difference between the result obtained and the true value of the measurand. In reality, the true value of the measurand is impossible to determine and a conventionally-accepted value is used instead.
Measurement error includes two components:
Measurement |
|
True value = analysis result + systematic error + random error |
In practice, the random error is taken as having a null expectation under the relevant set of measurement conditions, and the systematic error is then the negative of the bias (under those conditions.)
The evaluation and quality control tools are used to evaluate systematic error and random error, and to monitor their changes over time. They are also used to estimate the uncertainty associated with a test result.
7. Initial evaluation of an oenological analysis method
7.1. Methodology
Depending on how much is known about the characteristics of a given method, the evaluation approaches will be more or less complete.
If the laboratory adopts a method whose characteristics are consensually recognized, especially in the case of standard methods, it can carry out a study limited to measuring the performance of the method in the laboratory. If the laboratory adapts, modifies, or develops a method, it must implement a more comprehensive evaluation process, to ensure that it knows all the relevant characteristics of the method.
Assessing the method in all cases involves the three major steps:
- identification of needs;
- study of performance characteristics and features;
- acceptance of the method.
The laboratory uses a variety of assessment tools, as needed. It is up to the laboratory to correctly choose the most suitable tools for the method to be validated.
STEP |
SUB-STEP 1 |
SUB-STEP 2 |
Identification of needs |
|
|
|
Scope |
|
|
Measuring range |
|
|
Level of performance required |
|
Study of performance characteristics and features |
|
|
|
Range of validity |
|
|
Matrices analyzed |
|
|
Study of the calibration function |
|
|
Study of the detection and quantification limits |
|
|
Robustness |
|
|
Study of extraction yields |
|
|
Specificity study |
|
|
Precision of the method |
|
|
|
Repeatability study |
|
|
Study of intermediate precision |
|
Method accuracy |
|
|
|
Study of the accuracy profile |
|
|
Comparison with other analytical systems |
Conclusions and adoption of the method |
|
|
|
Summary of method performance |
|
|
Statement on the validity of the method |
7.2. Step one: identification of needs
The needs are identified according to the requirements defined by the intended use of the method. These requirements may be of several types:
- technical requirements;
- requirements of standards;
- regulatory requirements;
- customer requirements;
- other
Wherever possible, these requirements should be translated into quantitative criteria (precision, accuracy, QL, etc.). The purpose of validation is to check the compatibility of the actual performance characteristics of the method with the expected performance characteristics as defined.
7.2.1. Definition of analysable matrices
The matrix comprises all constituents in the test material other than the analyte.
If these constituents are liable to influence the result of a measurement, the laboratory should define the matrices for which the method is applicable.
For example, in oenology, the determination of certain parameters can be influenced by the various possible matrices (wines, musts, sweet wines, etc.).
In case of doubt about a matrix effect, more in-depth studies can be carried out as part of the specificity study.
7.2.2. Definition of the measuring range
The laboratory defines the measurement range covered by the intended use of the method.
7.2.3. Definition of the required performance levels
The compilation of the internal and external requirements is used to define the performance levels required of the method. These performance levels can be translated in the form of a quantity setting out acceptable variations, still defined by the term MAD: Maximum Allowable Deviation.
MAD is a general term which is declined in various specific forms depending on the parameters being tested:
- The MADaccuracy characterizes the maximum admissible deviation of the analytical method compared with an accepted reference value.
- The MADreproducibility characterizes the allowable deviation for the dispersion of results obtained under reproducibility conditions.
The MADreproducibility can be defined in several ways, for example:
-
, where ROIV is the reproducibility limit R given by the Compendium of International Methods of Analysis of Wines and Musts of the OIV
-
, where RPTS is the reproducibility limit R given by an organisation of proficiency-testing schemes.
-
, where R comes from the Horwitz model (empirical model that establishes a relationship between the concentration of the analyte and the interlaboratory .
Where c is the concentration expressed in mass ratio without dimension.
- The MADRw for within-laboratory (intermediate) reproducibility that characterizes the allowable deviation for the dispersion of results obtained under conditions of intra-laboratory precision.
- MADcalibration which characterizes the allowable deviation for testing conformity with the MAD compares Individual measured values with the predicted values from the regression curve. Such a MAD would have to allow for the within (intermediate) reproducibility (at least) as well as for uncertainty in the calibration
7.3. Step two: evaluating method performance characteristics in the laboratory
7.3.1. Study of the calibration function
Note: The general term calibration is used here. The more restrictive of term gauging (see vocabulary) may be used in certain reference texts.
7.3.1.1. Purpose
The fit of the calibration data points on the calibration line or curve is never perfect. The mathematical calibration function thus generates a source of error which will be included in the method’s uncertainty budget.
The following figure shows the error for a calibration line between the signal value of a measuring instrument and the value of the measurand.
The purpose of this study is to examine the error produced by the mathematical calibration function by comparing it with a maximum allowable deviation. This deviation, which is specifically defined for the calibration function is called MADcalibration.
How to define a MADcalibration
The assessment coordinator sets a MAD for each concentration level i to evaluate a calibration range.
NOTES:
- This study is, of course, only meaningful when the method introduces a calibration function.
- This study applies only to methods having a signal specific to the measurand. It is not relevant to methods with a non-specific signal (e.g. the FTIR).
- The study concerns the analytical magnitudes calculated from the calibration or gauging function established during calibration or gauging operations of the instrument or method. There are various mathematical models used for this purpose:
- Linear model y = a1.x+ a0
- Quadratic (or polynomial) model y= a2.x2 + a1.x + a0
- Other models
The laboratory must verify or possibly choose the appropriate model.
7.3.1.2. Experiment schedule
This study requires that the laboratory uses stable reference materials whose accepted values have been determined with certainty. These may therefore be internal reference materials spiked or formulated with traceable equipment, wines or musts whose value is given by the mean of at least 3 repetitions of the reference method, external reference materials, or certified external reference materials.
The choice and constitution of reference materials must make it possible to ensure that their matrices are compatible with the method under study.
In all cases, it is essential that the reference materials used in the experiment schedule are perfectly independent of thestandards previously used to calibrate the instrument.
It is recommended to set up a number p of reference materials. This number will be at least equal to 3, but it is not necessary to go beyond 10. Accepted values for reference materials must be distributed evenly over the range of values studied.
All p reference materials should be measured the same n number of times, under intermediate precision conditions (for example over several days), n being at least equal to 5.
It is essential to carry out each individual measurement on independent reference materials in order to take into account the errors in the formulation of thestandards.
NOTE:
"0" is not an analytical value. A "blank" standard should not be used in the experiment schedule; the loweststandard must have a measurand value near to the quantification limit.
7.3.1.3. Compilation of the results
The results and calculations are compiled in the following table with linear model y = a + b.x
Reference materials |
1 |
… |
j |
… |
p |
|||
Accepted value of reference material |
x1 |
… |
xj |
… |
xp |
a |
b |
|
Standard 1 |
y11 |
… |
y1,j |
… |
y1,p |
a1 |
b1 |
|
Measured |
… |
… |
… |
… |
… |
… |
… |
… |
Value |
Standard i |
yi1 |
… |
yi,j |
… |
yj,p |
aj |
bj |
… |
… |
… |
… |
… |
… |
|||
Standard n |
yn,1 |
… |
yn,j |
… |
ynp |
an |
bn |
|
Standard 1 |
… |
… |
||||||
Calculated |
… |
… |
… |
… |
… |
… |
||
Value |
Standard j |
|
… |
... |
||||
… |
… |
… |
… |
… |
… |
|||
Standard n |
… |
|||||||
Standard 1 |
d11 |
… |
d1,j |
… |
d1,p |
|||
Calculated |
… |
… |
… |
… |
… |
… |
||
difference |
Standard j |
di1 |
… |
di,j |
… |
dj,p |
||
… |
… |
… |
… |
… |
… |
|||
Standard n |
dn,1 |
… |
dn,j |
… |
dnp |
is the ith measured value of the jth reference material with the standard N° i.
is the accepted value of the jth reference material with the standard N° i
is the slope of the regression line N° i.
is the intercept of the regression line N °i.
is the calculated value of the jth reference material with the regression line N °i.
The parameters a and b for the regression line are obtained using the following calculations:
-
the average for p measurements of the ith reference material
-
the average for all the accepted values of n reference materials
-
the average for all the measurements
-
estimated slope b
- estimated intercept point a
7.3.1.4. Graphical analysis of results
Theoretical and actual values
The first type of graph represents the calculated values as a function of the accepted values of reference materials. The calculated overlap line Y=X is also plotted.
Figure 1 – Representation of calculated values in relation to accepted values of reference materials, and the "calculated value = accepted value" straight line |
Study of calculated differences in relation to a MADcalibration
The purpose is to check that all the individual differences calculated on each analysed calibrator (or gauge) are acceptable using a MADcalibration set by the operator.
If the differences are lower than the MADcalibration set for each reference material then the calibration function is regarded as acceptable in the range under study (Figure 2).
In the opposite case, the calibration function cannot be used in the range under study.
NOTE
The study of the individual calculated differences between the accepted value of reference materials and the values measured from the calibration equation during each calibration can also be conducted with relative values expressed as a %:
7.3.1.5. Further actions when the calibration function is not accepted in the range under study
If the calibration function is not accepted in the calibration range, several solutions are proposed:
- The assessment coordinator repeats the suitability test after reducing the study range.
- The assessment coordinator demonstrates that the range can be segmented into several calibration ranges.
- The assessment coordinator decides that another function is better suited to the method.
7.3.1.6. Example: study of the linear calibration function for the determination of L-malic acid
The laboratory wishes to evaluate the linear calibration function for the determination of L-malic acid in reference materials from 0.15 to 4.20 g/l.
The accepted values for 4 reference materials are: 0 - 0.15 – 0.80 – 2.80 – 4.20 g/l.
The values measured on each reference material during each calibration are indicated in the following table:
Reference materials |
1 |
2 |
3 |
4 |
|
Accepted value of reference material |
0.15 |
0.80 |
2.80 |
4.20 |
|
Calculated values |
Oct. 7 |
0.09 |
0.73 |
2.74 |
4.21 |
Oct. 22 |
0.14 |
0.81 |
2.72 |
4.29 |
|
Oct. 25 |
0.12 |
0.77 |
2.71 |
4.28 |
|
Oct. 27 |
0.10 |
0.72 |
2.77 |
4.27 |
|
Oct. 30 |
0.07 |
0.77 |
2.85 |
4.15 |
|
|
Oct. 7 |
-0.06 |
-0.07 |
-0.06 |
0.01 |
Oct. 22 |
-0.01 |
0.01 |
-0.08 |
0.09 |
|
Oct. 25 |
-0.03 |
-0.03 |
-0.09 |
0.08 |
|
Oct. 27 |
-0.05 |
-0.08 |
-0.03 |
0.07 |
|
Oct. 30 |
-0.08 |
-0.03 |
0.05 |
-0.05 |
The individual calculated differences between accepted value of reference materials and the values calculated from the calibration equation during each calibration are indicated in the following figure for each material:
The laboratory wishes to evaluate the calibration function starting with a maximum allowable deviation of 0.10 in relation to the reference value for each material used during calibration.
The linear calibration function is considered acceptable in the range studied with the MADcalibration approach since all the calculated absolute differences are lower than the maximum acceptable deviation of 0.10 set by the laboratory.
7.3.2. Robustness
7.3.2.1. Definition
Robustness is the capacity of a method to give close results in the presence of slight changes in the experimental conditions likely to occur during the use of the procedure.
7.3.2.2. Recommendations
7.3.3. Study of the detection limit and the quantification limit of an analysis method
7.3.3.1. Preliminary background
In practice, the analytical value "zero" is never reached because of the uncertainty which exists in all measurement methods. For this reason, an analysis cannot return a result of "zero".
Analysis methods with lower values tending towards zero must be subjected to detection limit (DL) and quantification limit (QL) tests.
- If a raw result is above the QL, it is noted as is on the analysis report.
- If a raw result is between the DL and the QL, the laboratory notes, "< QL, traces detected, non-quantifiable" or equivalent.
- If a raw result is lower than the DL, the laboratory notes "not detected" or equivalent
7.3.3.2. Purpose
The purpose is to establish the detection limit and quantification limit of a given method by taking into account all the sources of variability related to the method, the apparatus used, and the wine analysed.
NOTE: This step is obviously neither applicable nor necessary for those methods with lower limits not tending towards zero, such as alcohol strength by volume, total acidity, or pH.
The following approaches enable an estimation of the detection limit and quantification limit:
- blank study;
- background noise study for graphical recordings;
- estimation of a pre-supposed value confirmed a posteriori.
These methods are suitable for various situations, but in every case they are mathematical approaches giving results of informative value only.
7.3.3.3. Blank study
Principle
This method can be applied when the blank analysis gives results with a non-zero standard deviation.
The operator will judge the advisability of using reagent blanks, or matrix blanks.
If the blank, for reasons related to uncontrolled signal pre-processing, is sometimes not measurable or does not offer a recordable variation (standard deviation of 0), the operation can be carried out on a very low concentration of the analyte, close to the blank.
Experiment schedule and calculations
- Carry out the analysis of n test materials similar to blanks, n being equal to or higher than 10.
- Calculate the average and the standard deviation of results xi obtained:
- From these results the detection limit is conventionally defined by the formula:
- From these results the quantification limit is conventionally defined by the formula:
Example
The table below shows some results obtained when determining the detection limit for the usual determination of free sulphur dioxide.
Test material # |
X (in mg/l) |
1 |
0 |
2 |
1 |
3 |
0 |
4 |
1.5 |
5 |
0 |
6 |
1 |
7 |
0.5 |
8 |
0 |
9 |
0 |
10 |
0.5 |
11 |
0 |
12 |
0 |
The calculated values are as follows:
- q = 12
-
= 0,375
- Sblank = 0.528 mg/l
- DL = 1.96 mg/l
- QL = 5.65 mg/l
7.3.3.4. Study of the background noise of a graphical recording
This study is described in detail in OIV Resolution OENO 7/2000 described in the Compendium of International Methods of Analysis of Wines and Musts of the OIV.
7.3.3.5. Confirmation of a pre-supposed quantification limit of the method
Principle
The objective is to verify that a presupposed quantification limit QL is valid in a matrix.
For any method, the greater the trend towards low values, the higher the coefficient of variation of intermediate precision. A relative variation of 60% under conditions of intermediate precision is considered to indicate the performance limit of a method, and therefore its QL.
Experiment schedule
Use a test material corresponding to the matrix analyzed whose measurand value corresponds to the QL to be validated:
- Material resulting from an interlaboratory comparison;
- External reference material;
- Material obtained by spiking;
- Material obtained by dilution;
Analyse the selected material n times under intermediate precision conditions and with a period of stability for the sample of n ≥ 5.
In each series the measurement is performed r times under repeatability conditions; r must be ≥ 2.
Table 5 –Table of the values measured on a reference material of an accepted value with a pre-supposed quantification limit
Series |
Repetitions |
Average |
Within-run variance |
||||
1 |
… |
j |
… |
r |
|
|
|
1 |
… |
… |
|||||
... |
… |
… |
… |
… |
… |
... |
|
i |
… |
… |
|||||
... |
… |
… |
… |
… |
… |
... |
|
n |
… |
… |
Calculations and results
The following are determined:
The average variance of within-run variance
Measurement average
Variance of the means
The standard deviation of intermediate precision
Interpretation of results
The purpose is to be sure of the accuracy of the presupposed quantification limit compared with a maximum allowable deviation of 60% of the QL by checking the two following inequalities:
If these two inequalities are not confirmed, the accuracy of the quantification limit is not confirmed.
Example
The laboratory wishes to check the quantification limit of the determination of L-malic acid from the values measured on a reference material of value 0.20 g/l compared with a maximum allowable deviation of 60% of the reference value at 0.20 g/l.
The measured values are indicated in the following table:
Repetitions |
Average
|
Variance of the series
|
||
Series |
Test 1 |
Test 2 |
||
10 Jan. |
0.23 |
0.22 |
0.225 |
0.000050 |
11 Jan. |
0.25 |
0.24 |
0.245 |
0.000050 |
12 Jan. |
0.23 |
0.23 |
0.230 |
0.000000 |
13 Jan. |
0.25 |
0.26 |
0.255 |
0.000050 |
14 Jan. |
0.24 |
0.25 |
0.245 |
0.000050 |
A study of the results yields the following information:
-
the general average on the QL material:
= 0.240.
-
the variance of repeatability:
= 0.00004
-
the variance of the means:
= 0.00015
- the standard deviation of intermediate precision sFI = 0.013.
The pre-supposed quantification limit is QL = 0.20 g/l
The maximum allowable deviation is MAD = 60% × 0.20, or MAD = 0.12 g/l.
The acceptance interval for the QL value is: QL ± 60% × QL, or [ 0.08; 0.32 ].
The two inequalities are confirmed, so the pre-supposed quantification limit of the method at 0.20 is confirmed.
7.3.4. Study of extraction yields
7.3.4.1. Purpose
The analytical procedures of some methods contain a preliminary step of extraction/purification/concentration of the analyte. This step generates random and systematic errors which are directly entered into the method’s uncertainty budget.
The yield is defined as the ratio between the quantity recovered after a purification or extraction process and the quantity initially present. The error linked to the extraction yield is in many cases relatively significant, and a particular study of it is essential.
The study of the yield is obviously not useful if there is no extraction treatment prior to the analysis. It may also be useless if the standards undergo the same extraction process as the samples. However, in this case, the laboratory must demonstrate that the extraction is not influenced by a matrix effect.
7.3.4.2. Experiment schedule
Add different levels of several materials representative of the matrix variability within the scope of application of the method.
Use at least n ≥ 5 tests of different materials and at least p ≥ 2 levels of addition. It is not necessary to add all the levels of each material.
Carry out measurements of the different test materials in different series, and at different times.
In the absence of recommendations, the levels of addition may be from 20% to 80% of maximum concentration of the scope of application.
Materials |
Initial value of the measurand |
Addition |
Value after addition |
Yield |
1 |
||||
… |
… |
… |
… |
… |
i |
||||
… |
… |
… |
… |
… |
n |
|
The following are determined:
The average yield:
The standard deviation of yields:
7.3.4.3. Interpretation of yields
If the average yield deviates from 100%, the laboratory defines in its documentation whether a yield correction is applied to the final result. If it is not, when calculating the uncertainties the laboratory must take into account the bias introduced by the absence of yield correction.
7.3.5. Study of matrix interferences
7.3.5.1. Purpose
The purpose is to study the influence of other compounds on the measurement result.
If the laboratory suspects the interaction of compounds others than the analyte, an experiment schedule may be set up to test the influence of various compounds. This study therefore mainly applies to the methods in which the signal is not specific to the analyte, or when there is a risk of disruption of the signal by the matrix elements.
The experiment schedule proposed here enables a search to be made for the influence of the compounds defined a priori: given its knowledge of the analytical process and its know-how the laboratory should be able to define a certain number of compounds liable to be present in the wine, and likely to influence the analytical result.
7.3.5.2. Basic protocol and calculations
Analyse n wines in duplicate, before and after the addition of the compound suspected of having an influence on the analytical result; n must be at least equal to 5.
Calculate the average values Mxi of 2 measurements xi and x’i carried out before addition, and the average values Myi of the 2 measurements yi and y'i carried out after addition, then the difference di between the values Mxi and Myi.
The results of the experiment can be reported as indicated in the following table:
Table 7 – Test organisation
Samples |
x: Before addition |
y : After addition |
Averages |
Difference |
|||
Rep1 |
Rep2 |
Rep1 |
Rep2 |
x |
y |
d |
|
1 |
x1 |
x’1 |
y1 |
y’1 |
Mx1 |
My1 |
d1=Mx1-My1 |
… |
… |
… |
… |
… |
… |
… |
… |
i |
xi |
x’i |
yi |
y’i |
Mxi |
Myi |
di=MxiMyi |
… |
… |
… |
… |
… |
… |
… |
… |
p |
xp |
x’p |
yp |
y’p |
Mxp |
Myp |
dn= Mxp-Myp |
The average of the results before addition Mx
The average of the results after addition My
To calculate the average of the differences Md
To calculate the standard deviation of the differences Sd
To calculate the reduced deviation
7.3.5.3. Interpretation
If the reduced deviation is lower than or equal to 2, the influence has not been proved to be present
- If the reduced deviation is higher than 2, the added compound can be considered as having been shown (at the 5% level of statistical significance) to influence the analysis result.
7.3.5.4. Example
The purpose is to study the interaction of compounds liable to be in the samples on the determination of glucose and fructose in wines using Fourier Transform Infrared Spectroscopy (FTIR).
Before addition |
+ 250 mg.L-1 potassium sorbate |
+ 1 g.L-1 salicylic acid |
Differences |
|||||
wine |
rep1 |
rep2 |
rep1 |
rep2 |
rep1 |
rep2 |
Sorbate diff. |
Salicylic diff |
1 |
6.2 |
6.2 |
6.5 |
6.3 |
5.3 |
5.5 |
-0.2 |
0.8 |
2 |
1.2 |
1.2 |
1.3 |
1.2 |
0.5 |
0.6 |
-0.05 |
0.65 |
3 |
0.5 |
0.6 |
0.5 |
0.5 |
0.2 |
0.3 |
0.05 |
0.3 |
4 |
4.3 |
4.2 |
4.1 |
4.3 |
3.8 |
3.9 |
0.05 |
0.4 |
5 |
12.5 |
12.6 |
12.5 |
12.7 |
11.5 |
11.4 |
-0.05 |
1.1 |
6 |
5.3 |
5.3 |
5.4 |
5.3 |
4.2 |
4.3 |
-0.05 |
1.05 |
7 |
2.5 |
2.5 |
2.6 |
2.5 |
1.5 |
1.4 |
-0.05 |
1.05 |
8 |
1.2 |
1.3 |
1.2 |
1.1 |
0.5 |
0.4 |
0.1 |
0.8 |
9 |
0.8 |
0.8 |
0.9 |
0.8 |
0.2 |
0.3 |
-0.05 |
0.55 |
10 |
0.6 |
0.6 |
0.5 |
0.6 |
0.1 |
0 |
0.05 |
0.55 |
Potassium Sorbate |
Md = |
-0.02 |
|
|
Sd = |
0.086 |
|
||
Reduced deviation = |
0,74 |
<2 |
||
|
|
|
||
Salicylic acid |
Md = |
0.725 |
|
|
Sd = |
0.282 |
|
||
Reduced deviation = |
8,13 |
>2 |
In conclusion, the FTIR performed here reveals that potassium sorbate does not influence the determination of glucose and fructose. However, salicylic acid shows an influence. Therefore, to maintain the validity for the calibration under study, precautions must be taken so that the samples do not contain salicylic acid.
In conclusion, the influence of potassium sorbate in the determination of glucose and fructose has not been proved to exist.
7.3.6. Precision of the method
7.3.6.1.1. General principle
Precision is a general concept that characterizes the random errors of a method. Precision is applied under various experimental conditions:
- Repeatability (r): minimum changes to experimental conditions
- Intermediate precision (IP): changes to intralaboratory experimental conditions.
- Intra-laboratory reproducibility (RW): the most extensive conditions possible of changes in intra-laboratory experimental conditions.
- Reproducibility (R): maximum conditions of change in experimental conditions. Reproducibility therefore refers to interlaboratory conditions.
The precision study is one of the mandatory items for estimating measurement uncertainty.
7.3.6.2. Scope of application
In principle, the precision study can be applied without difficulty to all quantitative methods.
In many cases, precision is not constant throughout the range of validity of the method. It is therefore advisable to define several sections or "range levels", in which we may reasonably consider that the precision is comparable to a constant. The precision calculation then needs to be reiterated for each range level.
Below are the detailed experiment schedules and calculations in the simplified case of the use of a single reference material, and thus a single level of quality.
7.3.6.3. Experiment schedule
A test material is analyzed over a period of time as long as possible with several replicates n. The test material must maintain constant properties during the period in question.
At each replicate, the measurement is performed with p repetitions (p ≥ 2).
The total number of n replicates must at least be equal to 5.
Series |
Repetitions |
Within-run average |
Within-run variance |
||||
1 |
… |
i |
… |
p |
|
|
|
1 |
|
|
|
||||
… |
|
|
|
|
|
|
|
j |
|
|
|||||
… |
|
|
|
|
|
|
|
n |
|
|
The following are defined:
The overall average of all the measurements
The variance of the average of the replicates
7.3.6.4. Calculation of repeatability
The variance of repeatability corresponds to the average variance of each series, since the number of repetitions p is identical in each series.
Repeatability standard deviation:
Repeatability is the deviation expanded to 95% (coverage factor k = 2) expected between two results in which the repeatability standard deviation is sr, which results in the following:
= |
In the case where the number of repetitions p = 2, the calculation is simplified:
i.e.
,the deviation between the two repetitions of the jth replicate..
Such that
7.3.6.5. Variation of the experiment schedule to calculate the repeatability with several test materials
Repeatability can be calculated by analyzing in duplicate n different materials, n ≥ 5.
The values of the measurand of the reference material must remain in a range in which the repeatability can be considered to remain constant.
i.e.
the deviation between the two replicates of the ith test material.
7.3.6.6. Calculation of precision
Based on the overall experiment schedule, the standard deviation of precision v is given by the general expression:
7.3.6.7. Expression of precision
Precision can be expressed directly by its standard deviation Sv or its value V.
The standard deviation Sv indicates that 95% of the results given by the method of the same material, obtained under given conditions of precision, will be distributed around their average in a range of +/- 2.Sv
V characterizes the deviation that can be expected, with a 95% probability, between two results produced using a method whose precision is Sv
V is calculated as follows:
The precision value v means that in 95% of the cases, the deviation between two values obtained by the method, under the conditions specified, will less than or equal to v.
NOTE: The use and interpretation of these results is possible if it is assumed that the deviations follow a normal distribution with 95% confidence.
7.3.6.8. Types of precision and their expressions
Repeatability
Repeatability refers to the minimum change in experimental conditions. The within-run variance is zero.
The repeatability standard deviation is expressed under the symbol sr, and repeatability r=2.sr.
Intermediate precision
Intermediate precision corresponds to the changes in experimental conditions within the same laboratory, which lie between the repeatability conditions, and the laboratory reproducibility conditions. When reference is made to intermediate precision, it is therefore necessary to specify the experimental conditions.
The standard deviation of intermediate precision is expressed by the acronym sIP, and the intermediate precision RIP= 2.sIP.
Intra-laboratory reproducibility
Intermediate precision corresponds to the most extensive changes in experimental conditions within the same laboratory.
The standard deviation of intralaboratory reproducibility is expressed by the acronym sRW, and the intra-laboratory reproducibility RW= 2.sRW.
Reproducibility
Reproducibility corresponds to the maximum changes in experimental conditions. By default, therefore, reproducibility refers to interlaboratory conditions.
The repeatability standard deviation is expressed under the symbol sR, and repeatability R=2. sR.
Example
Study of the intra-laboratory reproducibility of the assay of sorbic acid in wines by steam distillation and measurement of the absorption at 256 nm.
A wine to which potassium sorbate was added was stored for a period of 3 months.The determination of sorbic acid was carried out at regular intervals during this period, with three replicates for each measurement.
Series |
Repetitions |
Within-run average |
Within-run variance |
||
1 |
2 |
3 |
|||
1 |
140 |
139 |
144 |
141.0 |
7.0 |
2 |
138 |
137 |
140 |
138.3 |
2.3 |
3 |
136 |
141 |
140 |
139.0 |
7.0 |
4 |
145 |
148 |
143 |
145.3 |
6.3 |
5 |
133 |
134 |
130 |
132.3 |
4.3 |
6 |
132 |
138 |
135 |
135.0 |
9.0 |
Sr=2.4
r=6.8
The repeatability indicates that with 95% probability, the deviations between two results obtained under repeatability conditions will be less than or equal to 6.8 mg/L.
The intra-laboratory reproducibility indicates that with 95% probability, the deviations between two results obtained under the most extensive conditions of variability in the laboratory will be less than or equal to 14.1 mg/L.
7.3.7. Study of method accuracy
7.3.7.1. Purpose
The study of the accuracy is based on a holistic approach that assesses the combined effects of the precision and trueness of the method based on values that serve as a reference.
The accuracy of the method is checked on reference value materials and maximum acceptable deviation (MADaccuracy) arising from standard, regulatory or informative requirements or requirements set by the customer or the laboratory.
The reference value of the material is obtained for example by:
- the certificate value of a certified reference material (RefMRC),
- the consensus value resulting from an interlaboratory comparison (RefILC),
- the arithmetic mean of the values of measurements repeated as per the reference method (RefMeth),
- the value targeted by addition of the analyte to a matrix representative of the scope of application (RefAdd).
7.3.7.2. Experiment schedule
Prepare or choose q≥3 reference materials covering the scope of the method in terms of concentration for a given matrix.
Analyse each reference material in n≥5 series under intermediate precision conditions and within the stability time of the material for the matrix in question. If the material is unstable, carry out several preparations.
For each reference material and in each series, carry out p≥2 repetitions under repeatability conditions.
For each material, the following table of results can be drawn up (identical to the table for the study of intermediate precision)
Series |
Repetitions |
Within-run average |
Within-run variance |
||||
1 |
… |
i |
… |
p |
|||
1 |
|
|
|||||
… |
|
|
|
|
|
|
|
j |
|
|
|||||
… |
|
|
|
|
|
|
|
n |
|
|
7.3.7.3. Estimate of accuracy parameters
Each reference material corresponds to a given concentration level for which we can define a given MADaccuracy.
Material 1 |
… |
Material i |
… |
Material q |
|
Reference value |
Ref1 |
… |
Refi |
… |
Refp |
MAD |
MAD1 |
… |
MADi |
… |
MADp |
MAD% |
… |
… |
|||
Number of series |
n1 |
… |
ni |
… |
nq |
Number of repetitions per series |
p1 |
… |
pi |
… |
pq |
Overall average for each material |
… |
… |
|||
Relative deviation |
… |
… |
|||
Standard deviation of intermediate precision |
… |
… |
|||
CV of intermediate precision in% |
… |
… |
7.3.7.4. Interpretation of accuracy parameters
The study of accuracy is based on a check of a tolerance interval (average +/-2 intermediate precision standard deviations) with an acceptation interval (REF +/-MAD) resulting from the objective fixed by the laboratory.
The accuracy is verified at each concentration level represented by the q materials studied, by comparing the interval produced by the intermediate precision around the mean value measured, with the range of MADaccuracy around the reference material value. The accuracy is accepted if the first interval fits within the second.
This is reflected in the verification of the following inequalities:
The accuracy is verified if (1) and (2) are verified
These quantities can equally be expressed in relative terms:
The accuracy is accepted if (1) and (2) are confirmed
If the accuracy of the method is verified on the q reference materials, then the accuracy of the method is verified in the validation range.
7.3.7.5. Example
The laboratory wishes to verify the accuracy of an automated colorimetric method on a sequential analyzer for the determination of iron between 1 and 6 mg/l.
The laboratory tests were carried out for 5 days on three materials whose target values were obtained during interlaboratory tests.
Each analysis is repeated 2 times under repeatability conditions.
Material 1 |
Repetitions |
Average |
Variance of the series |
|
1 |
2 |
|||
J1 |
0.95 |
0.96 |
0.955 |
0.00005 |
J2 |
1.01 |
1.02 |
1.015 |
0.00005 |
J3 |
0.90 |
0.93 |
0.915 |
0.00045 |
J4 |
0.88 |
0.87 |
0.875 |
0.00005 |
J5 |
0.97 |
0.98 |
0.975 |
0.00005 |
Material 2 |
Repetitions |
Average |
Variance of the series |
|
1 |
2 |
|||
J1 |
2.05 |
2.02 |
2.035 |
0.00045 |
J2 |
2.23 |
2.19 |
2.210 |
0.00080 |
J3 |
2.06 |
2.10 |
2.080 |
0.00080 |
J4 |
2.31 |
2.35 |
2.330 |
0.00080 |
J5 |
2.19 |
2.25 |
2.220 |
0.00180 |
Material 3 |
Repetitions |
Average |
Variance of the series |
|
1 |
2 |
|||
J1 |
6.18 |
6.13 |
6.155 |
0.00125 |
J2 |
5.75 |
5.82 |
5.785 |
0.00245 |
J3 |
5.95 |
5.90 |
5.925 |
0.00125 |
J4 |
5.85 |
5.90 |
5.875 |
0.00125 |
J5 |
6.05 |
6.04 |
6.045 |
0.00005 |
The results are summarized in the following table:
Material 1 |
Material 2 |
Material 3 |
|
Reference value |
0,96 |
2,06 |
5,98 |
MADaccuracy |
0,192 |
0,412 |
0,7176 |
MADaccuracy% |
20% |
20% |
12% |
Number of series |
5 |
5 |
5 |
Number of repetitions per series |
2 |
2 |
2 |
Overall average for each material |
0,947 |
2,175 |
5,957 |
Relative deviation |
-1,35% |
5,58% |
-0,38% |
Standard deviation of intermediate precision |
0,055 |
0,12 |
0,147 |
CV of intermediate precision in % |
5,81% |
5,52% |
2,47% |
Verification of the accuracy in absolute values |
|||
Ref + MAD |
1,152 |
2,472 |
6,6976 |
x+2S |
1,057 |
2,415 |
6,251 |
x-2S |
0,837 |
1,935 |
5,663 |
Ref - MAD |
0,768 |
1,648 |
5,2624 |
Conclusion |
Accuracy verified |
Accuracy verified |
Accuracy verified |
Verification of the accuracy in relative values |
|||
MAD% |
20% |
20% |
12% |
b% + 2.CV% |
10,26% |
16,62% |
4,55% |
b% - 2.CV% |
-12,97% |
-5,45% |
-5,32% |
-MAD% |
-20% |
-20% |
-12% |
Conclusion |
Accuracy verified |
Accuracy verified |
Accuracy verified |
The graphical representation of the accuracy profile is as follows (relative values).
7.3.8. Comparison of two methods
7.3.8.1. Purpose
The purpose is to compare two different methods for the same parameter, and to characterize the deviation between the methods. The comparison is made by analyzing materials with each method under repeatability conditions.
This method will apply if the laboratory uses the OIV reference method, or possibly a traceable, validated method with known performance characteristics and which meets the needs of customers of the laboratory.
To study the difference in trueness between the two methods, it is first necessary to ensure the quality of the repeatability of the method in order to validate and compare it with the reference method.The methodology for comparing repeatability is described in the chapter on repeatability.
Before studying the bias between the two methods, it is first necessary to compare repeatability between them.
7.3.8.2. Experiment schedule
The trueness of the alternative method compared with that of the reference method is based on a scope of application in which the precision of the two methods is constant and in which the bias of the alternative method as compared to the true method is constant.
In practice, the range of analyzable values should be divided up into several (2 to 5) sections or "range levels", in which it can be reasonably estimated that the precision of each method is approximately constant, and the experiment schedule then carried out in each range.
In each range level, the trueness will be determined on the basis of a set of n test materials, each with analyte concentration levels covering the range level in question. A minimum of 10 test materials is required to obtain significant results.
Each test material will be analyzed in duplicate by both methods under repeatability conditions.
Calculate the average values Mxi of the 2 measurements xi and x’i carried out using the alternative method, and the average values Myi of the 2 measurements yi and y’i carried out using the reference method, then the difference di between the values Mxi and Myi.
The results of the experiment can be reported as indicated in the following table:
Test material |
x: Alternative method |
y: Reference method |
Averages |
Deviation x |
Deviation y |
Difference |
||||
|
Rep1 |
Rep2 |
Rep1 |
Rep2 |
Mx |
My |
wx |
wy |
d |
|
1 |
x1 |
x’1 |
y1 |
y’1 |
Mx1 |
My1 |
Wx1= x’1- x1 |
Wy1=y’1-y1 |
d1=Mx1 - My1 |
|
… |
… |
… |
… |
… |
… |
… |
|
|
… |
|
i |
xi |
x’i |
yi |
y’i |
Mxi |
Myi |
Wxi=x’i- xi |
Wyi=y’i-yi |
di=Mxi - Myi |
|
… |
… |
… |
… |
… |
… |
… |
|
|
… |
|
N |
xn |
x’n |
yn |
y’n |
Mxn |
Myn |
Wx1=x’n-xn |
Wyn=y’n-yn |
dn= Mxn- Myn |
|
7.3.8.3. Comparison of repeatability
Calculations
Let
Sralt be the repeatability standard deviation of the method evaluated
Srref be the repeatability standard deviation of the reference method The comparison is direct. If the repeatability value of the alternative method is less than or equal to that of the reference method, the result is favourable. If it is higher, the laboratory must ensure that this result is consistent with the specifications that it has accepted for the method in question. In the latter case, it may also apply a Fisher-Snedecor test to see if the difference is statistically significant.
Fisher-Snedecor test
Calculate the ratio:
Use the critical value of Snedecor with a risk α equal to 0.05 corresponding to the Fischer variable on confidence level 1- α, in which ν1 = N(x)-n, and ν2 = N(z)-m degrees of freedom: F(N(x)-n, N(y)-m, 1- α).
In the case of a calculated repeatability with a single repetition on n test materials for the alternative method and the reference method, the Fischer variable will have as the degree of freedom ν1=n, and ν2=n, i.e. F(n, n, 1- α).
In the case of a calculated repeatability with n materials and p measurements of each material using the alternative (numerator) or reference (denominator) methods, the ratio Fobs is tested against the 95th percentile of Snedecor’s F distribution F(n, n(p-1), 1- α). In the table above, p is 2 for both methods, and the percentile will be given by F(n, n, 0.95)
Interpretation of the test:
- 1/ Fobs> F(n, n(p-1), 1- α), the repeatability value of the alternative method. has been proved (at the 5% significance level) to be higher than that of the reference method.
- 2/ Fobs< F(n, n(p-1), 1- α), the repeatability value of the alternative method has not been proved (at the 5% significance level) to be higher than that of the reference method.
Example
The value of the repeatability standard deviation found for the assay of free sulphur dioxide is:
Sralt = 0.54 mg/l
The laboratory performed the assay on the same test materials using the OIV reference method. The value of the repeatability standard deviation found in this case is:
Srref = 0.39 mg/l
ν 2 = 12
ν 1 = 12
F(12, 12, 95%), = 2.69>1.93
The Fobs value obtained is less than the F(n, n(p-1), 1- α), therefore the repeatability value of the alternative method has not been proved to be higher (at the 5% significance level) than that of the reference method.
7.3.8.4. Characterization of the deviations between two methods
Calculations
The average results of the alternative method Mx
The average results of the reference method My
To calculate the average of the differences Md
To calculate the standard deviation of the differences Sd
To calculate the Zscore
If Zscore is less than or equal to 2.0, we can conclude that the bias of one method in relation to the other is satisfactory, in the range level in question, with an error risk α = 5%.
If Zscore is greater than 2.0, we can conclude that the bias of one method in relation to the other is not satisfactory, in the range level in question, with an error risk α = 5%.
NOTE The interpretation of Zscore is possible if it is assumed that the deviations follow normal distributions, the means and standard deviations being constant within each method.
Example
Study of an FTIR application for the determination of glucose and fructose in relation to the enzymatic method. The first range level covers the range from 0 to 5 g.L-1 and the second range level covers a range from 5 to 20 g.L-1.
Wine |
FTIR 1 |
FTIR 2 |
Enz 1 |
Enz 2 |
diff |
1 |
0.4 |
0.3 |
0.3 |
0.2 |
-0.1 |
2 |
0.2 |
0.3 |
0.1 |
0.1 |
-0.15 |
3 |
0.6 |
0.9 |
0.2 |
0.2 |
-0.55 |
4 |
0.7 |
1 |
0.8 |
0.7 |
-0.1 |
5 |
1.2 |
1.6 |
1.1 |
1.3 |
-0.2 |
6 |
1.3 |
1.4 |
1.3 |
1.3 |
-0.05 |
7 |
2.1 |
2 |
1.9 |
2.1 |
-0.05 |
8 |
1.4 |
1.3 |
1.1 |
1.2 |
-0.2 |
9 |
2.8 |
2.5 |
2 |
2.6 |
-0.35 |
10 |
3.5 |
4.2 |
3.7 |
3.8 |
-0.1 |
11 |
4.4 |
4.1 |
4.1 |
4.4 |
0 |
12 |
4.8 |
5.4 |
5.5 |
5 |
0.15 |
|
|
|
|
|
|
Md |
-0.14 |
|
|
|
|
Sd |
0.18 |
|
|
|
|
Zscore |
|-2.77|>2
|
|
|
|
|
Wine |
FTIR 1 |
FTIR 2 |
Enz 1 |
Enz 2 |
diff |
1 |
5.1 |
5.4 |
5.1 |
5.1 |
0.1 |
2 |
5.3 |
5.7 |
5.3 |
6.0 |
-0.2 |
3 |
7.7 |
7.6 |
7.2 |
7.0 |
0.6 |
4 |
8.6 |
8.6 |
8.3 |
8.5 |
0.2 |
5 |
9.8 |
9.9 |
9.1 |
9.3 |
0.6 |
6 |
9.9 |
9.8 |
9.8 |
10.2 |
-0.1 |
7 |
11.5 |
11.9 |
13.3 |
13.0 |
-1.4 |
8 |
11.9 |
12.1 |
11.2 |
11.4 |
0.7 |
9 |
12.4 |
12.5 |
11.4 |
12.1 |
0.7 |
10 |
16 |
15.8 |
15.1 |
15.7 |
0.5 |
11 |
17.7 |
18.1 |
17.9 |
18.3 |
-0.2 |
12 |
20.5 |
20.1 |
20.0 |
19.1 |
0.7 |
|
|
|
|
|
|
Md = |
0.19 |
|
|
|
|
Sd = |
0.61 |
|
|
|
|
Zscore = |
1,04 |
<2 |
|
|
|
For the first range level, the Z score is greater than 2. The FTIR calibration for the determination of fructose and glucose studied here is not considered true in relation to the enzymatic method.
For the second range level, the Z score is less than 2. The FTIR calibration for the determination of fructose and glucose studied here can be considered true in relation to the enzymatic method.
The bias of FTIR in relation to the enzymatic method is satisfactory only for the second range level, with an error risk α = 5%.
7.4. Step three: laboratory findings
The method is declared valid if the measured performance characteristics meet the requirements defined in step one. The laboratory issues a formal conclusion on the validity of the method stating: the method is declared not to have been proven invalid if the measured performance characteristics meet the requirements defined in step one.
- The analyzable matrices
- The validated concentration ranges
- The specific limits (interactions, etc.)
- Any information and specific restrictions on the validity of the method.
8. Internal Quality Control of analysis methods (IQC)
8.1. Reference documents
- OIV Resolution Oeno 19/2002: Recommendations for harmonized internal quality control in analytical laboratories.
- CITAC / EURACHEM: Quality Guidelines in Analytical Chemistry, 2002 Edition
- NF V03-115/1996: Analysis of agricultural and food products - Guide for application of metrological data
- ISO 7870/2007: Control Charts
- ISO 11095/1996: Linear calibration using reference materials
8.2. General principles
It should be remembered that the results of an analysis are subject to two types of errors: systematic errors, reflected as bias, and random errors. In the case of seriial analyses, another type of error can be defined, which may be due to both bias and random error: this is the series effect, illustrated for example by the deviation of the measurement system during a series of analyses.
The IQC will focus on monitoring and controlling these three errors.
8.3. Reference materials
The IQC is essentially based on the exploitation of the results of measurements of reference materials. The choice and establishment of these results are therefore essential steps that must be controlled in order to ensure an efficient base for the system.
A reference material is defined by two parameters:
- Its matrix
- The assignment of its reference value
Several scenarios are possible; the cases encountered in oenology are grouped in the double-entry table as follows:
Matrix Reference value |
Synthetic solution It is relatively easy to use synthetic solutions as reference materials. They are not compatible with methods in which the signal is not specific, and are sensitive to matrix effects. |
Natural matrix (wine, etc.) A priori, natural matrices are the most interesting reference materials because they avoid any possible matrix effect for methods that are not perfectly specific. |
Spiked wine A spiked wine is one with an artificial addition of an analyte. |
Value obtained by formulation |
The solution should be produced in accordance with metrological rules. It should be remembered that the value obtained by formulation is subject to uncertainty. In this case, the precision of the method as well as its trueness can be monitored at a given point in relation to a calibrated reference. |
Not applicable |
This method is applicable when the base wine is totally devoid of the analyte. These types of materials are well suited for non-native oenological additives in the wine. If spiking is applied with a native constituent of the wine, the matrix can no longer be considered to be natural. Spiking should be carried out in accordance with metrological rules. The value obtained is subject to uncertainty. In this case, the precision of the method as well as its trueness can be monitored at a given point. It can be applied to methods sensitive to matrix effects for non-native compounds of the wine, but not in the case of native compounds of the wine. |
Value external to the laboratory |
The organization providing the solution must provide guarantees of quality and if possible be certified. The reference values will be accompanied by an uncertainty value with a stated level of confidence. In this case, the precision of the method as well as its trueness can be monitored at a given point in relation to the external value. This has a traceability value at this point if the supplier organization is accredited for the preparation of the reference material in question. It cannot be applied to methods sensitive to matrix effects. |
The external value has been determined on the wine by an interlaboratory analysis. Some organizations offer packaged samples of wines whose values have been determined in this way. However, in some cases, the wines in question may have been spiked and/or chemically stabilized. In this case, the matrix may be affected. In this case, the precision of the method as well as its trueness can be monitored at a given point in relation to the external value. This has a traceability value at this point if the interlaboratory analysis is accredited. It can be applied to methods sensitive to matrix effects. |
In practice, the packaged samples offered by organizations are spiked and/or chemically stabilized wines. These materials cannot claim to form a natural matrix. The reference values are usually generated by an interlaboratory analysis. In this case, the precision as well as the trueness of the method can be monitored at a given point in relation to the externalstandard. This has a traceability value at this point if the supplier organization is accredited for the preparation of the reference material in question. It cannot be applied to methods sensitive to matrix effects. |
Value obtained by a reference method |
In cases where the synthetic solution has not been obtained with calibrated equipment, the reference value can be determined by analysing the synthetic solution using the reference method. The measurement is performed at least 3 times. The value retained is the average of the three results, as long as they remain within an interval smaller than the repeatability of the method. If necessary, the operator can check the consistency of the result obtained with the formulation value of the solution. In this case, the precision of the method can be monitored and its trueness verified at a given point in relation to the reference method. It cannot be applied to methods sensitive to matrix effects. |
The measurement is performed 3 times with the reference method; the value retained is the average of the three results, as long as they remain within an interval smaller than the repeatability of the method. In this case, the precision of the method can be monitored and its trueness verified at a given point in relation to the reference method. It can be applied to methods sensitive to matrix effects. |
The measurement is performed 3 times with the reference method; the value retained is the average of the three results, as long as they remain within an interval smaller than the repeatability of the method. In this case, the precision of the method can be monitored and its trueness verified at a given point in relation to the reference method. It can be applied to methods sensitive to matrix effects for non-native compounds of the wine, but not in the case of native compounds of the wine. |
Value obtained using the method to be checked Using the instrument value as a reference value does not mean its trueness is checked. In this case, an alternative approach must be implemented. |
The reference value is measured by the method to be checked. The material is measured with 10 repetitions, and a check is carried out to ensure that the deviations between these values are lower than the repeatability value; the more extreme values may if necessary be deleted, without deleting more than two values. To ensure the consistency of the values obtained during the 10 repetitions, this series will be checked using control materials defined during a previous session, placed at the beginning and end of the series. This case can only be used to monitor the precision of the method; its trueness must be monitored using another approach. |
The reference value is measured by the method to be checked. The material is measured with 10 repetitions, and a check is carried out to ensure that the deviations between these values are lower than the repeatability value; the more extreme values may if necessary be deleted, without deleting more than two values. To ensure the consistency of the values obtained during the 10 repetitions, this series will be checked using control materials defined during a previous session, placed at the beginning and end of the series. The value obtained can be compared with the value obtained using the reference method (during the 3 repetitions for example). The deviation between the two values must be less than the calculated trueness of the alternative method compared with the reference method. This case is of particular interest when a method produces a repeatable random error specific to each sample, in particular due to the non-specificity of the measured signal. This error is often minimal and less than the uncertainty, but may cause a systematic error if the method is adjusted to a single value. It can be used to monitor the precision of the method; its trueness must be monitored using another approach. The most noteworthy case is that of FTIR. |
The reference value is measured by the method to be checked. The material is measured with 10 repetitions, and a check is carried out to ensure that the deviations between these values are lower than the repeatability value; the more extreme values may if necessary be deleted, without deleting more than two values. To ensure the consistency of the values obtained during the 10 repetitions, this series will be checked using control materials defined during a previous session, placed at the beginning and end of the series. This case can only be used to monitor the precision of the method; its trueness must be monitored using another approach. It can be applied to methods sensitive to matrix effects for non-native compounds of the wine, but not in the case of native compounds of the wine. |
8.4. Control of the analytical series
8.4.1. Definition
An analytical series or run is a set of measurements made under conditions of repeatability.
A laboratory that mainly works with analytical series must ensure the instantaneous adjustment of the measuring instrument and its stability during the analytical run.
Two complementary approaches are possible:
- The use of reference materials (often called by extension "control materials")
- The use of an internal standard, in particular for separation methods.
8.4.2. Control of trueness based on reference materials
Systematic error will be checked by introducing reference materials, the reference values of which have been assigned using means external to the method to be checked.
The measured value of the reference material is associated with a maximum tolerated deviation (MTD), also referred to as tolerance, within which the measured value is accepted as being valid. The laboratory defines the MTD values for each parameter and for each analytical system. These values are specific to the laboratory.
The control materials must be selected so that their reference values correspond to the levels normally encountered for a given parameter. In the case where the measurement range is large, and where the measurement uncertainty is not constant over the range, several control materials should be used to cover the different levels.
8.4.3. Within-run precision
When the analytical series are relatively long, there is a risk of drift of the analytical system. In this case, it is necessary to control the within-run precision by using the same reference material positioned at regular intervals in the series. The same control materials as those used for trueness can be applied.
The deviation of the measured values of the same reference material in the series should be less than the repeatability value r calculated for a confidence level of 95%.
NOTE: For a confidence level of 99%, 3.65.S can be used as the value.
8.4.4. Internal standard
Some separation methods allow the introduction of an internal standard in the product to be analyzed. The value of the signal for the internal standard is used in the quantification of the measured parameters.
In this case, introduce an internal standard with calibrated equipment, for which the uncertainty of measurement is known.
It should be noted that drift affects in equal proportions the signals of the analyte and those of the internal standard; since the value of the analyte is calculated with the signal value of the internal standard, the effect of drift is cancelled.
The series will be validated if the internal standards are within defined tolerances.
8.5. Control of the analysis system
8.5.1. Definition
This is an additional control to the control of the analytical series. It differs in that the values compiled are those obtained over long time scales, and/or compared with values output by other analysis systems.
We will develop two applications:
- Shewhart control charts to monitor the stability of the analysis system
- Internal and external comparisons of the analysis system
8.5.2. Shewhart control chart
Shewhart charts are statistical tools that monitor the drift of measuring systems by regularly analyzing stable reference materials in practice under reproducibility conditions.
8.5.2.1. Data acquisition
A stable reference material is measured over a sufficiently long period, at defined regular intervals. These measurements are recorded and reported in the control charts. They are carried out under reproducibility conditions, and are therefore usable for the calculation of reproducibility, and even for the estimation of measurement uncertainty.
The values of the analytical parameters of the reference materials selected should be within those of valid measurement ranges.
The reference materials are analyzed during an analytical series, routine if possible, with a variable position in the series from one time to another. In practice, it is perfectly possible to use the measurements of the control materials in the series to input the control charts.
8.5.2.2. Presentation of results and definition of limits
The individual results are compared with the accepted value of the reference material, and with the standard deviation of reproducibility of the parameter in question, at the level in question.
Two types of limits are defined in the Shewhart charts, the limits associated with the individual results, and the limits associated with the average.
The limits for individual results are usually based on the intra-laboratory reproducibility standard deviation values for the level of range in question, and are of two types:
-
The alert limit:
.
-
The action limit:
.
The greater the number of measurements, the narrower the limit defined for the cumulative average will be.
This limit is an action limit: , n being the number of measurements on the chart.
NOTE For the sake of clarity, the alert limit of the cumulative average is only indicated rarely on the control chart, its value being .
8.5.2.3. Operation of the Shewhart chart
The operating criteria most frequently used are indicated below. It is up to the laboratories to define the criteria they apply more precisely.
Corrective action concerning the method (or apparatus) will be undertaken:
- if an individual result is outside the action limits for individual results.
- if two consecutive individual results are outside the alert limits for individual results.
- if, in addition, the a posteriori analysis of the control charts reveal deviations of the method in three cases:
- nine points of consecutive individual results are located on the same side of the line of the reference values.
- six successive individual data points go up or down.
-
two out of three successive points are located between the alert limit and the action limit.
- if the arithmetic mean of n recorded results is beyond the action limits of the cumulative average (thus highlighting a systematic deviation of results).
NOTE: The control chart must resume at n = 1 as soon as a corrective action has been performed on the method.
8.5.3. Internal comparison of different analysis systems of the same parameter
In a laboratory that has multiple methods of analysis for a given parameter, it is interesting to make measurements of the same test materials in order to compare the results. The agreement of the results is satisfactory between the two methods if the difference between them is less than 2 times the standard deviation of difference calculated by validation, with a confidence level of 95%.
NOTE The interpretation is possible if it is assumed that the deviations follow a normal distribution with 95% confidence.
8.5.4. External comparison of the analysis system
8.5.4.1. Interlaboratory comparison
Purpose
Interlaboratory tests are of two types:
- Collaborative studies that relate to a single method. These studies are conducted during the initial validation of a new method, mainly to define the standard deviation of interlaboratory reproducibility SR (method). The average m may also be determined.
- Interlaboratory comparisons, or proficiency testing (PT). These tests are performed to validate a method adopted by the laboratory, and its routine quality control. The value is the interlaboratory average m, and the standard deviation of interlaboratory and inter-method reproducibility.
As part of its participation in a PT scheme, or in a collaborative study, the laboratory can use the results to study the trueness of a method in order to provide initial validation, and routine quality control.
If the interlaboratory tests are performed as part of a certified organization, this comparison work will enable traceability of the method.
Basic protocol and calculations
For the comparison to be sufficient, a minimum of five test materials should be used during the period.
For each test material, two results are provided:
- The average of all the laboratories that have produced significant results m
- The standard deviation of interlaboratory reproducibility SR-inter
The test materials are analyzed with p replicates by the laboratory, the replicates being carried out under conditions of repeatability; p must be equal to at least 2.
In addition, the laboratory can verify that the intra-laboratory variability (intralaboratory reproducibility) is lower than the interlaboratory variability (interlaboratory reproducibility) given by the interlaboratory analysis.
For each test material, the laboratory calculates the Zscore, given by the following formula:
The results can be reported as indicated in the following table:
Test material |
Rep1 |
… |
Rep j |
… |
Rep p |
Lab average |
PT scheme average |
Standard deviation |
Zscore |
1 |
x11 |
… |
x1j |
… |
x1p |
m1 |
SR-inter(1) |
||
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
i |
xi1 |
… |
xij |
… |
xip |
mi |
SR-inter(i) |
||
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
n |
xn1 |
… |
xnj |
… |
xnp |
|
mn |
SR-inte(n) |
|
A graphical representation of the Zscores is very useful for the interpretation of the results.
Interpretation
The interpretation of results must be done at two levels:
Individual interpretation of the results
If all the Zscores are less than 2, the results of the method studied can be considered identical to those obtained by the laboratories that have produced significant results.
The laboratory can define its own validation criteria, depending on the required performance characteristics. By default, the following criteria are proposed:
Corrective action is initiated:
- If an individual result is outside the limits of 3 Zscores.
- If two consecutive individual results are located beyond two Zscores.
Interpretation of trends
In addition to the verification of individual results, the trends must be analyzed, with particular regard to the position of the laboratory values compared with the average. If they are consistently on the same side of the average for several successive interlaboratory analyses, this indicates that the laboratory is subject to bias, which justifies the implementation by the laboratory of corrective actions, even if the Zscores remain below the critical value.
The laboratory can define its own validation criteria, depending on the required performance characteristics. By default, the following criteria are proposed:
Corrective action is initiated if:
- nine consecutive Zscores are positive or negative.
- six successive individual data points go up or down.
- two out of three successive points lie between 2 Zscores and 3 Zscores.
Comparison with external reference materials
The measurement at regular intervals of external reference materials can also be used to monitor the development of a systematic error (bias).
The principle involves measuring the external reference materials, and accepting or rejecting the value based on the tolerance limits. These limits are defined by the combination of the uncertainties of the controlled method and the reference value of the reference material.
Standard uncertainty of the reference material
The reference values of these materials are accompanied by confidence intervals. The laboratory must determine the nature of this data, and derive the value of the standard uncertainty of the reference value Sref. A distinction must be made between the various possible cases that may occur:
- The case where uncertainty a is given as a 95% confidence interval (expanded uncertainty). This means that the distribution is normal; a thus constitutes an "expanded uncertainty" and is 2 times the standard deviation Sref of the standard uncertainty values of the reference materials provided.
- The case of a certificate, or other specification, giving limits +/- a without specifying the level of confidence. This means that the dispersion is rectangular, and the measuring value of X therefore has the same chance of having any value in the range ref +/- a.
- The special case of laboratory glassware giving limits +/- a. The dispersion is therefore triangular.
Defining the validity limits of a measurement of the reference material
To the standard uncertainty Sref of the value of the external reference material is added the uncertainty of the laboratory method to be controlled, Smethod. These two sources of variability must be taken into account in determining the limits.
Smethod is calculated from the expanded uncertainty of the laboratory method as follows:
The validity limit of the result (with a confidence level of 95%) =
|
Example: A pH 7 buffer solution is used to control a pH meter. The confidence interval given by the pH solution is +/- 0.01, and it is stated that this confidence interval corresponds to the expanded uncertainty with a confidence level of 95%. In addition, the expanded uncertainty of the pH meter is 0.024.
The limits will be:
i.e. +/- 0.026 compared with the reference value, with a confidence level of 95%.
9. Study of the measurement uncertainty
9.1. Scope
The study is from ISO 11352 (2012): Estimation of measurement uncertainty based on validation and quality control data.
Uncertainty provides two types of information:
- Firstly, for customers of the laboratory, by indicating the potential deviations to be taken into account when interpreting a test result. It should be noted, however, that this information cannot be used as a means of external evaluation of the laboratory.
- Secondly, it is a dynamic internal tool for evaluating the quality of the analytical results of the laboratory. To the extent that its evaluation is in order and carried out using a fixed and well-defined methodology, it shows whether the deviations involved in a method evolve positively or negatively (in the case of the estimate based on intralaboratory data alone).
This guide is limited to providing a practical methodology for oenology laboratories dealing with serial analyses. These laboratories have large volumes of data giving them a statistically significant dimension.
In most cases, therefore, the estimated uncertainties can be obtained from the data acquired through the work of validation and quality control. These data can be complemented by experiment schedules, in particular for determining the systematic errors.
9.2. Methodology
Work involved in estimating the uncertainty is divided into three basic steps; the calculation itself is only the third step, and cannot be used relevantly without the prior application of the other first two.
- Definition of the measurand, and description of the quantitative analysis method
- Critical analysis of the measurement process
- Estimation of uncertainty.
9.2.1. Definition of the measurand, and description of the quantitative analysis method
First of all:
- Clearly define the purpose of the measurement
- Define the quantity to be measured
- If the measurand is obtained by calculation from measured quantities, if possible express the mathematical relationship linking them.
- Indicate all the operating conditions.
In principle, these items are included in the procedures of the laboratory's quality system.
Expressing the mathematical relationship between the measurand and the variables can in some cases (physical methods, etc.) be very complex, and it is therefore not necessarily relevant or possible to fully detail them.
9.2.2. Critical analysis of the measurement process, and identification of the sources of uncertainty
The sources of uncertainty should be identified from the analysis of the measurement process. The importance of each factor can be estimated, in order to eliminate those which only have a negligible minor influence. This is done by estimating:
- the severity of the drift caused by poor control of the factor in question
- the frequency of occurrence of potential problems
- their detectability.
This critical analysis can be done as outlined below.
Labour:
- Operator effect
Material:
- Sample effect (stability, homogeneity, matrix effects) and consumables (reagents, products, solutions, reference materials), etc.
Equipment:
- Equipment effect (response, sensitivity, integration modes, etc.), and laboratory equipment (balance, glassware, etc.).
Method:
- Effect of applying the procedure (operating conditions, sequence of operations, etc.).
Environment:
- Ambient conditions (temperature, pressure, lighting, vibration, radiation, humidity, etc.).
9.2.3. Calculation of the estimated standard uncertainty using the precision and bias approach
9.2.3.1. Principle
A test result deviates from the true value as a result of two sources of error: systematic errors and random errors.
Analysis result = true value + systematic error + random error
The uncertainty will characterize the variability of the results of the analysis.
Variability (test result) = standard uncertainty U(x)
Variability (true value) = 0
Variability (systematic error) = standard uncertainty of bias Ubias
Variability (random error) = standard uncertainty of precision Uprecision
This results in terms of standard uncertainties (standard deviations):
When identifying the sources of uncertainty, the laboratory will make a distinction between the "random" and "systematic" categories, in order to identify those to be included in the Uprecision budget and those to be included in the Ubias budget.
NOTE: It should be noted here that the EURACHEM/CITAC Guide "Quantifying Uncertainty in Analytical Measurements" has a reminder that "In general, the ISO Guide requires that the corrections be applied to all identified and significant systematic effects." In a method "under control", systematic errors should therefore be a minor part of the uncertainty.
9.2.3.2. Determining the standard uncertainty of intra-laboratory precision , Uprecision
Uprecision is a quantification of all the (significant) random errors of the method, according to three cases:
- the laboratory has internal quality control results (IQC).
- the laboratory has results based on a study of a material under intermediate precision conditions as part of the validation of a method.
- the laboratory has performed tests on materials under repeatability conditions.
The intra-laboratory reproducibility standard deviation SRw is a good estimate of the Uprecision on the condition that laboratory reproducibility conditions are experimentally applicable to the method. It therefore follows that:
- The experimental conditions can be used to cover all the significant sources of random error;
- In cases where the matrix has an effect on the method, the matrix of the materials used in the experiment schedule of intermediate precision must be representative of the matrices analyzed.
In this case,
Note: If the laboratory reproducibility conditions are not applicable (no stable material, etc.), then the laboratory will implement an intermediate precision study (SIP), to cover the widest possible conditions. The sources of random error, identified as significant and not included in the intermediate precision conditions, will, if possible, be the subject of an appropriate study to estimate their standard uncertainties, which will then be added to Sprecision..
9.2.3.3. Determining the standard uncertainty of bias, Ubias
Bias stems from the reference materials used for the calibration or adjustment of the method. The target values of these reference materials involve uncertainty, which, by propagation, affects the bias of the method calibrated in this way.
In addition, the interval in which the laboratory accepts the measurement results of the reference material(s) (here referred to as the maximum tolerated deviation, or MTD) is a second source of uncertainty, which is added to that of the target value of the reference material.
The standard uncertainty due to systematic errors depends on:
- • the bias of the method and of the laboratory, i.e. the difference from a certified or nominal reference value.
- • the standard uncertainty of the certified or nominal reference value.
The uncertainty associated with bias of the method Ubias is estimated using several approaches:
- the laboratory can measure one or more certified appropriate reference materials.
- the laboratory has taken part in interlaboratory comparisons.
- the laboratory can perform appropriate yield or addition analyses.
- the laboratory uses a reference method in parallel.
- the laboratory carries out measurements on a quantity Y, based on a mathematical model (GUM approach).
This means the error on the bias comprises:
-
the MTD: the standard uncertainty of the MTD is
- the standard uncertainty of the target value of the reference material Uref. This uncertainty is obtained in various ways, depending on the type of material.
Reference |
Standard uncertainty |
Certified reference material |
Uref is the standard uncertainty of the reference value of the certified reference material, supplied in the certificate of the material. |
Material resulting from an interlaboratory comparison |
Urefis usually provided by the organizer of the interlaboratory comparison. Urefcan be calculated as follows:
Uref= the standard deviation of interlaboratory reproducibility / p being the number of laboratories taken into account in calculating the standard deviation. |
Reference material with values assigned by a reference method |
Uref= standard uncertainty of the reference method / p being the number of tests performed on the material, under intermediate precision conditions. |
Reference material with values assigned by formulation (weighing, pipetting) |
Uref= standard deviation or standard uncertainty characterizing the value of the addition due to the preparation, materials and equipment used. The standard uncertainty can be obtained from the information marked on the equipment used:
|
If the method is calibrated using a single reference material, the standard uncertainty of the bias is established as follows:
If the method is calibrated using p reference materials, the standard uncertainty of the bias is established as follows:
Note: The laboratory may choose a reference material with the lowest possible Uref. In particular the formulation of reference materials by connected weighing and/or volumetric analysis systems results in Urefvalues that are generally low and negligible compared with UMTD. Materials whose target values are obtained by inter-laboratory tests often have higher Uref values.
9.2.3.4. Exception: matrix error in the case of methods in which the signal is not specific to the measurand
In the case of methods in which the signal is not specific to the measurand (e.g. infrared techniques) the matrix effect produces a random error from one material to another, but reproducible for the same material. This error is related to the interaction of the compounds present in the product to be analysed on the measurement of the analyte. This source of error, which can be an important component of the uncertainty budget, is not covered by the experimental conditions of laboratory reproducibility.
In this type of situation, it is therefore necessary to implement an experiment schedule to estimate this source of error Umat, which is then added to the uncertainty budget.
Example: Estimating the matrix effect on FTIR
The signal of the FTIR, the infrared spectrum, is not a signal specific to each of the compounds that are measured using this technique. The statistical calibration model can process noisy, non-specific spectral information, and produce a sufficiently accurate estimate of the value of the measurand. This model incorporates the influences of other compounds in the wine, which vary from one wine to another and introduce an error in the result. Upstream of the routine analytical work, special work is required of the calibration developers to minimize this matrix effect and make the calibration robust, i.e. capable of incorporating these variations without having them affect the result. The matrix effect is always present, however, and is a source of error that is the cause of a significant amount of the uncertainty in an FTIR method.
Strictly speaking, the matrix effect error can be estimated by comparing, on the one hand, the averages of a large number of replicates of FTIR measurements, obtained using several reference materials (at least 10), under conditions of reproducibility, and, on the other hand, the true values of the reference materials with a natural wine matrix. The standard deviation of the differences then gives the variability of calibration (provided that the calibration has been previously adjusted (bias = 0)).
This theoretical approach is not feasible in practice, because the true values are never known, but it is experimentally possible to get sufficiently close:
- Beforehand, the FTIR calibration must be adjusted statistically (bias = 0) in relation to a reference method, based on at least 30 samples. This eliminates the effects of bias in the measurements that follow.
- The reference materials must be natural wines. At least 10 different reference materials should be used, whose values lie within a range where the level of uncertainty can be regarded as constant.
-
An acceptable reference value is acquired from the average of several measurements by the reference method, performed under conditions of reproducibility. This makes it possible to reduce the uncertainty of the reference value: if, for the reference method used, all the significant sources of uncertainty are included in the reproducibility conditions, increasing the number p of measurements made under reproducibility conditions, allows the uncertainty associated with the average to be divided by
. The average obtained from a sufficient number of measurements will have a low or negligible uncertainty compared with the uncertainty of the alternative method, and can therefore be used as the reference value; p must be at least equal to 5.
-
The reference materials are analyzed by the FTIR method, with several replicates, obtained under conditions of reproducibility. Multiplying the number of measurements q under reproducibility conditions using the FTIR method reduces the variability associated with the precision of the method (random error). The average value of these measurements has a standard deviation of variability divided by
. The random error can become negligible when compared with the variability associated with the calibration (matrix effect) that we wish to estimate here; q must be at least equal to 5.
The following example is applied to the determination of acetic acid by FTIR calibration. The reference values are given by 5 measurements under reproducibility conditions of 7 stable test materials. Normally 7 test materials suffice, but the data below are only given as an example.
Reference method |
FTIR |
||||||||||||
Materials |
1 |
2 |
3 |
4 |
5 |
Mean Ref |
1 |
2 |
3 |
4 |
5 |
Mean FTIR |
Diff |
1 |
0.30 |
0.32 |
0.31 |
0.30 |
0.31 |
0.308 |
0.30 |
0.31 |
0.31 |
0.30 |
0.30 |
0.305 |
-0.004 |
2 |
0.31 |
0.32 |
0.32 |
0.32 |
0.31 |
0.316 |
0.31 |
0.32 |
0.30 |
0.31 |
0.31 |
0.315 |
-0.006 |
3 |
0.38 |
0.39 |
0.39 |
0.38 |
0.38 |
0.384 |
0.37 |
0.37 |
0.37 |
0.37 |
0.36 |
0.37 |
-0.016 |
4 |
0.25 |
0.25 |
0.25 |
0.24 |
0.25 |
0.248 |
0.26 |
0.26 |
0.26 |
0.25 |
0.26 |
0.26 |
0.01 |
5 |
0.39 |
0.39 |
0.40 |
0.40 |
0.39 |
0.394 |
0.43 |
0.42 |
0.43 |
0.42 |
0.42 |
0.425 |
0.03 |
6 |
0.27 |
0.26 |
0.26 |
0.26 |
0.26 |
0.262 |
0.25 |
0.26 |
0.25 |
0.25 |
0.26 |
0.255 |
-0.008 |
7 |
0.37 |
0.37 |
0.37 |
0.37 |
0.36 |
0.368 |
0.37 |
0.36 |
0.36 |
0.35 |
0.36 |
0.365 |
-0.008 |
Calculation of differences: diff = Mean FTIR – Mean Ref.
This case confirms that the mean difference Md = 0.000 (good fit of the FTIR compared with the reference method)
The standard deviation of the differences Sd = 0.015. It is this standard deviation which can be used to estimate the variability generated by the calibration, and we can therefore say that: Umat = 0.015
NOTE: It should be noted that the value of Umat can be overestimated by this approach. If the laboratory considers that the value is significantly excessive under the operating conditions defined here, it can increase the number of measurements on the reference method and/or the FTIR method.
The reproducibility conditions include all the other significant sources of error; SRw was also calculated: SRw = 0.017
The Uprecision of the determination of acetic acid by the IRTF application is:
9.2.3.5. Expressing the expanded uncertainty
The expanded uncertainty is generally expressed with a coverage factor k = 2
i.e. in absolute terms
i.e. in relative terms:
9.2.3.6. Example
Estimation of the uncertainty of a method of measurement of SO2 by the automated sequential colorimetric method.
- Sources of error
Programmable-controller errors: random errors
Variation of reagents: random errors
Calibration of the method: systematic error
- Uprecision
An intralaboratory reproducibility study is conducted based on the monitoring of data from the method used on a wine stored in a refrigerator at 4°C (the level of SO2 is considered to be stable over the test period):
Repetition 1 |
Repetition 2 |
|
Day 1 |
114 |
114 |
Day 2 |
116 |
115 |
Day 3 |
114 |
113 |
Day 4 |
118 |
117 |
Day 5 |
117 |
116 |
Day 6 |
122 |
120 |
Day 7 |
116 |
117 |
- Ubias
The series are validated using a reference solution of SO2 whose value was obtained using the Frantz Paul reference method (OIV Reference: OIV-MA-AS323-04A: R2009). The target value of this solution is therefore affected by the uncertainty of the Frantz Paul method.
The laboratory applies an MTD of 12%,
The expanded uncertainty of the Paul Frantz method is 15%, its standard uncertainty is
- Relative uncertainty U(%)
In this example, note that much of the uncertainty is given by Ubias and, in particular, by Uref of the control material, the target value of which is affected by the significant uncertainty of the reference method. In order to reduce this uncertainty, the laboratory can determine the target value of the material based on the average of p measurements by the reference method. In this case, the standard uncertainty Uref is divided by .
In this case, if the laboratory determines the target value using the average of 3 measurements obtained by the reference method, U(%)ref is equal to 4.33%. The uncertainty of the automated colorimetric method becomes:
9.2.4. Estimating uncertainty in a global approach based on proficiency-testing schemes (PTS)
9.2.4.1. Principle
The data produced during the participation of the laboratory in PTS can be used to estimate overall uncertainty U(x). This estimate covers all the sources of error and therefore constitutes a rigorous approach to the estimation of uncertainty. The quality of the estimate, however, depends on the quality of the PTS. This approach is not suitable, therefore, if too few laboratories participate in PTS, or if significant method effects mean the quality of the results is relative.
Based on the assumption that the target value of the chain can be assimilated to a true value:
- the average of the deviations of the laboratory is an estimate of the method bias in the laboratory
- the dispersion of the deviations is an estimate of the precision of the method
This method of estimating uncertainty is reliable if:
- the number of participations in the PTS is at least equal to 10;
- the results of the PTS are very reliable, i.e. the target value defined by the chain can be regarded as a quality value with a very small uncertainty (less than 2%). When uncertainty about the target value provided by the organizer of the interlaboratory comparison is higher, this method of estimating uncertainty loses much of its relevance.
- materials that have been subject to PTS are representative of the matrices of the method.
In order not to count the uncertainty due to changes in conditions twice, the uncertainty is to be estimated with the estimated precision under repeatability conditions and with the estimated trueness over time, taking into account the changes in conditions outlined in 9.2.2.
9.2.4.2. Experiment schedule and calculations
The data can be used as raw values in ranges of concentration where the uncertainty of the method is constant.
If the uncertainty is not constant it can be expressed in relative terms.
PTS |
Repetitions |
Within-run variance |
Average la PTS i |
PTS target value |
Standard uncertainty on the target value |
Lab deviation |
Lab deviation (%) |
||||
1 |
... |
i |
... |
p |
|||||||
1 |
... |
... |
|||||||||
… |
... |
... |
... |
... |
... |
... |
... |
... |
|||
j |
... |
... |
|||||||||
… |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
n |
… |
… |
Determination in absolute values (stable accuracy throughout the range):
Repeatability standard deviation
Standard deviation due to deviations from a target value
Standard deviation of the error on a target value
The expanded uncertainty of the method is expressed as follows:
In this expression, we have:
The random error of the method (precision) taking into account the within-run variability.
Systematic error (bias) taking into account the changes in conditions outlined in 9.2.2.
Determination in relative values:
Coefficient of variation of repeatability
Coefficient of variation due to the relative deviations in relation to a target value
Coefficient of variation of the error on the target value
The relative expanded uncertainty of the method is expressed as follows:
9.2.4.3. Example
The same laboratory as in the previous example has 10 results for its participation in several analysis chains conducted over a period of 18 months for total SO2 using the automated sequential colorimetric method.
PTS |
Repetitions |
Within-run variance |
Average lab PTS i |
PTS target value |
Standard uncertainty on the target value |
Lab deviation |
Lab deviation (%) |
||
1 |
2 |
|
|
|
|
|
|
||
1 |
136 |
135 |
0,5 |
135,5 |
137 |
1 |
-1.5 |
-1.1% |
|
2 |
100 |
102 |
2 |
101 |
98 |
0.8 |
3 |
3.0% |
|
3 |
137 |
137 |
0 |
137 |
126 |
1.2 |
11 |
8.0% |
|
4 |
44 |
45 |
0,5 |
44,5 |
51 |
0.5 |
-6.5 |
-14.6% |
|
5 |
144 |
147 |
4,5 |
145,5 |
147 |
1.2 |
-1.5 |
-1.0% |
|
6 |
91 |
89 |
2 |
90 |
85 |
0.8 |
5 |
5.6% |
|
7 |
144 |
144 |
0 |
144 |
136 |
0.9 |
8 |
5.6% |
|
8 |
178 |
176 |
2 |
177 |
162 |
1.3 |
15 |
8.5% |
|
9 |
131 |
128 |
4,5 |
129,5 |
129 |
1.2 |
0.4 |
0.4% |
|
10 |
98 |
100 |
2 |
99 |
92 |
0.6 |
6.4 |
7.1% |
|
Coefficient of variation of repeatability
Coefficient of variation due to the relative deviations in relation to a target value =
Coefficient of variation of the error on the target value
The relative expanded uncertainty of the method
FURTHER INFORMATION
Annex A: Confidence interval relating to the standard deviation and mean (ISO 5725)
If a material is measured in n series representing intermediate precision conditions and within the period of stability of the sample.
Each measurement is repeated r times under repeatability conditions.
Let:
- srepeat: the repeatability standard deviation of the method.
- sIP: the intermediate precision standard deviation of the method.
-
: the arithmetic mean.
- the confidence interval for the repeatability standard deviation relating to a confidence level of 95% is:
- the confidence interval for the intermediate precision standard deviation relating to a confidence level of 95% is:
- the confidence interval for the mean relating to a confidence level of 95% is:
are the fractiles of the Chi-square distribution with df degrees of freedom and for a confidence level of 95%.
Annex B: β-risk calculation for the study of accuracy
1. Definitions
Alpha risk ()
Probability of declaring the tested sample non-compliant with the measurement obtained when the sample is in fact compliant.
Percentage of false non-compliant results.
Beta risk (β)
Probability of declaring the tested sample compliant with the measurement obtained when the sample is in fact non-compliant.
Percentage of false compliant results.
Power: (1 - β)
Probability of declaring the tested sample non-compliant with the measurement obtained when the sample is indeed non-compliant.
2. β-risk calculation
The study of accuracy on the basis of a quantity Z measured in relation to a reference value (Ref) and a maximum acceptable deviation (MAD) is based on the null hypothesis H0: I Ref - Z I = MAD versus the alternative hypothesis H1: I Ref - Z I > MAD.
The β-risk is traditionally calculated from a non-centrality parameter λ, from the -risk and from n number of measurements where:
- n is the number of measurements carried out on a material and has the value of Ref.
- is the supplier risk.
- β is the customer risk.
- MAD is the maximum acceptable deviation relating to the reference value Ref.
-
is the arithmetic mean of n results.
- sIP is the intermediate precision standard deviation of n results.
- λ is the non-centrality parameter which expresses the deviation between the reference value and the mean in relation to the maximum acceptable deviation reduced to the intermediate precision standard deviation of the measurements.
The customer risk is therefore calculated as follows: β = Prob[ T ≤ t0 ]
- T is a random variable that follows Student's t-distribution for n-1 degrees of freedom.
-
-
is the quantile of Student's t-distribution for n-1 degrees of freedom and a one-sided -risk.
Principle of β-risk calculation to test the null hypothesis H0: I Ref - Z I = MAD versus the alternative hypothesis H1: I Ref - Z I > MAD
3. Application
The laboratory wishes to test a reference value of 25 mg/l in relation to a maximum acceptable deviation of 60% of the reference value in 95% of cases.
The laboratory has carried out 10 measurements on a sample with the reference value of 25 mg/l.
The results indicate a mean of 23.92 mg/l and a standard deviation of 1.30 mg/l, that is, a bias of 4.3% and a coefficient of variation of 5.4%.
If the relative deviation in relation to the reference value is in fact greater than 60% in 5% of cases then the laboratory has a 0.0017% probability of accepting this limit of quantification with their measurements.
Appendix C: Bias study (ISO 11352)
Several approaches can be used to check the accuracy of a method:
- Interpreting the results obtained in intermediate precision conditions based on the accuracy study.
- Interpreting the results through interlaboratory comparisons.
Based on the experiment schedule implemented during the accuracy study, the accuracy of a method is confirmed when either:
-
The bias
or the relative bias
is lower than an acceptability limit selected beforehand, as per a standard or regulatory requirement, or a requirement set by the customer or the laboratory itself.
-
Or the standardized deviation =
is lower than or equal to 2.
uRef refers to the standard uncertainty related to the "Ref" value according to the following cases:
Reference |
|
Reference |
Standard uncertainty |
RefCRM |
uRef is the standard uncertainty for the reference value. |
RefPTS |
Either uRef was provided by the organiser of the PTS.
Or uRef = standard deviation of interlaboratory reproducibility / |
RefMethod |
uRef = standard deviation of the p results obtained with a reference method / |
RefAddition |
uRef = standard deviation or standard uncertainty characterising the value of the addition due to the preparation, materials and equipment used. |