Processing math: 100%

Deming Regression

Aaron R. Caldwell

Last Updated: 2024-03-21

Background

Error-in-variables (EIV) models are useful tools to account for measurement error in the independent variable. For studies of agreement, this is particularly useful where there are paired measurements of the paired measurements (X & Y) of the same underlying value (e.g., two assays of the same analyte).

Deming regression is one of the simplest forms of of EIV models promoted by W. Edwards Deming1. The first to detail the method were Adcock (1878) followed by Kummell (1879) and Koopmans (1936). The name comes from the popularity of Deming’s book (Deming 1943), and within the field of clinical chemistry, the procedure was simply referred to as “Deming regression” (e.g., Linnet (1990)).

Code Demonstration

Simple Deming Regression

We can start by creating some fake data to work with.

library(SimplyAgree)

dat = data.frame(
  x = c(7, 8.3, 10.5, 9, 5.1, 8.2, 10.2, 10.3, 7.1, 5.9),
  y = c(7.9, 8.2, 9.6, 9, 6.5, 7.3, 10.2, 10.6, 6.3, 5.2)
)

Also, we will assume, based on historical data, that the measurement error ratio is equal to 4.

The data can be run through the dem_reg function and the results printed.

dem1 = dem_reg(x = "x",
               y = "y",
               data = dat,
               error.ratio = 4,
               weighted = FALSE)
dem1
#> Deming Regression with 95% C.I.
#>               coef      bias     se df lower.ci upper.ci        t p.value
#> Intercept -0.08974 -0.044938 1.7220  8  -4.0607    3.881 -0.05212  0.9597
#> Slope      1.00119  0.003529 0.1872  8   0.5696    1.433  0.00638  0.9951

The resulting regression line can then be plotted.

plot(dem1)

The assumptions of the Deming regression model, primarily normality and homogeneity of variance, can then be check with the check method for Deming regression results. Both plots appear to be fine with regards to the assumptions.

check(dem1)

Weighted Deming Regression

For this example, I will rely upon the “ferritin” data from the deming R package.

library(deming)
data('ferritin')

head(ferritin)
#>   id period old.lot new.lot
#> 1  1      1       1       1
#> 2  2      1       3       3
#> 3  3      1      10       9
#> 4  4      1      13      11
#> 5  5      1      13      12
#> 6  6      1      15      13

Let me demonstrate the problem with using simple Deming regression when the weights are helpful. When we look at the two plots below, we can see there is severe problem with using the “un-weighted” model.

dem2 = dem_reg(
  x = "new.lot",
  y = "old.lot",
  data = ferritin,
  weighted = FALSE
)
dem2
#> Deming Regression with 95% C.I.
#>             coef      bias      se  df lower.ci upper.ci      t p.value
#> Intercept 5.2157 -0.235818 2.18603 160   0.8985    9.533  2.386 0.01821
#> Slope     0.9637  0.002597 0.02505 160   0.9143    1.013 -1.448 0.14949

check(dem2)

Now, let us see what happens when weighted is set to TRUE.

dem2 = dem_reg(
  x = "new.lot",
  y = "old.lot",
  data = ferritin,
  weighted = TRUE
)
dem2
#> Weighted Deming Regression with 95% C.I.
#>               coef       bias       se  df lower.ci upper.ci       t   p.value
#> Intercept -0.02616  0.0065148 0.033219 160 -0.09176  0.03945 -0.7874 4.322e-01
#> Slope      1.03052 -0.0001929 0.006262 160  1.01815  1.04288  4.8729 2.626e-06

plot(dem2)


check(dem2)

Calculative Approach

Deming regression assumes paired measures (xi, yi) are each measured with error.

xi=Xi+ϵi

yi=Yi+δi We can then measure the relationship between the two variables with the following model.

ˆYi=β0+β1ˆXi Traditionally there are 2 null hypotheses

First, the intercept is equal to zero

H0:β0=0 vs. H1:β00

Second, that the slope is equal to one.

H0:β1=1 vs. H1:β01

Measurement Error

A Deming regression model also assumes the measurement error (σ2) ratio is constant.

λ=σ2ϵσ2δ In SimplyAgree, the error ratio can be set with the error.ratio argument. It defaults to 1, but can be changed by the user. If replicate measures are taken, then the user can use the id argument to indicate which measures belong to which subject/participant. The measurement error, and the error ratio, will then be estimated from the data itself.

If the data was not measured in replicate then the error ratio (λ) can be estimated from the coefficient of variation (if that data is available) and the mean of x and y (ˉx, ˉy).

λ=(CVyˉy)2(CVxˉx)2

Weights

In some cases the variance of X and Y may increase proportional to the true value of the measure. In these cases, it may be prudent to use “weighted” Deming regression models. The weights used in SimplyAgree are the same as those suggested by Linnet (1993).

ˆwi=1[xi+λyi1+λ]2

Weights can also be provided through the weights argument. If weighted Deming regression is not selected (weighted = FALSE), the weights for each observation is equal to 1.

The estimated mean of X and Y are then estimated as the following.

ˉxw=ΣNi=1ˆwixiΣNi=1ˆwi

ˉyw=ΣNi=1ˆwiyiΣNi=1ˆwi

Estimating the Slope and Interept

First, there are 3 components (vx, vy, covxy)

vx=ΣNi=1 ˆwi(xiˉxw)2 vy=ΣNi=1 ˆwi(yiˉyw)2 covxy=ΣNi=1 ˆwi(xiˉxw)(yiˉyw)

The slope (b1) can then be estimated with the following equation.

b1=(λvyvx)+(vxλvy)2+4λcov2xy2λcovxy The intercept (b0) can then be estimated with the following equation.

b0=ˉywb1ˉxw The standard errors of b1 and b0 are both estimated using a jackknife method (detailed by Linnet (1990)).

References

Adcock, R J. 1878. “A Problem in Least Squares.” The Analyst 5 (2): 53. https://doi.org/10.2307/2635758.
Deming, W E. 1943. Statistical Adjustment of Data. Wiley.
Koopmans, Tjalling Charles. 1936. Linear Regression Analysis of Economic Time Series. Vol. 20. DeErven F. Bohn, Haarlem, Netherlands.
Kummell, C H. 1879. “Reduction of Observation Equations Which Contain More Than One Observed Quantity.” The Analyst 6 (4): 97. https://doi.org/10.2307/2635646.
Linnet, Kristian. 1990. “Estimation of the Linear Relationship Between the Measurements of Two Methods with Proportional Errors.” Statistics in Medicine 9 (12): 1463–73. https://doi.org/10.1002/sim.4780091210.
———. 1993. “Evaluation of Regression Procedures for Methods Comparison Studies.” Clinical Chemistry 39 (3): 424–32.

  1. Deming was a titan of the fields of statistics and engineering and I would highly recommend reading some of his academic work and books↩︎