Processing math: 100%

Semiparametric model

library(serosv)

Penalized splines

Proposed model

Penalized splines

A general model relating the prevalence to age can be written as a GLM

g(P(Yi=1|ai))=g(π(ai))=η(ai)

The linear predictor can be estimated semi-parametrically using penalized spline with truncated power basis functions of degree p and fixed knots κ1,...,κk as followed

η(ai)=β0+β1ai+...+βpapi+Σkk=1uk(aiκk)p+

In matrix notation, the mean structure model for η(ai) becomes

η=Xβ+Zu

Where η=[η(ai)...η(aN)]T, β=[β0β1....βp]T, and u=[u1u2...uk]T are the regression with corresponding design matrices

X=[1a1a21...ap11a2a22...ap21aNa2N...apN],Z=[(a1κ1)p+(a1κ2)p+(a1κk)p+(a2κ1)p+(a2κ2)p+(a2κk)p+(aNκ1)p+(aNκ2)p+(aNκk)p+]

FOI can then be derived as

ˆλ(ai)=[^β1,2^β2ai,...,pˆβap1i+Σkk=1pˆuk(aiκk)p1+]δ(ˆη(ai))


Penalized likelihood framework

Refer to Chapter 8.2.1

Proposed approach

A first approach to fit the model is by maximizing the following penalized likelihood

ϕ1[yT(Xβ+Zu)1Tc(Xβ+Zu)]12λ2[βu]TD[βu]

Where:

Fitting data

To fit the data using the penalized likelihood framework, specify framework = "pl"

Basis function can be defined via the s parameter, some values for s includes:

For more options, refer to the mgcv documentation (Wood 2017)

data <- parvob19_be_2001_2003
pl <- penalized_spline_model(data$age, status = data$seropositive, s = "tp", framework = "pl") 
pl$info
#> 
#> Family: binomial 
#> Link function: logit 
#> 
#> Formula:
#> spos ~ s(age, bs = s, sp = sp)
#> 
#> Estimated degrees of freedom:
#> 6.16  total = 7.16 
#> 
#> UBRE score: 0.1206458
plot(pl)


Generalized Linear Mixture Model framework

Refer to Chapter 8.2.2

Proposed approach

Looking back at (1), a constraint for u would be Σku2k<C for some positive value C

This is equivalent to choosing (β,u) to maximise (1) with D=diag(0,1) where 0 denotes zero vector length p+1 and 1 denotes the unit vector of length K

For a fixed value for λ this is equivalent to fitting the following generalized linear mixed model Ngo and Wand (2004)

f(y|u)=exp{ϕ1[yT(Xβ+Zu)c(Xβ+Zu)]+1Tc(y)},uN(0,G)

Thus Z is penalized by assuming the corresponding coefficients u are random effect with uN(0,σ2uI).

Fitting data

To fit the data using the penalized likelihood framework, specify framework = "glmm"

data <- parvob19_be_2001_2003
glmm <- penalized_spline_model(data$age, status = data$seropositive, s = "tp", framework = "glmm") 
#> 
#>  Maximum number of PQL iterations:  20
#> iteration 1
#> iteration 2
#> iteration 3
#> iteration 4
glmm$info$gam
#> 
#> Family: binomial 
#> Link function: logit 
#> 
#> Formula:
#> spos ~ s(age, bs = s, sp = sp)
#> 
#> Estimated degrees of freedom:
#> 6.45  total = 7.45
plot(glmm)

Eilers, Paul H. C., and Brian D. Marx. 1996. “Flexible Smoothing with b-Splines and Penalties.” Statistical Science 11 (2). https://doi.org/10.1214/ss/1038425655.
Green, P. J., and Bernard. W. Silverman. 1993. Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Chapman; Hall/CRC. https://doi.org/10.1201/b15710.
Ngo, Long, and Matthew P. Wand. 2004. “Smoothing with Mixed Model Software.” Journal of Statistical Software 9 (1). https://doi.org/10.18637/jss.v009.i01.
Ruppert, David, M. P. Wand, and R. J. Carroll. 2003. Semiparametric Regression. Cambridge University Press. https://doi.org/10.1017/cbo9780511755453.
Wahba, Grace. 1978. “Improper Priors, Spline Smoothing and the Problem of Guarding Against Model Errors in Regression.” Journal of the Royal Statistical Society Series B: Statistical Methodology 40 (3): 364–72. https://doi.org/10.1111/j.2517-6161.1978.tb01050.x.
Wand, M. P. 2003. “Smoothing and Mixed Models.” Computational Statistics 18 (2): 223–49. https://doi.org/10.1007/s001800300142.
Wood, Simon N. 2017. Generalized Additive Models: An Introduction with r. Chapman; Hall/CRC. https://doi.org/10.1201/9781315370279.