Please cite the following companion paper if you’re using the ctbi package: Ritter, F.: Technical note: A procedure to clean, decompose, and aggregate time series, Hydrol. Earth Syst. Sci., 27, 349–361, https://doi.org/10.5194/hess-27-349-2023, 2023.
The goal of ctbi is to clean, decompose, impute and
aggregate univariate time series. Ctbi stands for
Cyclic/Trend decomposition using Bin Interpolation : the time
series is divided into a sequence of non-overlapping bins. The long-term
trend is a linear interpolation of the mean values between successive
bins and the cyclic component is the mean stack of detrended data within
all bins. Outliers present in the residuals are flagged using an
enhanced Boxplot rule (called Logbox) that is adapted
to non-Gaussian data and keeps the type I error at
You can install the latest version of ctbi on CRAN.
library(ctbi)
<- data.frame(year = 1700:1988,sunspot = as.numeric(sunspot.year))
example1 sample(1:289,30),'sunspot'] <- NA # contaminate data with missing values
example1[c(5,30,50),'sunspot'] <- c(-50,300,400) # contaminate data with outliers
example1[<- example1[-(70:100),]
example1 <- 11 # aggregation performed every 11 years (the year is numeric here)
bin.period <- 1989 # to capture the last year, 1988, in a complete bin
bin.side <- 'mean'
bin.FUN <- 0.2 # maximum of 20% of missing data per bin
bin.max.f.NA <- c(0,Inf) # negative values are impossible
ylim
<- ctbi(example1,bin.period=bin.period,
list.main bin.side=bin.side,bin.FUN=bin.FUN,
ylim=ylim,bin.max.f.NA=bin.max.f.NA)
<- list.main$data0 # cleaned raw dataset
data0.example1 <- list.main$data1 # aggregated dataset.
data1.example1 <- list.main$mean.cycle # this data set shows a moderate seasonality
mean.cycle.example1 <- list.main$summary.bin # confirmed with SCI = 0.50
summary.bin.example1 <- list.main$summary.outlier summary.outlier.example1