| Type: | Package |
| Title: | Directly Adjusted Estimates |
| Version: | 0.6.1 |
| Description: | Compute estimates and confidence intervals of weighted averages quickly and easily. Weighted averages are computed using data.table for speed. Confidence intervals are approximated using the delta method with either using known formulae or via algorithmic or numerical integration. |
| License: | MIT + file LICENSE |
| URL: | https://github.com/FinnishCancerRegistry/directadjusting/ |
| BugReports: | https://github.com/FinnishCancerRegistry/directadjusting/issues |
| Depends: | R (≥ 2.10) |
| Imports: | data.table, stats |
| Encoding: | UTF-8 |
| Language: | en-GB |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | no |
| Packaged: | 2026-02-02 09:01:43 UTC; joonas.miettinen |
| Author: | Joonas Miettinen |
| Maintainer: | Joonas Miettinen <joonas.miettinen@cancer.fi> |
| Repository: | CRAN |
| Date/Publication: | 2026-02-04 19:30:02 UTC |
directadjusting: Directly Adjusted Estimates
Description
Compute estimates and confidence intervals of weighted averages quickly and easily. Weighted averages are computed using data.table for speed. Confidence intervals are approximated using the delta method with either using known formulae or via algorithmic or numerical integration.
Recommended installation
devtools::install_github( "FinnishCancerRegistry/directadjusting@release" )
Example
# suppose we have poisson rates that we want to adjust for by age group. # they are stratified by sex. set.seed(1337) offsets <- rnorm(8, mean = 1000, sd = 100) baseline <- 100 sex_hrs <- rep(1:2, each = 4) age_group_hrs <- rep(c(0.75, 0.90, 1.10, 1.25), times = 2) counts <- rpois(8, baseline * sex_hrs * age_group_hrs) # raw estimates my_stats <- data.table::data.table( sex = rep(1:2, each = 4), ag = rep(1:4, times = 2), e = counts / offsets ) my_stats[, "v" := my_stats[["e"]] / offsets] print(my_stats) # sex ag e v # <int> <int> <num> <num> # 1: 1 1 0.08928141 8.759527e-05 # 2: 1 2 0.10054601 1.175523e-04 # 3: 1 3 0.11987410 1.238776e-04 # 4: 1 4 0.09722692 8.365551e-05 # 5: 2 1 0.18043221 1.937844e-04 # 6: 2 2 0.14781448 1.227479e-04 # 7: 2 3 0.21747515 1.987203e-04 # 8: 2 4 0.21519746 1.781152e-04 # adjusted by age group my_adj_stats <- directadjusting::directly_adjusted_estimates( stats_dt = my_stats, stat_col_nms = "e", var_col_nms = "v", conf_lvls = 0.95, conf_methods = "log", stratum_col_nms = "sex", adjust_col_nms = "ag", weights = c(200, 300, 400, 100) ) print(my_adj_stats) # Key: <sex> # sex e v e_lo e_hi # <int> <num> <num> <num> <num> # 1: 1 0.1056924 3.474049e-05 0.09474912 0.1178996 # 2: 2 0.1889406 5.237509e-05 0.17527556 0.2036710
News
News for version 0.6.1
directadjusting
DESCRIPTION and documentation fixes.
News for version 0.6.0
directadjusting
First CRAN release.
News for version 0.5.0
directadjusting::direct_adjusted_estimates
directadjusting::delta_method_confidence_intervals made a lot more
flexible. It now accepts via conf_method a string, a call, and a list
of calls that produce the desired confidence intervals.
directadjusting::direct_adjusted_estimates
directadjusting::direct_adjusted_estimates option conf_methods = "boot"
removed. Only delta method confidence intervals now possible. Making use of
the delta method is now more flexible and accepts e.g.
list("log", list(g = quote(qnorm(theta)), g_inv = quote(pnorm(g)))).
directadjusting::direct_adjusted_estimates
directadjusting::direct_adjusted_estimates now allows for the sake of
convenience to be called with no adjust_col_nms defined. This results
in no adjusting.
News for version 0.4.0
directadjusting::direct_adjusted_estimates
directadjusting::direct_adjusted_estimates now correctly uses
the same conf_lvls and conf_methods for all statistics when their
length is one.
News for version 0.3.0
directadjusting
Remove deprecated directadjusting::direct_adjusted_estimates. Use
directadjusting::directly_adjusted_estimates.
Author(s)
Maintainer: Joonas Miettinen joonas.miettinen@cancer.fi (ORCID)
See Also
Useful links:
Confidence Intervals
Description
Functions to compute confidence intervals.
Usage
delta_method_confidence_intervals(
statistics,
variances,
conf_lvl = 0.95,
conf_method = "identity"
)
Arguments
statistics |
Statistics for which to calculate confidence intervals. | ||||||||||||||||||||
variances |
Variance estimates of | ||||||||||||||||||||
conf_lvl |
Confidence level of confidence intervals in | ||||||||||||||||||||
conf_method |
Delta method transformation to be applied.
|
Value
directadjusting::delta_method_confidence_intervals
Returns a data.table with columns
c("statistic", "variance", "ci_lo", "ci_hi").
Functions
directadjusting::delta_method_confidence_intervals
directadjusting::delta_method_confidence_intervals can be used to
compute confidence intervals using the delta method. The following steps
are performed:
Compute confidence intervals based on
conf_method,statistics,variances, andconf_lvl.If
conf_methodis a string, a pre-defined set of mathematical expressions are used to compute the confidence intervals.If
conf_methodis acall, it is evaluated with the variablestheta,theta_variance,theta_standard_error, andz. This is done once for the lower and once for the upper bound of the confidence interval, so for the lower bound andconf_level = 0.95we usez = stats::qnorm(p = (1 - conf_lvl) / 2).If
conf_methodis alist, it must contain elementsgandg_inv, e.g.list(g = quote(log(theta)), g_inv = quote(exp(g))).-
gis passed to[stats::deriv]. If that fails, a numerical derivative is computed. With the derivative known the variance after the transformation is
variance * g_gradient ^ 2.With the transformed variance known the transform confidence interval is calculated simply via
g(theta) + g_standard_error * z.These transformation-scale confidence intervals are then converted back to the original scale using
g_inv.
-
Collect a
data.tablewith the confidence intervals and with also the columnsstatistics = statisticsandvariance = variances.Add attribute named
ci_metato thedata.table. This attribute is a list which contains elementsconf_lvlandconf_method.Return
data.tablewith columnsc("statistic", "variance", "ci_lo", "ci_hi").
Examples
# directadjusting::delta_method_confidence_intervals
dt_1 <- directadjusting::delta_method_confidence_intervals(
statistics = 0.9,
variances = 0.1,
conf_lvl = 0.95,
conf_method = "log"
)
# you can also supply your own math for computing the confidence intervals
dt_2 <- directadjusting::delta_method_confidence_intervals(
statistics = 0.9,
variances = 0.1,
conf_lvl = 0.95,
conf_method = quote(theta * exp(z * theta_standard_error / theta))
)
dt_3 <- directadjusting::delta_method_confidence_intervals(
statistics = 0.9,
variances = 0.1,
conf_lvl = 0.95,
conf_method = list(
g = quote(log(theta)),
g_inv = quote(exp(g))
)
)
dt_4 <- directadjusting::delta_method_confidence_intervals(
statistics = 0.9,
variances = 0.1,
conf_lvl = 0.95,
conf_method = list(
g = quote(stats::qnorm(theta)),
g_inv = quote(stats::pnorm(g))
)
)
stopifnot(
all.equal(dt_1, dt_2, check.attributes = FALSE),
all.equal(dt_1, dt_3, check.attributes = FALSE)
)
Directly Adjusted Estimates
Description
Compute direct adjusted estimates from a table of statistics.
Usage
directly_adjusted_estimates(
stats_dt,
stat_col_nms,
var_col_nms,
stratum_col_nms = NULL,
adjust_col_nms = NULL,
conf_lvls = 0.95,
conf_methods = "identity",
weights = NULL
)
Arguments
stats_dt |
a |
stat_col_nms |
names of columns in |
var_col_nms |
|
stratum_col_nms |
names of columns in |
adjust_col_nms |
Names of columns in
|
conf_lvls |
confidence levels for confidence intervals; you may specify each statistic
(see |
conf_methods |
Method(s) to compute confidence intervals. Either one method for all stats
( Can also be |
weights |
The weights need not sum to one as this is ensured internally. You may supply weights in one of the following ways:
|
Details
directadjusting::directly_adjusted_estimates computes weighted
averages and their confidence intervals. Performs the following steps:
Makes a new
data.tablewith data fromstats_dtwithout copying any column data to avoid modifyingstats_dtitself.Handles argument
weightsin order to produce adata.tableof weights if it wasn't one already.Inserts the weights into
stats_dt.Weights are merged into
stats_dtin-place by making a left join onweights_dtusingstats_dtand adding columnweightresulting from this join intostats_dt.Re-scale weights to sum to one within each stratum defined by
stratum_col_nms.
Computes weighted averages of
stat_col_nmsandvar_col_nms(the latter with squared weights because they are variances) overadjust_col_nms. This results in adata.tablewithout column(s)adjust_col_nms.For each
iinseq_along(stat_col_nm):If
conf_methods[[i]]is"none", doesn't compute confidence intervals.Otherwise calls
[delta_method_confidence_intervals].
Sets attribute
directly_adjusted_estimates_meta. It is a list containing:-
call: The call todirectadjusting::directly_adjusted_estimates. -
stat_col_nms: The argument as given by the user. -
var_col_nms: The argument as given by the user. -
stratum_col_nms: The argument as given by the user. -
adjust_col_nms: The argument as given by the user. -
conf_lvls: The argument, but always of lengthlength(stat_col_nms). -
conf_methods: The argument, but always of lengthlength(stat_col_nms).
-
Returns a
data.table. Returned columns are those given viastratum_col_nms,stat_col_nms, andvar_col_nms.
Value
Returns a data.table. Returned columns are those given via
stratum_col_nms, stat_col_nms, and var_col_nms.
Examples
# directadjusting::directly_adjusted_estimates
library("data.table")
set.seed(1337)
offsets <- rnorm(8, mean = 1000, sd = 100)
baseline <- 100
hrs_by_sex <- rep(1:2, each = 4)
hrs_by_ag <- rep(c(0.75, 0.90, 1.10, 1.25), times = 2)
counts <- rpois(8, baseline * hrs_by_sex * hrs_by_ag)
# raw estimates
my_stats <- data.table::data.table(
sex = rep(1:2, each = 4),
ag = rep(1:4, times = 2),
e = counts / offsets,
v = counts / (offsets ** 2)
)
# adjusted by age group
my_adj_stats <- directly_adjusted_estimates(
stats_dt = my_stats,
stat_col_nms = "e",
var_col_nms = "v",
conf_lvls = 0.95,
conf_methods = "log",
stratum_col_nms = "sex",
adjust_col_nms = "ag",
weights = c(200, 300, 400, 100)
)
# adjusted by smaller age groups, stratified by larger age groups
my_stats[, "ag2" := c(1,1, 2,2, 1,1, 2,2)]
my_adj_stats <- directly_adjusted_estimates(
stats_dt = my_stats,
stat_col_nms = "e",
var_col_nms = "v",
conf_lvls = 0.95,
conf_methods = "log",
stratum_col_nms = c("sex", "ag2"),
adjust_col_nms = "ag",
weights = c(200, 300, 400, 100)
)
# with no adjusting columns defined you get the same table as input
# but with confidence intervals. this for the sake of
# convenience for programming cases where sometimes you want to adjust,
# sometimes not.
stats_dt_2 <- data.table::data.table(
sex = 0:1,
e = 0.0,
v = 0.1
)
dt_2 <- directadjusting::directly_adjusted_estimates(
stats_dt = stats_dt_2,
stat_col_nms = "e",
var_col_nms = "v",
conf_lvls = 0.95,
conf_methods = "identity",
stratum_col_nms = "sex"
)
stopifnot(
dt_2[["e"]] == stats_dt_2[["e"]],
dt_2[["v"]] == stats_dt_2[["v"]],
dt_2[["sex"]] == stats_dt_2[["sex"]]
)
# sometimes when adjusting rates or counts, there can be strata where the
# statistic is zero. these should be included in your statistics dataset
# if you still want the weighted average be influenced by the zero.
# otherwise you will get the wrong result. sometimes when naively tabulating
# a dataset with e.g. dt[, .N, keyby = "stratum"] one does not get a result
# row for a stratum that does not appear in the dataset even if we know that
# the stratum exists, for instance only the age groups 1-17 are present in
# the dataset.
stats_dt_3 <- data.table::data.table(
age_group = 1:18,
count = 17:0,
var = 17:0
)
# this goes as intended
dt_3 <- directadjusting::directly_adjusted_estimates(
stats_dt = stats_dt_3,
stat_col_nms = "count",
var_col_nms = "var",
stratum_col_nms = NULL,
adjust_col_nms = "age_group",
weights = data.table::data.table(
age_group = 1:18,
weight = 18:1
)
)
# this does not
dt_4 <- directadjusting::directly_adjusted_estimates(
stats_dt = stats_dt_3[1:17, ],
stat_col_nms = "count",
var_col_nms = "var",
stratum_col_nms = NULL,
adjust_col_nms = "age_group",
weights = data.table::data.table(
age_group = 1:18,
weight = 18:1
)
)
# the weighted average that included the zero is smaller
stopifnot(
dt_3[["count"]] < dt_4[["count"]]
)
# NAs are allowed and produce in turn NAs silently.
stats_dt_5 <- data.table::data.table(
age_group = 1:18,
count = c(NA, 16:0),
var = c(NA, 16:0)
)
dt_5 <- directadjusting::directly_adjusted_estimates(
stats_dt = stats_dt_5,
stat_col_nms = "count",
var_col_nms = "var",
adjust_col_nms = "age_group",
weights = data.table::data.table(
age_group = 1:18,
weight = 18:1
)
)
stopifnot(
is.na(dt_5)
)
stats_dt_6 <- data.table::data.table(
age_group = 1:4,
survival = c(0.20, 0.40, 0.60, 0.80),
var = 0.05 ^ 2
)
# you can use conf_method to pass whatever to
# `delta_method_confidence_intervals`.
dt_6 <- directadjusting::directly_adjusted_estimates(
stats_dt = stats_dt_6,
stat_col_nms = "survival",
var_col_nms = "var",
adjust_col_nms = "age_group",
weights = data.table::data.table(
age_group = 1:4,
weight = 1:4
),
conf_methods = list(
list(
g = quote(stats::qnorm(theta)),
g_inv = quote(stats::pnorm(g))
)
)
)