sgboost
Implements the sparse-group boosting in to be used conjunction with
the R-package mboost
. A formula object defining group base
learners and individual base learners is used in the fitting process.
Regularization is based on the degrees of freedom of individual
baselearners \(df(\lambda)\) and the
ones of group baselearners \(df(\lambda^{(g)})\), such that \(df(\lambda) = \alpha\) and \(df(\lambda^{(g)}) = 1- \alpha\).
You can install the development version of sgboost from GitHub with:
# install.packages("devtools")
::install_github("FabianObster/sgboost") devtools
This is a basic example which shows you how to solve a common problem:
library(sgboost)
library(dplyr)
library(mboost)
For a data.frame df
and a group structure
group_df
, this example fits a sparse-group boosting model
and plots the coefficient path:
library(sgboost)
set.seed(1)
<- data.frame(
df x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100),
x4 = rnorm(100), x5 = runif(100)
)<- df %>%
df mutate_all(function(x) {
as.numeric(scale(x))
})$y <- df$x1 + df$x4 + df$x5
df<- data.frame(
group_df group_name = c(1, 1, 1, 2, 2),
var_name = c("x1", "x2", "x3", "x4", "x5")
)
<- as.formula(create_formula(alpha = 0.3, group_df = group_df))
sgb_formula #> Warning in create_formula(alpha = 0.3, group_df = group_df): there is a group containing only one variable.
#> It will be treated as individual variable and as group
<- mboost(formula = sgb_formula, data = df)
sgb_model plot_path(sgb_model)