tidylearn is designed so that analysis results flow directly into
reports. Every model produces tidy tibbles, ggplot2 visualisations, and
— with the tl_table_*() functions — polished
gt tables, all with a consistent interface. This vignette
walks through the reporting tools available.
tidylearn’s plot() method dispatches to the right
visualisation for each model type. All plots are ggplot2 objects —
themeable, composable, and convertible to plotly.
model_reg <- tl_model(mtcars, mpg ~ wt + hp, method = "linear")
# Actual vs predicted — one call
plot(model_reg, type = "actual_predicted")split <- tl_split(iris, prop = 0.7, stratify = "Species", seed = 42)
model_clf <- tl_model(split$train, Species ~ ., method = "forest")
plot(model_clf, type = "confusion")The tl_table() family mirrors the plot interface but
produces formatted gt tables instead. Like
plot(), tl_table() dispatches based on model
type and a type parameter:
tl_table(model) # auto-selects the best table type
tl_table(model, type = "coefficients") # specific type| Model Evaluation Metrics | |
| Metric | Value |
|---|---|
| Rmse | 2.4689 |
| Mae | 1.9015 |
| Rsq | 0.8268 |
| tidylearn | linear (regression) | mpg ~ wt + hp | n = 32 | |
For linear and logistic models, the table includes standard errors, test statistics, and p-values, with significant terms highlighted:
| Linear Model Coefficients | |||||
| Term | Estimate | Std. Error | t value | p | |
|---|---|---|---|---|---|
| (Intercept) | 37.2273 | 1.5988 | 23.2847 | 2.57 × 10−20 | * |
| wt | −3.8778 | 0.6327 | −6.1287 | 1.12 × 10−6 | * |
| hp | −0.0318 | 0.0090 | −3.5187 | 1.45 × 10−3 | * |
| tidylearn | linear (regression) | mpg ~ wt + hp | n = 32 | |||||
For regularised models, coefficients are sorted by magnitude and zero coefficients are greyed out:
| Lasso Coefficients | ||
| lambda = 1.536 (1se) | ||
| Term | Coefficient | |Coefficient| |
|---|---|---|
| (Intercept) | 33.4721 | 33.4721 |
| wt | −2.2863 | 2.2863 |
| cyl | −0.8339 | 0.8339 |
| hp | −0.0059 | 0.0059 |
| disp | 0.0000 | 0.0000 |
| drat | 0.0000 | 0.0000 |
| qsec | 0.0000 | 0.0000 |
| vs | 0.0000 | 0.0000 |
| am | 0.0000 | 0.0000 |
| gear | 0.0000 | 0.0000 |
| carb | 0.0000 | 0.0000 |
| tidylearn | lasso (regression) | mpg ~ . | n = 32 | ||
A formatted confusion matrix with correct predictions highlighted on the diagonal:
| Confusion Matrix | |||
| Actual |
Predicted
|
||
|---|---|---|---|
| setosa | versicolor | virginica | |
| setosa | 15 | 0 | 0 |
| versicolor | 0 | 14 | 1 |
| virginica | 0 | 2 | 13 |
| tidylearn | forest (classification) | Species ~ . | n = 105 | |||
A ranked importance table with a colour gradient:
| Feature Importance | |
| Top 4 features | |
| Feature | Importance |
|---|---|
| Petal.Length | 100.00 |
| Petal.Width | 93.33 |
| Sepal.Length | 27.43 |
| Sepal.Width | 10.52 |
| tidylearn | forest (classification) | Species ~ . | n = 105 | |
Cumulative variance is coloured green to highlight how many components are needed:
| PCA Variance Explained | ||||
| Component | Std. Dev. | Variance | Proportion | Cumulative |
|---|---|---|---|---|
| PC1 | 1.5749 | 2.4802 | 62.0% | 62.0% |
| PC2 | 0.9949 | 0.9898 | 24.7% | 86.8% |
| PC3 | 0.5971 | 0.3566 | 8.9% | 95.7% |
| PC4 | 0.4164 | 0.1734 | 4.3% | 100.0% |
| tidylearn | pca | n = 50 | ||||
A diverging red–blue colour scale highlights strong positive and negative loadings:
| PCA Loadings | ||||
| Variable | PC1 | PC2 | PC3 | PC4 |
|---|---|---|---|---|
| Murder | −0.536 | −0.418 | 0.341 | 0.649 |
| Assault | −0.583 | −0.188 | 0.268 | −0.743 |
| UrbanPop | −0.278 | 0.873 | 0.378 | 0.134 |
| Rape | −0.543 | 0.167 | −0.818 | 0.089 |
| tidylearn | pca | n = 50 | ||||
Cluster sizes and mean feature values:
| Cluster Summary | |||||
| kmeans | 3 clusters | |||||
| Cluster | Size | Sepal.Length | Sepal.Width | Petal.Length | Petal.Width |
|---|---|---|---|---|---|
| 1 | 50 | 5.01 | 3.43 | 1.46 | 0.25 |
| 2 | 38 | 6.85 | 3.07 | 5.74 | 2.07 |
| 3 | 62 | 5.90 | 2.75 | 4.39 | 1.43 |
| tidylearn | kmeans | n = 150 | |||||
Compare multiple models side-by-side:
m1 <- tl_model(split$train, Species ~ ., method = "logistic")
m2 <- tl_model(split$train, Species ~ ., method = "forest")
m3 <- tl_model(split$train, Species ~ ., method = "tree")
tl_table_comparison(
m1, m2, m3,
new_data = split$test,
names = c("Logistic", "Random Forest", "Decision Tree")
)| Model Comparison | |||
| 3 models compared | |||
| Metric | Logistic | Random Forest | Decision Tree |
|---|---|---|---|
| Accuracy | 0.0000 | 0.9556 | 0.8889 |
| tidylearn | n = 45 | |||
Because all plot functions return ggplot2 objects, converting to interactive plotly charts is a one-liner:
library(plotly)
ggplotly(plot(model_reg, type = "actual_predicted"))
ggplotly(tidy_pca_biplot(pca, label_obs = TRUE))
ggplotly(tl_plot_regularization_path(model_lasso))A typical reporting workflow combines plots and tables for the same model. Because the interface is consistent, the same pattern works regardless of the algorithm:
# Fit
model <- tl_model(split$train, Species ~ ., method = "forest")
# Evaluate
tl_table_metrics(model, new_data = split$test)| Model Evaluation Metrics | |
| Metric | Value |
|---|---|
| Accuracy | 0.9333 |
| tidylearn | forest (classification) | Species ~ . | n = 105 | |
| Feature Importance | |
| Top 4 features | |
| Feature | Importance |
|---|---|
| Petal.Length | 100.00 |
| Petal.Width | 94.05 |
| Sepal.Length | 33.96 |
| Sepal.Width | 12.28 |
| tidylearn | forest (classification) | Species ~ . | n = 105 | |
Swap method = "forest" for method = "tree"
or method = "svm" and the reporting code above works
without modification.