**stabs** implements resampling procedures to assess the
stability of selected variables with additional finite sample error
control for high-dimensional variable selection procedures such as Lasso
or boosting. Both, standard stability selection (Meinshausen &
Bühlmann, 2010, doi:10.1111/j.1467-9868.2010.00740.x)
and complementarty pairs stability selection with improved error bounds
(Shah & Samworth, 2013, doi:10.1111/j.1467-9868.2011.01034.x)
are implemented. The package can be combined with arbitrary user
specified variable selection approaches.

For an expanded and executable version of this file please see

`vignette("Using_stabs", package = "stabs")`

- Current version (from CRAN):

`install.packages("stabs")`

- Latest development version from GitHub:

```
library("devtools")
install_github("hofnerb/stabs")
```

To be able to use the `install_github()`

command, one
needs to install **devtools** first:

`install.packages("devtools")`

A simple example of how to use **stabs** with package
**lars**:

```
library("stabs")
library("lars")
## make data set available
data("bodyfat", package = "TH.data")
## set seed
set.seed(1234)
## lasso
<- stabsel(x = bodyfat[, -2], y = bodyfat[,2],
(stab.lasso fitfun = lars.lasso, cutoff = 0.75,
PFER = 1))
## stepwise selection
<- stabsel(x = bodyfat[, -2], y = bodyfat[,2],
(stab.stepwise fitfun = lars.stepwise, cutoff = 0.75,
PFER = 1))
## plot results
par(mfrow = c(2, 1))
plot(stab.lasso, main = "Lasso")
plot(stab.stepwise, main = "Stepwise Selection")
```

We can see that stepwise selection seems to be quite unstable even in this low dimensional example!

To use **stabs** with user specified functions, one can
specify an own `fitfun`

. These need to take arguments
`x`

(the predictors), `y`

(the outcome) and
`q`

the number of selected variables as defined for stability
selection. Additional arguments to the variable selection method can be
handled by `...`

. In the function `stabsel()`

these can then be specified as a named list which is given to
`args.fitfun`

.

The `fitfun`

function then needs to return a named list
with two elements `selected`

and `path`

: *
`selected`

is a vector that indicates which variable was
selected. * `path`

is a matrix that indicates which variable
was selected in which step. Each row represents one variable, the
columns represent the steps. The latter is optional and only needed to
draw the complete selection paths.

The following example shows how `lars.lasso`

is
implemented:

```
<- function(x, y, q, ...) {
lars.lasso if (!requireNamespace("lars"))
stop("Package ", sQuote("lars"), " needed but not available")
if (is.data.frame(x)) {
message("Note: ", sQuote("x"),
" is coerced to a model matrix without intercept")
<- model.matrix(~ . - 1, x)
x
}
## fit model
<- lars::lars(x, y, max.steps = q, ...)
fit
## which coefficients are non-zero?
<- unlist(fit$actions)
selected ## check if variables are removed again from the active set
## and remove these from selected
if (any(selected < 0)) {
<- which(selected < 0)
idx <- c(idx, which(selected %in% abs(selected[idx])))
idx <- selected[-idx]
selected
}
<- logical(ncol(x))
ret <- TRUE
ret[selected] names(ret) <- colnames(x)
## compute selection paths
<- fit$beta
cf <- t(cf != 0)
sequence ## return both
return(list(selected = ret, path = sequence))
}
```

To see more examples simply print, e.g., `lars.stepwise`

,
`glmnet.lasso`

, or `glmnet.lasso_maxCoef`

. Please
contact me if you need help to integrate your method of choice.

Instead of specifying a fitting function, one can also use
`stabsel`

directly on computed boosting models from mboost.

```
library("stabs")
library("mboost")
### low-dimensional example
<- glmboost(DEXfat ~ ., data = bodyfat)
mod
## compute cutoff ahead of running stabsel to see if it is a sensible
## parameter choice.
## p = ncol(bodyfat) - 1 (= Outcome) + 1 ( = Intercept)
stabsel_parameters(q = 3, PFER = 1, p = ncol(bodyfat) - 1 + 1,
sampling.type = "MB")
## the same:
stabsel(mod, q = 3, PFER = 1, sampling.type = "MB", eval = FALSE)
## now run stability selection
<- stabsel(mod, q = 3, PFER = 1, sampling.type = "MB"))
(sbody <- par(mai = par("mai") * c(1, 1, 1, 2.7))
opar plot(sbody, type = "paths")
par(opar)
plot(sbody, type = "maxsel", ymargin = 6)
```

To cite the package in publications please use

`citation("stabs")`

which will currently give you

```
'stabs' in publications use:
To cite package
Hothorn (2021). stabs: Stability
Benjamin Hofner and Torsten
Selection with Error Control, R package version R package version0.6-4, https://CRAN.R-project.org/package=stabs.
Goeker (2015). Controlling
Benjamin Hofner, Luigi Boccuto and Markus in high-dimensional situations: Boosting with
false discoveries 16:144.
stability selection. BMC Bioinformatics, :10.1186/s12859-015-0575-3
doi
for 'gamboostLSS' models use:
To cite the stability selection
Thomas, J., Mayr, A., Bischl, B., Schmid, M., Smith, A., B. (2017). Gradient boosting for distributional regression -
and Hofner,
faster tuning and improved variable selection via noncyclical updates. 10.1007/s11222-017-9754-6
Statistics and Computing. Online First. DOI
toBibtex(citation("stabs"))’ to extract BibTeX references. Use ‘
```

To obtain BibTeX references use

`toBibtex(citation("stabs"))`