Implements elastic net regularization for general purpose optimization problems. The penalty function is given by: $$p( x_j) = p( x_j) = \frac{1}{w_j}\lambda| x_j|$$ Note that the elastic net combines ridge and lasso regularization. If \(\alpha = 0\), the elastic net reduces to ridge regularization. If \(\alpha = 1\) it reduces to lasso regularization. In between, elastic net is a compromise between the shrinkage of the lasso and the ridge penalty.

## Usage

```
gpElasticNet(
par,
regularized,
fn,
gr = NULL,
lambdas,
alphas,
...,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
```

## Arguments

- par
labeled vector with starting values

- regularized
vector with names of parameters which are to be regularized.

- fn
R function which takes the parameters AND their labels as input and returns the fit value (a single value)

- gr
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients

- lambdas
numeric vector: values for the tuning parameter lambda

- alphas
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso.

- ...
additional arguments passed to fn and gr

- method
which optimizer should be used? Currently implemented are ista and glmnet.

- control
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details.

## Details

The interface is similar to that of optim. Users have to supply a vector
with starting values (important: This vector *must* have labels) and a fitting
function. This fitting functions *must* take a labeled vector with parameter
values as first argument. The remaining arguments are passed with the ... argument.
This is similar to optim.

The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.

Elastic net regularization:

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x

For more details on GLMNET, see:

Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01

Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.

Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421

For more details on ISTA, see:

Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542

Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.

Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.

## Examples

```
# This example shows how to use the optimizers
# for other objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(lessSEM)
set.seed(123)
# first, we simulate data for our
# linear regression.
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
# First, we must construct a fiting function
# which returns a single value. We will use
# the residual sum squared as fitting function.
# Let's start setting up the fitting function:
fittingFunction <- function(par, y, X, N){
# par is the parameter vector
# y is the observed dependent variable
# X is the design matrix
# N is the sample size
pred <- X %*% matrix(par, ncol = 1) #be explicit here:
# we need par to be a column vector
sse <- sum((y - pred)^2)
# we scale with .5/N to get the same results as glmnet
return((.5/N)*sse)
}
# let's define the starting values:
b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates
names(b) <- paste0("b", 1:length(b))
# names of regularized parameters
regularized <- paste0("b",1:p)
# optimize
elasticNetPen <- gpElasticNet(
par = b,
regularized = regularized,
fn = fittingFunction,
lambdas = seq(0,1,.1),
alphas = c(0, .5, 1),
X = X,
y = y,
N = N
)
# optional: plot requires plotly package
# plot(elasticNetPen)
# for comparison:
fittingFunction <- function(par, y, X, N, lambda, alpha){
pred <- X %*% matrix(par, ncol = 1)
sse <- sum((y - pred)^2)
return((.5/N)*sse + (1-alpha)*lambda * sum(par^2) + alpha*lambda *sum(sqrt(par^2 + 1e-8)))
}
round(
optim(par = b,
fn = fittingFunction,
y = y,
X = X,
N = N,
lambda = elasticNetPen@fits$lambda[15],
alpha = elasticNetPen@fits$alpha[15],
method = "BFGS")$par,
4)
elasticNetPen@parameters[15,]
```