startFromSparse — startFromSparse • regCtsem

alpha status function. Start regularized model from a setting where all parameters are already at their target values

Usage

startFromSparse(
  ctsemObject,
  dataset,
  regIndicators,
  targetVector = NULL,
  lambdasAutoLength = 50,
  lambdasAutoCurve = 10,
  penalty = "lasso",
  adaptiveLassoWeights = NULL,
  adaptiveLassoPower = -1,
  cvSample = NULL,
  autoCV = "No",
  k = 5,
  subjectSpecificParameters = NULL,
  standardizeDrift = "No",
  scaleLambdaWithN = TRUE,
  returnFitIndices = TRUE,
  BICWithNAndT = FALSE,
  optimization = "exact",
  optimizer = "GIST",
  control = list(),
  verbose = 0,
  trainingWheels = TRUE,
  nMultistart = 3,
  fitFull = TRUE,
  optimizeRegCtsem = TRUE
)

Arguments

ctsemObject: Fitted object of class ctsemFit
dataset: Data set in wide format compatible with ctsemOMX
regIndicators: Labels of the regularized parameters (e.g. drift_eta1_eta2).
targetVector: named vector with values towards which the parameters are regularized (Standard is regularization towards zero)
lambdasAutoLength: lambdasAutoLength will determine the number of lambdas tested.
lambdasAutoCurve: It is often a good idea to have unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If lambdasAutoCurve is close to 1 lambda values will be equally spaced, if lambdasAutoCurve is large lambda values will be more concentrated close to 0. See ?getCurvedLambda for more informations.
penalty: Currently supported are lasso, ridge and adaptiveLasso
adaptiveLassoWeights: weights for the adaptive lasso. Defaults to 1/(|theta|^adaptiveLassoPower), where theta is the maximum likelihood estimate of the regularized parameters.
adaptiveLassoPower: power for the adaptive lasso weights. The weights will be set to 1/(|theta|^adaptiveLassoPower).
cvSample: cross-validation sample. Has to be in wide format and compatible with ctsemOMX
autoCV: Should automatic cross-validation be used? Possible are "No", "kFold" or "Blocked". kFold splits the dataset in k groups by selecting independent units from the rows. Blocked is a within-unit split, where for each person blocks of observations are deleted. See Bulteel, K., Mestdagh, M., Tuerlinckx, F., & Ceulemans, E. (2018). VAR(1) based models do not always outpredict AR(1) models in typical psychological applications. Psychological Methods, 23(4), 740–756. https://doi.org/10.1037/met0000178 for a more detailed explanation
k: number of cross-validation folds if autoCV = "kFold" or autoCV = "Blocked"
subjectSpecificParameters: EXPERIMENTAL! A vector of parameter labels for parameters which should be estimated person-specific. If these parameter labels are also passed to regIndicators, all person-specific parameters will be regularized towards a group-parameter. This is a 2-step-procedure: In step 1 all parameters are constrained to equality between individuals to estimate the group parameters. In step 2 the parameters are estimated person-specific, but regularized towards the group parameter from step 1.
standardizeDrift: Should Drift parameters be standardized automatically? Set to 'No' for no standardization, 'T0VAR' for standardization using the T0VAR or 'asymptoticDiffusion' for standardization using the asymptotic diffusion
scaleLambdaWithN: Boolean: Should the penalty value be scaled with the sample size? True is recommended as the likelihood is also sample size dependent
returnFitIndices: Boolean: should fit indices be returned?
BICWithNAndT: Boolean: TRUE = Use N and T in the formula for the BIC (-2log L + log(N+T)*k, where k is the number of parameters in the model). FALSE = Use N in the formula for the BIC (-2log L + log(N)). Defaults to FALSE
optimization: which optimization procedure should be used. Possible are "exact" or "approx". exact is recommended for sparsity inducing penalty functions (lasso and adaptive lasso)
optimizer: for exact optimization: Either GIST or GLMNET. When using optimization = "approx", Rsolnp or any of the optimizers in optimx can be used. See ?optimx
control: List with control arguments for the optimizer. See ?controlGIST, ?controlGLMNET and ?controlApprox for the respective parameters
verbose: 0 (default), 1 for convergence plot, 2 for parameter convergence plot and line search progress.
trainingWheels: If set to FALSE all bells and whistles used to keep regCtsem on track are turned off (no multiple starting values, no initial optimization with solnp or optimx). The focus is speed instead of accuracy. This might work in simulated data, but is NOT recommended with real data. The optimizer is quite likely to get stuck in local minima.
nMultistart: number of additional tries when optimizing the models
fitFull: boolean: Only used for adaptiveLASSO weights. Should the full model (model without regularization) be fitted (TRUE) or only approximated (FALSE). Approximation might work sometimes, but not always.
optimizeRegCtsem: if set to false, the function will only return the lambda_max, a vector with sparse parameter values and a vector for the full, unregularized model parameters

Details

NOTE: Function located in file utils.R

Author

Jannik Orzek