SLRpackage_AcceleratedCV_matlab
swMATH ID: 
43993

Software Authors: 
Obuchi, Tomoyuki; Sakata, Ayaka

Description: 
Cross validation in sparse linear regression with piecewise continuous nonconvex penalties and its acceleration. We investigate the signal reconstruction performance of sparse linear regression in the presence of noise when piecewise continuous nonconvex penalties are used. Among such penalties, we focus on the smoothly clipped absolute deviation (SCAD) penalty. The contributions of this study are threefold: we first present a theoretical analysis of a typical reconstruction performance, using the replica method, under the assumption that each component of the design matrix is given as an independent and identically distributed (i.i.d.) Gaussian variable. This clarifies the superiority of the SCAD estimator compared with \(\ell_1\) in a wide parameter range, although the nonconvex nature of the penalty tends to lead to solution multiplicity in certain regions. This multiplicity is shown to be connected to replica symmetry breaking in the spinglass theory, and associated phase diagrams are given. We also show that the global minimum of the mean square error between the estimator and the true signal is located in the replica symmetric phase. Second, we develop an approximate formula efficiently computing the crossvalidation error without actually conducting the crossvalidation, which is also applicable to the noni.i.d. design matrices. It is shown that this formula is only applicable to the unique solution region and tends to be unstable in the multiple solution region. We implement instability detection procedures, which allows the approximate formula to stand alone and resultantly enables us to draw phase diagrams for any specific dataset. Third, we propose an annealing procedure, called nonconvexity annealing, to obtain the solution path efficiently. Numerical simulations are conducted on simulated datasets to examine these results to verify the consistency of the theoretical results and the efficiency of the approximate formula and nonconvexity annealing. The characteristic behaviour of the annealed solution in the multiple solution region is addressed. Another numerical experiment on a realworld dataset of Type Ia supernovae is conducted; its results are consistent with those of earlier studies using the \(\ell_0\) formulation. A MATLAB package of numerical codes implementing the estimation of the solution path using the annealing with respect to \(\lambda\) in conjunction with the approximate CV formula and the instability detection routine is distributed in Obuchi (2019 https://github.com/TObuchi/SLRpackage\_AcceleratedCV\_matlab). 
Homepage: 
https://arxiv.org/abs/1902.10375

Source Code: 
https://github.com/TObuchi/SLRpackage_AcceleratedCV_matlab

Dependencies: 
Matlab 
Keywords: 
sparse linear regression;
compressed sensing;
replica method;
crossvalidation;
nonconvex penalty

Related Software: 
glmnet;
Matlab

Cited in: 
1 Document
