group sparsity

Group lasso

β^λ=argminβYXβ22+λg=1GβIg2,\hat{\bm \beta}_\lambda = \arg \min_{\bm \beta} \| \bm Y - \bm X \bm \beta \|_2^2 + \lambda \sum_{g=1}^G \|\bm \beta_{\mathcal{I}_g}\|_2,
where Ig\mathcal{I}_g is the index set belonging to the ggth group of variables, g=1,,Gg=1,\ldots,G.

  • This penalty can be viewed as an intermediate between the 1\ell_1 and 2\ell_2-type penalty.

The 1\ell_1-penalty treats the three coordinate directions differently from other directions, and this encourages sparsity in individual coefficients. The 2\ell_2-penalty treats all directions equally and does not encourage sparsity. The group lasso encourages sparsity at the factor level.

  • The estimates have the attractive property of being invariant under groupwise orthogonal reparameterizations (transformations)\textcolor{red}{\text{\small invariant under groupwise orthogonal reparameterizations (transformations)}}, like ridge regression.

Group LARS

M2,1=i=1dM2\|\bm M \|_{2,1}= \sum_{i=1}^d \| \bm M\|_2

Group non-negative garrotte

group lasso group LARS group non-negative garrotte
performance excellent comparable
computational efficiency intensive in large scale problems quickly fastest
applicability sub-optimal when pnp \rightarrow n, not applicable when p>np>n

相關約束

Elastic net: Under elastic net, highly correlated features will receive similar weightings. This grouping effect occurs as a result of strict convexity from the 2\ell_2 norm.


參考文獻

  1. Yuan, Ming, and Yi Lin. “Model selection and estimation in regression with grouped variables.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.1 (2006): 49-67.
  2. Zou, Hui, and Trevor Hastie. “Regularization and variable selection via the elastic net.” Journal of the royal statistical society: series B (statistical methodology) 67.2 (2005): 301-320.
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章