# 04.Shrinkage Methods

Shrinkage methods

## Shrinkage methods

- Why? Some variables might be redundant. Shrink the model.

### Ridge Regression

### Lasso

- Small constraint $t$ cause some of the coefficients reduce exactly to 0: this is
**variable selection**, while producing**sparse model**. - Convex optimization.

**Why would lasso leads to exact 0 coefficients?**

Would spot the reason as long as you plot out the constraints and the RSS. Fig. 3.11.

### Compare Lasso and Ridge

For sparse models, lasso is better. Otherwise, lasso can make the fitting worse than ridge.

No rule of thumb.

### Generalization

Ridge and lasso can be generalized. Replace the distance calculation with other definitions, i.e., $\sum \lvert \beta_j \rvert^q$.

- $q=0$: subset selection
- $q=1$: lasso
- $q=2$: ridge

Smaller $q$ leads to tighter selection.

{% highlight text %} Plot[Evaluate@Table[(1 - x^(q))^(1/q), {q, 0.5, 4, 0.5}], {x, 0, 1}, AspectRatio -> 1, Frame -> True, PlotLegends -> Placed[Table[“q=” <> ToString@q, {q, 0.5, 4, 0.5}], {Left, Bottom}], PlotLabel -> “Shrinkage as function of L-q norm disance”, FrameLabel -> {"!(*SubscriptBox[([Beta]), (i)])", “!(*SubscriptBox[([Beta]), (j)])"}] {% endhighlight %}

## Table of Contents

**Current Ref:**

- esl/04.shrinkage-methods.md