📚 PracticeMediumVerifiedML ConceptKnowledge Ready
Regularization: L1, L2, and Beyond
machine-learningregularizationoverfittingoptimization
Generated
Updated Dec 20, 2025
Question
Implement and compare different regularization techniques to prevent overfitting in machine learning models.
Part 1: Implement L1 (Lasso) Regularization
import numpy as np
class LassoRegression:
"""
Linear Regression with L1 (Lasso) Regularization
Penalty: α * Σ|w|
"""
def __init__(self, alpha=1.0, learning_rate=0.01, num_iterations=1000):
"""
Args:
alpha: Regularization strength (λ)
learning_rate: Step size for gradient descent
num_iterations: Number of training iterations
"""
self.alpha = alpha
self.learning_rate = learning_rate
self.num_iterations = num_iterations
self.weights = None
self.bias = None
def fit(self, X, y):
"""
Train the model with L1 regularization
TODO: Implement L1 regularized gradient descent
- Loss = MSE + α * Σ|w|
- Gradient for L1: sign(w)
"""
pass
def predict(self, X):
"""Make predictions"""
pass
Part 2: Implement L2 (Ridge) Regularization
class RidgeRegression:
"""
Linear Regression with L2 (Ridge) Regularization
Penalty: α * Σw²
"""
def __init__(self, alpha=1.0, learning_rate=0.01, num_iterations=1000):
self.alpha = alpha
self.learning_rate = learning_rate
self.num_iterations = num_iterations
self.weights = None
self.bias = None
def fit(self, X, y):
"""
Train the model with L2 regularization
TODO: Implement L2 regularized gradient descent
- Loss = MSE + α * Σw²
- Gradient for L2: 2 * α * w
"""
pass
def predict(self, X):
"""Make predictions"""
pass
Part 3: Implement Elastic Net (L1 + L2)
class ElasticNet:
"""
Linear Regression with Elastic Net Regularization
Penalty: α₁ * Σ|w| + α₂ * Σw²
Combines L1 and L2
"""
def __init__(self, alpha_l1=0.5, alpha_l2=0.5, learning_rate=0.01, num_iterations=1000):
"""
Args:
alpha_l1: L1 regularization strength
alpha_l2: L2 regularization strength
"""
self.alpha_l1 = alpha_l1
self.alpha_l2 = alpha_l2
self.learning_rate = learning_rate
self.num_iterations = num_iterations
self.weights = None
self.bias = None
def fit(self, X, y):
"""
Train with both L1 and L2 regularization
TODO: Implement Elastic Net gradient descent
- Loss = MSE + α₁ * Σ|w| + α₂ * Σw²
- Gradient: sign(w) for L1 + 2*α₂*w for L2
"""
pass
def predict(self, X):
"""Make predictions"""
pass
Part 4: Compare All Methods
def compare_regularization(X_train, y_train, X_test, y_test):
"""
Compare different regularization methods
TODO: Train all 3 models and compare:
- Training error
- Test error
- Number of zero weights (sparsity)
- Weight magnitudes
"""
pass
Hints
Hint 1
L1 vs L2 Gradient:
L1 (Lasso):
Gradient = sign(w) = {
+1 if w > 0
-1 if w < 0
0 if w = 0
}
L2 (Ridge):
Gradient = 2 * α * w
Key difference: L1 uses sign, L2 uses actual value!
Hint 2
Handling L1's Non-differentiability:
L1 is not differentiable at w=0. Use soft thresholding or subgradient:
# Soft thresholding approach
def soft_threshold(w, lambda_):
if w > lambda_:
return w - lambda_
elif w < -lambda_:
return w + lambda_
else:
return 0
Hint 3
Weight Update Formula:
Standard gradient descent:
w = w - lr * gradient
With L1:
w = w - lr * (∂MSE/∂w + α * sign(w))
With L2:
w = w - lr * (∂MSE/∂w + 2 * α * w)
Your Solution
python
Auto-saves every 30s
Try solving the problem first before viewing the solution
0:00time spent