-
内容大纲
稀疏统计模型只具有少数非零参数或权重,经典地体现了化繁为简的理念,因而广泛应用于诸多领域。本书就稀疏性统计学习做出总结,以LASSO方法为中心,层层推进,逐渐囊括其他方法,深入探讨诸多稀疏性问题的求解和应用;不仅包含大量的例子和清晰的图表,还附有文献注释和课后练习,是深入学习统计学知识的参考。本书适合计算机科学、统计学和机器学习的学生和研究人员。 -
作者介绍
-
目录
Preface
1 Introduction
2 The Lasso for Linear Models
2.1 Introduction
2.2 The Lasso Estimator
2.3 Cross-Validation and Inference
2.4 Computation of the Lasso Solution
2.4.1 Single Predictor: Soft Thresholding
2.4.2 Multiple Predictors: Cyclic Coordinate Descent
2.4.3 Soft-Thresholding and Orthogonal Bases
2.5 Degrees of Freedom
2.6 Uniqueness of the Lasso Solutions
2.7 A Glimpse at the Theory
2.8 The Nonnegative Garrote
2.9 lq Penalties and Bayes Estimates
2.10 Some Perspective
Exercises
3 Generalized Linear Models
3.1 Introduction
3.2 Logistic Regression
3.2.1 Example: Document Classification
3.2.2 Algorithms
3.3 Multiclass Logistic Regression
3.3.1 Example: Handwritten Digits
3.3.2 Algorithms
3.3.3 Grouped-Lasso Multinomial
3.4 Log-Linear Models and the Poisson GLM
3.4.1 Example: Distribution Smoothing
3.5 Cox Proportional Hazards Models
3.5.1 Cross-Validation
3.5.2 Pre-Validation
3.6 Support Vector Machines
3.6.1 Logistic Regression with Separable Data
3.7 Computational Details and glmnet
Bibliographic Notes
Exercises
4 Generalizations of the Lasso Penalty
4.1 Introduction
4.2 The Elastic Net
4.3 The Group Lasso
4.3.1 Computation for the Group Lasso
4.3.2 Sparse Group Lasso
4.3.3 The Overlap Group Lasso
4.4 Sparse Additive Models and the Group Lasso
4.4.1 Additive Models and Backfitting
4.4.2 Sparse Additive Models and Backfitting
4.4.3 Approaches Using Optimization and the Group Lasso
4.4.4 Multiple Penalization for Sparse Additive Models
4.5 The Fused Lasso
4.5.1 Fitting the Fused Lasso
4.5.1.1 Reparametrization
4.5.1.2 A Path Algorithm
4.5.1.3 A Dual Path Algorithm
4.5.1.4 Dynamic Programming for the Fused Lasso
4.5.2 Trend Filtering
4.5.3 Nearly Isotonic Regression
4.6 Nonconvex Penalties
Bibliographic Notes
Exercises
5 Optimization Methods
5.1 Introduction
5.2 Convex Optimality Conditions
5.2.1 Optimality for Differentiable Problems
5.2.2 Nondifferentiable Functions and Subgradients
5.3 Gradient Descent
5.3.1 Unconstrained Gradient Descent
5.3.2 Projected Gradient Methods
5.3.3 Proximal Gradient Methods
5.3.4 Accelerated Gradient Methods
5.4 Coordinate Descent
5.4.1 Separability and Coordinate Descent
5.4.2 Linear Regression and the Lasso
5.4.3 Logistic Regression and Generalized Linear Models
5.5 A Simulation Study
5.6 Least Angle Regression
5.7 Alternating Direction Method of Multipliers
5.8 Minorization-Maximization Algorithms
5.9 Biconvexity and Alternating Minimization
5.10 Screening Rules
Bibliographic Notes
Appendix
Exercises
6 Statistical Inference
6.1 The Bayesian Lasso
6.2 The Bootstrap
6.3 Post-Selection Inference for the Lasso
6.3.1 The Covariance Test
6.3.2 A General Scheme for Post-Selection Inference
6.3.2.1 Fixed-入 Inference for the Lasso
6.3.2.2 The Spacing Test for LAR
6.3.3 What Hypothesis Is Being Tested?
6.3.4 Back to Forward Stepwise Regression
6.4 Inference via a Debiased Lasso
6.5 Other Proposals for Post-Selection Inference
Bibliographic Notes
Exercises
7 Matrix Decompositions, Approximations, and Completion
7.1 Introduction
7.2 The Singular Value Decomposition
7.3 Missing Data and Matrix Completion
7.3.1 The Netflix Movie Challenge
7.3.2 Matrix Completion Using Nuclear Norm
7.3.3 Theoretical Results for Matrix Completion
7.3.4 Maximum Margin Factorization and Related Methods
7.4 Reduced-Rank Regression
7.5 A General Matrix Regression Framework
7.6 Penalized Matrix Decomposition
7.7 Additive Matrix Decomposition
Bibliographic Notes
Exercises
8 Sparse Multivariate Methods
8.1 Introduction
8.2 Sparse Principal Components Analysis
8.2.1 Some Background
8.2.2 Sparse Principal Components
8.2.2.1 Sparsity from Maximum Variance
8.2.2.2 Methods Based on Reconstruction
8.2.3 Higher-Rank Solutions
8.2.3.1 Illustrative Application of Sparse PCA
8.2.4 Sparse PCA via Fantope Projection
8.2.5 Sparse Autoencoders and Deep Learning
8.2.6 Some Theory for Sparse PCA
8.3 Sparse Canonical Correlation Analysis
8.3.1 Example: Netflix Movie Rating Data
8.4 Sparse Linear Discriminant Analysis
8.4.1 Normal Theory and Bayes' Rule
8.4.2 Nearest Shrunken Centroids
8.4.3 Fisher's Linear Discriminant Analysis
8.4.3.1 Example: Simulated Data with Five Classes
8.4.4 Optimal Scoring
8.4.4.1 Example: Face Silhouettes
8.5 Sparse Clustering
8.5.1 Some Background on Clustering
8.5.1.1 Example: Simulated Data with Six Classes
8.5.2 Sparse Hierarchical Clustering
8.5.3 Sparse K-Means Clustering
8.5.4 Convex Clustering
Bibliographic Notes
Exercises
9 Graphs and Model Selection
9.1 Introduction
9.2 Basics of Graphical Models
9.2.1 Factorization and Markov Properties
9.2.1.1 Factorization Property
9.2.1.2 Markov Property
9.2.1.3 Equivalence of Factorization and Markov Properties
9.2.2 Some Examples
9.2.2.1 Discrete Graphical Models
9.2.2.2 Gaussian Graphical Models
9.3 Graph Selection via Penalized Likelihood
9.3.1 Global Likelihoods for Gaussian Models
9.3.2 Graphical Lasso Algorithm
9.3.3 Exploiting Block-Diagonal Structure
9.3.4 Theoretical Guarantees for the Graphical Lasso
9.3.5 Global Likelihood for Discrete Models
9.4 Graph Selection via Conditional Inference
9.4.1 Neighborhood-Based Likelihood for Gaussians
9.4.2 Neighborhood-Based Likelihood for Discrete Models
9.4.3 Pseudo-Likelihood for Mixed Models
9.5 Graphical Models with Hidden Variables
Bibliographic Notes
Exercises
10 Signal Approximation and Compressed Sensing
10.1 Introduction
10.2 Signals and Sparse Representations
10.2.1 Orthogonal Bases
10.2.2 Approximation in Orthogonal Bases
10.2.3 Reconstruction in Overcomplete Bases
10.3 Random Projection and Approximation
10.3.1 Johnson–Lindenstrauss Approximation
10.3.2 Compressed Sensing
10.4 Equivalence between lo and l1 Recovery
10.4.1 Restricted Nullspace Property
10.4.2 Sufficient Conditions for Restricted Nullspace
10.4.3 Proofs
10.4.3.1 Proof of Theorem 10.1
10.4.3.2 Proof of Proposition 10.1
Bibliographic Notes
Exercises
11 Theoretical Results for the Lasso
11.1 Introduction
11.1.1 Types of Loss Functions
11.1.2 Types of Sparsity Models
11.2 Bounds on Lasso l2-Error
11.2.1 Strong Convexity in the Classical Setting
11.2.2 Restricted Eigenvalues for Regression
11.2.3 A Basic Consistency Result
11.3 Bounds on Prediction Error
11.4 Support Recovery in Linear Regression
11.4.1 Variable-Selection Consistency for the Lasso
11.4.1.1 Some Numerical Studies
11.5 Beyond the Basic Lasso
Bibliographic Notes
Exercises
Bibliography
Author Index
Index
同类热销排行榜
- 向着光亮那方/谁的青春不迷茫系列16.8
- 你所谓的稳定不过是在浪费生命15.2
- 全球通史(从史前史到21世纪第7版修订版下)/培文书系21.6
- 答案之书(精)15.2
- 八万四千问18
- 万历十五年/黄仁宇作品系列10.4
- 耶路撒冷三千年(精)31.2
- 中国大历史/黄仁宇作品系列11.2
- 梦的解析15.92
- 鱼羊野史(第6卷11-12月晓松说历史上的今天)18
推荐书目
-
孩子你慢慢来/人生三书 华人世界率性犀利的一枝笔,龙应台独家授权《孩子你慢慢来》20周年经典新版。她的《...
-
时间简史(插图版) 相对论、黑洞、弯曲空间……这些词给我们的感觉是艰深、晦涩、难以理解而且与我们的...
-
本质(精) 改革开放40年,恰如一部四部曲的年代大戏。技术突变、产品迭代、产业升级、资本对接...