TY - JOUR
T1 - A unified consensus-based parallel algorithm for high-dimensional regression with combined regularizations
AU - Wu, Xiaofei
AU - Liang, Rongmei
AU - Zhang, Zhimin
AU - Cui, Zhenyu
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2025/3
Y1 - 2025/3
N2 - The parallel algorithm is widely recognized for its effectiveness in handling large-scale datasets stored in a distributed manner, making it a popular choice for solving statistical learning models. However, there is currently limited research on parallel algorithms specifically designed for high-dimensional regression with combined regularization terms. These terms, such as elastic-net, sparse group lasso, sparse fused lasso, and their nonconvex variants, have gained significant attention in various fields due to their ability to incorporate prior information and promote sparsity within specific groups or fused variables. The scarcity of parallel algorithms for combined regularizations can be attributed to the inherent nonsmoothness and complexity of these terms, as well as the absence of closed-form solutions for certain proximal operators associated with them. This paper proposes a unified constrained optimization formulation based on the consensus problem for these types of convex and nonconvex regression problems, and derives the corresponding parallel alternating direction method of multipliers (ADMM) algorithms. Furthermore, it is proven that the proposed algorithm not only has global convergence but also exhibits a linear convergence rate. It is worth noting that the computational complexity of the proposed algorithm remains the same for different regularization terms and losses, which implicitly demonstrates the universality of this algorithm. Extensive simulation experiments, along with a financial example, serve to demonstrate the reliability, stability, and scalability of our algorithm. The R package for implementing the proposed algorithm can be obtained at https://github.com/xfwu1016/CPADMM.
AB - The parallel algorithm is widely recognized for its effectiveness in handling large-scale datasets stored in a distributed manner, making it a popular choice for solving statistical learning models. However, there is currently limited research on parallel algorithms specifically designed for high-dimensional regression with combined regularization terms. These terms, such as elastic-net, sparse group lasso, sparse fused lasso, and their nonconvex variants, have gained significant attention in various fields due to their ability to incorporate prior information and promote sparsity within specific groups or fused variables. The scarcity of parallel algorithms for combined regularizations can be attributed to the inherent nonsmoothness and complexity of these terms, as well as the absence of closed-form solutions for certain proximal operators associated with them. This paper proposes a unified constrained optimization formulation based on the consensus problem for these types of convex and nonconvex regression problems, and derives the corresponding parallel alternating direction method of multipliers (ADMM) algorithms. Furthermore, it is proven that the proposed algorithm not only has global convergence but also exhibits a linear convergence rate. It is worth noting that the computational complexity of the proposed algorithm remains the same for different regularization terms and losses, which implicitly demonstrates the universality of this algorithm. Extensive simulation experiments, along with a financial example, serve to demonstrate the reliability, stability, and scalability of our algorithm. The R package for implementing the proposed algorithm can be obtained at https://github.com/xfwu1016/CPADMM.
KW - ADMM
KW - Combined regularization
KW - High-dimensional regression
KW - Massive data
KW - Parallel algorithm
UR - http://www.scopus.com/inward/record.url?scp=85208079283&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85208079283&partnerID=8YFLogxK
U2 - 10.1016/j.csda.2024.108081
DO - 10.1016/j.csda.2024.108081
M3 - Article
AN - SCOPUS:85208079283
SN - 0167-9473
VL - 203
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
M1 - 108081
ER -