TY - JOUR
T1 - Stabilized variational formulation of an oldroyd-B fluid flow equations on a Graphic Processing Unit (GPU) architecture
AU - Ayyad, Mahmoud
AU - Guaily, Amr
AU - Hassanein, Maha A.
N1 - Publisher Copyright:
© 2020
PY - 2021/1
Y1 - 2021/1
N2 - The governing equations of the flow of an oldroyd-B fluid are discretized using the finite element method. To overcome the convective nature of the momentum equation, the Galerkin/Least-Squares Finite Element Method (GLS/FEM) is used while the Discrete Elastic–Viscous Stress-Splitting (DEVSS) method is used to overcome the instability due to the absence of diffusion in the constitutive equations. The discretized equations are implemented on a hybrid system between the Graphics Processing Unit (GPU) architecture using Compute-Unified-Device-Architecture (CUDA) and a multi-core CPU. The implementation is applied successfully to simulate the blood flow in abdominal aortic aneurysm. To accelerate application performance on the GPU several optimized approaches are adopted. The most significant approach is the coloring technique that is used to assemble the global matrix. Numerical experiments show that the hybrid CPU-GPU implementation has a 26 time speedup over the multi-core CPU implementations.
AB - The governing equations of the flow of an oldroyd-B fluid are discretized using the finite element method. To overcome the convective nature of the momentum equation, the Galerkin/Least-Squares Finite Element Method (GLS/FEM) is used while the Discrete Elastic–Viscous Stress-Splitting (DEVSS) method is used to overcome the instability due to the absence of diffusion in the constitutive equations. The discretized equations are implemented on a hybrid system between the Graphics Processing Unit (GPU) architecture using Compute-Unified-Device-Architecture (CUDA) and a multi-core CPU. The implementation is applied successfully to simulate the blood flow in abdominal aortic aneurysm. To accelerate application performance on the GPU several optimized approaches are adopted. The most significant approach is the coloring technique that is used to assemble the global matrix. Numerical experiments show that the hybrid CPU-GPU implementation has a 26 time speedup over the multi-core CPU implementations.
KW - CUDA
KW - Galerkin/Least-Squares
KW - GPU
KW - Hybrid CPU–GPU
KW - Viscoelastic fluids
UR - http://www.scopus.com/inward/record.url?scp=85090121536&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090121536&partnerID=8YFLogxK
U2 - 10.1016/j.cpc.2020.107495
DO - 10.1016/j.cpc.2020.107495
M3 - Article
AN - SCOPUS:85090121536
SN - 0010-4655
VL - 258
JO - Computer Physics Communications
JF - Computer Physics Communications
M1 - 107495
ER -