Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

09/25/2015
by   Ang Li, et al.
0

We discuss an approach for solving sparse or dense banded linear systems A x = b on a Graphics Processing Unit (GPU) card. The matrix A∈R^N × N is possibly nonsymmetric and moderately large; i.e., 10000 ≤ N ≤ 500000. The split andparallelize ( SaP) approach seeks to partition the matrix A into diagonal sub-blocks A_i, i=1,...,P, which are independently factored in parallel. The solution may choose to consider or to ignore the matrices that couple the diagonal sub-blocks A_i. This approach, along with the Krylov subspace-based iterative method that it preconditions, are implemented in a solver called SaP::GPU, which is compared in terms of efficiency with three commonly used sparse direct solvers: PARDISO, SuperLU, and MUMPS. SaP::GPU, which runs entirely on the GPU except several stages involved in preliminary row-column permutations, is robust and compares well in terms of efficiency with the aforementioned direct solvers. In a comparison against Intel's MKL, SaP::GPU also fares well when used to solve dense banded systems that are close to being diagonally dominant. SaP::GPU is publicly available and distributed as open source under a permissive BSD3 license.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset