Loading…

Compression and load balancing for efficient sparse matrix‐vector product on multicore processors and graphics processing units

Summary We contribute to the optimization of the sparse matrix‐vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and the numerical information. Our approach is multi‐platform, in the sense th...

Full description

Saved in:
Bibliographic Details
Published in:Concurrency and computation 2022-06, Vol.34 (14), p.n/a
Main Authors: Aliaga, José I., Anzt, Hartwig, Grützmacher, Thomas, Quintana‐Ortí, Enrique S., Tomás, Andrés E.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary We contribute to the optimization of the sparse matrix‐vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and the numerical information. Our approach is multi‐platform, in the sense that the realizations for (general‐purpose) multicore processors as well as graphics accelerators (GPUs) are built upon common principles, but differ in the implementation details, which are adapted to avoid thread divergence in the GPU case or maximize compression element‐wise (i.e., for each matrix entry) for multicore architectures. Our evaluation on the two last generations of NVIDIA GPUs as well as Intel and AMD processors demonstrate the benefits of the new kernels when compared with the optimized implementations of the sparse matrix‐vector product in NVIDIA's cuSPARSE and Intel's MKL, respectively.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.6515