Loading…

Scalable Hybrid Designs for Linear Algebra on Reconfigurable Computing Systems

Recently, high-end reconfigurable computing systems have been built that employ Field Programmable Gate Arrays (FPGAs) as hardware accelerators for general-purpose processors. These systems not only provide new opportunities for high-performance computing, but also pose new challenges to application...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on computers 2008-12, Vol.57 (12), p.1661-1675
Main Authors:	Ling Zhuo, Prasanna, V.K.
Format:	Article
Language:	English
Subjects:	Algorithms Algorithms implemented in hardware Bandwidth Computation Computational modeling Computations on matrices Construction Design engineering Field programmable gate arrays Gate arrays Hardware Heterogeneous (hybrid) systems Mathematical models Methodology Processors Program processors Random access memory Studies
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Recently, high-end reconfigurable computing systems have been built that employ Field Programmable Gate Arrays (FPGAs) as hardware accelerators for general-purpose processors. These systems not only provide new opportunities for high-performance computing, but also pose new challenges to application developers. In this paper, we build a design model for hybrid designs that utilize both the processors and the FPGAs. The model characterizes a reconfigurable computing system using various parameters. Based on the model, we propose a design methodology for hardware/software co-design. The methodology partitions workload between the processors and the FPGAs, maintains load balance in the system, and realizes scalability over multiple nodes. Designs are proposed for several computationally intensive applications: matrix multiplication, matrix factorization and the Floyd-Warshall algorithm for the all-pairs shortest-paths problem. To illustrate our ideas, the proposed hybrid designs are implemented on a Cray XD1. Experimental results show that our designs utilize both the processors and the FPGAs efficiently, and overlap most of the data transfer overheads and network communication costs with the computations. Our designs achieve up to 90% of the total performance of the nodes, and 90% of the performance predicted by the design model. In addition, our designs scale over a large number of nodes.
ISSN:	0018-9340 1557-9956
DOI:	10.1109/TC.2008.84