Loading…
Accelerating radio astronomy cross-correlation with graphics processing units
We present a highly parallel implementation of the cross-correlation of time-series data using graphics processing units (GPUs), which is scalable to hundreds of independent inputs and suitable for the processing of signals from ‘large- N ’ arrays of many radio antennas. The computational part of th...
Saved in:
Published in: | The international journal of high performance computing applications 2013-05, Vol.27 (2), p.178-192 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We present a highly parallel implementation of the cross-correlation of time-series data using graphics processing units (GPUs), which is scalable to hundreds of independent inputs and suitable for the processing of signals from ‘large-
N
’ arrays of many radio antennas. The computational part of the algorithm, the X-engine, is implemented efficiently on NVIDIA’s Fermi architecture, sustaining up to 79% of the peak single-precision floating-point throughput. We compare performance obtained for hardware- and software-managed caches, observing significantly better performance for the latter. The high performance reported involves use of a multi-level data tiling strategy in memory and use of a pipelined algorithm with simultaneous computation and transfer of data from host to device memory. The speed of code development, flexibility, and low cost of the GPU implementations compared with application-specific integrated circuit (ASIC) and field programmable gate array (FPGA) implementations have the potential to greatly shorten the cycle of correlator development and deployment, for cases where some power-consumption penalty can be tolerated. |
---|---|
ISSN: | 1094-3420 1741-2846 |
DOI: | 10.1177/1094342012444794 |