Loading…

A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform

We present a parallel conjugate gradient solver for the Poisson problem optimized for multi-GPU platforms. Our approach includes a novel heuristic Poisson preconditioner well suited for massively-parallel SIMD processing. Furthermore, we address the problem of limited transfer rates over typical dat...

Full description

Saved in:
Bibliographic Details
Main Authors: Ament, Marco, Knittel, Gunter, Weiskopf, Daniel, Strasser, Wolfgang
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c217t-7d3bfde36c5ec83dc1f3a483ca44080446cd6d07a48da511d7da4e6317cc98103
cites
container_end_page 592
container_issue
container_start_page 583
container_title
container_volume
creator Ament, Marco
Knittel, Gunter
Weiskopf, Daniel
Strasser, Wolfgang
description We present a parallel conjugate gradient solver for the Poisson problem optimized for multi-GPU platforms. Our approach includes a novel heuristic Poisson preconditioner well suited for massively-parallel SIMD processing. Furthermore, we address the problem of limited transfer rates over typical data channels such as the PCI-express bus relative to the bandwidth requirements of powerful GPUs. Specifically, naive communication schemes can severely reduce the achievable speedup in such communication-intense algorithms. For this reason, we employ overlapping memory transfers to establish a high level of concurrency and to improve scalability. We have implemented our model on a high-performance workstation with multiple hardware accelerators. We discuss the mathematical principles, give implementation details, and present the performance and the scalability of the system.
doi_str_mv 10.1109/PDP.2010.51
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_5452414</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5452414</ieee_id><sourcerecordid>5452414</sourcerecordid><originalsourceid>FETCH-LOGICAL-c217t-7d3bfde36c5ec83dc1f3a483ca44080446cd6d07a48da511d7da4e6317cc98103</originalsourceid><addsrcrecordid>eNpVTLtOwzAUNS-JUjoxsvgHUvy4tpOxKlCQirAEnSvXvgFXboycgMTfEwkWpvM-hFxxNuecNTf21s4FG5XiR2TWmJqDAFDaSDgmEyGNqZRR7ORfJswpmXCmdaV5I87JRd_vGWMGRDMhfkGtKy4lTNQW9LkLcYi5w0CXudt_vrkB6aq4ELEb6EtOX1homwsd3pHaHPs-d-Mw7xIe6EgdffpMQ6xWdkNtcsNYPVySs9alHmd_OCWb-7vX5UO1fl49LhfrygtuhsoEuWsDSu0V-loGz1vpoJbeAbCaAWgfdGBm9IJTnAcTHKCW3Hjf1JzJKbn-_Y2IuP0o8eDK91aBEsBB_gBQylkd</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform</title><source>IEEE Xplore All Conference Series</source><creator>Ament, Marco ; Knittel, Gunter ; Weiskopf, Daniel ; Strasser, Wolfgang</creator><creatorcontrib>Ament, Marco ; Knittel, Gunter ; Weiskopf, Daniel ; Strasser, Wolfgang</creatorcontrib><description>We present a parallel conjugate gradient solver for the Poisson problem optimized for multi-GPU platforms. Our approach includes a novel heuristic Poisson preconditioner well suited for massively-parallel SIMD processing. Furthermore, we address the problem of limited transfer rates over typical data channels such as the PCI-express bus relative to the bandwidth requirements of powerful GPUs. Specifically, naive communication schemes can severely reduce the achievable speedup in such communication-intense algorithms. For this reason, we employ overlapping memory transfers to establish a high level of concurrency and to improve scalability. We have implemented our model on a high-performance workstation with multiple hardware accelerators. We discuss the mathematical principles, give implementation details, and present the performance and the scalability of the system.</description><identifier>ISSN: 1066-6192</identifier><identifier>ISBN: 9781424456727</identifier><identifier>ISBN: 142445672X</identifier><identifier>EISSN: 2377-5750</identifier><identifier>EISBN: 9781424456734</identifier><identifier>EISBN: 1424456738</identifier><identifier>DOI: 10.1109/PDP.2010.51</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bandwidth ; Character generation ; Concurrent computing ; Conjugate Gradient ; Graphics ; Hardware ; Iterative methods ; Jacobian matrices ; Linear systems ; Multi-GPU ; Parallel Preconditioning ; Poisson Problem ; Scalability ; Sparse matrices</subject><ispartof>2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010, p.583-592</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c217t-7d3bfde36c5ec83dc1f3a483ca44080446cd6d07a48da511d7da4e6317cc98103</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5452414$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>310,311,786,790,795,796,2071,27958,54906,55271,55283</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5452414$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ament, Marco</creatorcontrib><creatorcontrib>Knittel, Gunter</creatorcontrib><creatorcontrib>Weiskopf, Daniel</creatorcontrib><creatorcontrib>Strasser, Wolfgang</creatorcontrib><title>A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform</title><title>2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing</title><addtitle>PDP</addtitle><description>We present a parallel conjugate gradient solver for the Poisson problem optimized for multi-GPU platforms. Our approach includes a novel heuristic Poisson preconditioner well suited for massively-parallel SIMD processing. Furthermore, we address the problem of limited transfer rates over typical data channels such as the PCI-express bus relative to the bandwidth requirements of powerful GPUs. Specifically, naive communication schemes can severely reduce the achievable speedup in such communication-intense algorithms. For this reason, we employ overlapping memory transfers to establish a high level of concurrency and to improve scalability. We have implemented our model on a high-performance workstation with multiple hardware accelerators. We discuss the mathematical principles, give implementation details, and present the performance and the scalability of the system.</description><subject>Bandwidth</subject><subject>Character generation</subject><subject>Concurrent computing</subject><subject>Conjugate Gradient</subject><subject>Graphics</subject><subject>Hardware</subject><subject>Iterative methods</subject><subject>Jacobian matrices</subject><subject>Linear systems</subject><subject>Multi-GPU</subject><subject>Parallel Preconditioning</subject><subject>Poisson Problem</subject><subject>Scalability</subject><subject>Sparse matrices</subject><issn>1066-6192</issn><issn>2377-5750</issn><isbn>9781424456727</isbn><isbn>142445672X</isbn><isbn>9781424456734</isbn><isbn>1424456738</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpVTLtOwzAUNS-JUjoxsvgHUvy4tpOxKlCQirAEnSvXvgFXboycgMTfEwkWpvM-hFxxNuecNTf21s4FG5XiR2TWmJqDAFDaSDgmEyGNqZRR7ORfJswpmXCmdaV5I87JRd_vGWMGRDMhfkGtKy4lTNQW9LkLcYi5w0CXudt_vrkB6aq4ELEb6EtOX1homwsd3pHaHPs-d-Mw7xIe6EgdffpMQ6xWdkNtcsNYPVySs9alHmd_OCWb-7vX5UO1fl49LhfrygtuhsoEuWsDSu0V-loGz1vpoJbeAbCaAWgfdGBm9IJTnAcTHKCW3Hjf1JzJKbn-_Y2IuP0o8eDK91aBEsBB_gBQylkd</recordid><startdate>201002</startdate><enddate>201002</enddate><creator>Ament, Marco</creator><creator>Knittel, Gunter</creator><creator>Weiskopf, Daniel</creator><creator>Strasser, Wolfgang</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201002</creationdate><title>A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform</title><author>Ament, Marco ; Knittel, Gunter ; Weiskopf, Daniel ; Strasser, Wolfgang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c217t-7d3bfde36c5ec83dc1f3a483ca44080446cd6d07a48da511d7da4e6317cc98103</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Bandwidth</topic><topic>Character generation</topic><topic>Concurrent computing</topic><topic>Conjugate Gradient</topic><topic>Graphics</topic><topic>Hardware</topic><topic>Iterative methods</topic><topic>Jacobian matrices</topic><topic>Linear systems</topic><topic>Multi-GPU</topic><topic>Parallel Preconditioning</topic><topic>Poisson Problem</topic><topic>Scalability</topic><topic>Sparse matrices</topic><toplevel>online_resources</toplevel><creatorcontrib>Ament, Marco</creatorcontrib><creatorcontrib>Knittel, Gunter</creatorcontrib><creatorcontrib>Weiskopf, Daniel</creatorcontrib><creatorcontrib>Strasser, Wolfgang</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ament, Marco</au><au>Knittel, Gunter</au><au>Weiskopf, Daniel</au><au>Strasser, Wolfgang</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform</atitle><btitle>2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing</btitle><stitle>PDP</stitle><date>2010-02</date><risdate>2010</risdate><spage>583</spage><epage>592</epage><pages>583-592</pages><issn>1066-6192</issn><eissn>2377-5750</eissn><isbn>9781424456727</isbn><isbn>142445672X</isbn><eisbn>9781424456734</eisbn><eisbn>1424456738</eisbn><abstract>We present a parallel conjugate gradient solver for the Poisson problem optimized for multi-GPU platforms. Our approach includes a novel heuristic Poisson preconditioner well suited for massively-parallel SIMD processing. Furthermore, we address the problem of limited transfer rates over typical data channels such as the PCI-express bus relative to the bandwidth requirements of powerful GPUs. Specifically, naive communication schemes can severely reduce the achievable speedup in such communication-intense algorithms. For this reason, we employ overlapping memory transfers to establish a high level of concurrency and to improve scalability. We have implemented our model on a high-performance workstation with multiple hardware accelerators. We discuss the mathematical principles, give implementation details, and present the performance and the scalability of the system.</abstract><pub>IEEE</pub><doi>10.1109/PDP.2010.51</doi><tpages>10</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1066-6192
ispartof 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010, p.583-592
issn 1066-6192
2377-5750
language eng
recordid cdi_ieee_primary_5452414
source IEEE Xplore All Conference Series
subjects Bandwidth
Character generation
Concurrent computing
Conjugate Gradient
Graphics
Hardware
Iterative methods
Jacobian matrices
Linear systems
Multi-GPU
Parallel Preconditioning
Poisson Problem
Scalability
Sparse matrices
title A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-09-22T05%3A38%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%20Parallel%20Preconditioned%20Conjugate%20Gradient%20Solver%20for%20the%20Poisson%20Problem%20on%20a%20Multi-GPU%20Platform&rft.btitle=2010%2018th%20Euromicro%20Conference%20on%20Parallel,%20Distributed%20and%20Network-based%20Processing&rft.au=Ament,%20Marco&rft.date=2010-02&rft.spage=583&rft.epage=592&rft.pages=583-592&rft.issn=1066-6192&rft.eissn=2377-5750&rft.isbn=9781424456727&rft.isbn_list=142445672X&rft_id=info:doi/10.1109/PDP.2010.51&rft.eisbn=9781424456734&rft.eisbn_list=1424456738&rft_dat=%3Cieee_CHZPO%3E5452414%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c217t-7d3bfde36c5ec83dc1f3a483ca44080446cd6d07a48da511d7da4e6317cc98103%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5452414&rfr_iscdi=true