Loading…

GPU-accelerated machine learning inference as a service for computing in neutrino experiments

Machine learning algorithms are becoming increasingly prevalent and performant in the reconstruction of events in accelerator-based neutrino experiments. These sophisticated algorithms can be computationally expensive. At the same time, the data volumes of such experiments are rapidly increasing. Th...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2021-03
Main Authors:	Wang, Michael, Yang, Tingjun, Maria Acosta Flechas, Harris, Philip, Hawks, Benjamin, Holzman, Burt, Knoepfel, Kyle, Krupa, Jeffrey, Pedro, Kevin, Tran, Nhan
Format:	Article
Language:	English
Subjects:	Algorithms Central processing units Coprocessors CPUs Experiments Graphics processing units Inference Machine learning Neutrinos Reconstruction Web services Workflow
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Wang, Michael Yang, Tingjun Maria Acosta Flechas Harris, Philip Hawks, Benjamin Holzman, Burt Knoepfel, Kyle Krupa, Jeffrey Pedro, Kevin Tran, Nhan
description	Machine learning algorithms are becoming increasingly prevalent and performant in the reconstruction of events in accelerator-based neutrino experiments. These sophisticated algorithms can be computationally expensive. At the same time, the data volumes of such experiments are rapidly increasing. The demand to process billions of neutrino events with many machine learning algorithm inferences creates a computing challenge. We explore a computing model in which heterogeneous computing with GPU coprocessors is made available as a web service. The coprocessors can be efficiently and elastically deployed to provide the right amount of computing for a given processing task. With our approach, Services for Optimized Network Inference on Coprocessors (SONIC), we integrate GPU acceleration specifically for the ProtoDUNE-SP reconstruction chain without disrupting the native computing workflow. With our integrated framework, we accelerate the most time-consuming task, track and particle shower hit identification, by a factor of 17. This results in a factor of 2.7 reduction in the total processing time when compared with CPU-only production. For this particular task, only 1 GPU is required for every 68 CPU threads, providing a cost-effective solution.
doi_str_mv	10.48550/arxiv.2009.04509
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2441676589</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2441676589</sourcerecordid><originalsourceid>FETCH-LOGICAL-a529-2325f28858d7bce750d37583f5a9198ad6aab3bee3cc7de458fae54347096c273</originalsourceid><addsrcrecordid>eNotjVFLwzAURoMgOOZ-gG8BnzvTJLdJHmXoFAb6MB9l3Ka3mtGmM2nHfr6D7enjwOF8jD2UYqktgHjCdArHpRTCLYUG4W7YTCpVFlZLeccWOe-FELIyEkDN2Pf686tA76mjhCM1vEf_GyLxjjDFEH94iC0lip44Zo48UzqGM7RD4n7oD9N4kXikaUwhDpxOB0qhpzjme3bbYpdpcd05276-bFdvxeZj_b563hQI0hVSSWiltWAbU3syIBplwKoW0JXOYlMh1qomUt6bhjTYFgm00ka4ykuj5uzxkj2k4W-iPO72w5Ti-XEntS4rU4F16h8ZXFXX</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2441676589</pqid></control><display><type>article</type><title>GPU-accelerated machine learning inference as a service for computing in neutrino experiments</title><source>Publicly Available Content Database</source><creator>Wang, Michael ; Yang, Tingjun ; Maria Acosta Flechas ; Harris, Philip ; Hawks, Benjamin ; Holzman, Burt ; Knoepfel, Kyle ; Krupa, Jeffrey ; Pedro, Kevin ; Tran, Nhan</creator><creatorcontrib>Wang, Michael ; Yang, Tingjun ; Maria Acosta Flechas ; Harris, Philip ; Hawks, Benjamin ; Holzman, Burt ; Knoepfel, Kyle ; Krupa, Jeffrey ; Pedro, Kevin ; Tran, Nhan</creatorcontrib><description>Machine learning algorithms are becoming increasingly prevalent and performant in the reconstruction of events in accelerator-based neutrino experiments. These sophisticated algorithms can be computationally expensive. At the same time, the data volumes of such experiments are rapidly increasing. The demand to process billions of neutrino events with many machine learning algorithm inferences creates a computing challenge. We explore a computing model in which heterogeneous computing with GPU coprocessors is made available as a web service. The coprocessors can be efficiently and elastically deployed to provide the right amount of computing for a given processing task. With our approach, Services for Optimized Network Inference on Coprocessors (SONIC), we integrate GPU acceleration specifically for the ProtoDUNE-SP reconstruction chain without disrupting the native computing workflow. With our integrated framework, we accelerate the most time-consuming task, track and particle shower hit identification, by a factor of 17. This results in a factor of 2.7 reduction in the total processing time when compared with CPU-only production. For this particular task, only 1 GPU is required for every 68 CPU threads, providing a cost-effective solution.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2009.04509</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Central processing units ; Coprocessors ; CPUs ; Experiments ; Graphics processing units ; Inference ; Machine learning ; Neutrinos ; Reconstruction ; Web services ; Workflow</subject><ispartof>arXiv.org, 2021-03</ispartof><rights>2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2441676589?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>783,787,25767,27939,37026,44604</link.rule.ids></links><search><creatorcontrib>Wang, Michael</creatorcontrib><creatorcontrib>Yang, Tingjun</creatorcontrib><creatorcontrib>Maria Acosta Flechas</creatorcontrib><creatorcontrib>Harris, Philip</creatorcontrib><creatorcontrib>Hawks, Benjamin</creatorcontrib><creatorcontrib>Holzman, Burt</creatorcontrib><creatorcontrib>Knoepfel, Kyle</creatorcontrib><creatorcontrib>Krupa, Jeffrey</creatorcontrib><creatorcontrib>Pedro, Kevin</creatorcontrib><creatorcontrib>Tran, Nhan</creatorcontrib><title>GPU-accelerated machine learning inference as a service for computing in neutrino experiments</title><title>arXiv.org</title><description>Machine learning algorithms are becoming increasingly prevalent and performant in the reconstruction of events in accelerator-based neutrino experiments. These sophisticated algorithms can be computationally expensive. At the same time, the data volumes of such experiments are rapidly increasing. The demand to process billions of neutrino events with many machine learning algorithm inferences creates a computing challenge. We explore a computing model in which heterogeneous computing with GPU coprocessors is made available as a web service. The coprocessors can be efficiently and elastically deployed to provide the right amount of computing for a given processing task. With our approach, Services for Optimized Network Inference on Coprocessors (SONIC), we integrate GPU acceleration specifically for the ProtoDUNE-SP reconstruction chain without disrupting the native computing workflow. With our integrated framework, we accelerate the most time-consuming task, track and particle shower hit identification, by a factor of 17. This results in a factor of 2.7 reduction in the total processing time when compared with CPU-only production. For this particular task, only 1 GPU is required for every 68 CPU threads, providing a cost-effective solution.</description><subject>Algorithms</subject><subject>Central processing units</subject><subject>Coprocessors</subject><subject>CPUs</subject><subject>Experiments</subject><subject>Graphics processing units</subject><subject>Inference</subject><subject>Machine learning</subject><subject>Neutrinos</subject><subject>Reconstruction</subject><subject>Web services</subject><subject>Workflow</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNotjVFLwzAURoMgOOZ-gG8BnzvTJLdJHmXoFAb6MB9l3Ka3mtGmM2nHfr6D7enjwOF8jD2UYqktgHjCdArHpRTCLYUG4W7YTCpVFlZLeccWOe-FELIyEkDN2Pf686tA76mjhCM1vEf_GyLxjjDFEH94iC0lip44Zo48UzqGM7RD4n7oD9N4kXikaUwhDpxOB0qhpzjme3bbYpdpcd05276-bFdvxeZj_b563hQI0hVSSWiltWAbU3syIBplwKoW0JXOYlMh1qomUt6bhjTYFgm00ka4ykuj5uzxkj2k4W-iPO72w5Ti-XEntS4rU4F16h8ZXFXX</recordid><startdate>20210322</startdate><enddate>20210322</enddate><creator>Wang, Michael</creator><creator>Yang, Tingjun</creator><creator>Maria Acosta Flechas</creator><creator>Harris, Philip</creator><creator>Hawks, Benjamin</creator><creator>Holzman, Burt</creator><creator>Knoepfel, Kyle</creator><creator>Krupa, Jeffrey</creator><creator>Pedro, Kevin</creator><creator>Tran, Nhan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20210322</creationdate><title>GPU-accelerated machine learning inference as a service for computing in neutrino experiments</title><author>Wang, Michael ; Yang, Tingjun ; Maria Acosta Flechas ; Harris, Philip ; Hawks, Benjamin ; Holzman, Burt ; Knoepfel, Kyle ; Krupa, Jeffrey ; Pedro, Kevin ; Tran, Nhan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a529-2325f28858d7bce750d37583f5a9198ad6aab3bee3cc7de458fae54347096c273</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Central processing units</topic><topic>Coprocessors</topic><topic>CPUs</topic><topic>Experiments</topic><topic>Graphics processing units</topic><topic>Inference</topic><topic>Machine learning</topic><topic>Neutrinos</topic><topic>Reconstruction</topic><topic>Web services</topic><topic>Workflow</topic><toplevel>online_resources</toplevel><creatorcontrib>Wang, Michael</creatorcontrib><creatorcontrib>Yang, Tingjun</creatorcontrib><creatorcontrib>Maria Acosta Flechas</creatorcontrib><creatorcontrib>Harris, Philip</creatorcontrib><creatorcontrib>Hawks, Benjamin</creatorcontrib><creatorcontrib>Holzman, Burt</creatorcontrib><creatorcontrib>Knoepfel, Kyle</creatorcontrib><creatorcontrib>Krupa, Jeffrey</creatorcontrib><creatorcontrib>Pedro, Kevin</creatorcontrib><creatorcontrib>Tran, Nhan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Database (Proquest)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Michael</au><au>Yang, Tingjun</au><au>Maria Acosta Flechas</au><au>Harris, Philip</au><au>Hawks, Benjamin</au><au>Holzman, Burt</au><au>Knoepfel, Kyle</au><au>Krupa, Jeffrey</au><au>Pedro, Kevin</au><au>Tran, Nhan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>GPU-accelerated machine learning inference as a service for computing in neutrino experiments</atitle><jtitle>arXiv.org</jtitle><date>2021-03-22</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>Machine learning algorithms are becoming increasingly prevalent and performant in the reconstruction of events in accelerator-based neutrino experiments. These sophisticated algorithms can be computationally expensive. At the same time, the data volumes of such experiments are rapidly increasing. The demand to process billions of neutrino events with many machine learning algorithm inferences creates a computing challenge. We explore a computing model in which heterogeneous computing with GPU coprocessors is made available as a web service. The coprocessors can be efficiently and elastically deployed to provide the right amount of computing for a given processing task. With our approach, Services for Optimized Network Inference on Coprocessors (SONIC), we integrate GPU acceleration specifically for the ProtoDUNE-SP reconstruction chain without disrupting the native computing workflow. With our integrated framework, we accelerate the most time-consuming task, track and particle shower hit identification, by a factor of 17. This results in a factor of 2.7 reduction in the total processing time when compared with CPU-only production. For this particular task, only 1 GPU is required for every 68 CPU threads, providing a cost-effective solution.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2009.04509</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2021-03
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2441676589
source	Publicly Available Content Database
subjects	Algorithms Central processing units Coprocessors CPUs Experiments Graphics processing units Inference Machine learning Neutrinos Reconstruction Web services Workflow
title	GPU-accelerated machine learning inference as a service for computing in neutrino experiments
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-11-05T07%3A57%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=GPU-accelerated%20machine%20learning%20inference%20as%20a%20service%20for%20computing%20in%20neutrino%20experiments&rft.jtitle=arXiv.org&rft.au=Wang,%20Michael&rft.date=2021-03-22&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2009.04509&rft_dat=%3Cproquest%3E2441676589%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a529-2325f28858d7bce750d37583f5a9198ad6aab3bee3cc7de458fae54347096c273%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2441676589&rft_id=info:pmid/&rfr_iscdi=true