T-Rex: Optimizing Pattern Search on Time Series

Pattern search is an important class of queries for time series data. Time series patterns often match variable-length segments with a large search space, thereby posing a significant performance challenge. The existing pattern search systems, for example, SQL query engines supporting MATCH_RECOGNIZ...

Full description

Saved in:

Bibliographic Details
Published in:	Proceedings of the ACM on management of data 2023-06, Vol.1 (2), p.1-26
Main Authors:	Huang, Silu, Zhu, Erkang, Chaudhuri, Surajit, Spiegelberg, Leonhard
Format:	Article
Language:	eng
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

recordid	cdi_crossref_primary_10_1145_3589275
title	T-Rex: Optimizing Pattern Search on Time Series
format	Article
creator	Huang, Silu Zhu, Erkang Chaudhuri, Surajit Spiegelberg, Leonhard
ispartof	Proceedings of the ACM on management of data, 2023-06, Vol.1 (2), p.1-26
description	Pattern search is an important class of queries for time series data. Time series patterns often match variable-length segments with a large search space, thereby posing a significant performance challenge. The existing pattern search systems, for example, SQL query engines supporting MATCH_RECOGNIZE, are ineffective in pruning the large search space of variable-length segments. In many cases, the issue is due to the use of a restrictive query language modeled on time series points and a computational model that limits search space pruning. We built T-ReX to address this problem using two main building blocks: first, a MATCH_RECOGNIZE language extension that exposes the notion of segment variable and adds new operators, lending itself to better optimization; second, an executor capable of pruning the search space of matches and minimizing total query time using an optimizer. We conducted experiments using 5 real-world datasets and 11 query templates, including those from existing works. T-ReX outperformed an optimized NFA-based pattern search executor by 6x in median query time and an optimized tree-based executor by 19X.
language	eng
source	ACM Digital Library Complete
identifier	ISSN: 2836-6573
fulltext	fulltext
issn	2836-6573 2836-6573
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-05-21T16%3A12%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=T-Rex:%20Optimizing%20Pattern%20Search%20on%20Time%20Series&rft.jtitle=Proceedings%20of%20the%20ACM%20on%20management%20of%20data&rft.au=Huang,%20Silu&rft.date=2023-06-20&rft.volume=1&rft.issue=2&rft.spage=1&rft.epage=26&rft.pages=1-26&rft.issn=2836-6573&rft.eissn=2836-6573&rft_id=info:doi/10.1145/3589275&rft_dat=%3Ccrossref%3E10_1145_3589275%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a515-9394f68eabb81bcf0a59cdb7e4aa89474a1de0ffef14be9a98dfdbfdbb7c10c53%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/
container_title	Proceedings of the ACM on management of data
container_volume	1
container_issue	2
container_start_page	1
container_end_page	26
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3589275</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1145_3589275</sourcerecordid><originalsourceid>FETCH-LOGICAL-a515-9394f68eabb81bcf0a59cdb7e4aa89474a1de0ffef14be9a98dfdbfdbb7c10c53</originalsourceid><addsrcrecordid>eNpNj01LAzEURYMoWGrxL2TnKjaZfLuTolYoVHT2w0vmRSPOtCSzUH-9FbsQLtx7NhcOIZeCXwuh9FJq5xurT8iscdIwo608_bfPyaLWd855440U3szIsmXP-HlDt_spD_k7j6_0CaYJy0hfEEp8o7uRtnnAA5aM9YKcJfiouDj2nLT3d-1qzTbbh8fV7YaBFpp56VUyDiEEJ0JMHLSPfbCoAJxXVoHokaeESaiAHrzrUx8OCTYKHrWck6u_21h2tRZM3b7kAcpXJ3j3a9odTeUPsNdFKg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><isCDI>true</isCDI><recordtype>article</recordtype></control><display><type>article</type><title>T-Rex: Optimizing Pattern Search on Time Series</title><source>ACM Digital Library Complete</source><creator>Huang, Silu ; Zhu, Erkang ; Chaudhuri, Surajit ; Spiegelberg, Leonhard</creator><creatorcontrib>Huang, Silu ; Zhu, Erkang ; Chaudhuri, Surajit ; Spiegelberg, Leonhard</creatorcontrib><description>Pattern search is an important class of queries for time series data. Time series patterns often match variable-length segments with a large search space, thereby posing a significant performance challenge. The existing pattern search systems, for example, SQL query engines supporting MATCH_RECOGNIZE, are ineffective in pruning the large search space of variable-length segments. In many cases, the issue is due to the use of a restrictive query language modeled on time series points and a computational model that limits search space pruning. We built T-ReX to address this problem using two main building blocks: first, a MATCH_RECOGNIZE language extension that exposes the notion of segment variable and adds new operators, lending itself to better optimization; second, an executor capable of pruning the search space of matches and minimizing total query time using an optimizer. We conducted experiments using 5 real-world datasets and 11 query templates, including those from existing works. T-ReX outperformed an optimized NFA-based pattern search executor by 6x in median query time and an optimized tree-based executor by 19X.</description><identifier>ISSN: 2836-6573</identifier><identifier>EISSN: 2836-6573</identifier><identifier>DOI: 10.1145/3589275</identifier><language>eng</language><ispartof>Proceedings of the ACM on management of data, 2023-06, Vol.1 (2), p.1-26</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a515-9394f68eabb81bcf0a59cdb7e4aa89474a1de0ffef14be9a98dfdbfdbb7c10c53</cites><orcidid>0000-0002-2119-3230 ; 0009-0000-3326-1790 ; 0000-0002-5291-0167 ; 0000-0001-8252-5270</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,787,791,27985,27986</link.rule.ids></links><search><creatorcontrib>Huang, Silu</creatorcontrib><creatorcontrib>Zhu, Erkang</creatorcontrib><creatorcontrib>Chaudhuri, Surajit</creatorcontrib><creatorcontrib>Spiegelberg, Leonhard</creatorcontrib><title>T-Rex: Optimizing Pattern Search on Time Series</title><title>Proceedings of the ACM on management of data</title><description>Pattern search is an important class of queries for time series data. Time series patterns often match variable-length segments with a large search space, thereby posing a significant performance challenge. The existing pattern search systems, for example, SQL query engines supporting MATCH_RECOGNIZE, are ineffective in pruning the large search space of variable-length segments. In many cases, the issue is due to the use of a restrictive query language modeled on time series points and a computational model that limits search space pruning. We built T-ReX to address this problem using two main building blocks: first, a MATCH_RECOGNIZE language extension that exposes the notion of segment variable and adds new operators, lending itself to better optimization; second, an executor capable of pruning the search space of matches and minimizing total query time using an optimizer. We conducted experiments using 5 real-world datasets and 11 query templates, including those from existing works. T-ReX outperformed an optimized NFA-based pattern search executor by 6x in median query time and an optimized tree-based executor by 19X.</description><issn>2836-6573</issn><issn>2836-6573</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNj01LAzEURYMoWGrxL2TnKjaZfLuTolYoVHT2w0vmRSPOtCSzUH-9FbsQLtx7NhcOIZeCXwuh9FJq5xurT8iscdIwo608_bfPyaLWd855440U3szIsmXP-HlDt_spD_k7j6_0CaYJy0hfEEp8o7uRtnnAA5aM9YKcJfiouDj2nLT3d-1qzTbbh8fV7YaBFpp56VUyDiEEJ0JMHLSPfbCoAJxXVoHokaeESaiAHrzrUx8OCTYKHrWck6u_21h2tRZM3b7kAcpXJ3j3a9odTeUPsNdFKg</recordid><startdate>20230620</startdate><enddate>20230620</enddate><creator>Huang, Silu</creator><creator>Zhu, Erkang</creator><creator>Chaudhuri, Surajit</creator><creator>Spiegelberg, Leonhard</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-2119-3230</orcidid><orcidid>https://orcid.org/0009-0000-3326-1790</orcidid><orcidid>https://orcid.org/0000-0002-5291-0167</orcidid><orcidid>https://orcid.org/0000-0001-8252-5270</orcidid></search><sort><creationdate>20230620</creationdate><title>T-Rex: Optimizing Pattern Search on Time Series</title><author>Huang, Silu ; Zhu, Erkang ; Chaudhuri, Surajit ; Spiegelberg, Leonhard</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a515-9394f68eabb81bcf0a59cdb7e4aa89474a1de0ffef14be9a98dfdbfdbb7c10c53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Huang, Silu</creatorcontrib><creatorcontrib>Zhu, Erkang</creatorcontrib><creatorcontrib>Chaudhuri, Surajit</creatorcontrib><creatorcontrib>Spiegelberg, Leonhard</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of the ACM on management of data</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Huang, Silu</au><au>Zhu, Erkang</au><au>Chaudhuri, Surajit</au><au>Spiegelberg, Leonhard</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>T-Rex: Optimizing Pattern Search on Time Series</atitle><jtitle>Proceedings of the ACM on management of data</jtitle><date>2023-06-20</date><risdate>2023</risdate><volume>1</volume><issue>2</issue><spage>1</spage><epage>26</epage><pages>1-26</pages><issn>2836-6573</issn><eissn>2836-6573</eissn><abstract>Pattern search is an important class of queries for time series data. Time series patterns often match variable-length segments with a large search space, thereby posing a significant performance challenge. The existing pattern search systems, for example, SQL query engines supporting MATCH_RECOGNIZE, are ineffective in pruning the large search space of variable-length segments. In many cases, the issue is due to the use of a restrictive query language modeled on time series points and a computational model that limits search space pruning. We built T-ReX to address this problem using two main building blocks: first, a MATCH_RECOGNIZE language extension that exposes the notion of segment variable and adds new operators, lending itself to better optimization; second, an executor capable of pruning the search space of matches and minimizing total query time using an optimizer. We conducted experiments using 5 real-world datasets and 11 query templates, including those from existing works. T-ReX outperformed an optimized NFA-based pattern search executor by 6x in median query time and an optimized tree-based executor by 19X.</abstract><doi>10.1145/3589275</doi><orcidid>https://orcid.org/0000-0002-2119-3230</orcidid><orcidid>https://orcid.org/0009-0000-3326-1790</orcidid><orcidid>https://orcid.org/0000-0002-5291-0167</orcidid><orcidid>https://orcid.org/0000-0001-8252-5270</orcidid></addata></record>