Loading…

Eigenvoice modeling with sparse training data

We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoice...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on speech and audio processing 2005-05, Vol.13 (3), p.345-354
Main Authors: Kenny, P., Boulianne, G., Dumouchel, P.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3
cites cdi_FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3
container_end_page 354
container_issue 3
container_start_page 345
container_title IEEE transactions on speech and audio processing
container_volume 13
creator Kenny, P.
Boulianne, G.
Dumouchel, P.
description We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoices in situations where speaker-dependent training is not feasible, our method enables us to estimate as many eigenvoices from a given training set as there are training speakers. In the limit as the amount of training data for each speaker tends to infinity, it is equivalent to cluster adaptive training.
doi_str_mv 10.1109/TSA.2004.840940
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TSA_2004_840940</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1420369</ieee_id><sourcerecordid>919934512</sourcerecordid><originalsourceid>FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3</originalsourceid><addsrcrecordid>eNp9kM1LAzEQxYMoWKtnD16KoJ62nXxuciylfkDBg_UcstlsTdnu1mSr-N-bsoWCB08zzPzeMO8hdI1hjDGoyfJtOiYAbCwZKAYnaIA5lxmhnJ6mHgTNhMjFObqIcQ0AEudsgLK5X7nmq_XWjTZt6WrfrEbfvvsYxa0J0Y26YHyzH5amM5forDJ1dFeHOkTvj_Pl7DlbvD69zKaLzFKJuywvKpErC9xZqkxZCsUqwARLXKQXpORguSKSOyeoJIopUihb5IJYiwUFR4foob-7De3nzsVOb3y0rq5N49pd1AorRRnHJJH3_5JEAmAp8wTe_gHX7S40yYWWklJFKIUETXrIhjbG4Cq9DX5jwo_GoPcp65Sy3qes-5ST4u5w1kRr6iqYxvp4lInkXeQ8cTc9551zxzUjQIWivx-kgiw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>883392330</pqid></control><display><type>article</type><title>Eigenvoice modeling with sparse training data</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Kenny, P. ; Boulianne, G. ; Dumouchel, P.</creator><creatorcontrib>Kenny, P. ; Boulianne, G. ; Dumouchel, P.</creatorcontrib><description>We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoices in situations where speaker-dependent training is not feasible, our method enables us to estimate as many eigenvoices from a given training set as there are training speakers. In the limit as the amount of training data for each speaker tends to infinity, it is equivalent to cluster adaptive training.</description><identifier>ISSN: 1063-6676</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-2353</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TSA.2004.840940</identifier><identifier>CODEN: IESPEJ</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Applied sciences ; Cluster adaptive training ; Clusters ; Covariance matrix ; Eigenvalues and eigenfunctions ; eigenvoices ; Equivalence ; Exact sciences and technology ; Exact solutions ; extended MAP (EMAP) ; H infinity control ; Hidden Markov models ; Infinity ; Information, signal and communications theory ; Loudspeakers ; Mathematical analysis ; Mathematical models ; Maximum likelihood estimation ; Principal component analysis ; Signal processing ; speaker adaptation ; Speech ; Speech processing ; Speech recognition ; Telecommunications and information theory ; Testing ; Training ; Training data</subject><ispartof>IEEE transactions on speech and audio processing, 2005-05, Vol.13 (3), p.345-354</ispartof><rights>2005 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2005</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3</citedby><cites>FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1420369$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,786,790,27957,27958,55147</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=16694675$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Kenny, P.</creatorcontrib><creatorcontrib>Boulianne, G.</creatorcontrib><creatorcontrib>Dumouchel, P.</creatorcontrib><title>Eigenvoice modeling with sparse training data</title><title>IEEE transactions on speech and audio processing</title><addtitle>T-SAP</addtitle><description>We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoices in situations where speaker-dependent training is not feasible, our method enables us to estimate as many eigenvoices from a given training set as there are training speakers. In the limit as the amount of training data for each speaker tends to infinity, it is equivalent to cluster adaptive training.</description><subject>Applied sciences</subject><subject>Cluster adaptive training</subject><subject>Clusters</subject><subject>Covariance matrix</subject><subject>Eigenvalues and eigenfunctions</subject><subject>eigenvoices</subject><subject>Equivalence</subject><subject>Exact sciences and technology</subject><subject>Exact solutions</subject><subject>extended MAP (EMAP)</subject><subject>H infinity control</subject><subject>Hidden Markov models</subject><subject>Infinity</subject><subject>Information, signal and communications theory</subject><subject>Loudspeakers</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Maximum likelihood estimation</subject><subject>Principal component analysis</subject><subject>Signal processing</subject><subject>speaker adaptation</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>Telecommunications and information theory</subject><subject>Testing</subject><subject>Training</subject><subject>Training data</subject><issn>1063-6676</issn><issn>2329-9290</issn><issn>1558-2353</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><recordid>eNp9kM1LAzEQxYMoWKtnD16KoJ62nXxuciylfkDBg_UcstlsTdnu1mSr-N-bsoWCB08zzPzeMO8hdI1hjDGoyfJtOiYAbCwZKAYnaIA5lxmhnJ6mHgTNhMjFObqIcQ0AEudsgLK5X7nmq_XWjTZt6WrfrEbfvvsYxa0J0Y26YHyzH5amM5forDJ1dFeHOkTvj_Pl7DlbvD69zKaLzFKJuywvKpErC9xZqkxZCsUqwARLXKQXpORguSKSOyeoJIopUihb5IJYiwUFR4foob-7De3nzsVOb3y0rq5N49pd1AorRRnHJJH3_5JEAmAp8wTe_gHX7S40yYWWklJFKIUETXrIhjbG4Cq9DX5jwo_GoPcp65Sy3qes-5ST4u5w1kRr6iqYxvp4lInkXeQ8cTc9551zxzUjQIWivx-kgiw</recordid><startdate>20050501</startdate><enddate>20050501</enddate><creator>Kenny, P.</creator><creator>Boulianne, G.</creator><creator>Dumouchel, P.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7SP</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20050501</creationdate><title>Eigenvoice modeling with sparse training data</title><author>Kenny, P. ; Boulianne, G. ; Dumouchel, P.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Applied sciences</topic><topic>Cluster adaptive training</topic><topic>Clusters</topic><topic>Covariance matrix</topic><topic>Eigenvalues and eigenfunctions</topic><topic>eigenvoices</topic><topic>Equivalence</topic><topic>Exact sciences and technology</topic><topic>Exact solutions</topic><topic>extended MAP (EMAP)</topic><topic>H infinity control</topic><topic>Hidden Markov models</topic><topic>Infinity</topic><topic>Information, signal and communications theory</topic><topic>Loudspeakers</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Maximum likelihood estimation</topic><topic>Principal component analysis</topic><topic>Signal processing</topic><topic>speaker adaptation</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>Telecommunications and information theory</topic><topic>Testing</topic><topic>Training</topic><topic>Training data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kenny, P.</creatorcontrib><creatorcontrib>Boulianne, G.</creatorcontrib><creatorcontrib>Dumouchel, P.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEL</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on speech and audio processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kenny, P.</au><au>Boulianne, G.</au><au>Dumouchel, P.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Eigenvoice modeling with sparse training data</atitle><jtitle>IEEE transactions on speech and audio processing</jtitle><stitle>T-SAP</stitle><date>2005-05-01</date><risdate>2005</risdate><volume>13</volume><issue>3</issue><spage>345</spage><epage>354</epage><pages>345-354</pages><issn>1063-6676</issn><issn>2329-9290</issn><eissn>1558-2353</eissn><eissn>2329-9304</eissn><coden>IESPEJ</coden><notes>ObjectType-Article-2</notes><notes>SourceType-Scholarly Journals-1</notes><notes>ObjectType-Feature-1</notes><notes>content type line 23</notes><abstract>We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoices in situations where speaker-dependent training is not feasible, our method enables us to estimate as many eigenvoices from a given training set as there are training speakers. In the limit as the amount of training data for each speaker tends to infinity, it is equivalent to cluster adaptive training.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TSA.2004.840940</doi><tpages>10</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1063-6676
ispartof IEEE transactions on speech and audio processing, 2005-05, Vol.13 (3), p.345-354
issn 1063-6676
2329-9290
1558-2353
2329-9304
language eng
recordid cdi_crossref_primary_10_1109_TSA_2004_840940
source IEEE Electronic Library (IEL) Journals
subjects Applied sciences
Cluster adaptive training
Clusters
Covariance matrix
Eigenvalues and eigenfunctions
eigenvoices
Equivalence
Exact sciences and technology
Exact solutions
extended MAP (EMAP)
H infinity control
Hidden Markov models
Infinity
Information, signal and communications theory
Loudspeakers
Mathematical analysis
Mathematical models
Maximum likelihood estimation
Principal component analysis
Signal processing
speaker adaptation
Speech
Speech processing
Speech recognition
Telecommunications and information theory
Testing
Training
Training data
title Eigenvoice modeling with sparse training data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-09-23T02%3A32%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Eigenvoice%20modeling%20with%20sparse%20training%20data&rft.jtitle=IEEE%20transactions%20on%20speech%20and%20audio%20processing&rft.au=Kenny,%20P.&rft.date=2005-05-01&rft.volume=13&rft.issue=3&rft.spage=345&rft.epage=354&rft.pages=345-354&rft.issn=1063-6676&rft.eissn=1558-2353&rft.coden=IESPEJ&rft_id=info:doi/10.1109/TSA.2004.840940&rft_dat=%3Cproquest_cross%3E919934512%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=883392330&rft_id=info:pmid/&rft_ieee_id=1420369&rfr_iscdi=true