Loading…
Eigenvoice modeling with sparse training data
We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoice...
Saved in:
Published in: | IEEE transactions on speech and audio processing 2005-05, Vol.13 (3), p.345-354 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3 |
---|---|
cites | cdi_FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3 |
container_end_page | 354 |
container_issue | 3 |
container_start_page | 345 |
container_title | IEEE transactions on speech and audio processing |
container_volume | 13 |
creator | Kenny, P. Boulianne, G. Dumouchel, P. |
description | We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoices in situations where speaker-dependent training is not feasible, our method enables us to estimate as many eigenvoices from a given training set as there are training speakers. In the limit as the amount of training data for each speaker tends to infinity, it is equivalent to cluster adaptive training. |
doi_str_mv | 10.1109/TSA.2004.840940 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TSA_2004_840940</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1420369</ieee_id><sourcerecordid>919934512</sourcerecordid><originalsourceid>FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3</originalsourceid><addsrcrecordid>eNp9kM1LAzEQxYMoWKtnD16KoJ62nXxuciylfkDBg_UcstlsTdnu1mSr-N-bsoWCB08zzPzeMO8hdI1hjDGoyfJtOiYAbCwZKAYnaIA5lxmhnJ6mHgTNhMjFObqIcQ0AEudsgLK5X7nmq_XWjTZt6WrfrEbfvvsYxa0J0Y26YHyzH5amM5forDJ1dFeHOkTvj_Pl7DlbvD69zKaLzFKJuywvKpErC9xZqkxZCsUqwARLXKQXpORguSKSOyeoJIopUihb5IJYiwUFR4foob-7De3nzsVOb3y0rq5N49pd1AorRRnHJJH3_5JEAmAp8wTe_gHX7S40yYWWklJFKIUETXrIhjbG4Cq9DX5jwo_GoPcp65Sy3qes-5ST4u5w1kRr6iqYxvp4lInkXeQ8cTc9551zxzUjQIWivx-kgiw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>883392330</pqid></control><display><type>article</type><title>Eigenvoice modeling with sparse training data</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Kenny, P. ; Boulianne, G. ; Dumouchel, P.</creator><creatorcontrib>Kenny, P. ; Boulianne, G. ; Dumouchel, P.</creatorcontrib><description>We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoices in situations where speaker-dependent training is not feasible, our method enables us to estimate as many eigenvoices from a given training set as there are training speakers. In the limit as the amount of training data for each speaker tends to infinity, it is equivalent to cluster adaptive training.</description><identifier>ISSN: 1063-6676</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-2353</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TSA.2004.840940</identifier><identifier>CODEN: IESPEJ</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Applied sciences ; Cluster adaptive training ; Clusters ; Covariance matrix ; Eigenvalues and eigenfunctions ; eigenvoices ; Equivalence ; Exact sciences and technology ; Exact solutions ; extended MAP (EMAP) ; H infinity control ; Hidden Markov models ; Infinity ; Information, signal and communications theory ; Loudspeakers ; Mathematical analysis ; Mathematical models ; Maximum likelihood estimation ; Principal component analysis ; Signal processing ; speaker adaptation ; Speech ; Speech processing ; Speech recognition ; Telecommunications and information theory ; Testing ; Training ; Training data</subject><ispartof>IEEE transactions on speech and audio processing, 2005-05, Vol.13 (3), p.345-354</ispartof><rights>2005 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2005</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3</citedby><cites>FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1420369$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,786,790,27957,27958,55147</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=16694675$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Kenny, P.</creatorcontrib><creatorcontrib>Boulianne, G.</creatorcontrib><creatorcontrib>Dumouchel, P.</creatorcontrib><title>Eigenvoice modeling with sparse training data</title><title>IEEE transactions on speech and audio processing</title><addtitle>T-SAP</addtitle><description>We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoices in situations where speaker-dependent training is not feasible, our method enables us to estimate as many eigenvoices from a given training set as there are training speakers. In the limit as the amount of training data for each speaker tends to infinity, it is equivalent to cluster adaptive training.</description><subject>Applied sciences</subject><subject>Cluster adaptive training</subject><subject>Clusters</subject><subject>Covariance matrix</subject><subject>Eigenvalues and eigenfunctions</subject><subject>eigenvoices</subject><subject>Equivalence</subject><subject>Exact sciences and technology</subject><subject>Exact solutions</subject><subject>extended MAP (EMAP)</subject><subject>H infinity control</subject><subject>Hidden Markov models</subject><subject>Infinity</subject><subject>Information, signal and communications theory</subject><subject>Loudspeakers</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Maximum likelihood estimation</subject><subject>Principal component analysis</subject><subject>Signal processing</subject><subject>speaker adaptation</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>Telecommunications and information theory</subject><subject>Testing</subject><subject>Training</subject><subject>Training data</subject><issn>1063-6676</issn><issn>2329-9290</issn><issn>1558-2353</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><recordid>eNp9kM1LAzEQxYMoWKtnD16KoJ62nXxuciylfkDBg_UcstlsTdnu1mSr-N-bsoWCB08zzPzeMO8hdI1hjDGoyfJtOiYAbCwZKAYnaIA5lxmhnJ6mHgTNhMjFObqIcQ0AEudsgLK5X7nmq_XWjTZt6WrfrEbfvvsYxa0J0Y26YHyzH5amM5forDJ1dFeHOkTvj_Pl7DlbvD69zKaLzFKJuywvKpErC9xZqkxZCsUqwARLXKQXpORguSKSOyeoJIopUihb5IJYiwUFR4foob-7De3nzsVOb3y0rq5N49pd1AorRRnHJJH3_5JEAmAp8wTe_gHX7S40yYWWklJFKIUETXrIhjbG4Cq9DX5jwo_GoPcp65Sy3qes-5ST4u5w1kRr6iqYxvp4lInkXeQ8cTc9551zxzUjQIWivx-kgiw</recordid><startdate>20050501</startdate><enddate>20050501</enddate><creator>Kenny, P.</creator><creator>Boulianne, G.</creator><creator>Dumouchel, P.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7SP</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20050501</creationdate><title>Eigenvoice modeling with sparse training data</title><author>Kenny, P. ; Boulianne, G. ; Dumouchel, P.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Applied sciences</topic><topic>Cluster adaptive training</topic><topic>Clusters</topic><topic>Covariance matrix</topic><topic>Eigenvalues and eigenfunctions</topic><topic>eigenvoices</topic><topic>Equivalence</topic><topic>Exact sciences and technology</topic><topic>Exact solutions</topic><topic>extended MAP (EMAP)</topic><topic>H infinity control</topic><topic>Hidden Markov models</topic><topic>Infinity</topic><topic>Information, signal and communications theory</topic><topic>Loudspeakers</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Maximum likelihood estimation</topic><topic>Principal component analysis</topic><topic>Signal processing</topic><topic>speaker adaptation</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>Telecommunications and information theory</topic><topic>Testing</topic><topic>Training</topic><topic>Training data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kenny, P.</creatorcontrib><creatorcontrib>Boulianne, G.</creatorcontrib><creatorcontrib>Dumouchel, P.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEL</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Electronics & Communications Abstracts</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on speech and audio processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kenny, P.</au><au>Boulianne, G.</au><au>Dumouchel, P.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Eigenvoice modeling with sparse training data</atitle><jtitle>IEEE transactions on speech and audio processing</jtitle><stitle>T-SAP</stitle><date>2005-05-01</date><risdate>2005</risdate><volume>13</volume><issue>3</issue><spage>345</spage><epage>354</epage><pages>345-354</pages><issn>1063-6676</issn><issn>2329-9290</issn><eissn>1558-2353</eissn><eissn>2329-9304</eissn><coden>IESPEJ</coden><notes>ObjectType-Article-2</notes><notes>SourceType-Scholarly Journals-1</notes><notes>ObjectType-Feature-1</notes><notes>content type line 23</notes><abstract>We derive an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and show how it can be regarded as a new method of eigenvoice estimation. Unlike other approaches to the problem of estimating eigenvoices in situations where speaker-dependent training is not feasible, our method enables us to estimate as many eigenvoices from a given training set as there are training speakers. In the limit as the amount of training data for each speaker tends to infinity, it is equivalent to cluster adaptive training.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TSA.2004.840940</doi><tpages>10</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1063-6676 |
ispartof | IEEE transactions on speech and audio processing, 2005-05, Vol.13 (3), p.345-354 |
issn | 1063-6676 2329-9290 1558-2353 2329-9304 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TSA_2004_840940 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Applied sciences Cluster adaptive training Clusters Covariance matrix Eigenvalues and eigenfunctions eigenvoices Equivalence Exact sciences and technology Exact solutions extended MAP (EMAP) H infinity control Hidden Markov models Infinity Information, signal and communications theory Loudspeakers Mathematical analysis Mathematical models Maximum likelihood estimation Principal component analysis Signal processing speaker adaptation Speech Speech processing Speech recognition Telecommunications and information theory Testing Training Training data |
title | Eigenvoice modeling with sparse training data |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-09-23T02%3A32%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Eigenvoice%20modeling%20with%20sparse%20training%20data&rft.jtitle=IEEE%20transactions%20on%20speech%20and%20audio%20processing&rft.au=Kenny,%20P.&rft.date=2005-05-01&rft.volume=13&rft.issue=3&rft.spage=345&rft.epage=354&rft.pages=345-354&rft.issn=1063-6676&rft.eissn=1558-2353&rft.coden=IESPEJ&rft_id=info:doi/10.1109/TSA.2004.840940&rft_dat=%3Cproquest_cross%3E919934512%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c381t-7bf679c05ec39add694f012181b6678850c59285ee63829492b9cb762cc1630e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=883392330&rft_id=info:pmid/&rft_ieee_id=1420369&rfr_iscdi=true |