Loading…

Balancing false alarms and hits in Spoken Term Detection

This paper presents methods to improve retrieval of Out-Of-Vocabulary (OOV) terms in a Spoken Term Detection (STD) system. We demonstrate that automated tagging of OOV regions helps to reduce false alarms while incorporating phonetic confusability increases the hits. Additional features that boost t...

Full description

Saved in:
Bibliographic Details
Main Authors: Parada, Carolina, Sethy, Abhinav, Ramabhadran, Bhuvana
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents methods to improve retrieval of Out-Of-Vocabulary (OOV) terms in a Spoken Term Detection (STD) system. We demonstrate that automated tagging of OOV regions helps to reduce false alarms while incorporating phonetic confusability increases the hits. Additional features that boost the probability of a hit in accordance with the number of neighboring hits for the same query and query-length normalization also improve the overall performance of the spoken-term detection system. We show that these methods can be combined effectively to provide a relative improvement of 21% in Average Term Weighted Value (ATWV) on a 100-hour corpus with 1290 OOV-only queries and 2% relative on the NIST 2006 STD task, where only 16 of the 1107 queries were OOV terms. Lastly, we present results to show that the proposed methods are general enough to work well in query-by-example based spoken-term detection, and in mismatched situations when the representation of the index being searched through and the queries are not generated by the same system.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.2010.5494966