Loading…
A conditional random fields model for overlapping ambiguity resolution in Chinese word segmentation
Overlapping ambiguity is a kind of ambiguity phenomena in the Chinese word segmentation. Up to now, the researches on overlapping ambiguity always focused on the 3-character overlapping ambiguity strings. In this paper the distribution and forms of overlapping ambiguity strings are discussed empiric...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Overlapping ambiguity is a kind of ambiguity phenomena in the Chinese word segmentation. Up to now, the researches on overlapping ambiguity always focused on the 3-character overlapping ambiguity strings. In this paper the distribution and forms of overlapping ambiguity strings are discussed empirically. In order to deal with the overlapping ambiguity strings in different forms synchronously, a conditional random fields model is used. Different features for overlapping ambiguity resolution are explored, including component independency probability, component co-occurrence probability, in-word probability of a component and string structures. The experimental results show that the precision reaches 93.81% in the open test. |
---|---|
DOI: | 10.1109/GRC.2009.5255092 |