Loading…
Discriminative Feature Spamming Technique for Roman Urdu Sentiment Analysis
Term weighting is one of the most commonly used approaches, which works by assigning weights to terms, that aims to improve the performance of information retrieval or text categorization tasks. In this paper, we present a novel term weighting technique, called discriminative feature spamming techni...
Saved in:
Published in: | IEEE access 2019, Vol.7, p.47991-48002 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Term weighting is one of the most commonly used approaches, which works by assigning weights to terms, that aims to improve the performance of information retrieval or text categorization tasks. In this paper, we present a novel term weighting technique, called discriminative feature spamming technique (DFST), which identifies distinctive terms, based on a term utility criteria (TUC), and then spams them to increase their discriminative power. The experimental results show that the DFST outperformed a set of time-tested term weighting schemes, from the information retrieval field. All the experiments were performed on the largest ever Roman Urdu (RU) dataset of 11000 reviews, which was collected and annotated for this work. In addition, a custom tokenizer was built, which further improved classification accuracy. A cross-scheme comparison was performed, which showed that the results obtained by using the newly proposed DFST, were statistically significant and better than previous approaches. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2019.2908420 |