Two phase estimation method for multi-classifying real life tweets
International Journal of Web Information Systems
ISSN: 1744-0084
Article publication date: 11 November 2014
Abstract
Purpose
This paper aims to propose a multi-label method that estimates appropriate aspects against unknown tweets using the two-phase estimation method. Many Twitter users share daily events and opinions. Some beneficial comments are posted on such real-life aspects as eating, traffic, weather and so on. Such posts as “The train is not coming” are categorized in the Traffic aspect. Such tweets as “The train is delayed by heavy rain” are categorized in both the Traffic and Weather aspects.
Design/methodology/approach
The proposed method consists of two phases. In the first, many topics are extracted from a sea of tweets using Latent Dirichlet Allocation (LDA). In the second, associations among many topics and fewer aspects are built using a small set of labeled tweets. The aspect scores for tweets were calculated using associations based on the extracted terms. Appropriate aspects are labeled for unknown tweets by averaging the aspect scores.
Findings
Using a large amount of actual tweets, the sophisticated experimental evaluations demonstrate the high efficiency of the proposed multi-label classification method. It is confirmed that high F-measure aspects are strongly associated with topics that have high relevance. Low F-measure aspects are associated with topics that are connected to many other aspects.
Originality/value
The proposed method features two-phase semi-supervised learning. Many topics are extracted using an unsupervised learning model called LDA. Associations among many topics and fewer aspects are built using labeled tweets.
Keywords
Acknowledgements
This work was supported by a Grant-in-Aid for scientific Research No. 25280110.
Citation
Yamamoto, S. and Satoh, T. (2014), "Two phase estimation method for multi-classifying real life tweets", International Journal of Web Information Systems, Vol. 10 No. 4, pp. 378-393. https://doi.org/10.1108/IJWIS-04-2014-0013
Publisher
:Emerald Group Publishing Limited
Copyright © 2014, Emerald Group Publishing Limited