To read this content please select one of the options below:

Exploring the impact of short-text complexity and structure on its quality in social media

Jamal Al Qundus (AG Corporate Semantic Web, Institute of Computer Science, Free University of Berlin, Berlin, Germany)
Adrian Paschke (AG Corporate Semantic Web, Institute of Computer Science, Free University of Berlin, Berlin, Germany)
Shivam Gupta (Montpellier Research in Management, Montpellier Business School, Montpellier, France)
Ahmad M. Alzouby (Department of Architecture, Jordan University of Science and Technology, Irbid, Jordan)
Malik Yousef (Zefat Academic College, Zefat, Israel)

Journal of Enterprise Information Management

ISSN: 1741-0398

Article publication date: 26 June 2020

Issue publication date: 7 December 2020

472

Abstract

Purpose

The purpose of this paper is to explore to which extent the quality of social media short text without extensions can be investigated and what are the predictors, if any, of such short text that lead to trust its content.

Design/methodology/approach

The paper applies a trust model to classify data collections based on metadata into four classes: Very Trusted, Trusted, Untrusted and Very Untrusted. These data are collected from the online communities, Genius and Stack Overflow. In order to evaluate short texts in terms of its trust levels, the authors have conducted two investigations: (1) A natural language processing (NLP) approach to extract relevant features (i.e. Part-of-Speech and various readability indexes). The authors report relatively good performance of the NLP study. (2) A machine learning technique in more precise, a random forest (RF) classifierusing bag-of-words model (BoW).

Findings

The investigation of the RF classifier using BoW shows promising intermediate results (on average 62% accuracy of both online communities) in short-text quality identification that leads to trust.

Practical implications

As social media becomes an increasingly new and attractive source of information, which is mostly provided in the form of short texts, businesses (e.g. in search engines for smart data) can filter content without having to apply complex approaches and continue to deal with information that is considered more trustworthy.

Originality/value

Short-text classifications with regard to a criterion (e.g. quality, readability) are usually extended by an external source or its metadata. This enhancement either changes the original text if it is an additional text from an external source, or it requires text metadata that is not always available. To this end, the originality of this study faces the challenge of investigating the quality of short text (i.e. social media text) without having to extend or modify it using external sources. This modification alters the text and distorts the results of the investigation.

Keywords

Citation

Al Qundus, J., Paschke, A., Gupta, S., Alzouby, A.M. and Yousef, M. (2020), "Exploring the impact of short-text complexity and structure on its quality in social media", Journal of Enterprise Information Management, Vol. 33 No. 6, pp. 1443-1466. https://doi.org/10.1108/JEIM-06-2019-0156

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles