To read this content please select one of the options below:

Classifying information sender of web documents

Yoshikiyo Kato (National Institute of Information and Communications Technology, Kyoto, Japan)

Sadao Kurohashi (Kyoto University, Kyoto, Japan and National Institute of Information and Communications Technology, Kyoto, Japan)

Kentaro Inui (Nara Institute of Science and Technology, Nara, Japan and National Institute of Information and Communications Technology, Kyoto, Japan)

Internet Research

ISSN: 1066-2243

Article publication date: 4 April 2008

Downloads

600

Abstract

Purpose

–

To develop a method for classifying information sender of web documents, which constitutes an important part of information credibility analysis.

Design/methodology/approach

–

Machine learning approach was employed. About 2,000 human‐annotated web documents were prepared for training and evaluation. The classification model was based on support vector machine, and the features used for the classification included the title and URL of documents, as well as information of the top page.

Findings

–

With relatively small set of features, the proposed method achieved over 50 per cent accuracy.

Research limitations/implications

–

Some of the information sender categories were found to be more difficult to classify. This is due to the subjective nature of the categories, and further refinement of the categories is needed.

Practical implications

–

When combined with opinion/sentiment analysis techniques, information sender classification allows more profound analysis based on interactions between opinions and senders. Such analysis forms a basis of information credibility analysis.

Originality/value

–

This study formulated the problem of information sender classification. It proposed a method which achieves moderate performance. It also identified some of the issues related to information sender classification.

Keywords

Citation

Kato, Y., Kurohashi, S. and Inui, K. (2008), "Classifying information sender of web documents", Internet Research, Vol. 18 No. 2, pp. 191-203. https://doi.org/10.1108/10662240810862248

Publisher

:

Emerald Group Publishing Limited

To read this content please select one of the options below:

Please note you do not have access to teaching notes

Classifying information sender of web documents

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Keywords

Citation

Publisher

Related articles

Something didn’t work…

All feedback is valuable

Platform update page

Questions & More Information

To read this content please select one of the options below:

Please note you do not have access to teaching notes

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Keywords

Citation

Publisher

Related articles

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information