To read this content please select one of the options below:

An investigation of K‐means clustering to high and multi‐dimensional biological data

Barileé B. Baridam (Department of Computer Science, University of Pretoria, Pretoria, South Africa)

M. Montaz Ali (School of Computational and Applied Mathematics, Witwatersrand University, Johannesburg, South Africa)

Kybernetes

ISSN: 0368-492X

Article publication date: 19 April 2013

Downloads

338

Abstract

Purpose

–

The K‐means clustering algorithm has been intensely researched owing to its simplicity of implementation and usefulness in the clustering task. However, there have also been criticisms on its performance, in particular, for demanding the value of K before the actual clustering task. It is evident from previous researches that providing the number of clusters a priori does not in any way assist in the production of good quality clusters. The authors' investigations in this paper also confirm this finding. The purpose of this paper is to investigate further, the usefulness of the K‐means clustering in the clustering of high and multi‐dimensional data by applying it to biological sequence data.

Design/methodology/approach

–

The authors suggest a scheme which maps the high dimensional data into low dimensions, then show that the K‐means algorithm with pre‐processor produces good quality, compact and well‐separated clusters of the biological data mapped in low dimensions. For the purpose of clustering, a character‐to‐numeric conversion was conducted to transform the nucleic/amino acids symbols to numeric values.

Findings

–

A preprocessing technique has been suggested.

Originality/value

–

Conceptually this is a new paper with new results.

Keywords

Citation

Baridam, B.B. and Montaz Ali, M. (2013), "An investigation of K‐means clustering to high and multi‐dimensional biological data", Kybernetes, Vol. 42 No. 4, pp. 614-627. https://doi.org/10.1108/K-02-2013-0028

Publisher

:

Emerald Group Publishing Limited

To read this content please select one of the options below:

Please note you do not have access to teaching notes

An investigation of K‐means clustering to high and multi‐dimensional biological data

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Keywords

Citation

Publisher

Related articles

Something didn’t work…

All feedback is valuable

Platform update page

Questions & More Information

To read this content please select one of the options below:

Please note you do not have access to teaching notes

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Keywords

Citation

Publisher

Related articles

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information