Abstract—The proposed protein function prediction methods are mostly based on sequence or structure protein similarity and do not take into account the semantic similarity extracted from protein knowledge databases such as Gene Ontology. Many studies have shown that identification of protein complexes or functional modules can be effectively done by clustering protein interaction network (PIN). A significant number of proteins in such PIN remain uncharacterized and predicting their function remains a major challenge in system biology. In this paper we present a “semantic driven” clustering approach for protein function prediction by using both semantic similarity metrics and the whole network topology of a PIN. We apply k-medoids clustering combined with several semantic similarity metrics as a weight factor in the distance-clustering matrix. Protein functions are assigned based on cluster information. Results reveal improvement over standard non-semantic similarity metric.
Index Terms—Protein clustering, gene ontology, semantic similarity.
The authors are with Ss. Cyril and Methodius University, Faculty of Computer Science and Engineering, 1000 Skopje, Macedonia (e-mail: email@example.com, firstname.lastname@example.org, email@example.com).
Cite:Ilinka Ivanoska, Kire Trivodaliev, and Slobodan Kalajdziski, "Protein Function Prediction Using Semantic Driven K-Medoids Clustering Algorithm," International Journal of Machine Learning and Computing vol.4, no. 1, pp. 52-56, 2014.