• Aug 09, 2018 News! Vol. 6, No. 4-No. 7, No. 3 has been indexed by EI(Inspec)!   [Click]
  • Aug 09, 2018 News!Good News! All papers from Volume 8, Number 3 have been indexed by Scopus!   [Click]
  • May 23, 2018 News![CFP] 2018 the annual meeting of IJMLC Editorial Board, ACMLC 2018, will be held in Ho Chi Minh, Vietnam, December 7-9, 2018   [Click]
General Information
Dr. Lin Huang
Metropolitan State University of Denver, USA
It's my honor to take on the position of editor in chief of IJMLC. We encourage authors to submit papers concerning any branch of machine learning and computing.
IJMLC 2018 Vol.8(4): 311-318 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2018.8.4.704

Weighted Frequent Itemset of SNPs in Genome Wide Studies

Sofianita Mutalib, Azlinah Mohamed, Shuzlina Abdul-Rahman, and Norlaila Mustafa
Abstract—Genome wide association study (GWAS) is a study to investigate the correlations between genetic variants and traits. GWAS normally focus on the associations between single-nucleotide polymorphisms (SNPs) and traits like major human diseases. Generally, GWAS uses standard statistical tests on each SNP to capture main the genetic effects. However, the association is done between a single SNP and the trait. This study make use the whole sets of available SNPs in GWAS, data mining approach is applied to associate more than one SNPs to traits. In general, this will complement the GWAS to help understand complex diseases. This paper presents a proposed frequent itemset mining with weights to discover important sets of SNPs that are associated with diabetes. The purpose of using weights is to mine SNPs that might be less frequent but important in the study of diabetes. The approach consists of three stages: first, reduction of feature space and testing them through classifiers; second, the selection of informative SNPs through allelic testing; then, weight assignment for the selected SNPs; and third, itemset mining and gene analysis. The proposed approach has proven to be effective by helping to discover genes that have associated with the risk of diabetes. These patterns could be used as a set of significant information extracted by mining genetic variants in any particular SNP.

Index Terms—Diabetes, feature selection, frequent itemset mining, single nucleotide polymorphism, weight.

Sofianita Mutalib, Azlinah Mohamed, and Shuzlina Abdul-Rahman are with the Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam, Selangor Malaysia (e-mail: sofi@tmsk.uitm.edu.my, azlinah@tmsk.uitm.edu.my and shuzlina@tmsk.uitm.edu.my).
Norlaila Mustafa is with the Medical Department, Faculty of Medicine, Hospital Canselor Tuanku Muhriz, Jalan Yaacob Latif, Bandar Tun Razak, Universiti Kebangsaan Malaysia, 56000 Cheras, Kuala Lumpur, Malaysia (e-mail: norlaila@ppukm.ukm.edu.my).


Cite: Sofianita Mutalib, Azlinah Mohamed, Shuzlina Abdul-Rahman, and Norlaila Mustafa, "Sofianita Mutalib, Azlinah Mohamed, Shuzlina Abdul-Rahman, and Norlaila Mustafa," International Journal of Machine Learning and Computing vol. 8, no. 4, pp. 311-318, 2018.

Copyright © 2008-2018. International Journal of Machine Learning and Computing. All rights reserved.
E-mail: ijmlc@ejournal.net