• Jul 29, 2019 News!IJMLC Had Implemented Online Submission System, Please Sumbit New Submissions thorough This System Only!   [Click]
  • Jul 16, 2019 News!Good News! All papers from Volume 9, Number 3 have been indexed by Scopus!   [Click]
  • Jul 08, 2019 News!Vol.9, No.4 has been published with online version.   [Click]
Search
General Information
    • ISSN: 2010-3700 (Online)
    • Abbreviated Title: Int. J. Mach. Learn. Comput.
    • Frequency: Bimonthly
    • DOI: 10.18178/IJMLC
    • Editor-in-Chief: Dr. Lin Huang
    • Executive Editor:  Ms. Cherry L. Chen
    • Abstracing/Indexing: Scopus (since 2017), EI (INSPEC, IET), Google Scholar, Crossref, ProQuest, Electronic Journals Library.
    • E-mail: ijmlc@ejournal.net
Editor-in-chief
Dr. Lin Huang
Metropolitan State University of Denver, USA
It's my honor to take on the position of editor in chief of IJMLC. We encourage authors to submit papers concerning any branch of machine learning and computing.

IJMLC 2019 Vol.9(1): 103-107 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2019.9.1.772

FICOBU: Filipino WordNet Construction Using Decision Tree and Language Modeling

Ria Ambrocio Sagum, Aldrin D. Ramos, and Monique T. Llanes
Abstract—The paper discusses the approach in creating a Filipino WordNet. A semi-supervised learning approach using Decision Tree and Language Modeling. This will take advantage on the information found on the web. It will help future NLP researchers in Filipino language. The approach uses words from a dictionary as preliminary data and as seed for the search engine to start crawling the WWW. To decide if the word is part of Filipino language, the word will first undergo in Code-Switching Points Module (CSPD). CSPD scores the word by using the frequency counts of word bigrams and unigrams from language models which were trained from an existing and available corpus. After scoring, Filipino Stemmer will get the stem of the word and examine if the stem word is part of the said language. Once the words were scored and stemmed, the archive will evaluate if the word is Filipino. To test the accuracy of the system, we collected different articles around the web and then grouped it into two groups — Plain Filipino and Bilingual. The result shows the F-measure for Plain Filipino Category range between 65.65% - 96.85% with an average of 85.64% while for Bilingual range between 60% - 100% with an average of 88.17%.

Index Terms—Corpus building, information retrieval, data and web mining, lexicography.

R. A. Sagum is with the Department of Computer Science, College of Computer and Information Sciences (CCIS), Polytechnic University of the Philippines, Philippines (e-mail: rasagum@pup.edu.ph).
A. D. Ramos and M. T. Llanes are with Polytechnic University of the Philippines, Philippines

[PDF]

Cite: Ria Ambrocio Sagum, Aldrin D. Ramos, and Monique T. Llanes, "FICOBU: Filipino WordNet Construction Using Decision Tree and Language Modeling," International Journal of Machine Learning and Computing vol. 9, no. 1, pp. 103-107, 2019.

Copyright © 2008-2019. International Journal of Machine Learning and Computing. All rights reserved.
E-mail: ijmlc@ejournal.net