Labeling Sequential Data Based on Word Representations and Conditional Random Fields

Home > Archive > 2015 > Volume 5 Number 6 (Dec. 2015) >

IJMLC 2015 Vol.5(6): 439-444 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2015.5.6.548

Xiuying Wang, Bo Xu, Changliang Li, and Wendong Ge

Abstract—Most of Natural Language Processing tasks including part-of-speech tagging, chunking, named entity recognition can be seen as tasks assigning labels to words. Many existing methods including hidden Markov models, maximum entropy Markov models and conditional random fields have been applied to label sequential data, which rely on amount of training data and can’t solve the problem of out-of-lexicon words. In this paper, we propose a new method based on word representations and conditional random fields to solve these problems. We preprocess input features via computing word similarity based on word representations which can capture semantic similarity of words on the basis of vast amounts of unlabeled training data, and then use these preprocessed features as input features of training data to train conditional random fields model. The experiment results show that our approach has improvements in labeling accuracies upon the existing methods.

Index Terms—Conditional random fields, label sequential data, word representations, word similarity.

The authors are with the Institute of Automation, Chinese Academy of Sciences, China (e-mail: xiuying.wang@ia.ac.cn).

[PDF]

Cite: Xiuying Wang, Bo Xu, Changliang Li, and Wendong Ge, "Labeling Sequential Data Based on Word Representations and Conditional Random Fields," International Journal of Machine Learning and Computing vol.5, no. 6, pp. 439-444, 2015.

PREVIOUS PAPER

Network Security Analysis by Using Business Intelligence

NEXT PAPER

Application of Wearable Devices to Running during Training

General Information

E-ISSN: 2972-368X
Abbreviated Title: Int. J. Mach. Learn.
Frequency: Quaterly
DOI: 10.18178/IJML
Editor-in-Chief: Dr. Lin Huang
Executive Editor: Ms. Cherry L. Chen
Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals Library, CNKI.
E-mail: ijml@ejournal.net

Home

About IJML

Editorial Board

Author Guideline

Editor Guideline

Reviewer Guideline

Special Issues

Archive

Home > Archive > 2015 > Volume 5 Number 6 (Dec. 2015) >

Labeling Sequential Data Based on Word Representations and Conditional Random Fields

General Information

Article Metrics in Dimensions