Home > Archive > 2022 > Volume 12 Number 2 (Mar. 2022) >
IJMLC 2022 Vol.12(2): 57-62 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2022.12.2.1079

A Neural Language Model with Attention Mechanism Based on Convolutional Neural Network

Seongik Park, Ki Yong Lee, and Yanggon Kim

Abstract—The convergence of artificial intelligence (AI) technology and natural language processing (NLP) has rapidly increased the demands for an analysis on the natural language that involves plenty of ambiguities not present in formal language. For this reason, the language model (LM), a statistical approach, has been used as a key role in this area. Recently, the emerging field of deep learning, which applies complex deep neural networks for machine learning tasks, has been applied to language modeling and achieved more remarkable results than traditional language models. One of the important techniques that have led neural network-based LM success is the attention mechanism. Attention mechanism makes neural networks pay attention to specific words in the input sentence when generating the output words. However, although the attention mechanism has improved the performance of many neural network models, it requires tons of parameters to achieve the state-of-art level performance. This is because attention mechanism encodes the context of a word by simply accumulating the outputs from the network for all the input words, which may cause information loss. To compensate for this limitation, we propose an extension of attention mechanism by adopting a convolutional neural network to replace the accumulation. With only far fewer parameters, our model achieved comparable performance to the recent state-of-the-art models on the very popular benchmark datasets, yielding perplexity scores of 58.4 on the Penn Treebank dataset and 50.1 on the Wikitext-2 dataset, respectively.

Index Terms—Attention mechanism, deep learning, neural language model, neural network.

Seongik Park and Yanggon Kim are with the Department of Computer & Information Sciences at Towson University, USA (e-mail: spark32@students.towson.edu, ykim@towson.edu).
Ki Yong Lee is with the Division of Computer Science at Sookmyung Women's University, South Korea (e-mail: kiyonglee@sookmyung.ac.kr).

[PDF]

Cite: Seongik Park, Ki Yong Lee, and Yanggon Kim, "A Neural Language Model with Attention Mechanism Based on Convolutional Neural Network," International Journal of Machine Learning and Computing vol. 12, no. 2, pp. 57-62, 2022.

Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

General Information

  • E-ISSN: 2972-368X
  • Abbreviated Title: Int. J. Mach. Learn.
  • Frequency: Quaterly
  • DOI: 10.18178/IJML
  • Editor-in-Chief: Dr. Lin Huang
  • Executive Editor:  Ms. Cherry L. Chen
  • Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals LibraryCNKI.
  • E-mail: ijml@ejournal.net


Article Metrics in Dimensions