A Neural Language Model with Attention Mechanism Based on Convolutional Neural Network

Home > Archive > 2022 > Volume 12 Number 2 (Mar. 2022) >

IJMLC 2022 Vol.12(2): 57-62 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2022.12.2.1079

Seongik Park, Ki Yong Lee, and Yanggon Kim

Abstract—The convergence of artificial intelligence (AI) technology and natural language processing (NLP) has rapidly increased the demands for an analysis on the natural language that involves plenty of ambiguities not present in formal language. For this reason, the language model (LM), a statistical approach, has been used as a key role in this area. Recently, the emerging field of deep learning, which applies complex deep neural networks for machine learning tasks, has been applied to language modeling and achieved more remarkable results than traditional language models. One of the important techniques that have led neural network-based LM success is the attention mechanism. Attention mechanism makes neural networks pay attention to specific words in the input sentence when generating the output words. However, although the attention mechanism has improved the performance of many neural network models, it requires tons of parameters to achieve the state-of-art level performance. This is because attention mechanism encodes the context of a word by simply accumulating the outputs from the network for all the input words, which may cause information loss. To compensate for this limitation, we propose an extension of attention mechanism by adopting a convolutional neural network to replace the accumulation. With only far fewer parameters, our model achieved comparable performance to the recent state-of-the-art models on the very popular benchmark datasets, yielding perplexity scores of 58.4 on the Penn Treebank dataset and 50.1 on the Wikitext-2 dataset, respectively.

Index Terms—Attention mechanism, deep learning, neural language model, neural network.

Seongik Park and Yanggon Kim are with the Department of Computer & Information Sciences at Towson University, USA (e-mail: spark32@students.towson.edu, ykim@towson.edu).
Ki Yong Lee is with the Division of Computer Science at Sookmyung Women's University, South Korea (e-mail: kiyonglee@sookmyung.ac.kr).

[PDF]

Cite: Seongik Park, Ki Yong Lee, and Yanggon Kim, "A Neural Language Model with Attention Mechanism Based on Convolutional Neural Network," International Journal of Machine Learning and Computing vol. 12, no. 2, pp. 57-62, 2022.

Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

PREVIOUS PAPER

Effect of Named Entity Recognition on English-Vietnamese Neural Machine Translation

NEXT PAPER

Free-Floating Bike-Sharing Demand Prediction with Deep Learning

General Information

E-ISSN: 2972-368X
Abbreviated Title: Int. J. Mach. Learn.
Frequency: Quarterly
DOI: 10.18178/IJML
Editor-in-Chief: Dr. Lin Huang
Executive Editor: Ms. Cherry L. Chen
Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals Library, CNKI.
E-mail: ijml@ejournal.net
APC: 500USD

Home

About IJML

Editorial Board

Author Guideline

Editor Guideline

Reviewer Guideline

Special Issues

Archive

Home > Archive > 2022 > Volume 12 Number 2 (Mar. 2022) >

A Neural Language Model with Attention Mechanism Based on Convolutional Neural Network

General Information

Article Metrics in Dimensions