Home > Archive > 2014 > Volume 4 Number 3 (June 2014) >
IJMLC 2014 Vol.4(3): 271-274 ISSN: 2010-3700
DOI: 10.7763/IJMLC.2014.V4.423

Natural Language Processing Technologies for Multi-Level Intelligent Spam Mail-Filter

Haiyan Kang and Xiaojiao Yuan

Abstract—To overcome the lack of existing mail filtering system, we designed a content-based message filtering system of multi-level intelligence. Using natural language processing technology, it denotes the E-mail content including attachments. First, it pre-processes the content of E-mail, including segmentation, feature extraction. Second, combining knowledge-base and expansion of the feature, it can form the vector. Corresponding categories vector in the database, two vectors similar degree of calculation determines the credibility of the message. Based on the above theory, with the Java EE 6+SQL Server 2005 platform, a mail filtering system is achieved. It can maximize the elimination of spam. The major features are following: 1) black /white list filtering. It can intercept white list blacklist e-mail messages released. 2) reverse DNS testing. it can effectively eliminate the anonymous e-mail attacks. 3) content-based message filtering. An accurate analysis of mail content can filter out suspicious messages. 4) fingerprint recognition. It can mimic the biological concept of fingerprint identification to complete the identification of spam. 5) user-personalized filtering. The user independently designed filter program. 6) intent detection. It can detect the content URL connection in email. Experiment shows mail filter system can play a very good effect on spam filters.

Index Terms—Spam mail-filter, Chinese word segmentation, mail classification, privacy protection, information security.

Haiyan Kang is with School of Information and Management, Beijing Information Science and Technology University, Beijing 100192, China. (e-mail: kanghaiyan@126.com).
Xiaojiao Yuan is with Computer School, Beijing Information Science and Technology University, Beijing 100192, China.


Cite: Haiyan Kang and Xiaojiao Yuan, "Natural Language Processing Technologies for Multi-Level Intelligent Spam Mail-Filter," International Journal of Machine Learning and Computing vol.4, no. 3, pp. 271-274, 2014.

General Information

  • ISSN: 2010-3700 (Online)
  • Abbreviated Title: Int. J. Mach. Learn. Comput.
  • Frequency: Bimonthly
  • DOI: 10.18178/IJMLC
  • Editor-in-Chief: Dr. Lin Huang
  • Executive Editor:  Ms. Cherry L. Chen
  • Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals LibraryCNKI.
  • E-mail: ijmlc@ejournal.net

Article Metrics in Dimensions