IJMLC 2014 Vol.4(3): 271-274 ISSN: 2010-3700
Natural Language Processing Technologies for Multi-Level Intelligent Spam Mail-Filter
Haiyan Kang and Xiaojiao Yuan
Abstract—To overcome the lack of existing mail filtering system, we designed a content-based message filtering system of multi-level intelligence. Using natural language processing technology, it denotes the E-mail content including attachments. First, it pre-processes the content of E-mail, including segmentation, feature extraction. Second, combining knowledge-base and expansion of the feature, it can form the vector. Corresponding categories vector in the database, two vectors similar degree of calculation determines the credibility of the message. Based on the above theory, with the Java EE 6+SQL Server 2005 platform, a mail filtering system is achieved. It can maximize the elimination of spam. The major features are following: 1) black /white list filtering. It can intercept white list blacklist e-mail messages released. 2) reverse DNS testing. it can effectively eliminate the anonymous e-mail attacks. 3) content-based message filtering. An accurate analysis of mail content can filter out suspicious messages. 4) fingerprint recognition. It can mimic the biological concept of fingerprint identification to complete the identification of spam. 5) user-personalized filtering. The user independently designed filter program. 6) intent detection. It can detect the content URL connection in email. Experiment shows mail filter system can play a very good effect on spam filters.
—Spam mail-filter, Chinese word segmentation, mail classification, privacy protection, information security.
Haiyan Kang is with School of Information and Management, Beijing Information Science and Technology University, Beijing 100192, China. (e-mail: firstname.lastname@example.org).
Xiaojiao Yuan is with Computer School, Beijing Information Science and Technology University, Beijing 100192, China.
Cite: Haiyan Kang and Xiaojiao Yuan, "Natural Language Processing Technologies for Multi-Level Intelligent Spam Mail-Filter," International Journal of Machine Learning and Computing vol.4, no. 3, pp. 271-274, 2014.