Abstract—The increase in the volume of opinion posted on social media sites has led to a tremendous increase in the dimensionality of data used for the sentiment analysis. The selection of informative features from textual data can improve the performance of supervised learning methods. In this article, we propose a novel and efficient method for integrating different filter-based feature selection methods for sentiment classification. The ensemble method utilizes hesitant fuzzy sets for representing opinions of different filter-based feature selection methods in order to optimize the relevancyscore among features and class labels. Based on this relevancy score, top-k ranked features are selected for sentiment classification. The proposed feature selection method with Naïve Bayes and Support Vector Machine classifiers was evaluated on three most widely used datasets for sentiment analysis using Unigram and Parts-of-Speech based text representation schemes. The performance is evaluated using five-fold cross validation technique and the results show that the proposed method can achieve greater value of accuracy with only 10-25% of total extracted features. The outcomes of comparison carried out via statistical tests confirm that the aggregation using hesitant fuzzy sets is more effective than baseline feature selection methods on Parts-of-Speech features in terms of performance metrics.
Index Terms—Filter-based feature selection,hesitant fuzzy sets,Naïve Bayes classifier,sentiment classification,support vector machine.
Gunjan Ansari,Tanvir Ahmad and Mohammad Najmud Doja are with the Department of Computer Engineering,Faculty of Engineering & Technology,Jamia Millia Islamia,JamiaNagar, New Delhi-110025, India (e-mail: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org).
Cite: Gunjan Ansari, Tanvir Ahmad, and Mohammad Najmud Doja, "Ensemble of Feature Ranking Methods Using Hesitant Fuzzy Sets for Sentiment Classification," International Journal of Machine Learning and Computing vol. 9, no. 5, pp. 599-608, 2019.Copyright © 2019 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).