سامانه پژوهشی دانشگاه کردستان | Feature selection based on hybridization of Information gain and graph clustering for text classification

عنوان	Feature selection based on hybridization of Information gain and graph clustering for text classification
نوع پژوهش	مقاله ارائه شده کنفرانسی
کلیدواژه‌ها	Feature selection, Information gain, text categorization, Feature clustering.
چکیده	Text datasets usually have a lot of features. Therefore, theirs classification cost is too much and feature selection in this context is of vital importance. In this paper, a novel feature selection method based on information gain and FAST algorithm is proposed. In the proposed method, at first, the features with higher information gain are selected. Then, the FAST algorithm on the selected features is applied. Experiments are carried out to compare our algorithm with several feature selection techniques. The new approach is tested on three text datasets. The results confirm that the proposed method produces smaller feature subset in shorter time. The evaluation of a K-nearest neighborhood classifier on validation data show that, the novel algorithm gives higher classification accuracy.
پژوهشگران	شادی رحیمی (نفر اول)، علیرضا عبداله پوری (نفر دوم)، فاطمه زمانی (نفر سوم)، پرهام مرادی دولت آبادی (نفر چهارم)