سامانه پژوهشی دانشگاه کردستان | Elastic Deep Autoencoder for Text Embedding Clustering by an Improved Graph Regularization

عنوان	Elastic Deep Autoencoder for Text Embedding Clustering by an Improved Graph Regularization
نوع پژوهش	مقاله چاپ‌شده در مجلات علمی
کلیدواژه‌ها	Deep autoencoder;Text clusteringGraph; regularizationText embedding
چکیده	Text clustering is a task for grouping extracted information of the text in different clusters, which has many applications in recommender systems, sentiment analysis, and more. Deep learning-based methods have become increasingly popular due to their high accuracy in identifying nonlinear structures. They usually consist of two major parts: dimensionality reduction and clustering. Autoencoders are simple unsupervised neural networks used for better representation of low-dimensional data and have shown good performance in dealing with non-linear features. However, while they utilize the Frobenius norm to deal well with Gaussian noise, they are sensitive to outlier data and Laplacian noise. In this paper, a deep autoencoder with an adapted elastic loss for text embedding clustering (EDA-TEC) is proposed. The elastic loss is a combination of the Frobenius norm and -norm to consider both types of noises. Additionally, to maintain the high-dimensional data geometric structure, a modified graph regularization term based on the weighted cosine similarity measure is used. EDA-TEC also improves clustering results by considering the sparsity regularization of the manifold representation data. In this jointly end-to-end deep learning model, better representation and text clustering results are achieved with high accuracy on common datasets compared to existing methods.1
پژوهشگران	فاطمه دانشفر (نفر اول)، پدرام یمینی (نفر چهارم)، سیوان سلیمان بیگی (نفر دوم)، علی نفیسی (نفر سوم)

مشخصات پژوهش