Research Info

Home /Elastic Deep Autoencoder for ...
Title Elastic Deep Autoencoder for Text Embedding Clustering by an Improved Graph Regularization
Type JournalPaper
Keywords Deep autoencoder;Text clusteringGraph; regularizationText embedding
Abstract Text clustering is a task for grouping extracted information of the text in different clusters, which has many applications in recommender systems, sentiment analysis, and more. Deep learning-based methods have become increasingly popular due to their high accuracy in identifying nonlinear structures. They usually consist of two major parts: dimensionality reduction and clustering. Autoencoders are simple unsupervised neural networks used for better representation of low-dimensional data and have shown good performance in dealing with non-linear features. However, while they utilize the Frobenius norm to deal well with Gaussian noise, they are sensitive to outlier data and Laplacian noise. In this paper, a deep autoencoder with an adapted elastic loss for text embedding clustering (EDA-TEC) is proposed. The elastic loss is a combination of the Frobenius norm and -norm to consider both types of noises. Additionally, to maintain the high-dimensional data geometric structure, a modified graph regularization term based on the weighted cosine similarity measure is used. EDA-TEC also improves clustering results by considering the sparsity regularization of the manifold representation data. In this jointly end-to-end deep learning model, better representation and text clustering results are achieved with high accuracy on common datasets compared to existing methods.1
Researchers Pedram Yamini (Fourth Researcher), Ali Nafisi (Third Researcher), Sayvan Soleymanbaigi (Second Researcher), Fatemeh Daneshfar (First Researcher)