WCSE 2022
ISBN: 978-981-18-3959-7 DOI: 10.18178/wcse.2022.06.055

Feature Completion Using Correlation-Preserved Autoencoder

Li Sun, Tao Liu, Jiyun Li

Abstract—Missing value existing in various datasets always causes tremendous obstacles to data mining in the real world. In recent years, autoencoder has been a popular deep learning model in data imputation due to the simple structure and efficient training period. In this paper, we developed a model that combined advantages of multiple imputations using denoising autoencoder (MIDA) and tracking-remove autoencoder (TRAE), integrating the idea of tracking-remove into MIDA. We introduce KNN to pre-impute the dataset after the input was denoising, and then we put the pre-imputed input data into both MIDA and MIDA with tracking-remove and ensemble the outputs by linear combination corresponding to the “multiple imputation” thought. The model called correlation-preserved autoencoder (CPAE) is applied to the completion of brain tissue feature data in ADNIMERGE (ADNI database). Experiments show that CAPE has a better performance than MIDA, TARE, and other autoencoders.

Index Terms—data imputation; Alzheimer's disease; autoencoder; KNN; data fusion

Li Sun, Tao Liu, Jiyun Li
School of Computer Science and Technology, Donghua University Shanghai, CHINA


Cite:Li Sun, Tao Liu, Jiyun Li, "Feature Completion Using Correlation-Preserved Autoencoder, " Proceedings of 2022 the 12th International Workshop on Computer Science and Engineering (WCSE 2022), pp. 389-394, June 24-27, 2022.