In collaboration with Payame Noor University and the Iranian Society of Instrumentation and Control Engineers

Document Type : Research Article

Authors

1 Department of Computer Engineering, Neyshabur Branch, Islamic Azad University, Neyshabur, Iran

2 Department of Electrical Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran

Abstract

In recent decades, the amount and variety of data have grown rapidly. As a result, data storage, compression, and analysis have become critical subjects in data mining and machine learning.‎ It is essential to achieve accurate compression without losing important data in the process. Therefore, this work proposes an effective data compression method for recommender systems based on the attention mechanism. The proposed method performs data compression on two levels: features and records. It is time-aware and based on time windows, taking into account users' activity and preventing the loss of important data. The resulting technique can be efficiently utilized for deep networks, where the amount of data is a significant challenge. Experimental results demonstrate that this technique not only reduces the amount of data and processing time but also achieves acceptable accuracy.

Keywords

References
[1] Ahmadian Yazdi H., Seyyed Mahdavi Chabok S.J., Kheirabadi M. (2021). “Dynamic educational recommender system based on improved recurrent neural networks using attention technique”, Applied Artificial Intelligence, 1-24.
[2] Bonyani M. Ghanbari M., Rad A. (2022). “Different gaze direction (DGNet) collaborative learning for iris segmentation”, Available at SSRN 4237124.
[3] Bonyani M., Soleymani M. (2022). “Towards improving workers’ safety and progress monitoring of construction sites through construction site understanding”, arXivpreprintarXiv: 2210.15760.
[4] Brezocnik L., Fister I., Podgorelec V. (2018). “Swarm intelligence algorithms for feature selection: A review”, Applied Sciences, 8 (9), 1521.
[5] Chahboun S., Maaroufi M. (2022). “Performance comparison of K-nearest neighbor, random forest, and multiple linear regression to predict photovoltaic panels’ power output”, Advances on Smart and Soft Computing, Springer, 301-311.
[6] Chen Q., Zhang M., Xue B. (2017). “Feature selection to improve generalization of genetic programming for high-dimensional symbolic regression”, IEEE Transactions on Evolutionary Computation, 21(5), 792-806.
[7] Hashemi A., Dowlatshahi M. B., Nezamabadi-pour H. (2022). “Ensemble of feature selection algorithms: a multi-criteria decision-making approach”, International Journal of Machine Learning and Cybernetics, 13 (1), 49-69.
[8] Kuzilek J., Hlosta M., Zdrahal Z. (2017). “Open university learning analytics dataset”, Scientific data, 4, 170171.
[9] Lee C.-F., Changchien S. W., Wang J.-J. (2006). “A data mining approach to database compression”, Information Systems Frontiers, 8 (3), 147-161.
[10] Luong H. H., Tran T. T., Van Nguyen N., Le A. D., Nguyen H. H. T., Nguyen K. D., Tran N. C., Nguyen H. T. (2022). “Feature selection using correlation matrix on metagenomics data with pearson enhancing inflammatory bowel disease prediction”, in: International Conference on Artificial Intelligence for Smart Community: AISC 2020, 17-18 December, Universiti Teknologi Petronas, Malaysia, Springer, 1073-1084.
[11] Nguyen B. H., Xue B., Zhang M. (2020). “A survey on swarm intelligence approaches to feature selection in data mining”, Swarm and Evolutionary Computation, 54, 100663.
[12] Pan Y., He F., Yu H. (2019). “A novel enhanced collaborative autoencoder with knowledge distillation for top-N recommender systems”, Neurocomputing, 332, 137-148.
[13] Prasetyawan D., Gatra R. (2022). “Algoritma K-nearest neighbor untuk memprediksi prestasi mahasiswa berdasarkan latar Belakang pendidikan dan ekonomi”, Jurnal Informatika Sunan Kalijaga, 7 (1), 56-67.
[14] Uthayakumar J., Vengattaraman T., Dhavachelvan P. (2018). “A survey on data compression techniques: From the perspective of data quality, coding schemes, data type and applications”, Journal of King Saud University-Computer and Information Sciences.
[15] Van Hulse J., Khoshgoftaar T. M., Napolitano A., Wald R. (2012). “Threshold-based feature selection techniques for high-dimensional bioinformatics data”, Network modeling analysis in health informatics and bioinformatics, 1 (1-2), 47-61.
[16] Venkatesh V., Arya A., Agarwal P., Lakshmi S., Balana S. (). “Iterative machine and deep learning approach for aviation delay prediction”, in: 2017 4th IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics (UPCON), IEEE, 562-567.
[17] Wan Y., Wang M., Ye Z., Lai X. (2016). “A feature selection method based on modified binary coded ant colony optimization algorithm”, Applied Soft Computing, 49, 248-258.
[18] Wang Z., Liang M., Delahaye D. (2018). “A hybrid machine learning model for short-term estimated time of arrival prediction in terminal manoeuvring area”, Transportation Research Part C: Emerging Technologies, 95, 280-294.
[19] Xue B., Zhang M., Browne W. N., Yao X. (2015). “A survey on evolutionary computation approaches to feature selection”, IEEE Transactions on Evolutionary Computation, 20(4), 606-626.
[20] Yazdi M. F., Kamel S. R., Chabok S. J. M., Kheirabadi M. (2020). “Flight delay prediction based on deep learning and Levenberg-Marquart algorithm”, Journal of Big Data, 7, 1-28.
[21] Yousefzadeh Aghdam M., Kamel Tabbakh S. R., Mahdavi Chabok S. J., Kheyrabadi M. (2021). “Optimization of air trafic management eficiency based on deep learning enriched by the long short-term memory (LSTM) and extreme learning machine (ELM)”, Journal of Big Data, 8 (1), 1-26.
[22] Zhang Y., Li H.-G., Wang Q., Peng C. (2019). “A filter-based bare-bone particle swarm optimization algorithm for unsupervised feature selection”, Applied Intelligence, 49 (8), 2889-2898.
[23] “Oulad: Open university learning analytics dataset, https://www.kaggle.com/vjcalling/ oulad-open-university-learning-analytics-dataset.