MACHINE LEARNING-BASED DATABASE ANOMALY DETECTION: A COMPREHENSIVE REVIEW
DOI:
https://doi.org/10.62373/phs12920Keywords:
Machine Learning, Anomaly Detection, AI, Artificial Intelligence, ENSEMBLE METHODS, SUPPORT VECTOR MACHINEAbstract
Database management systems are critical components of modern enterprise infrastructures and are increasingly targeted by sophisticated cyberattacks. Traditional rule-based intrusion detection systems are limited in identifying evolving and zero-day threats due to their static nature. Machine learning (ML) techniques provide adaptive, datadriven solutions for modeling normal database behavior and detecting anomalous activities. This paper presents a concise review of machine learning-based approaches for database anomaly detection, categorizing existing methods into clustering, classification, ensemble learning, and deep learning frameworks. A comparative analysis highlights the strengths and limitations of each category in terms of detection accuracy, recall, scalability, and interpretability. Ensemble methods demonstrate improved robustness and performance in imbalanced datasets, while deep learning models effectively capture complex and temporal behavioral patterns. Key challenges, including class imbalance, scalability constraints, and limited explainability, are discussed, along with emerging research directions such as federated learning and graph-based modeling.
Downloads
References
[1] MacQueen, J.: ‘Some methods for classification and analysis of multivariate observations.’ Proc. 5th Berkeley Symp. Mathematical Statistics and Probability, Berkeley, USA, 1967, pp. 281–297
[2] Cortes, C., Vapnik, V.: ‘Support-vector networks’, Mach. Learn., 1995, 20, (3), pp. 273–297
[3] Breiman, L.: ‘Random forests’, Mach. Learn., 2001, 45, (1), pp. 5–32
[4] Friedman, J.H.: ‘Greedy function approximation: A gradient boosting machine’, Ann. Stat., 2001, 29, (5), pp. 1189–1232
[5] Pang, G., Shen, C., Cao, L., Van Den Hengel, A.: ‘Deep learning for anomaly detection: A review’, ACM Comput. Surv., 2021, 54, (2), pp. 1–38
[6] Ahmed, M., Mahmood, A.N., Hu, J.: ‘A deep learning-based anomaly detection survey for cybersecurity’, IEEE Access, 2021, 9, pp. 152270–152293
[7] Ferrag, M.A., Maglaras, L., Moschoyiannis, S., Janicke, H.: ‘Deep learning for cyber security intrusion detection: A survey’, IEEE Commun. Surv. Tutor., 2020, 22, (2), pp. 964–987
[8] Zhang, C., et al.: ‘A survey on deep learning-based cybersecurity anomaly detection’, IEEE Access, 2020, 8, pp. 152710–152738
[9] Al-Turaiki, I., Altwaijry, H.: ‘A systematic review of ensemble learning for anomaly detection’, IEEE Access, 2020, 8, pp. 182765–182781
[10] Li, Y., et al.: ‘Federated learning for intrusion detection systems’, IEEE Trans. Netw. Serv. Manag., 2022, 19, (3), pp. 2143–2156
[11] Naseer, S., et al.: ‘Explainable artificial intelligence in cybersecurity: A survey’, IEEE Access, 2022, 10, pp. 70114–70134
[12] Wu, Z., et al.: ‘A comprehensive survey on graph neural networks’, IEEE Trans. Neural Netw. Learn. Syst., 2021, 32, (1), pp. 4–24
[13] Nguyen, T.D., et al.: ‘Federated learning for intrusion detection: Recent advances and open challenges’, IEEE Access, 2023, 11, pp. 98765–98789
[14] Zhang, Y., et al.: ‘Graph neural networks for cybersecurity: A survey’, ACM Comput. Surv., 2023, 56, (4), pp. 1–36
[15] Kim, J., et al.: ‘Real-time anomaly detection in large-scale database systems using deep autoencoders’, Future Gener. Comput. Syst., 2022, 128, pp. 381–392
[16] Patel, H., et al.: ‘Lightweight deep learning models for real-time intrusion detection’, Comput. Secur., 2024, 138, 103532
[17] Johnson, J.M., Khoshgoftaar, T.M.: ‘Survey on deep learning with class imbalance’, J. Big Data, 2019, 6, (1), pp. 1–54
[18] Ke, G., Meng, Q., Finley, T., et al.: ‘LightGBM: A highly efficient gradient boosting decision tree’, Adv. Neural Inf. Process. Syst., 2017, 30, pp. 3146–3154
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Harnish Shah, Dhirenkumar Purohit (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
All articles published in PUXplore: Multidisciplinary Journal of Engineering are licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.
Under this license, anyone may read, download, copy, distribute, and share the work for non-commercial purposes, provided that appropriate credit is given to the author(s), the journal, and a link to the license is included.
No adaptations, derivatives, or modifications of the work are permitted without prior written permission from the copyright holder.
Authors retain copyright and grant the journal the right of first publication under this license.