MACHINE LEARNING-BASED DATABASE ANOMALY DETECTION: A COMPREHENSIVE REVIEW

Harnish Shah; Dhirenkumar Purohit

doi:10.62373/phs12920

Authors

Harnish Shah Student , Parul University Author
Dhirenkumar Purohit Parul University Author

DOI:

https://doi.org/10.62373/phs12920

Keywords:

Machine Learning, Anomaly Detection, AI, Artificial Intelligence, ENSEMBLE METHODS, SUPPORT VECTOR MACHINE

Abstract

Database management systems are critical components of modern enterprise infrastructures and are increasingly targeted by sophisticated cyberattacks. Traditional rule-based intrusion detection systems are limited in identifying evolving and zero-day threats due to their static nature. Machine learning (ML) techniques provide adaptive, datadriven solutions for modeling normal database behavior and detecting anomalous activities. This paper presents a concise review of machine learning-based approaches for database anomaly detection, categorizing existing methods into clustering, classification, ensemble learning, and deep learning frameworks. A comparative analysis highlights the strengths and limitations of each category in terms of detection accuracy, recall, scalability, and interpretability. Ensemble methods demonstrate improved robustness and performance in imbalanced datasets, while deep learning models effectively capture complex and temporal behavioral patterns. Key challenges, including class imbalance, scalability constraints, and limited explainability, are discussed, along with emerging research directions such as federated learning and graph-based modeling.

Downloads

Download data is not yet available.

References

[1] MacQueen, J.: ‘Some methods for classification and analysis of multivariate observations.’ Proc. 5th Berkeley Symp. Mathematical Statistics and Probability, Berkeley, USA, 1967, pp. 281–297

[2] Cortes, C., Vapnik, V.: ‘Support-vector networks’, Mach. Learn., 1995, 20, (3), pp. 273–297

[3] Breiman, L.: ‘Random forests’, Mach. Learn., 2001, 45, (1), pp. 5–32

[4] Friedman, J.H.: ‘Greedy function approximation: A gradient boosting machine’, Ann. Stat., 2001, 29, (5), pp. 1189–1232

[5] Pang, G., Shen, C., Cao, L., Van Den Hengel, A.: ‘Deep learning for anomaly detection: A review’, ACM Comput. Surv., 2021, 54, (2), pp. 1–38

[6] Ahmed, M., Mahmood, A.N., Hu, J.: ‘A deep learning-based anomaly detection survey for cybersecurity’, IEEE Access, 2021, 9, pp. 152270–152293

[7] Ferrag, M.A., Maglaras, L., Moschoyiannis, S., Janicke, H.: ‘Deep learning for cyber security intrusion detection: A survey’, IEEE Commun. Surv. Tutor., 2020, 22, (2), pp. 964–987

[8] Zhang, C., et al.: ‘A survey on deep learning-based cybersecurity anomaly detection’, IEEE Access, 2020, 8, pp. 152710–152738

[9] Al-Turaiki, I., Altwaijry, H.: ‘A systematic review of ensemble learning for anomaly detection’, IEEE Access, 2020, 8, pp. 182765–182781

[10] Li, Y., et al.: ‘Federated learning for intrusion detection systems’, IEEE Trans. Netw. Serv. Manag., 2022, 19, (3), pp. 2143–2156

[11] Naseer, S., et al.: ‘Explainable artificial intelligence in cybersecurity: A survey’, IEEE Access, 2022, 10, pp. 70114–70134

[12] Wu, Z., et al.: ‘A comprehensive survey on graph neural networks’, IEEE Trans. Neural Netw. Learn. Syst., 2021, 32, (1), pp. 4–24

[13] Nguyen, T.D., et al.: ‘Federated learning for intrusion detection: Recent advances and open challenges’, IEEE Access, 2023, 11, pp. 98765–98789

[14] Zhang, Y., et al.: ‘Graph neural networks for cybersecurity: A survey’, ACM Comput. Surv., 2023, 56, (4), pp. 1–36

[15] Kim, J., et al.: ‘Real-time anomaly detection in large-scale database systems using deep autoencoders’, Future Gener. Comput. Syst., 2022, 128, pp. 381–392

[16] Patel, H., et al.: ‘Lightweight deep learning models for real-time intrusion detection’, Comput. Secur., 2024, 138, 103532

[17] Johnson, J.M., Khoshgoftaar, T.M.: ‘Survey on deep learning with class imbalance’, J. Big Data, 2019, 6, (1), pp. 1–54

[18] Ke, G., Meng, Q., Finley, T., et al.: ‘LightGBM: A highly efficient gradient boosting decision tree’, Adv. Neural Inf. Process. Syst., 2017, 30, pp. 3146–3154

MACHINE LEARNING-BASED DATABASE ANOMALY DETECTION: A COMPREHENSIVE REVIEW

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

ISSN

Make a Submission

Information

Latest publications