Breast Cancer Prediction using Machine Learning Techniques

Authors

  • Sandip Chakraborty Parul University image/svg+xml Author
  • Aritra Das Author
  • Subhadeep Das Author
  • Arya Manikya Sinha Author

Keywords:

Breast Cancer, Machine Learning, Logistic Regression, Random Forest, KNN, Naive Bayes, SVM

Abstract

Breast cancer is one of the leading causes of death among women worldwide. Early detection is crucial for increasing survival rates. In this study, we evaluate and compare several supervised machine learning algorithms for classifying breast cancer tumors as benign or malignant using the Wisconsin Breast Cancer Dataset. The preprocessing stage involved handling missing values, feature scaling, and skewness removal using mathematical transformations. Feature selection was applied to improve classification performance. Six algorithms—Logistic Regression, Naïve Bayes, Decision Tree, Random Forest, K-Nearest Neighbors, and Support Vector Machine—were implemented in Python (Anaconda, Jupyter Notebook) and evaluated using 10-fold cross-validation. Experimental results demonstrate that Logistic Regression and SVM achieved the highest accuracy (98.24%), surpassing results reported in previous studies. This paper highlights the potential of these techniques for developing robust Computer-Aided Diagnosis systems. 

Downloads

Download data is not yet available.

References

Anderson, B., (2021). World Health Organization. [Online] Available at: https://www.who.int/news-room/fact-sheets/detail/breast-cancer [Accessed 26 2022].

Coughlin, S. S., & Ekwueme, D. U. (2009). Breast cancer as a global health concern. Cancer epidemiology, 33(5), 315-318.

Liew, X. Y., Hameed, N., & Clos, J. (2021). A review of computer-aided expert systems for breast cancer diagnosis. Cancers, 13(11), 2764.

Ayodele, T. O. (2010). Types of machine learning algorithms. New advances in machine learning, 3(19-48), 5-1.

Nallamala, S. H., Mishra, P., & Koneru, S. V. (2019). Breast cancer detection using machine learning way. Int J Recent Technol Eng, 8(2-3), 1402-1405.

Mohammed, S. A., Darrab, S., Noaman, S. A., & Saake, G. (2020, July). Analysis of breast cancer detection using different machine learning techniques. In International conference on data mining and big data (pp. 108-117). Singapore: Springer Singapore.

Sivapriya, J., Kumar, A., Sai, S. S., & Sriram, S. (2019). Breast cancer prediction using machine learning. International Journal of Recent Technology and Engineering (IJRTE), 8(4), 4879-4881.

Joshi, A., & Mehta, A. (2017). Comparative analysis of various machine learning techniques for diagnosis of breast cancer. International Journal on Emerging Technologies, 8(1), 522-526.

Keleş, M. K. (2019). Breast cancer prediction and detection using data mining classification algorithms: a comparative study. Tehnički vjesnik, 26(1), 149-155.

Amrane, M., Oukid, S., Gagaoua, I., & Ensari, T. (2018, April). Breast cancer classification using machine learning. In 2018 electric electronics, computer science, biomedical engineerings' meeting (EBBT) (pp. 1-4). IEEE.

Sharma, S., Aggarwal, A., & Choudhury, T. (2018, December). Breast cancer detection using machine learning algorithms. In 2018 International conference on computational techniques, electronics and mechanical systems (CTEMS) (pp. 114-118). IEEE.

Bazazeh, D., & Shubair, R. (2016, December). Comparative study of machine learning algorithms for breast cancer detection and diagnosis. In 2016 5th international conference on electronic devices, systems and applications (ICEDSA) (pp. 1-4). IEEE.

Deepa, R., Kavipraba, R., Pavithra, G., Preethi, S., & Sri Rakshitha, A. K. (2021, May). Breast cancer classification using the supervised learning algorithms. In 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS) (pp. 1492-1498). IEEE.

Shravya, C., Pravalika, K., & Subhani, S. (2019). Prediction of breast cancer using supervised machine learning techniques. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 8(6), 1106-1110.

Dharsandiya, G., Gohil, S., & Almeida, M. A. (2021). BREAST CANCER PREDICTION USING MACHINE LEARNING. International Journal of Creative Research Thoughts (IJCRT), 2021 IJCRT, 9(5).

Asri, H., Mousannif, H., Al Moatassime, H., & Noel, T. (2016). Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Computer Science, 83, 1064-1069.

Adhikari, D., Jiang, W., Zhan, J., He, Z., Rawat, D. B., Aickelin, U., & Khorshidi, H. A. (2022). A comprehensive survey on imputation of missing data in internet of things. ACM Computing Surveys, 55(7), 1-38.

Schober, P., Boer, C., & Schwarte, L. A. (2018). Correlation coefficients: appropriate use and interpretation. Anesthesia & analgesia, 126(5), 1763-1768.

Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering, 160(1), 3-24.

Downloads

Published

19-01-2026

Data Availability Statement

https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data 

How to Cite

Breast Cancer Prediction using Machine Learning Techniques. (2026). PUXplore Multidisciplinary Journal of Engineering, 1(2). https://puxplore.paruluniversity.ac.in/index.php/PXMJE/article/view/36