Performance Comparison of Machine Learning Techniques for Breast Cancer Detection

Dada Emmanuel Gbenga

Abstract


The fundamental cause of death among women in developed nations of the world is breast cancer. Breast cancer has been identified as one of the most deadly type of cancer prevalent among women globally. There have been a dramatic increase of breast cancer cases among women of recent. Machine learning algorithms are effective tools that have found application in the field of medical imaging for early detection and diagnosis of cancer. This paper investigate the performance of eight (8) machine learning algorithms that have been applied for timely detection of breast cancer. Diagnosing breast cancer involves making a distinction between benign and malignant breast lumps. Our experimental results indicated that Support Vector Machine (SVM) have the best performance in term of classification accuracy (97.07%) and lowest error rate compared to Radial Based Function (96.49 %), Simple Linear Logistic Regression Model (96.78%), Naïve Bayes (96.48%), k-Nearest Neighbour (96.34%), AdaBoost (96.19%), Fuzzy Unordered Role Induction algorithm (96.78%) and Decision Tree - J48 (96.48%). All experiments are conducted using WEKA data mining and machine learning simulation environment.

Keywords: Breast cancer; RBF, SVM; NB; AdaBoost; kNN; J48.


Full Text:

PDF

References


. Yew-Ching Teh, Gie-Hooi Tan, Nur Aishah Taib, Kartini Rahmat, Caroline Judy Westerhout, Farhana Fadzli, Mee-Hoong See, Suniza Jamaris, and Cheng-Har Yip. Opportunistic mammography screening provides effective detection rates in a limited resource healthcare system. BMC Cancer. 2015; 15(1): 405. DOI: https://doi.org/10.1186/s12885-015-1419-2.

. Howlader N, Noone AM, Krapcho M, Miller D, Bishop K, Kosary CL, Yu M, Ruhl J, Tatalovich Z, Mariotto A, Lewis DR, Chen HS, Feuer EJ, and Cronin KA (eds). SEER Cancer Statistics Review, 1975-2014, National Cancer Institute, 2017.

. Siegel R. L., Miller, K D, Jemal A. Cancer statistics, 2017. CA: A Cancer Journal for Clinicians, 2017; 67:7–30. http://doi:10.3322/caac.21387

. Mei-Sing O, Kenneth DM. National expenditure for false positive mammograms and breast cancer overdiagnoses estimated at $4 billion a year. Health affairs. 2015; 34 (4): 576–583.

. Pendharkar PC, Rodger JA, Yaverbaum GJ, Herman N, Benner M. Association, statistical, mathematical and neural approaches for mining breast cancer patterns. Expert Systems with Applications. 1999; 17: 223-232.

. Delen D, Walker G, Kadam A. Predicting breast cancer survivability: a comparison of three data mining methods. Artificial Intelligence in Medicine2005; 34: 113-127.

. Lundin M, Lundin J, Burke H B, Toikkanen S, Pylkkanen L. Artificial neural networks applied to survival prediction in breast cancer. Oncology. 1999; 57: 281-286.

. Chaurasia V, Pal S. Data Mining Techniques : To Predict and Resolve Breast Cancer Survivability. 2014; 3(1): 10–22, 2014.

. Djebbari A, Liu Z, Phan S, Famili F. International journal of computational biology and drug design (IJCBDD). 21st Annual Conference on Neural Information Processing Systems, 2008.

. Aruna S, Nandakishore LV. Knowledge based analysis of various statistical tools in detecting breast cancer. 2011; 37–45.

. Angeline C, Sivaprakasam Y. An Empirical Comparison of Data Mining Classification Methods. 2011; 3(2): 24–28.

. Pradesh A. Analysis of Feature Selection with Classification : Breast Cancer Datasets, Indian J. Comput. Sci. Eng.2011; 2 (5): 756–763.

. Thorsten J. Transductive Inference for Text Classification Using Support Vector Machines. ICML. 1999; 99: 200-209. doi:10.4218/etrij.10.0109.0425.

. Ya-Qin L, Cheng W, Lu Z. Decision tree based predictive models for breast cancer survivability on imbalanced data, in Bioinformatics and Biomedical Engineering, ICBBE 2009. 3rd International Conference on. 2009;1-4.

. Thongkam J, Xu G, Zhang Y. AdaBoost Algorithm with Random Forests for Predicting Breast Cancer Survivability. International Joint Conference on Neural Networks (IJCNN 2008). 2008; 3062-3069. https://pdfs.semanticscholar.org/8947/6ee849ba93fc666c91cbe19be68c0aff11f4.pdf

. Hiba A, Hajar M, Hassan A M, Thomas N. Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis. The 6th International Symposium on Frontiers in Ambient and Mobile Systems (FAMS 2016), Procedia Computer Science. 2016; 83: 1064 – 1069.

. Wang K J, Makond B, Chen K H, Wang M K. A hybrid classifier combining SMOTE with PSO to estimate 5-year survivability of breast cancer patients, Applied Soft Computing, 2014; 20: 15-24, 2014.

. Hal M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update", ACM SIGKDD explorations newsletter. 2009;11(1): 10-18.

. UCI Machine Learning Repository: Breast Cancer Wisconsin Data Set. Available Online: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29.


Refbacks

  • There are currently no refbacks.


Nova Journal of Engineering and Applied Sciences ISSN: 2292-7921

Nova Pub inc.

Nova Explore Publications is a member of CrossRef.
DOI Prefix: 10.20286