logo

A Comparative Study of Machine Learning Algorithms for Multi-Disease Healthcare Prediction: A Web-Based Intelligent System

Authors
  • Madhuri Gokhale

    Author
Keywords:
Machine Learning, Healthcare Prediction System, Random Forest, Support Vector Machine, Streamlit, Diabetes, Heart Disease, Lung Cancer, Parkinson’s Disease
Abstract

Machine learning has significantly transformed healthcare analytics, enabling new approaches to early disease detection and clinical decision support. This study presents a web-based platform designed to predict five clinically significant diseases simultaneously: Diabetes, Heart Disease, Lung Cancer, Parkinson’s Disease, and Thyroid Disorders. Disease-specific datasets were sourced from established public repositories and subjected to a systematic preprocessing pipeline encompassing noise removal, normalisation, feature selection, and stratified train-test partitioning. Five supervised machine learning models were confusion matrices and multi-model accuracy comparisons. The platform is designed for scalability, with provisions for integration with electronic health records and wearable health monitoring devices, establishing its suitability as a next-generation clinical decision support tool. The system is intended to function as a decision support aid and does not substitute for professional clinical diagnosis.

References

[1] J. W. Smith, J. E. Everhart, W. C. Dickson,

W. C. Knowler, and R. S. Johannes,

“Using the ADAP learning algorithm to

forecast the onset of diabetes mellitus,”

in Proc. Annu. Symp. Comput. Appl.

Med. Care, pp. 261–265, 1988.

R. Detrano et al., “International

application of a new probability

algorithm for the diagnosis of coronary

artery disease,” Amer. J. Cardiol., vol. 64,

no. 5, pp. 304–310, 1989.

[3] M. A. Little, P. E. McSharry, S. J. Roberts,

D. A. E. Costello, and I. M. Moroz,

“Exploiting nonlinear recurrence and

fractal scaling properties for voice

disorder detection,” BioMed. Eng.

OnLine, vol. 6, no. 1, p. 23, 2007, doi:

10.1186/1475-925X-6-23.

[4] M. Lichman, “UCI Machine Learning

Repository,” University of California,

Irvine, School of Information and

Computer Sciences, 2013. [Online].

Available: https://archive.ics.uci.edu/

ml. [Accessed: Jan. 14, 2026]

[5] L. Breiman, “Random forests,” Mach.

Learn., vol. 45, no. 1, pp. 5–32, 2001,

doi: 10.1023/A:1010933404324.

[6] I. Kavakiotis et al., “Machine learning

and data mining methods in diabetes

research,” Comput. Struct. Biotechnol.

J., vol. 15, pp. 104–116, 2017, doi:

10.1016/j.csbj.2016.12.005.

[7] S. Mohan, C. Thirumalai, and G.

Srivastava, “Effective heart disease

prediction using hybrid machine

learning techniques,” IEEE Access, vol.

7, pp. 81542–81554, 2019, doi:

10.1109/ACCESS.2019.2923707.

[8] A. Parmar, R. Katariya, and V. Patel, “A

review on random forest: An ensemble

classifier,” in Proc. Int. Conf. Intell. Data

Commun. Technol. Internet of Things

(ICICI 2018), Lect. Notes Data Eng.

Commun. Technol., vol. 26, Springer,

Cham, 2019, pp.758–763.

[9] C. Cortes and V. Vapnik, “Supportvector networks,” Mach. Learn., vol. 20,

no. 3, pp. 273–297, 1995, doi: 10.1007/

BF00994018.

[10] N. V. Chawla, K. W. Bowyer, L. O. Hall,

and W. P. Kegelmeyer, “SMOTE:

Synthetic minority over- sampling

technique,” J. Artif. Intell. Res., vol. 16,

pp. 321–357, 2002, doi: 10.1613/

jair.953.

[11] F. Pedregosa et al., “Scikit-learn:

Machine learning in Python,” J. Mach.

Learn. Res., vol. 12, pp. 2825–2830,

2011.

[12] T. Cover and P. Hart, “Nearest neighbor

pattern classification,” IEEE Trans. Inf.

Theory, vol. 13, no. 1, pp. 21–27, 1967,

doi:10.1109/TIT.1967.1053964.

[13] J. R. Quinlan, “Induction of decision

trees,” Mach. Learn., vol. 1, no. 1, pp. 81

–106, 1986, doi: 10.1007/BF00116251.

[14] Streamlit Inc., “Streamlit – The fastest

way to build and share data apps,”

2024. [Online]. Available: https://

streamlit.io. [Accessed: Jan. 14, 2026]

[15] O. Taylan, M. Kaya, and S. Tezcan, “A

new machine learning approach for

thyroid disease prediction,” in Proc.

2022 Int. Conf. Mach. Learn. Data Eng.

(iCMLDE), IEEE, 2022, pp. 47–52, doi:

10.1109/iCMLDE56768.2022.00019.

Cover Image
Downloads
Published
2026-05-26 — Updated on 2026-05-26
Versions
Section
Articles

How to Cite

A Comparative Study of Machine Learning Algorithms for Multi-Disease Healthcare Prediction: A Web-Based Intelligent System. (2026). Journal of Integrated Engineering Innovation & Applications, 2(1 (March 2026). https://joieia.com/index.php/home/article/view/17

Similar Articles

You may also start an advanced similarity search for this article.