Prediction of Patient’s Stroke Vulnerability Status Using Logistic Regression Machine Learning Model

Okwori, Okpe Anthony and Agana, Moses Adah and Ofem, Ofem Ajah and Ofem, Obono I. (2024) Prediction of Patient’s Stroke Vulnerability Status Using Logistic Regression Machine Learning Model. Asian Basic and Applied Research Journal, 6 (1). pp. 70-82.

[thumbnail of Okwori612023ABAARJ1348.pdf] Text
Okwori612023ABAARJ1348.pdf - Published Version

Download (1MB)

Abstract

In recent time, machine learning has been widely used in healthcare services due to its efficiency in solving health-related problems through accurate prediction of diseases and medical conditions thereby assisting the physicians to diagnose diseases at an early stage. Machine learning models are equally used to handle complex and high-dimensional ever evolving huge amount of medical data to improve the accuracy and efficiency of disease prediction and diagnosis. This paper aims at applying machine learning model for the prediction of stroke vulnerability among individuals. In particular a Logistic Regression (LR) based stroke prediction model was described and developed using phyton programming language for the prediction of likelihood of stroke occurrence. Stroke usually occur due to blockage of blood flow to the brain cell which causes the brain cells to die as a result of lack of oxygen and nutrients. It is a medical emergency that may result in lasting brain damage, permanent disability and mortality across all ages. To reduce stroke occurrence, there is an urgent need for stroke prediction and life style changes. The logistic regression-based stroke prediction model was developed in this paper using the healthcare dataset stroke data obtained from Kaggle machine learning dataset repository. The dataset was preprocessed to improve the prediction performance using various dataset preprocessing techniques such as feature selection, feature encoding, missing values correction, class balancing, outlier detection and correction, feature scaling as well as hyperparameter turning. The preprocessed dataset was used for the training, validation and testing of the logistic regression stroke prediction machine learning model and was evaluated using python Scikit-Learn evaluation metrics such as accuracy score, precision score, recall score, f1-score, specificity score as well as area under receiver operating characteristic curve (AUC-ROC). After successful evaluation, the model produced a classification accuracy of 81% and AUC-ROC of 90%. This shows that logistic regression model is very efficient in stroke classification using the healthcare dataset and the proposed model has shown improvement over some existing stroke prediction model that uses logistic regression.

Item Type: Article
Subjects: Archive Digital > Multidisciplinary
Depositing User: Unnamed user with email support@archivedigit.com
Date Deposited: 21 Jun 2024 08:15
Last Modified: 21 Jun 2024 08:15
URI: http://eprints.ditdo.in/id/eprint/2232

Actions (login required)

View Item
View Item