NORMA eResearch @NCI Library

Diagnosis of Cardiovascular Diseases using Hybrid Feature Selection and Classification Algorithms

Mondal, Sandip (2017) Diagnosis of Cardiovascular Diseases using Hybrid Feature Selection and Classification Algorithms. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview

Abstract

Current diagnostic systems in order to identify cardiovascular diseases (CVDs) such as Echocardiography (ECG) require highly skilled physicians to evaluate complex combinations of clinical and pathological data. Inaccurate decision decision making is the challenge in the process and thus can’t be permitted in healthcare industry. Data mining methodologies can be applied to large medical datasets to extract insights that aid healthcare professionals in the diagnosis of cardiovascular diseases. In CVDs data mining, classification categorize a patient as having CVDs or free from it based on their similarities to previous examples of other patients. The classification accuracy rate is highly influenced by feature selection technique which eliminates features or attributes with practically no or little information from the dataset. Thus, feature selection and classification algorithms are considered as a concern of global "combinatorial optimization". The aim of this research is to investigate the optimal hybrid model of feature selection and classification algorithms in the diagnosis of cardiovascular diseases based on three performance metrics namely accuracy, sensitivity and specificity. It followed the Cross Industry Standard Process for Data Mining (CRISP-DM). The effect of hybrid feature selection and classification algorithms is examined on heart disease dataset acquired from University of California, Irvine - Machine Learning Repository (UCI-ML). The feature selection algorithm used is Particle Swarm Optimization (PSO). The classification algorithms used are Support Vector Machines (SVM), Artificial Neural Network (ANN), Naïve Bayes, K-Nearest Neighbour (KNN), Random Forest and C5.0 Decision Tree. The hybrid feature selection and classification algorithms are evaluated based on accuracy, sensitivity and specificity with the objective of achieving superior predictive performance. Results demonstrated that hybrid combination of PSO with SVM (PSO_SVM) achieves superior predictive performance over other models. The research will thus empower physicians to diagnose cardiovascular diseases and initiate timely treatment without the intervention of a trained cardiologist.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
R Medicine > Healthcare Industry
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Caoimhe Ní Mhaicín
Date Deposited: 28 Aug 2018 12:41
Last Modified: 28 Aug 2018 12:41
URI: https://norma.ncirl.ie/id/eprint/3092

Actions (login required)

View Item View Item