Personal Blog

Framingham Heart Study: EDA and Classification Model

Tech Stack: Python, Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
Project Focus: Exploratory data analysis and predicting heart disease risk using a decision tree classifier
GitHub Repository: Project Link

This project involves an extensive exploratory data analysis (EDA) of the Framingham Heart Study dataset and the development of a decision tree classifier to predict the risk of heart disease. Key highlights of the project include:

Data Cleaning and Preprocessing: Addressed missing values, outliers, and feature scaling to prepare the dataset for analysis and modeling.
EDA Insights:
- Examined the impact of key factors like age, cholesterol levels, blood pressure, and smoking habits on heart disease risk.
- Visualized correlations using heatmaps, histograms, and pair plots to uncover hidden patterns in the data.
- Identified trends in gender-specific risk factors for cardiovascular diseases.
Model Development: Built a decision tree classifier achieving 90% accuracy on the test set by optimizing hyperparameters and feature selection.
Evaluation: Assessed model performance using metrics such as precision, recall, F1-score, and ROC-AUC to