A Data Analysis Exploration into Environmental & Socioeconomic Factors on Poor Health Outcomes
A course capstone project exploring how environmental and socioeconomic variables relate to public health outcomes, combining machine learning with data visualization in Python.
CoursePythonMachine Learning

PROJECT CASE STUDY
OVERVIEW
This project focused on analyzing how environmental and socioeconomic factors correlate with health outcomes. I used Python-based data pipelines, machine learning models, and visualization tools to build a clear, presentation-driven narrative from real datasets. Being a capstone, this involved developing a strong fundamental understanding of Python, and then advancing into different relevant specific modules for visualization and statistical analysis.
WHAT I DID
- Imported and standardized county-level public health datasets using Pandas, handling missing values and distributional data skews.
- Utilized classical machine learning frameworks such as data pre-processing, and decision tree models based on extracted features and relevant variables.
- Added different methods of data visualization through heatmaps, distribution plots, and state-based comparisons
RESULTS / IMPACT
- Identified key variables associated with adverse health outcomes, exploring them from a professional data and statistical perspective.
- Delivered a polished, insight-driven presentation, utilizing numerous Python modules and libraries and visualizations to possibly be used for policymaker analysis
PRESENTATION
GALLERY
