A Data Analysis Exploration into Environmental & Socioeconomic Factors on Poor Health Outcomes

A course capstone project exploring how environmental and socioeconomic variables relate to public health outcomes, combining machine learning with data visualization in Python.

CoursePythonMachine Learning
A Data Analysis Exploration into Environmental & Socioeconomic Factors on Poor Health Outcomes cover

OVERVIEW

This project focused on analyzing how environmental and socioeconomic factors correlate with health outcomes. I used Python-based data pipelines, machine learning models, and visualization tools to build a clear, presentation-driven narrative from real datasets. Being a capstone, this involved developing a strong fundamental understanding of Python, and then advancing into different relevant specific modules for visualization and statistical analysis.

WHAT I DID

  • Imported and standardized county-level public health datasets using Pandas, handling missing values and distributional data skews.
  • Utilized classical machine learning frameworks such as data pre-processing, and decision tree models based on extracted features and relevant variables.
  • Added different methods of data visualization through heatmaps, distribution plots, and state-based comparisons

RESULTS / IMPACT

  • Identified key variables associated with adverse health outcomes, exploring them from a professional data and statistical perspective.
  • Delivered a polished, insight-driven presentation, utilizing numerous Python modules and libraries and visualizations to possibly be used for policymaker analysis

PRESENTATION