Project updates forthcoming ...

Assessing Inequalities in the Built Environment

Multiple studies have shown associations between the built environment and health outcomes. However, performing neighborhood audits can be costly and time-consuming. We apply deep learning to satellite images to identify features in the built environment that could be associated with poverty, health and crime. Our first study demonstrating that artificial intelligence could be used to process satellite images to study population health outcomes was published in JAMA Network Open.

Unsafe Foods & FDA Recall

The goal of this project is to use product reviews from to identify unsafe food products. Foods that are mislabeled, contaminated, or spoiled are recalled through a time consuming process that can leave consumers at risk for allergic reactions, injury, and illness for months. Our goal is to use consumers' reviews posted online to predict whether a product will be recalled. Project details and code on Github

Who shares personal health information on the Internet?

There are a number of systems that use data from the Internet (such as, news, social media and crowd-sourced reports) and other digital sources (e.g., cell phones, wearable devices) to monitor disease spread, assess population attitudes towards vaccines, and improve understanding of the interaction between population behavioral changes and health. In addition to challenges in extracting public health signals from the noise inherent in these data sources, there are significant biases due to differences in the representation of individuals from different locations, age and race/ethnic backgrounds. This project assesses demographic and spatial disparities in digital reporting of illness. Funding for this project is provided by the Robert Wood Johnson Foundation. Project code on Github

Estimating prevalence of chronic disease risk factors from social media postings.

We explore the use of social media data as a supplement to estimates of risk factor prevalence from traditional survey data. We construct county-level estimates of leisure time physical activity and obesity prevalence from geottaged postings of food and exercise. These estimates are made separately for males and females. Initial publication is forthcoming.

Foodborne Illness Surveillance

The aim of this project is to develop a framework for monitoring foodborne illness reports using data from sources such as, Twitter and Yelp. We work with public health departments to develop tools for monitoring local reports of foodborne illnesses for targeted restaurant inspections and surveillance of foodborne disease outbreaks. This project is run in collaboration with the Computational Epidemiology Group at Boston Children's Hospital. This project is funded by an R01 grant from the National Library of Medicine, National Institutes of Health. Project website.

Integrating Digital Data with Other Data Sources for Infectious Disease Surveillance and Forecasting

This project was funded by an NIH BD2K K01 award. We combined digital event-based data sources with other disparate data sources such as, climate and case data from traditional disease surveillance systems to estimate disease risk and forecast temporal dynamics of infectious diseases. We applied machine learning, statistical modeling and geospatial mapping techniques to study the spread of influenza-like illness, chikungunya and Zika.

Influenza Forecasting

My PhD dissertation was focused on developing a method for forecasting influenza epidemics using network epidemic models. I also worked on influenza surveillance using novel data sources during my postdoc. See below for a list of relevant publications.

Computational approaches to influenza surveillance: beyond timeliness

A Dirichlet process model for classifying and forecasting epidemic curves

A systematic review of studies on forecasting the dynamics of influenza outbreaks

Forecasting Peaks of Seasonal Influenza Epidemics

Using Clinicians' Search Query Data to Monitor Influenza Epidemics

Monitoring Influenza Epidemics in China with Search Query from Baidu

A Simulation Optimization Approach to Epidemic Forecasting