Explorations in Computational Sociology: Causal-Textual Attribution of Epidemiological Policy Failures
Session Number
1
Advisor(s)
Sue Fricano, IMSA
Location
IN2 Alpha Design Studio
Discipline
Business
Start Date
15-4-2026 10:15 AM
End Date
15-4-2026 11:00 AM
Abstract
Current forecasting models for epidemiological crises often treat policy interventions as binary variables, applied treatments, or not. The effectiveness of these mandates, however, depends heavily on legal phrasing, in preventing inadvertent semantic loopholes leading to non-compliance. Evaluating these textual effects presents a challenge, as mapping policy text directly to raw disease curves introduces selection bias from underlying socioeconomic factors. This project presents a novel "select -> predict" natural language processing framework that traces the specific language of epidemiological laws to their observable effects on time-scale data. By comparing perceived variance between “treated” and “untreated” curves, we derive a test dataset to train on Oxford’s Coronavirus response database. Chunking policy and feeding selected chunks into a law-specialized predictor, we attempt to train a model to predict a “compliance score” or “risk score”. This derived "Semantic Risk Score" is interpreted as an extra regressor into a foundation time-series model to forecast the previously constructed dataset of disease-prevalence curves. This project aims to produce a tool that allows legislators to input policies, highlight compliance gaps, and visualize the resulting epidemiological curve. Future work will focus on integrating additional sociological models to expand the applications of this framework.
Explorations in Computational Sociology: Causal-Textual Attribution of Epidemiological Policy Failures
IN2 Alpha Design Studio
Current forecasting models for epidemiological crises often treat policy interventions as binary variables, applied treatments, or not. The effectiveness of these mandates, however, depends heavily on legal phrasing, in preventing inadvertent semantic loopholes leading to non-compliance. Evaluating these textual effects presents a challenge, as mapping policy text directly to raw disease curves introduces selection bias from underlying socioeconomic factors. This project presents a novel "select -> predict" natural language processing framework that traces the specific language of epidemiological laws to their observable effects on time-scale data. By comparing perceived variance between “treated” and “untreated” curves, we derive a test dataset to train on Oxford’s Coronavirus response database. Chunking policy and feeding selected chunks into a law-specialized predictor, we attempt to train a model to predict a “compliance score” or “risk score”. This derived "Semantic Risk Score" is interpreted as an extra regressor into a foundation time-series model to forecast the previously constructed dataset of disease-prevalence curves. This project aims to produce a tool that allows legislators to input policies, highlight compliance gaps, and visualize the resulting epidemiological curve. Future work will focus on integrating additional sociological models to expand the applications of this framework.