Session ID: 2021-US-30MP-920
The COVID-19 vaccines are crucial to ending the global pandemic that has caused surges of infections and deaths globally. However, the unprecedent rate at which they were developed and administered had raised doubts in the community regarding their safety. Data from the United States Vaccine Adverse Event Reporting System, VAERS, has the potential to help determine if the safety concerns of the vaccines are founded. This paper uses the combination of both structured and unstructured variables from VAERS to model the adverse reactions of COVID-19 vaccines.
The severity of an adverse reaction is first derived from the variables describing the vaccine recipient outcome following a reaction. Next, unstructured data in the from of text describing symptoms, medical history, medication and allergies are converted into a document term matrix and then combined with the structured variables to build a model to predict for the severity of the adverse reaction.
The predictive model is build using JMP Pro 16 and SAS Enterprise Miner 14.1, using logistic regression and decision tree with both binary document term matrix and term frequency inverse document frequency, with the model evaluation based on lowest misclassification area. The best fit model is a logistic regression model for ordinal target variables. The key determinants contributing to an adverse reaction from the optimal model are age, number of symptoms, period between vaccination onset, sex, state, type of vaccine, how the vaccine was administered, symptoms, history of dementia, and history of chronic obstructive pulmonary disease.