Analysis of “Big” Real-World Health Care Data: Promises and Perils
Rockefeller Hall 300
Professor Bhramar Mukherjee to deliver the Department of Mathematics and Statistics’ annual Henry Seely White Lectures
Using administrative patient-care data such as Electronic Health Records and medical/pharmaceutical claims for population-based scientific research has become increasingly common. With vast sample sizes leading to very small standard errors, researchers need to pay more attention to potential biases in the estimates of association parameters of interest, specifically to biases that do not diminish with increasing sample size. Of these multiple sources of biases, in this talk, we primarily focus on understanding selection bias. We present an analytical framework for understanding selection bias and arriving at bias-reduced inference using external data from a target population. We illustrate our methods via case-studies in cancer and COVID-19. We try to highlight that sampling and study design are at the heart of analysis of big data. This is joint work with many students and colleagues at the University of Michigan School of Public Health.
This is the second of two lectures in the Henry Seely White Lecture Series. The first lecture, The Data Struggle of the Unseen, is on April 22, 2024.
About Bhramar Mukherjee
Dr. Mukherjees’s research interests span statistical methods for analyzing electronic health records, gene-environment interaction studies, Bayesian methods, shrinkage estimation, and the analysis of multiple pollutants. With over 375 publications in statistics, biostatistics, medicine, and public health, Dr. Mukherjee is globally recognized for her prolific research contributions and has received numerous awards for her outstanding scholarship, service, and teaching. She is a fellow of the American Statistical Association and the American Association for the Advancement of Science, and was elected to the US National Academy of Medicine.
During the COVID-19 pandemic, Dr. Mukherjee and her team actively contributed to modeling the SARS-CoV-2 virus trajectory in India, garnering significant attention from major media outlets worldwide. She has been a strong advocate of diversifying the data science workforce and for global data equity.
Sponsoring department: