The Evolving Role and Impact of Statisticians in Pharma

Organizers: Revathi Ananthakrishnan, Shubhadeep Chakraborty, Veronica Bunn

Title: Machine Learning approaches for small Ns: a challenge working at the intersection of clinical trial data scientific approaches and artificial intelligence

Thursday, Aug 20th, 2026

Speaker: Pradipta Ray, Associate Director

Bristol Myers Squibb

Pradipta Ray, Ph D is a computer scientist by training obtaining his Ph.D. from the School of Computer Science, Carnegie Mellon working on stochastic processes of regulatory DNA evolution. From 2010-2016, Pradipta was part of NIH’s Epigenome Roadmap Project modelling epigenetics and its transcriptional consequences in human stem cell lines at UT Dallas. From 2016-2022, he led the creation of the computational genomics unit at the Center for Advanced Pain Studies at UT Dallas, working on transcriptome-wide association studies to identify novel non-opioid drug targets for chronic pain. Pradipta joined Bristol Myers Squibb as Associate Director of Data Science in 2022, primarily supporting AI/ML workstreams for cell therapy drug development. Pradipta’s work has resulted in over 35 peer-reviewed publications at Cell, Nature, Brain and other top scientific journals, and his work has been featured on Nature News, WIRED, New Scientist and Scientific American.

Abstract: Artificial intelligence is increasingly central to drug discovery and clinical development, yet many applications are constrained by limited preclinical evidence and small, heterogeneous clinical trial cohorts. In these settings, naïve application of data-hungry machine learning methods risks overfitting, instability, and misleading inference. A hypothesis-driven statistical foundation driven by targeted hypothesis testing and non-parametric and resampling-based methods can provide robust inference when distributional assumptions are untenable and sample sizes are small. Additionally, bayesian frameworks further offer a natural mechanism to formally encode prior biological knowledge, borrow strength across related studies, and quantify uncertainty through posterior distributions rather than unstable point estimates. Clustering approaches can help uncover latent patient subgroups or biomarker subsets providing confidence in analyses from independent readouts, while probabilistic imputation methods can address missingness so that sample sizes are not reduced more than necessary for multivariate analysis. Regularized and Bayesian classification and regression models, including stratified formulations, enable stable prediction and effect estimation under data scarcity. Importantly, these approaches emphasize interpretability, uncertainty quantification, rigor and reproducibility, aligning with regulatory and scientific expectations.

Title: Role of statistician in designing interim analysis in clinical trials

Thursday, Aug 20th, 2026

Speaker: Shoubhik Mondal, Director

AstraZeneca

Shoubhik Mondal is a Director in AstraZeneca’s Oncology Biometrics. He has over 10 years of experience as a statistician, primarily focused on designing oncology trials across different development stages. He earned his PhD in 2014 from the New Jersey Institute of Technology. His research interests include survival analysis, adaptive design, and estimands.

Abstract: Early identification of efficacy or timely discontinuation for futility due to lack of efficacy or safety can markedly improve the efficiency and ethical conduct of clinical trials. Careful planning of interim analyses is therefore a critical component of trial design. In this talk, we will discuss role of statistician’s role in planning interim analyses, review main types of interim analyses, outline widely used decision-making methodologies (including efficacy and futility boundaries and safety monitoring strategies), and illustrate their application with real trial examples.