Programming and Data Science in Pharma

Title: “My journey to become a statistical programming leader”

Friday, Aug 15th, 2025

Speaker: Nori Oharu, Head of the Real World Evidence (RWE) Center of Excellence
Bristol Myers Squibb

Nori Norihiko “Nori” Oharu, located in Connecticut, is the Head of the Programming Center of Excellence (CoE) at Takeda Pharmaceutical Company Limited. In his capacity, Nori oversees teams responsible for managing Standards Programming and Automation, Ecosystem Management, Submission Standards and Governance, Reporting Metadata Governance, Safety Analytics, Specialized Data programming (PK/PD), data anonymization, and Real-World Data programming. After obtaining his master's degree in quantitative science, Nori started his career as a programmer in a CRO organization, handling programming tasks for pharmaceutical companies. He subsequently joined Pfizer, where he worked in the Oncology therapeutic area, overseeing breast cancer trials for regulatory submissions. Nori has a strong interest in innovative data analyses aimed at efficiently delivering medicine to patients and maintaining submission excellence. With his background in statistical and quantitative sciences, he is proficient in statistical programming and data analysis. Nori has led significant projects and initiatives within the organization and is highly regarded for his collaborative and effective leadership.

Abstract

Starting a career in statistical programming within the pharmaceutical industry involves gaining technical proficiency, industry knowledge, and leadership abilities. Initially, as a junior programmer using SAS to analyze clinical trial data, you will begin to develop expertise in industry standards. Over time, advancing to lead teams, you will oversee programming deliverables for data reporting to support regulatory submissions, ultimately achieving a leadership position. In addition to technical skills, effective communication, and analytical thinking, strength in project management is critical in navigating the complex landscape of pharmaceutical research. Leading global projects and collaborating with cross-functional teams allowed me to gain insights into regulatory interactions and the drug development process from IND to drug approval. In discussion I will focus on my journey to be a leader in data science.

Best practices and guidance for machine learning and large language model operations

Friday, Aug 15th, 2025

Speaker: Dr. Jacob Gagnon, Biostatistician
Biogen

Jake Dr. Jacob Gagnon is an associate director of biostatistics at Biogen and leads a team of medical researchers in the areas of neurology and immunology.  He leads statistical methodology development efforts for the latest omics technologies (ie spatial transcriptomics, scRNAseq, single cell proteomics, etc), performs preclinical research, is a core member of the text mining center of excellence, and leads a ML/DL focus group. His team’s research interests include deep learning, machine learning, translational biology, omics analysis, and text mining. He obtained a PhD in statistics from UMASS Amherst and did postdoctoral work in biostatistics at WPI. After his postdoctoral work, he did biostatistics research for Abbvie, Roche, and then Biogen. He has authored/co-authored around 20 publications including three in Nature journals. Additionally, he has won multiple awards including: a winner of the PHUSE/FDA innovation challenge, NEDSI’s best application of theory award, and Wiley’s highly viewed article award.

Abstract

Deep learning has been quite popular in recent years with many applications in the pharmaceutical space. Some applications include large language models applied to R&D, computer vision of medical images, digital twins, and text to image generation. In this talk, we will focus on deep learning best practices specifically in the area of MLops and LLMops. MLops, or machine learning operations, is a set of recommendations for the end-to-end life cycle of machine learning projects to aid in ML pipeline reproducibility, reliability, and automation. MLops includes model development and deployment as well as monitoring and retraining of models. We will conclude the talk with a discussion of LLMops, which includes particular guidance for the lifecycle of LLM projects.