Data Platforms & Storage Infrastructure Header Image

 





Data Platforms & Storage Infrastructure
Optimize Data Platforms for Scale, Speed, Performance, and Cost Efficiency
5/20/2026 - May 21, 2026
Life sciences data volumes are exploding, driving the need for scalable, secure, and sustainable infrastructure. From cloud-based platforms and high-performance computing to AI-driven storage optimization, organizations are rethinking how they manage, integrate, and govern critical data. This track explores cutting-edge approaches to storage, processing, and interoperability, offering real-world strategies for balancing speed, performance, cost, and compliance. Learn how industry leaders are advancing large-scale data management and shaping the future of data infrastructure in life sciences.

Tuesday, May 19

Recommended Pre-Conference Workshops and Symposia*

On Tuesday, May 19, 2026, Cambridge Healthtech Institute is pleased to offer six pre-conference Workshops scheduled across two time slots (9:00 am–12:00 pm and 1:15–4:15 pm) and three Symposia from 8:30 am–3:45 pm. All are designed to be instructional, interactive, and provide in-depth information on a specific topic. They allow for one-on-one interaction and provide a great way to explain more technical aspects that would otherwise not be covered during the main conference tracks that take place Wednesday–Thursday.

*Separate registration required. Additional details:

PLENARY KEYNOTE PROGRAM

Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute , Executive Event Director , Cambridge Healthtech Institute

Presentation to be Announced

Welcome Reception in the Exhibit Hall with Poster Viewing

The Bio-IT Kickoff Reception is a reunion—reconnect with friends, explore cutting-edge research, and celebrate innovation! Enjoy poster presentations, networking, and vote for the Best of Show and Poster awards.

Close of Day

Wednesday, May 20

Bio-IT World’s 5K Rise and Shine Fun Run! (Sponsorship Opportunities Available)

RUN COORDINATORS:
Bridget Kotelly, Senior Conference Director, Cambridge Healthtech Institute
Eileen Murphy, Conference Producer, Cambridge Healthtech Institute

Lace up and join Bio-IT’s Coordinators for the Fun Run on Wednesday, May 20! Sprint, jog, walk, or talk-your-way-through—ALL abilities are welcome. This informal event is all about getting moving together. Full details to come…just don’t forget your sneakers!

Registration and Morning Coffee

PLENARY KEYNOTE PROGRAM

Organizer's Remarks

Allison Proffitt, Editorial Director, Bio-IT World and Clinical Research News , Editorial Dir , Bio-IT World

PLENARY KEYNOTE PRESENTATION:
The Collaboration Breakthrough: How Federated Learning Is Rewriting the Rules of Drug Discovery

Photo of Mohammed AlQuraishi, PhD, Assistant Professor, Systems Biology, Columbia University , Assistant Professor , Systems Biology , Columbia University
Mohammed AlQuraishi, PhD, Assistant Professor, Systems Biology, Columbia University , Assistant Professor , Systems Biology , Columbia University
Photo of Jonathan B. Gilbert, PhD, Senior Director, Ecosystem Growth and Contributor Partnerships, Eli Lilly and Company , Sr. Director - Ecosystem Growth and Contributor Partnerships , Eli Lilly and Company
Jonathan B. Gilbert, PhD, Senior Director, Ecosystem Growth and Contributor Partnerships, Eli Lilly and Company , Sr. Director - Ecosystem Growth and Contributor Partnerships , Eli Lilly and Company
Photo of Woody Sherman, PhD, Founder and Chief Innovation Officer, Psivant Therapeutics , Founder and Chief Innovation Officer , Psivant Therapeutics
Woody Sherman, PhD, Founder and Chief Innovation Officer, Psivant Therapeutics , Founder and Chief Innovation Officer , Psivant Therapeutics
Photo of Christina Taylor, PhD, Senior Science Fellow and Computational Molecular Design Lead, Bayer , Senior Science Fellow and Computational Molecular Design Lead , Bayer
Christina Taylor, PhD, Senior Science Fellow and Computational Molecular Design Lead, Bayer , Senior Science Fellow and Computational Molecular Design Lead , Bayer

The pharmaceutical industry sits on a collective treasure trove of proprietary structural biology data, yet competitive concerns have historically prevented the data sharing necessary to train the most powerful AI models for drug discovery. Federated learning is changing this paradigm, enabling biopharma companies to collaborate on AI model training while keeping sensitive data secure and confidential. This plenary session explores the groundbreaking AI Structural Biology (AISB) Network, where industry leaders are pooling proprietary protein-ligand structure data to collaboratively train OpenFold3, an AI model designed to predict molecular interactions with precision approaching X-ray crystallography. Through the federated computing platform, thousands of experimentally determined protein–small molecule structures remain securely at their original locations while contributing to a shared learning framework that no single organization could achieve alone. This session reveals how federated learning solves the industry's most persistent challenge: unlocking collective intelligence while protecting intellectual property. ​Attendees will hear directly from consortium leaders about: 

  • The technical architecture enabling privacy-preserving collaborative AI training across competing organizations 
  • Real-world implementation of federated learning platforms and computational governance frameworks 
  • Strategic rationale for industry collaboration: why sharing model training beats going it alone 
  • Impact and outcomes from early OpenFold3 results in predicting binding affinities and accelerating small molecule discovery 
  • The future of collaborative AI in biopharma, from structural biology to clinical development

Coffee Break in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)

Start your morning with coffee, connections, and cutting-edge research! Enjoy poster presentations, network in the Exhibit Hall, vote for awards, and a chance at a fabulous raffle prize!

Organizer's Welcome Remarks

BUILDING RELIABLE, SCALABLE RESEARCH DATA FOUNDATIONS

Building a Validated, Multi-Lingual Analytics Ecosystem on a Shared Data Platform: Key Lessons from Practice

Photo of Anand Ganesan, Product Lead, GD-IT, Regeneron Pharmaceuticals, Inc. , Product Lead, GD-IT , Regeneron Pharmaceuticals
Anand Ganesan, Product Lead, GD-IT, Regeneron Pharmaceuticals, Inc. , Product Lead, GD-IT , Regeneron Pharmaceuticals
Photo of Sriram Krishnamurthy, Director, GD-IT, Regeneron Pharmaceuticals, Inc. , Dir. GD-IT , GD-IT , Regeneron Pharmaceuticals Inc
Sriram Krishnamurthy, Director, GD-IT, Regeneron Pharmaceuticals, Inc. , Dir. GD-IT , GD-IT , Regeneron Pharmaceuticals Inc

Modern computing and analytics platforms must deliver scalability to support diverse user communities, from regulatory submissions and patient safety in GxP environments to non-GxP use cases like exploratory analysis, all built on an adaptive, scalable infrastructure.This case study explores the journey of creating a validated, multi-lingual analytics ecosystem on a shared storage platform, leveraging insights from non-regulatory use cases. It highlights key architectural decisions, governance and validation strategies, and operational lessons, providing practical guidance for designing scalable, compliant analytics platforms in regulated environments.

Beyond the Script: Architecting a Truly Reproducible R Workflow for Clinical Analysis

Photo of Srihas Velpuri, Infrastructure Engineer, Biostatistics & Data Science R&D, F. Hoffmann La Roche AG , Infrastructure Engineer , Biostatistics & Data Science R&D , F Hoffmann La Roche AG
Srihas Velpuri, Infrastructure Engineer, Biostatistics & Data Science R&D, F. Hoffmann La Roche AG , Infrastructure Engineer , Biostatistics & Data Science R&D , F Hoffmann La Roche AG

In clinical biostatistics, reproducibility isn't optional—it's a regulatory mandate. This presentation details the architecture of a truly reproducible R workflow designed for GxP-compliant clinical analysis. We will break down our solution into four key pillars—Code (Git), Dependencies (Posit Package Manager), Environment (Docker), and Execution (CI/CD)—demonstrating how they combine to create a validated, auditable, and automated platform that ensures the integrity of every result and streamlines regulatory submissions.

Navigating the Data Landscape: The Cancer Research Data Commons Data Ecosystem

Photo of Durga Addepalli, PhD, Health Scientist, Center for Biomedical Informatics & IT, NIH NCI , Health Scientist , Ct for Biomedical Informatics & IT , NIH NCI
Durga Addepalli, PhD, Health Scientist, Center for Biomedical Informatics & IT, NIH NCI , Health Scientist , Ct for Biomedical Informatics & IT , NIH NCI

NCI's Cancer Research Data Commons is a data ecosystem built to administer and manage the data generated by the various NCI funded programs for FAIR data sharing across the research community. NCI has been a pioneer in establishing a democratized ecosystem co-locating data and compute, building secure and scalable data commons and cloud analytical platforms for the diverse users from cancer community.

Transition to Lunch

Refreshment Break in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)

Bio-IT's hall is bigger than ever; one break won’t cut it! Enjoy dessert and coffee after lunch, explore booths and posters, vote for awards, and participate in our raffle for a chance to win a prize!

FEDERATED AND FAIR: SCALING COLLABORATION AND DATA REUSE

Federated Data Systems to Turbocharge Collaboration

Photo of Ahmad Haider, PhD, Vice President, AI/ML & Data, Natera , Senior Director , Data and Advanced Analytics , Vertex Pharmaceuticals Inc
Ahmad Haider, PhD, Vice President, AI/ML & Data, Natera , Senior Director , Data and Advanced Analytics , Vertex Pharmaceuticals Inc

In this session, we will discuss a federated data architecture and operating model to shift ownership of data to domain teams and treat datasets as discoverable, secure products—enabling data producers to publish well-documented, versioned data products, data consumers to self-serve high-quality assets for analytics and ML, and platform teams to provide the plumbing, tooling, and federated governance that makes it all scalable. By combining product thinking, domain ownership, an organization-wide data catalog, and platform-provided capabilities (cataloging, lineage, observability, automated quality checks, and data contracts), we have made pipelines resilient to sudden schema changes, elevated accountability for data quality, and removed the central-team bottleneck that traditionally limited access. The result: faster time-to-insight, more reliable data for data science and analytics, and a governed, interoperable ecosystem where standards and policy are enforced without stifling domain innovation, demonstrating how a pragmatic federated implementation can deliver both autonomy and enterprise-grade control.

MATRIX: Scalable Compound Property Generation for Accelerating Bioinformatics Discovery

Photo of Rody Arantes, Director, Digital Technology, Montai Therapeutics , Director of Digital Technology , Montai Therapeutics Inc.
Rody Arantes, Director, Digital Technology, Montai Therapeutics , Director of Digital Technology , Montai Therapeutics Inc.
Photo of Thomas George Thomas, Senior Data Engineer, Platform Engineering, Montai Therapeutics , Sr Data Engineer , Platform Engineering , Montai Therapeutics
Thomas George Thomas, Senior Data Engineer, Platform Engineering, Montai Therapeutics , Sr Data Engineer , Platform Engineering , Montai Therapeutics

MATRIX (Montai AtTRIbute eXpander) system is a cloud-native platform for large-scale molecular property generation with full data traceability. It produces 50+ attributes for 500M compounds in under two hours, addressing throughput and reproducibility challenges in biotech R&D while enabling faster, more reliable discovery workflows.This scalable, reproducible workflow modernizes cheminformatics and provides a blueprint for rapid, traceable, research-grade bioinformatics discovery.

From Data Silos to Data Facilities: Building Platforms That Make Research Data Findable, Ready, and Reusable

Photo of Fernanda Foertter, MSc, Executive Director, The University of Alabama High Performance Computing and Data Center , Senior HPC Engineer , Oak Ridge National Lab
Fernanda Foertter, MSc, Executive Director, The University of Alabama High Performance Computing and Data Center , Senior HPC Engineer , Oak Ridge National Lab

As demand for analytics and AI accelerates, many organizations face a hidden bottleneck: they don’t know what data they have, where it lives, or how quickly it can be made usable. While FAIR principles define what good data should look like, they don’t solve the operational challenge of discovering, curating, and serving real-world datasets across fragmented storage environments. This session explores the emerging concept of a data facility—an evolution beyond file systems, catalogs, or data lakes. Examine how organizations can architect platforms that actively surface valuable datasets, connect distributed storage, apply readiness workflows, and deliver data that is truly “compute-ready” for downstream research.  Attendees will learn practical approaches for automating data discovery, reducing reliance on human intermediaries, and building infrastructure that turns legacy and siloed data into reusable research assets.

Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

Unwind with colleagues at our lively reception! Explore posters, vote for the best, network with exhibitors, enjoy a drink, and try to win a raffle prize. Celebrate Best of Show winners!

Close of Day

Thursday, May 21

Registration Open

Continental Breakfast with Breakout Discussions

CONTINENTAL BREAKFAST WITH BREAKOUT DISCUSSIONS

Connect & Collaborate: Breakfast Networking Roundtables (Sponsorship Opportunities Available)

Kick off the morning with small-group roundtable discussions designed to spark collaboration, share challenges, and exchange insights across the Bio-IT community. Attendees gather around themed tables—spanning data ecosystems, AI adoption, foundational models, intelligent labs, translational infrastructure, and emerging technologies—to compare experiences and explore practical strategies. Each roundtable seats 8–10 participants for focused, peer-driven conversation that accelerates problem-solving, strengthens connections, and surfaces cross-functional perspectives before the plenary keynote. Topics will be announced throughout the year on the Bio-IT World website as part of our 2026 theme rollout, with opportunities for attendees and partners to propose table themes. If you have a topic to suggest or would like to participate as a moderator, contact Cindy Crowninshield at ccrowninshield@healthtech.com.

PLENARY KEYNOTE PROGRAM

Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute , Executive Event Director , Cambridge Healthtech Institute

Bio-IT World 2026 Innovative Practices Awards Ceremony (Winners Announced)

Allison Proffitt, Editorial Director, Bio-IT World and Clinical Research News , Editorial Dir , Bio-IT World

The Innovative Practices Awards recognizes and celebrates technology innovation in the life sciences. Bio-IT World is currently accepting entries for the 2026 Innovative Practices Awards, a competition designed to recognize partnerships and projects pushing our industry forward. Winners will be announced in April 2026, recognized during the Thursday May 21 Plenary Keynote Program, and scheduled to give a podium presentation about their project during the conference. The deadline for entry is March 2, 2026. For more details about the Awards and to submit an application, visit www.bioitworldexpo.com/innovativepractices.

Bio-IT World 2026 Emerging Innovator Award—NEW (Winner Announced)

Allison Proffitt, Editorial Director, Bio-IT World and Clinical Research News , Editorial Dir , Bio-IT World

The Emerging Innovator Award recognizes one exceptional early-career researcher advancing the future of life sciences through breakthrough work in biomedical data, computational methods, or technology-enabled discovery. The 2026 awardee will deliver a 10-minute plenary keynote at Bio-IT World, highlighting the impact of their research and the forward-looking direction of their work. Nominations are due March 2, 2026, at www.bio-itworldexpo.com.

PLENARY KEYNOTE PRESENTATION:
Hopscotching through Drug Discovery: 15 Years of CADD and the Rise of AI

Photo of José Duca, PhD, Global Head Computer-Aided Drug Discovery, Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Inc. , Global Head Computer-Aided Drug Discovery , Global Discovery Chemistry , Novartis Institutes for BioMedical Research Inc
José Duca, PhD, Global Head Computer-Aided Drug Discovery, Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Inc. , Global Head Computer-Aided Drug Discovery , Global Discovery Chemistry , Novartis Institutes for BioMedical Research Inc

Coffee Break in the Exhibit Hall with Poster Competition Winners Announced (Sponsorship Opportunity Available)

Bio-IT is all about connections! Explore booths, award-winning posters, and network with clients, colleagues, and exhibitors. Grab coffee, build relationships, and stay for a chance to win a raffle prize!

Organizer's Remarks

INTELLIGENT PLATFORMS FOR AI-DRIVEN DISCOVERY

Function-First Generative Drug Design Platform Unlocking Functional Biologics

Photo of Adam Kraut, Director, Research Informatics and Data Architecture, Metaphore Bio , Director Research Informatics , Infrastructure & Cloud Architecture , Metaphore
Adam Kraut, Director, Research Informatics and Data Architecture, Metaphore Bio , Director Research Informatics , Infrastructure & Cloud Architecture , Metaphore

Metaphore Bio is building a function-first generative drug-design platform where data infrastructure is a first-class product. Cloud-orchestrated autonomous labs stream high-throughput functional and multiomic data into a governed cloud platform that powers AI/ML and protein design at scale. I will share how our storage and compute architecture balances speed, performance, cost, and compliance while accelerating functional biologics.

Digital Platform for Data-Driven, AI-Enabled Biotherapeutics Discovery

Photo of Yuhao Lin, Consultant, Biologics IT, Eli Lilly & Company , Consultant , Biologics IT , Eli Lilly & Co
Yuhao Lin, Consultant, Biologics IT, Eli Lilly & Company , Consultant , Biologics IT , Eli Lilly & Co

Lilly is developing an integrated digital platform to transform large molecule discovery in the age of AI. From assay data capture and NGS workflows to MLOps and DMTA, the solutions in this platform are unified in both data and UI by design. This talk will highlight the roadmap, progress, impact, challenges, and learnings from this transformational platform.

Structure Designer: AbbVie's Internal Compound Design Platform

Photo of Elyse Geoffroy, Technology Engineer, Information Research, AbbVie Inc. , Technology Engineer , Information Research , AbbVie Inc
Elyse Geoffroy, Technology Engineer, Information Research, AbbVie Inc. , Technology Engineer , Information Research , AbbVie Inc

Structure Designer is an internally built, customizable application for AbbVie's Discovery medicinal chemists, centralizing compound design workflows and data access, and replacing fragmented, vendor-dependent solutions with a flexible, web-based platform. Its modern architecture integrates rapid calculations, seamless session management, and secure user controls, significantly improving speed, usability, and accessibility for both internal staff and external collaborators, all at a lower cost than commercial alternatives.

Session Break and Transition to Lunch

Refreshment Break in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)

Feeling tired? Recharge during the final Networking Exhibit Hall break! Visit booths, explore posters, connect with peers, and turn in your Game Cards for a chance to win a raffle prize.

TRENDS FROM THE TRENCHES: BRIDGING TRADITIONAL INSIGHTS WITH INNOVATIVE ADVANCEMENTS

Chairperson's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute , Executive Event Director , Cambridge Healthtech Institute

For 20 years, Trends from the Trenches has been Bio-IT World’s unscripted pulse check, offering candid, insider perspectives on what works, what fails, and what’s pure hype in scientific computing. As the field evolved, the session broadened to reflect the real operational challenges and breakthroughs shaping R&D. For 2026, the format evolves again: a focused, credibility-driven keynote paired with a community-powered unconference built from attendee input. The result is a forum for late-breaking insights, grounded realities, forward-looking perspectives, and practical solutions you won’t find in vendor decks, marketing summaries, or any LLM. It leaves participants energized by the collective intelligence in the room and inspired by the emerging possibilities shaping the future of life-science computing.

From 20 Years of Trends to the Next Era of Digital R&D

Photo of Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute , Executive Event Director , Cambridge Healthtech Institute
Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute , Executive Event Director , Cambridge Healthtech Institute

This presentation frames the industry’s next chapter by tracing how Trends from the Trenches has shaped digital R&D for two decades and by spotlighting the forces redefining scientific computing today: AI–HPC convergence, modality-driven compute, multimodal data, and rising expectations for speed, interoperability, and trust. Remarks set the foundation for a forward-looking exploration of where digital biology and computational innovation are heading next.

FEATURED TALK: The Hard Truth about Digital R&D: Patterns, Pitfalls, and the Next Wave of Innovation

Photo of Eleanor A. Howe, PhD, Founder & CEO, Diamond Age Data Science , Founder & CEO , Diamond Age Data Science
Eleanor A. Howe, PhD, Founder & CEO, Diamond Age Data Science , Founder & CEO , Diamond Age Data Science

This presentation delivers a candid, comprehensive assessment of the forces reshaping scientific computing and digital R&D. Eleanor examines the technologies, platforms, modalities, and market dynamics that are truly driving change—highlighting what’s working, what’s stalling, and what’s losing relevance. She synthesizes emerging patterns across AI, data platforms, workflow orchestration, multimodal analytics, and new therapeutic and diagnostic directions, while calling out persistent bottlenecks and architectural missteps slowing progress. The result is a grounded, evidence-based view of where the field is heading and which strategies will matter most in the next cycle of innovation.

Community Unconference: Live Problems, Live Solutions

Photo of Allison Proffitt, Editorial Director, Bio-IT World and Clinical Research News , Editorial Dir , Bio-IT World
Allison Proffitt, Editorial Director, Bio-IT World and Clinical Research News , Editorial Dir , Bio-IT World

This session features a structured, participatory unconference built around topics sourced from Bio-IT event attendees during the conference week. A working group synthesizes all submissions Wednesday evening into a small set of high-value discussion themes. The facilitator guides the room through rapid-fire exchanges, micro-debates, and collaborative problem-solving focused on operational realities in AI, computing, data engineering, and scientific software. The goal is to surface patterns, stress-test ideas, and extract practical solutions emerging across the community—creating an annual, crowd-generated state-of-the-field snapshot that only this session can produce.

Close of Conference


Register Now Image