Open Access and Collaborations

The Open Access and Collaborations track presents case studies on collaborative technologies and methodologies used to aggregate and harmonize data from heterogeneous sources to accelerate translational and clinical research. Speakers will show novel approaches of key drivers, technology innovations, collaboration platforms, open-source frameworks, legal considerations, and other factors that are managing data and empowering transformative changes through translation. Additional themes that will be covered include: emerging security; analytic, semantic capabilities; FAIR data practices and applications; data commons; implications of Europe's Plan S on publishing in the United States; and large collaborative datasets.

Tuesday, October 6

PLENARY KEYNOTE PROGRAM

10:00 am

Welcome Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute
Scott Parker, Director of Product Marketing, Marketing, Sinequa
10:15 am

NIH’s Strategic Vision for Data Science

Susan K. Gregurick, PhD, Associate Director, Data Science (ADDS) and Director, Office of Data Science Strategy (ODSS), National Institutes of Health
Rebecca Baker, PhD, Director, HEAL (Helping to End Addiction Long-term) Initiative, Office of the Director, National Institutes of Health
11:05 am

LIVE Q&A: Session Wrap-Up Panel Discussion

Panel Moderator:
Ari E Berman, PhD, CEO, BioTeam Inc
11:25 am Lunch Break - View Our Virtual Exhibit Hall
11:55 am Recommended Pre-Conference Workshops*
W1: Data Management for Biologics: Registration and Beyond
W2: A Crash Course in AI: 0-60 in Three
W3: Data Science Driving Better Informed Decisions

*Separate registration required. See workshop page for details.

1:55 pm Refresh Break - View Our Virtual Exhibit Hall
2:15 pm Recommended Pre-Conference Workshops*
W4: Digital Biomarkers and Wearables in Pharma R&D and Clinical Trials
W5: AI-Celerating R&D: Foundational Approaches to How Emerging Technologies Can Create Value
W6: Dealing with Instrument Data at Scale: Challenges and Solutions

*Separate registration required. See workshop page for details.

4:15 pm Close of Day

Wednesday, October 7

BUILDING AND SUSTAINING SUCCESSFUL SCIENTIFIC COLLABORATION TEAMS AND DATA MODELS

9:00 am

Building Interdisciplinary Research Teams: Opportunities and Challenges

L. Michelle Bennett, PhD, Director, Center for Research Strategy, NIH NCI

Biomedical research over the last decade has become increasingly complex. The field recognizes the need to bring different disciplinary experts together to solve challenging scientific questions. No longer is a single disciplinary perspective enough for truly breakthrough research advances. It is the science and the possibility of making a major advance that brings people together to form a research team. Once the team has been assembled, it becomes critical for the leader(s) of the team to recognize that there is much more than the science to tend to.

9:20 am

A Common Data Model Proposal to Facilitate and Encourage Data Sharing and Reuse

Kathy Reinold, Principal Data Modeler, Broad Institute of MIT and Harvard

Biomedical researchers have access to many data sources, but finding data with specific characteristics remains a challenge. Datasets have different metadata, format, and structure. At the Broad Institute, we envision a simpler and more comprehensive search capability to allow researchers to find and reuse data across many datasets. We propose a cross-domain data model built specifically to facilitate search and reuse. We share our methods, lessons learned, and status.

9:40 am

The National Microbiome Data Collaborative: A FAIR Data Resource for Microbiome Research

Kjiersten Fagnan, PhD, CIO, Data Science & Informatics, Lawrence Berkeley National Laboratory

Our multi-lab collaborative partnership will pilot an integrated, community-centric framework within 27 months to fully leverage existing microbiome data science resources and high-performance computing systems available within the DOE complex for data access, integration, and advanced analyses. In this talk I will cover some of the challenges in microbiome data sciences and how we aim to overcome these by creating a large, open-access repository of FAIR data.

10:00 am Coffee Break - View Our Virtual Exhibit Hall

BUILDING DATA PLATFORMS TO ACHIEVE PATIENT CENTRICITY

10:20 am

A Comprehensive Platform for Innovation with Data

Ajay Shah, PhD, MBA, Executive Director & Head of IT for Translational Medicine, Bristol-Myers Squibb

Sage is a comprehensive platform that enables FAIR data, for data ranging from discovery, clinical research, and real-world. This talk will focus on the overview of Sage and solutions developed in Sage ecosystem for biomarker analytics, including an overview of essential components of the platform, such as uniform high-quality data ingestion, data lake enhancement with semantic integration conformance of data, and a reproducible research framework.

10:40 am

Maximizing Real-World Assets through a Comprehensive Patient Data Platform

Albert Wang, MS, Director, IT for Translational Medicine & Informatics & Predictive Sciences, Bristol-Myers Squibb

Sage ecosystem is a cross-functional cohesive platform for finding, accessing, integrating, and analyzing patient-centric data. This talk will focus on real-world data (RWD). It will highlight how Sage catalogs, models, integrates, conforms, and presents patient-level metadata across all RWD assets to facilitate downstream cross-dataset analysis within an integrated managed analytics environment. This talk will touch on the business drivers for this initiative, our current progress, as well as some lessons learned.

11:15 am LIVE Q&A:

Session Wrap-Up Panel Discussion

Panel Moderator:
Ajay Shah, PhD, MBA, Executive Director & Head of IT for Translational Medicine, Bristol-Myers Squibb
Panelists:
Albert Wang, MS, Director, IT for Translational Medicine & Informatics & Predictive Sciences, Bristol-Myers Squibb
Alexander Sherman, Director, Center for Innovation and Bioinformatics, Massachusetts General Hospital
Michael Liebman, PhD, Managing Director, IPQ Analytics, LLC
Michael Montgomery, MD, Co-Founder and CEO, Stable Solutions LLC
Jonathan Morris, MD, Vice President, Provider Solutions; Chief Medical Informatics Officer, Real World Insights, IQVIA
Kjiersten Fagnan, PhD, CIO, Data Science & Informatics, Lawrence Berkeley National Laboratory
L. Michelle Bennett, PhD, Director, Center for Research Strategy, NIH NCI
Kathy Reinold, Principal Data Modeler, Broad Institute of MIT and Harvard
11:35 am Session Break
11:50 am Lunch Break - View Our Virtual Exhibit Hall

PLENARY KEYNOTE PROGRAM

11:55 am Interactive Breakout Discussions

Consider joining a breakout discussion group. These are informal, moderated discussions with brainstorming and interactive problem solving, allowing participants from diverse backgrounds to exchange ideas and experiences and develop future collaborations around a focused topic.

Michael Riener, President, RCH Solutions

Join us for a lively discussion among prominent pharma leaders, and learn:

Why, when & how to implement a public Cloud for your computing needs

Challenges and opportunities when setting and managing stakeholder expectations

Critical keys to success to realize the best outcomes

To learn more about RCH Solutions, visit our Virtual Booth

Joe Donahue, Managing Director, Life Sciences, Accenture

Hosted by Joe Donahue, Managing Director, Life Sciences, Accenture

 

Participants include: 

Andreas Matern, Head of Digital Translational Medicine, Sanofi

John Quackenbush, Professor of Computational Biology and Bioinformatics; Harvard T.H. Chan School of Public Health

Seungtaek Lee, VP, Strategic Partnerships and AI RWE Head of CoE; ConcertAI

Preston Keller, PhD, MBA, President & CCO, PercayAI

Philip Payne, PhD, Becker Professor and Chief Data Scientist, Washington University in St. Louis

 

Jeff Evernham, VP of Customer Solutions, Consulting, Sinequa

Most large scale analysis of clinical trial data only leverages part of the picture, ignoring unstructured data and limiting findability across all the information collected throughout multiple disparate data sources.  This roundtable will discuss leveraging a cognitive platform to combine all data from multiple sources into one unified view using a single entry point to the data.

 

Sasha Paegle, Life Science Business Development, Dell Technologies

Evaluating, optimizing and benchmarking of next generation sequencing (NGS) methods are essential for clinical, commercial and academic NGS pipelines. Optimizations for speed and accuracy often require making trade-offs relative to other constraints. Join this roundtable to discuss benchmarking strategies, trade-offs, and the value of benchmarking genomics tools and applications. 

Michael Schwartz, Head, Product Marketing, Marketing, Benchling

The life science industry has forged ahead with a new generation of therapeutics. A new R&D paradigm is required to develop scientific platforms, manage data complexity, and orchestrate progress across specialized teams. Digital solutions and data ecosystems are at the heart of this, but require both structure and adaptability to thrive in the modern life science R&D environment.

12:30 pm KEYNOTE PRESENTATION & PANEL DISCUSSION:

Game On: How AI, Citizen Science, and Human Computation Are Facilitating the Next Leap Forward

Allison Proffitt, Editorial Director, Bio-IT World

While the precision medicine movement augurs for better outcomes through targeted prevention and intervention, those ambitions entail a bold new set of data challenges. Various panomic and traditional data streams must be integrated if we are to develop a comprehensive basis for individualized care. However, deriving actionable information requires complex predictive models that depend on the acquisition and integration of patient data on a massive scale. This picture is further complicated by new data streams emerging from quantified self-tracking and health social networks, both of which are driven by experimentation-feedback loops. Tackling these issues may seem insurmountable, but recent advancements in human/AI partnerships and crowdsourcing science adds a new set of capabilities to our analytic toolkit. This session describes recent work in online collective systems that combine human and machine-based information processing to solve biomedical data problems that have been otherwise intractable, and an information processing ecosystem emerging from this work that could transform the landscape of precision medicine for all stakeholders. Pietro will open with a framing talk, followed by short presentations from each panelist, ending with a moderated Q&A discussion by Allison with speakers and attendees. 

Panelists:
Seth Cooper, PhD, Assistant Professor, Khoury College of Computer Sciences, Northeastern University
Lee Lancashire, PhD, CIO, Cohen Veterans Bioscience
Pietro Michelucci, PhD, Director, Human Computation Institute
Jérôme Waldispühl, PhD, Associate Professor, School of Computer Science, McGill University
1:55 pm Refresh Break - View Our Virtual Exhibit Hall

DATA COMMONS IN PRACTICE

2:10 pm

Building Patient Platforms with Gen3 for Research and Real-World Data

Robert Grossman, PhD, Frederick H. Rawson Distinguished Service Professor in Medicine and Computer Science; Director, The Jim and Karen Frank Center for Translational Data Science, University of Chicago
2:20 pm

Facilitating Cooperative Data Science in the Cloud with the Gen3 Platform for Creating Data Commons

Christopher G. Meyer, PhD, Scientific Support Analyst, Center for Translational Data Science, University of Chicago

There is growing appreciation in the scientific community for the value of making data from exploratory pilot projects and published studies available for reuse. With the aim of accelerating new discoveries through the sharing and collaborative re-analysis of data, the University of Chicago has created open source software called “Gen3” for building data commons, which are cloud-based platforms for harmonizing, sharing, and analyzing large datasets contributed by multiple groups or organizations. Background and overview of the Gen3 software architecture and how it is being used will be presented.

2:35 pm

Gabriella Miller Kids First Data Resource Center: Collaborative Platforms for Accelerating Cross-Disease Pediatric Research across Development and Cancer

Allison Heath, PhD, Director, Data Technology & Innovation, Center for Data Driven Discovery in Biomedicine, Childrens Hospital of Philadelphia

Since launching in October 2018, the Gabriella Miller Kids First (Kids First) Data Resource Center (DRC), has made an increasing number of pediatric genomic studies available to the research community. A “best-of-breed” approach has been taken by our multi-institutional team to develop a platform comprising of reusable technology stack components enabling search and query capabilities coupled with secure workspaces for data analysis. Additionally, the DRC services strive towards a foundation for interoperability with other large-scale data sources, both nationally and globally.

2:50 pm Refresh Break - View Our Virtual Exhibit Hall
3:10 pm

Our Experience as the First External Organization to Build a Gen3 Data Commons

William Van Etten, PhD, Senior Scientific Consultant, Consulting, BioTeam Inc.

BMS asked BioTeam to build a Data Commons to share FAIR scientific data across their organization. Through work at NCI, BioTeam became acquainted with the open-source Gen3 Data Commons framework developed by the University of Chicago and we chose to leverage Gen3 for this BMS Cloud platform. In this talk, we will describe our experience as the first organization outside of U. Chicago to build a Gen3 Data Commons.

3:25 pm

Silo Breaking with Gen3: Improving the Culture of Data Sharing by Leveraging a Data Commons Approach within a Global Pharmaceutical Company

Daniel Huston, Lead IT Business Partner, Translational Bioinformatics, Bristol-Myers Squibb Co.

Sharing genomics datasets for collaborative analyses poses critical challenges for pharmaceutical organizations with diverse R&D needs. In 2019, IT for Translational Medicine created “SiloBreaker”; a cross-functional program with the mission to break down the barriers that exist between genomics data and scientists. The team implemented the Gen3 Data Commons framework as its key platform solution. This talk will discuss the significant improvement in the culture of genomics data sharing at BMS achieved through the SiloBreaker program.

4:00 pm LIVE Q&A:

Session Wrap-Up Panel Discussion

Panel Moderator:
William Van Etten, PhD, Senior Scientific Consultant, Consulting, BioTeam Inc.
Panelists:
Robert Grossman, PhD, Frederick H. Rawson Distinguished Service Professor in Medicine and Computer Science; Director, The Jim and Karen Frank Center for Translational Data Science, University of Chicago
Christopher G. Meyer, PhD, Scientific Support Analyst, Center for Translational Data Science, University of Chicago
Allison Heath, PhD, Director, Data Technology & Innovation, Center for Data Driven Discovery in Biomedicine, Childrens Hospital of Philadelphia
Daniel Huston, Lead IT Business Partner, Translational Bioinformatics, Bristol-Myers Squibb Co.
Adam Kraut, Dir Infrastructure & Cloud Architecture, Infrastructure & Cloud Architecture, BioTeam Inc
John Jacquay, Scientific Systems Engineer, BioIT, BioTeam Inc
4:20 pm Bio-IT Connects - View Our Virtual Exhibit Hall
5:00 pm Close of Day

Thursday, October 8

APPLICATIONS OF TOOLS IN CLINICAL, POPULATION, AND COLLABORATIVE HEALTH SETTINGS

9:00 am

Open Science with OHDSI: From Question to Evidence in 4 Days

Kees Van Bochove, Founder, The Hyve

The OHDSI (Observational Health Data Sciences and Informatics) COVID-19 studyathon in March 2020 produced a number of high profile studies, such as the study of the safety profile of hydroxychloroquine in real world data from nearly 1 million prevalent users of the drug across the globe, which went on to inform public health guidance from a.o. the European Medicines Agency (EMA). The volume and speed of the studies is interesting, since these studies typically take many months to prepare and execute. In this talk, we will illustrate how the OHDSI community leverages the OMOP common data model, standardized analytics as well as a powerful global open science network to make this a reality without compromising statistical rigor or scientific standards. We will also cover the role of the IMI EHDEN project which is building a European network of health data sources to strengthen the OHDSI community and further observational research in real world health data.

9:20 am

Development of Risk Prediction Models for Cardiovascular Diseases and Prostate Cancer Using Deep Learning: Case Studies from Ongoing Collaboration between the Department of Veterans Affairs (VA) and the Department of Energy (DOE)

Ravi K. Madduri, Scientist, Computation Institute, University of Chicago
9:40 am Coffee Break - View Our Virtual Exhibit Hall
10:00 am

Extending the Cancer Data Ecosystem

Ian M Fore, PhD, Sr Biomedical Informatics Program Mgr, Cancer Informatics, NIH NCI

This talk will explore the National Cancer Institute’s engagement in the Global Alliance for Genomics and Health (GA4GH) as a way of extending the ability of cancer researchers and physicians beyond the Cancer Research Data Commons to access a broader range of data and tools, potentially increasing the power of analyses of rare cancers.

10:20 am

TREAT-AD: Diversifying the Alzheimer's Disease Target Pipeline

Lara M. Mangravite, PhD, President, Sage Bionetworks

Here we describe a radically open approach to diversify the AD drug portfolio. Using multi-omic and genetic models of disease built from human brain data, a suite of emerging therapeutic hypotheses are generated that complement the small set already in drug development. To catalyze rapid evaluation of these targets, target enabling packages – containing computational and experimental resources including prototype drug compounds – are developed and openly distributed for use across the research community.

11:10 am LIVE Q&A:

Session Wrap-Up Panel Discussion

Panel Moderator:
Chris Anderson, Editor in Chief, Clinical OMICs
Panelists:
Ian M Fore, PhD, Sr Biomedical Informatics Program Mgr, Cancer Informatics, NIH NCI
Ravi K. Madduri, Scientist, Computation Institute, University of Chicago
Lara M. Mangravite, PhD, President, Sage Bionetworks
Kees Van Bochove, Founder, The Hyve
11:30 am Lunch Break - View Our Virtual Exhibit Hall
11:35 am Interactive Breakout Discussions

Consider joining a breakout discussion group. These are informal, moderated discussions with brainstorming and interactive problem solving, allowing participants from diverse backgrounds to exchange ideas and experiences and develop future collaborations around a focused topic.

Timothy Gardner, CEO, Riffyn, Inc.

How do you use data / digitization today to drive scientific discovery / product development?

What are you greatest scientific pain points / gaps that are not being met by digitization?

What kinds of outcomes do you believe digital tools could help you achieve?

 

Scott Jeschonek, Principal Program Manager, Microsoft Azure

Welcome to this discussion group on the growth of demand for HPC in scientific research. We are looking forward to a lively forum. We'll start by looking at three related topics:

- What events trigger demand in your organization? How has the current pandemic impacted resources?

- What could make scale and collaboration more accessible to more researchers?

- Share a recent experience of shifting workloads to manage HPC capacity.

Greg DiFraia, General Manager, Americas, Executive Team, Scality
Shailesh Manjrekar, Head of AI and Strategic Alliances, Executive Team, WekaIO

In this session we’ll discuss how to provide researchers with performance and scale in genomics & research analytics, to drive results at a price point that’s economically viable on public & private cloud.

11:35 am

Breakout: NGS Pipeline Optimizations

Tristan J Lubinski, PhD, Sr Scientist, Next Generation Sequencing Informatics, AstraZeneca Pharmaceuticals; Co-organizer, Boston Computational Biology and Bioinformatics (BCBB)
Howard Marks, Technologist Extraordinary and Plenipotentiary, VAST Data

Storage solutions we’ve been using force bioinformaticists to make trade-offs between the capacity and low-cost of disk and the performance of flash. This results in complex tiering configurations that only deliver performance for a small slice of the data. In this session, we will review how advancements in technology enable VAST Data to revolutionize the cost of all-flash and allows bioinformatists faster analysis across larger datasets for deeper insights.

PLENARY KEYNOTE PROGRAM

12:00 pm

Welcome Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute
Juergen A. Klenk, PhD, Principal, Deloitte Consulting LLP
12:15 pm

Toward Preventive Genomics: Lessons from MedSeq and BabySeq

Robert C. Green, Professor & Director, G2P Research, Genetics & Medicine, Brigham & Womens Hospital
12:40 pm

AI in Pharma: Where We Are Today and How We Will Succeed in the Future

Natalija Z. Jovanovic, PhD, Chief Digital Officer, Sanofi
1:05 pm

LIVE Q&A: Session Wrap-Up Panel Discussion

Panel Moderator:
Vivien R. Bonazzi, PhD, Managing Director & Chief Biomedical Data Scientist, Deloitte Consulting LLP
Juergen A. Klenk, PhD, Principal, Deloitte Consulting LLP
1:25 pm Happy Hour - View Our Virtual Exhibit Hall
2:00 pm Close of Conference





Exhibit Hall and Keynote Pass

Data Platforms and Storage Infrastructure