Track 8: Bioinformatics

Turn Big Data into Smart Data with Computational Resources and Tools

May 4 - 5, 2022 ALL TIMES EDT

The Bioinformatics track assembles thought leaders who will present case studies using computational resources and tools that take data from multiple -omics sources and align it with clinical action. Turning big data into smart data can lead to real-time assistance in disease prevention, prognosis, diagnostics, and therapeutics. With the ever-increasing volume of information generated for curing or treating diseases and cancers, bioinformatics technologies, tools, and techniques play a critical role in turning data into actionable knowledge to meet unstated and unmet medical needs.

Tuesday, May 3

7:00 am Registration Open (Plaza Level Lobby)
8:00 am Recommended Pre-Conference Workshops and Symposium*

On Tuesday, May 3, 2022 Cambridge Healthtech Institute is pleased to offer nine pre-conference workshops scheduled across three time slots (8:00-10:00 am, 10:30 am-12:30 pm, and 1:45-3:45 pm) and a Symposium from 8:25 am-3:45 pm. All are designed to be instructional, interactive and provide in-depth information on a specific topic. They allow for one-on-one interaction and provide a great way to explain more technical aspects that would otherwise not be covered during the main conference tracks that take place Wednesday-Thursday.

*Separate registration required. See Workshop page and Symposium page for details.

3:45 pm Session Break and Transition to Plenary Keynote

PLENARY KEYNOTE LOCATION: 210 (Overflow 208)

PLENARY KEYNOTE PROGRAM

4:00 pm

Welcome by Conference Organizer

Allison Proffitt, Editorial Director, Bio-IT World
4:05 pm Innovative Practices Award
Mike Tarselli, PhD, Chief Scientific Officer, TetraScience
4:30 pm

Ask What IT Can Do for Bio...and What Bio Can Do for IT

George M. Church, PhD, Robert Winthrop Professor, Genetics, Harvard Medical School

IT for Bio: In May 2021, one haploid human genome (3.055 billion bp) was sequenced completely, but zero diploid. We have 7.7 billion diploid humans yet to be sequenced and correlated with their environments and traits in the Personal Genome Project. Plus, at least one genome from each of over 8.7 million eukaryotic species in the Earth Biogenome project. Plus, monitoring pathogenic and commensal bacteria, allergens, and viruses in the BioWeatherMap. Plus, ancient DNA. We are counting RNA molecules per cell in most (or all) cell types in humans, mice, and many other species throughout development and connectome (with imaging resolution up to 20 nm).   

Bio for IT: Reading and writing DNA has improved exponentially in cost (at least 60 million fold) and is increasingly used for storing non-biological data. The record for editing DNA in vivo is now 24,000 edits per cell and for storing data in vivo is about 1 terabyte per mouse. Enormous chemical and biological 'libraries' can perform 'Natural Computing' for tasks far beyond current von-Neumann silicon and quantum computers. The combination of these – machine learning + megalibraries (ML-ML) is already having commercial impact (e.g. Nabla, Manifold, Dyno, Patch). 

5:45 pm Welcome Reception in the Exhibit Hall with Poster Viewing (Auditorium/Hall C)
7:00 pm Close of Day

Wednesday, May 4

7:00 am Registration Open and Morning Coffee (Plaza Level Lobby)

PLENARY KEYNOTE ROOM LOCATION: 210

PLENARY KEYNOTE PROGRAM

8:00 am

Welcome by Conference Organizer

Allison Proffitt, Editorial Director, Bio-IT World
Zachary Powers, Chief Information Security Officer, Benchling
8:15 am

Accessing and Securing the Data that Drives Breakthroughs

Allison Proffitt, Editorial Director, Bio-IT World
Rachana Ananthakrishnan, Executive Director, Globus, University of Chicago
Ari E. Berman, PhD, CEO, BioTeam, Inc.
Jonathan C. Silverstein, Chief Research Informatics Officer & Professor, Biomedical Informatics, University of Pittsburgh
Rebecca F. Rosen, PhD, Director, Office of Data Science and Sharing, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health

Life sciences research is generating massive amounts of data that should be accessible to collaborators and colleagues to enable breakthrough discoveries. However, ensuring sensitive data are shared securely in a manner that protects patient privacy and complies with myriad regulations is a daunting task, which often slows the pace of research. Our panel of leading practitioners will share insights on the challenges and best practices of managing protected research data.

9:30 am Coffee Break in the Exhibit Hall with Poster Viewing (Auditorium/Hall C)

ROOM LOCATION: 208

BIOMARKER DISCOVERY FOR PRECISION MEDICINE

10:15 am Organizer's Remarks
10:20 am

Chairperson's Remarks

Joseph D. Szustakowski, PhD, Vice President, Translational Bioinformatics, Bristol Myers Squibb Co.
10:25 am

Accelerating Biomarker Discovery for Precision Medicine with Multi-Modal Data and Advanced Analytics

Joseph D. Szustakowski, PhD, Vice President, Translational Bioinformatics, Bristol Myers Squibb Co.

Recent advances in disease biology, combination therapies, and treatment modalities have positively impacted human health.  Paradoxically, these advances created a complex and fragmented clinical landscape, amplifying the need for precision medicine approaches to enable drug development. In this talk, I will present several case studies of contemporary biomarker discovery efforts that leverage diverse data types (-omics, pathology, imaging, real world data) and modern computational methods to advance drug development programs.

NOVEL VISUALIZATION METHODS TO UNDERSTAND DISEASE MECHANISMS AND IDENTIFY THERAPEUTIC INTERVENTIONS

10:55 am

Innovative Visualization of Omics Data to Advance Alzheimer’s Disease Research

Sudeshna Das, PhD, Director Biomedical Informatics Core, Neurology, Massachusetts General Hospital

To facilitate molecular studies of Alzheimer's disease (AD), we have developed Alzheimer DataLENS, a web-based portal that enables researchers to share, browse, and visualize comprehensive results from bioinformatics analysis of public omics datasets. Currently, the portal has data from over 50 genetic, proteomics, and transcriptomics studies. Recently, several studies have used single-nucleus transcriptomics to characterize the brains of patients with Alzheimer's disease. We leveraged these datasets to develop innovative methods of visualizing gene expression across various cell types, brain regions, and disease states.

11:25 am

Cellxgene VIP Unleashes Full Power of Interactive Visualization and Integrative Analysis of scRNA-Seq, Spatial Transcriptomics, and Multiome Data

Baohong Zhang, PhD, Head, Research Data Sciences, Biogen

To meet the growing demands from scientists to effectively extract deep insights from single cell RNA sequencing, spatial transcriptomics, and emerging multiome datasets, we developed cellxgene VIP (Visualization In Plugin), a front end interactive visualization plugin of cellxgene framework, which greatly expanded capabilities of the base tool in the following aspects. First, it generates a comprehensive set of eighteen commonly used quality control and analytical plots in high resolution with highly customizable settings in real time. Second, it provides more advanced analytical functions to gain insights on cellular compositions and deep biology, such as marker gene identification, differential gene expression analysis, and gene set enrichment. Third, it empowers advanced users to perform analysis in a Jupyter notebook like Command Line Interface (CLI) environment by programming in Python and/or R directly without containing to available interactive functional modules. Finally, it pioneers methods to visualize multi-modal data, such as spatial transcriptomics embedding aligned with histological image on one slice or multiple slices in a grid format, and the latest 10x Genomic Multiome dataset where both DNA accessibility and gene expression in the same cells are measured, under the same framework in an integrative way to fully leverage the aforementioned functionalities. Taken together, the open-source tool makes large scale single cell data visualization and analysis more accessible to biologists in a user-friendly manner and fosters computational reproducibility by simplifying data and code reuse through the CLI.  Going forward, it has the potential to become a crowdsourcing ecosystem for the scientific community to contribute even more modules to the Swiss Army knife of single cell data exploration tool.

Natali Gulbahce, PhD, Associate Director, Bioinformatics, CareDx, Inc.

Abstract to be announced.

12:25 pm

Production Bioinformatics - Emphasis on 'Production'

Chris Dwan, Senior Vice President Production Bioinformatics, Sema4

Production bioinformatics at Sema4 can be thought of as data ops - a peer to the lab ops organization. We operate 24/7 to deliver correct and timely results on NGS and other data for thousands of samples per week. Chris Dwan, Senior Vice President will introduce the Prod BI organization and systems architecture, with a focus on what it takes to run bioinformatics in production rather than for R&D or pure research. 


12:55 pm Session Break and Transition to Luncheon Presentation
Ted Slater, Senior Director Product Management PAAS, Elsevier
Krishna Bulusu, Director in Early Computational Oncology Translational Medicine, AstraZeneca

To understand epigenetic alterations in cancer and investigate links between them and drug targets, AstraZeneca and Elsevier developed a large knowledge graph from full-text scientific literature used for semantic search, visualization and link prediction between epigenetic modifications and drug targets. Join Krishna Bulusu, Director, Early Computational Oncology, AZ and Ted Slater, Elsevier.

1:50 pm Refreshment Break in the Exhibit Hall with Poster Viewing (Auditorium/Hall C)

LEARNING TO DEVELOP MEANINGFUL INSIGHTS FROM LARGE BIOLOGICAL DATASETS

2:35 pm

Chairperson's Remarks

Sandor Szalma, PhD, Global Head, Computational Biology, Takeda
2:40 pm

Changing Health Care For People Of African-Ancestry Through An InterNational Genomics & Equity Initiative​

Lyndon J. Mitnaul, PhD, Senior Director, Research Initiatives, Regeneron Genetics Center
3:10 pm

Advancing Cures for Multiple Myeloma: The MMRF Bioinformatics Journey

Eva Lepisto, Vice President of Informatics, Multiple Myeloma Research Foundation (MMRF)
3:40 pm

Diversifying Genomic Data Access in Research: Advancing Scientific Discoveries, Empowering Drug Discovery & Development, and Overcoming Health Inequities to Benefit All Patients

Dorothee Diogo, PhD, Director, Statistical Genetics, Computational Biology, Takeda Development Center Americas, Inc.
Patti Connolly, BSc, MSc, CLS, CCRA, ACRP-CP, Chief Operating Officer, Verici Dx
Mike Kubal, Sr. Bioinformatics Staff Scientist, Illumina

Worldwide, 10% of kidney transplant patients experience clinical rejection while up to one-third experience a subclinical, or silent, rejection. Prognosis of risk of injury & resulting rejection beginning pre-transplant is crucial to guiding clinical management.

 

Join Verici Dx & Illumina to learn how large-scale datasets & predictive artificial intelligence on a secure, scalable analysis platform help us understand immune response & other biological pathway signals, ultimately enabling more personalized transplant care.

4:40 pm Best of Show Awards Reception in the Exhibit Hall with Poster Viewing (Auditorium/Hall C)
6:00 pm Close of Day

Thursday, May 5

7:30 am Registration Open and Morning Coffee (Plaza Level Lobby)

PLENARY KEYNOTE ROOM LOCATION: 210

PLENARY KEYNOTE PROGRAM

8:00 am

Welcome by Conference Organizer

Allison Proffitt, Editorial Director, Bio-IT World
Nate Raine, Director Data Custodians, Lifebit
8:15 am

Leveraging Large-Scale Human Data to Advance and Accelerate Drug Discovery

Shankar Subramaniam, PhD, Distinguished Professor of Bioengineering; Professor of Chemistry, Biochemistry and Nanotechnology; Adjunct Professor of Cellular & Molecular Medicine, University of California at San Diego

Advances in genomics technologies have led to generation of massive amounts of human data. This has catalyzed new insights into cellular processes in the normal and disease state and facilitated the search for safe and effective medicines. The UK Biobank, All of US and TopMed initiatives are exemplars of this approach. We highlight examples from our lab where meaningful insights have been obtained advancing our understanding of disease biology and its pharmacological application.

9:30 am Coffee Break in the Exhibit Hall with Poster Viewing (Auditorium/Hall C)

ROOM LOCATION: 208

LARGE-SCALE GENOMICS FOR DRUG DISCOVERY – APPLYING RESULTS OF GENOME ANALYSIS

10:15 am Organizer's Remarks
10:20 am

Chairperson's Remarks

Gunaretnam Rajagopal, PhD, Scientific Vice President & Fellow, Computational Sciences, Janssen R&D LLC
10:25 am

Advancing Drug Discovery and Development With Population-Based Genomics and Massive-Scale Analytics

Gunaretnam Rajagopal, PhD, Scientific Vice President & Fellow, Computational Sciences, Janssen R&D LLC
Mary H. Black, PhD, Head, Population Analytics & Computational Sciences, Janssen R&D LLC

Drugs supported by genetic evidence are more likely to be successfully approved. We developed computational workflows leveraging biobank-scale genomic and phenotypic data to predict therapeutic efficacy and adverse events in the clinic. We describe a pipeline for the implementation of our approach to a set of drug targets, including a supervised learning framework for estimation of a target-specific efficacy and liability indices, and discuss use cases for various therapeutic areas.

10:45 am

Harnessing Large-Scale Data for Neurodegeneration: Faster, Better, Smarter Drug Discovery Opportunities in Alzheimer’s Disease

Simon Lovestone, PhD, Vice President, Disease Area Stronghold Lead Neurodegeneration, Janssen, Inc.
Karen Y. He, PhD, Scientist, Population Analytics, Computational Sciences, Janssen R&D

Alzheimer’s Disease (AD) and related neurodegenerative diseases remain amongst the world’s largest unmet medical needs. The large-scale genomic, proteomic and phenotype data increasingly available in resources such as UK Biobank and other cohort studies afford an opportunity to better understand pathogenesis and identify novel targets for therapeutics and biomarker discovery. We present a study leveraging this vast constellation of data to support drug development in neurodegenerative disorders.

11:05 am

Biobank-Scale Genomics with Experimental Validation Informs Therapeutic Approaches for Immune-Mediated Inflammatory Diseases

Julio Molineros, PhD, Principal Scientist, The Janssen Pharmaceutical Companies of Johnson & Johnson
Julian C. Knight, PhD, Group Head & Principal Investigator, Genetics & Genomics, University of Oxford

Immune-mediated inflammatory diseases (IMID) are collectively common, clinically diverse, and associated with a high burden of morbidity and mortality. Biobank-scale genomic, proteomic, and phenotype data combined with experimental validation provide a powerful resource for novel target identification and prioritization, biomarker assessment, and mechanistic studies. We present an analytical workflow and preliminary findings leveraging these extensive data to support drug discovery and development in IMID.

11:25 am

Scalable Computational Approaches to Gene-Environment Interaction Yields Novel Insights for Drug Development

William James Gauderman, Professor, Population and Public Health Sciences, Biostatistics, University of Southern California
Shuwei Li, PhD, Senior Principal, Population Analytics, Computational Science, Johnson & Johnson

Gene-environment (GxE) interactions underlie much of the etiology of complex disease and can implicate new targets for drug development. The UK Biobank’s rich phenotypic and genomic sequencing data, together with state-of-the-art, scalable GxE approaches, enables comprehensive genome- and exposome-wide GxE analysis. We present a computational workflow for large-scale GxE analyses and discuss findings that can be used to inform genomics-based evaluations of efficacy and target-associated liabilities for drug development.

11:45 am

Leveraging the J&J Innovation Ecosystem for Biomarker and Target Validation

Trevor Howe, PhD, Director & Fellow, Translational Genomics, Janssen UK

While whole genome sequencing in UK Biobank and other cohorts delivers gene-phenotype associations at unprecedented scale, further analysis is required to validate these associations with evidence for biomarker or drug target identification. In this talk, we describe collaborations forged in the J&J innovation eco-system with partners who have developed experimental and/or informatics approaches capable of elucidating mechanisms and delivering actionable target or biomarker insights and opportunities.

12:05 pm PANEL DISCUSSION:

Moderated Q&A

Panel Moderator:
Gunaretnam Rajagopal, PhD, Scientific Vice President & Fellow, Computational Sciences, Janssen R&D LLC
Panelists:
Mary H. Black, PhD, Head, Population Analytics & Computational Sciences, Janssen R&D LLC
William James Gauderman, Professor, Population and Public Health Sciences, Biostatistics, University of Southern California
Trevor Howe, PhD, Director & Fellow, Translational Genomics, Janssen UK
Karen Y. He, PhD, Scientist, Population Analytics, Computational Sciences, Janssen R&D
Julian C. Knight, PhD, Group Head & Principal Investigator, Genetics & Genomics, University of Oxford
Shuwei Li, PhD, Senior Principal, Population Analytics, Computational Science, Johnson & Johnson
Simon Lovestone, PhD, Vice President, Disease Area Stronghold Lead Neurodegeneration, Janssen, Inc.
Julio Molineros, PhD, Principal Scientist, The Janssen Pharmaceutical Companies of Johnson & Johnson
12:55 pm Session Break and Transition to Luncheon Presentation
1:05 pm Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own
1:50 pm Refreshment Break in the Exhibit Hall with Poster Viewing (Auditorium/Hall C)

LARGE SCALE GENOMICS FOR DRUG DISCOVERY - DEPLOYING HPC/INFORMATICS/GENOME ANALYSIS CAPABILITIES

2:35 pm

Chairperson's Remarks

Gunaretnam Rajagopal, PhD, Scientific Vice President & Fellow, Computational Sciences, Janssen R&D LLC
2:40 pm

Computational Challenges And Opportunities for Translating Massive Biomedical Data

Brice Sarver, PhD, Head of Computational Genomics and Principal Scientist, Population Analytics, Computational Sciences, Janssen R&D LLC
Frank Wurthwein, PhD, Director, San Diego Supercomputer Center

Drug discovery requires pairing a variety of data modalities with the appropriate computational platforms to facilitate and expedite analysis. Here, we introduce the large-scale, multidimensional datasets provided by the UK Biobank, describe computational challenges and considerations when working with these data, and provide real-world examples of fit-for-purpose analytical workflows designed to take advantage of the resources provided by a major national supercomputer center.

3:00 pm

The UK Biobank – A Global Resource for Advancing Biomedical Research and Drug Discovery

Mark Effingham, PhD, Deputy CEO, UK Biobank

Translational research requires pairing a variety of data modalities with the appropriate cyber-infrastructure and computational platforms to facilitate and expedite analysis. We introduce the UK Biobank - a large-scale biomedical database containing in-depth genetic and health information from half a million UK participants. The database, which is regularly augmented is globally accessible to approved researchers undertaking research into life-threatening diseases. We outline how the data can be accessed and analyzed via a fit-for-purpose cloud-based analysis platform as well as highlight impactful uses cases.

3:20 pm

Structural Variation Data on 500K WGS from the UK Biobank

Sebastian Wasilewski, PhD, Senior Director, Genome Informatics, AstraZeneca

Accurately determining structural variants from WGS data is a non-trivial problem especially from large population-based cohorts such as that generated from 500K individuals within the UK Biobank via a public-industry consortium. We highlight an example of combining state-of-art algorithm for structural variation calling developed by Illumina with massive geographically distributed cloud-based computational resources provided by AWS. We anticipate that results of this analysis will be publicly available to the global scientific community in the near future.

3:40 pm PANEL DISCUSSION:

Moderated Q&A

Panel Moderator:
Gunaretnam Rajagopal, PhD, Scientific Vice President & Fellow, Computational Sciences, Janssen R&D LLC
Panelists:
Mark Effingham, PhD, Deputy CEO, UK Biobank
Slavé Petrovski, PhD, Vice President & Head, Genome Analytics & Informatics, AstraZeneca
Brice Sarver, PhD, Head of Computational Genomics and Principal Scientist, Population Analytics, Computational Sciences, Janssen R&D LLC
Shankar Subramaniam, PhD, Distinguished Professor of Bioengineering; Professor of Chemistry, Biochemistry and Nanotechnology; Adjunct Professor of Cellular & Molecular Medicine, University of California at San Diego
Sebastian Wasilewski, PhD, Senior Director, Genome Informatics, AstraZeneca
Frank Wurthwein, PhD, Director, San Diego Supercomputer Center
4:10 pm Close of Conference





Exhibit Hall and Keynote Pass

Data Platforms and Storage Infrastructure