Track 6 - April 5 – 7, 2016
Next-Gen Sequencing Informatics
Advances in Large-Scale Computing
Tremendous advancements have been made to broaden NGS applications from research to the clinic. Especially as genomics becomes more integrated with precision medicine initiatives. In spite of this, enormous challenges for NGS still exist including real-time sequencing, data storage, processing, scaling, quality control management, security and compliance in the cloud, and interpretation. Track 6 presents case studies on these challenges.
Tuesday, April 5
7:00 am Workshop Registration and Morning Coffee
8:00 – 11:30 Recommended Morning Pre-Conference Workshops*
Intelligent Methods Optimization of Algorithms of NGS
12:30 – 4:00 pm Recommended Afternoon Pre-Conference Workshops*
Determining Genome Variation and Clinical Utility
* Separate registration required
2:00 – 6:00 Main Conference
Registration
4:00 PLENARY KEYNOTE SESSION
5:00 – 7:00 Welcome Reception in the Exhibit Hall with Poster
Viewing
Wednesday, April 6
7:00 am Registration Open and
Morning Coffee
8:00 PLENARY KEYNOTE SESSION
9:00 Benjamin Franklin Awards and Laureate Presentation
9:30 Best Practices Awards Program
9:45 Coffee Break in the Exhibit Hall with Poster Viewing
10:50 Chairperson’s Opening Remarks
Hans Cobben, CEO, Bluebee
11:00 Time to Build Personal Genome
Wenming Xiao, Ph.D., Staff Fellow, Division of Bioinformatics
and Biostatistics, National Center for Toxicological Research, FDA
Precision medicine is based on interrogation of genetic alteration in one individual, which requires precise and complete characterization of personal genome. Whole genome sequencing has been becoming cheaper and affordable and the challenge of routinely applying it in the precision medicine era largely rests on bioinformatics solution, particularly for personal genome assembly. This study is to establish the best practice of personal genome assembly and quality matrices and to provide guidance for usage of personal genome in clinical application by investigating the impact of various the next-generation sequencing (NGS) parameters, such as coverage, read length, and methods on assembly quality.
11:30
An Innovative and Globally Distributed Genome Management System
Thomas Thies, Senior Scientist, Data/Information Architecture
and Terminology, pREDi, Roche
The huge amount of genomic data which needs to be analyzed
timely by a globally distributed scientific workforce cannot move around the
globe. Instead the analysis pipes are brought to the data. This talk will
introduce you to a solution that follows this new paradigm. In addition it will
explain how we are leveraging existing HPC environments including governance
models which fuel the innovative capacity of our computational scientists.
12:00 pm An Integrated High
Performance Analytics
Solution for Genomics and Translational Research
Kathy Tzeng, WW Technical Lead, Healthcare and Life Science
Solutions, IBM Systems, IBM
Janis Landry-Lane, WW Program Director, Healthcare and Life
Science Solutions, IBM Systems, IBM
The rapid advances in sequencing technology are driving the
use of genomics information in various domains. Processing raw data from a
sequencer and translating it into insights in a timely fashion requires a high
performance, scalable analytics solution to integrate genomics information with
other data sources. IBM’s approach of building integrated solutions with our
customers and partners will be highlighted.
12:30 Session Break
12:40 Luncheon Presentation I: Not Just Noise:
Transforming Big Data into Smart Data
Brady Davis, Senior Director, Informatics, Illumina, Inc.
When it comes down to it, big data is only a big deal when
you can attach context and meaning to it. Smart data -- that is the right data
at the right time to the right person -- can help professionals enhance and
inform care decisions. That’s the prize; and while everyone’s got their eyes on
it, not everyone knows how to get their hands on it. This session will focus on
how Illumina is working to provide solutions that look at data at every stage,
from collection and protection to collaboration, storage and analysis.
1:10 Luncheon Presentation II: The Edge of Analytics Insight
Ted Slater, M.A., M.S., Global Head of Healthcare & Life Sciences, Cray Inc.
Matt Gianni, Functional Solution Architect, Cray Inc
Learn how to power your life science pipelines — from deep learning to clinical genomics — using the latest advances in analytics. Step up performance, enable the rigor your workloads require, and flex with the evolving needs of your business. Learn what software strategies scale best with Cray’s novel, advanced system — including interconnects, advanced memory stacks, graph engines, storage and cluster management.
1:40 Session Break
1:50 Chairperson’s Remarks
Shanrong Zhao, Ph.D, Director, Pfizer Worldwide Research & Development
1:55 QuickRNASeq Lifts Large-scale RNA-seq Data Analyses to the Next Level of Automation and Interactive Visualization
Shanrong Zhao, Ph.D, Director, Pfizer Worldwide Research & Development
RNA sequencing is being increasingly used, in part driven by the decreasing cost of sequencing. Nevertheless, the analysis of the massive amounts of data generated by large-scale RNA-seq remains a challenge. By combing the best open source tools developed for RNA-seq data analyses and the most advanced web 2.0 technologies, we have implemented QuickRNASeq (http://quickrnaseq.sourceforge.net), a pipeline for large-scale RNA-seq data analyses and visualization. The high degree of automation and interactivity in QuickRNASeq leads to a substantial reduction in the time and effort, and QuickRNASeq advances primary RNA-seq data analyses to the next level of automation, and is mature for public release and adoption.
2:25 High-Throughput NGS Sequencing Using Ion Proton in a
Clinical Genetic Testing Lab
Yirong Wang, Associate Director, Production Informatics,
Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount
Sinai
Clustered Ion Protons provide a highly scalable framework for
high throughput sequencing in any genetic testing labs or core sequencing
facilities while keeping the cost manageable. Highly customized LIMS and
efficient data analysis pipeline also play critical roles in quality control
and report generation and delivery. In an initial pilot study, we are able to
sequence and process 6000 samples for a large panel (500+ genes) screening
under 8 weeks.
2:55 Shifting a Pure Academic HPC Environment to a Mixed Protected and Free Environment – on the Same Platform
Vanessa Borcherding, Director, Scientific Computing Unit, Weill Cornell Medicine, Department, Physiology and Biophysics, Weill Cornell Medical College
Completely open research computing platforms make sense. They’re less expensive to design, build, and maintain, while giving unfettered access to data helps the collaborative process. However, increased collaborations with commercial entities and increased use of “deidentified” patient data are putting pressure on HPC operations to make security options seamless but without sacrificing performance and price points to which users are accustomed
3:10 Genomic Analysis on a Loosely Coupled AWS Platform with Highly Distributed NGS Data Analytics at a Massive Scale
Tristan Lubinski, Associate Scientist, NGS Informatics, AstraZeneca
The global NGS team at AstraZeneca implements a robust, flexible and consumable platform to perform genomic analysis at scale. The Bina solution was tested by processing tens of thousands of TCGA exomes with modern algorithms against latest reference genome (hg38), in turn demonstrating that the driver mutational landscape of the TCGA can be redefined when comparing against public domain data.
3:25 Refreshment Break in the Exhibit Hall with Poster Viewing
4:00 Lessons Learned Analyzing Thousands of Samples for
Clinical Use Cases Using Amazon Web Services
Ravi Madduri, Fellow, Computation Institute, University of
Chicago; Project Manager, Math and Computer Science Division, Argonne National
Lab
Globus Genomics is a cloud-based, large scale genomics
analysis service that is used by research consortiums, healthcare providers for
analyzing 1000s of raw genomics datasets. In order to deliver results of the
analyses on the tight deadlines, we created cost-aware resource scheduling on
AWS resources and reusable recipes for setting up appropriate security controls
required for compliance. In this talk, we will present some of the use cases
and success stories from our work.
4:30 Federated EHR Network for Patient Cohort Discovery
Bhanu Bahl, Director of Informatics, Harvard Catalyst
Patient Cohort discovery, across multiple healthcare institutions is a challenge. Accrual of sufficient numbers of patients for orphan diseases clinical trials further compounds the challenge. The Shared Health Research Information Network (‘SHRINE’), a Harvard Catalyst’s open source web-based query tool helps overcome the barriers arising due to variability in the source electronic health record (EHR) systems and returns aggregate numbers of patients across all sites with user-defined characteristics, currently demographics, diagnoses, medications, and selected lab values. By allowing semantic interoperability and consistency of data elements, SHRINE leverages the use of the Informatics for Integrating Biology and the Bedside (‘i2b2’) Hive software, an open source scalable informatics framework. Using federated search architecture, real-time queries can be performed across collaborating institutions, each with their own locally managed patient datasets.
5:00 Selected Poster Presentation: Integrating Data, Tools, and Infrastructure for Efficient Collaboration and Management in Large-scale Biomedicine
Sven Nahnsen, Ph.D., Head, Quantitative Biology Center (QBiC), University of Tuebingen
High-throughput biology in the medical context aims at developing predictive models for disease development and therapy outcome. OMICS technologies and especially next-generation sequencing are becoming increasingly popular for the acquisition of adequate system-wide data. Such experiments need to involve stringent modelling of experiments and bioinformatics workflows to reach comprehensive metadata annotation and to enable automated processing and analyses. We present the latest developments towards the integrative analysis of large and complex high-throughput data; these include the integration of data and project management with state-of-the-art bioinformatics pipelines, as well as a production-scale hard- and software stake. Our integrated technology builds on Liferay as a portlet container, on workflow engines and finally on openBIS for data management application. The infrastructure is embedded in a multi-center environment and allows for distributed data acquisition and management. The modular nature of our software architecture allows for rapid extension of the functionality, such as novel pipelines are visualisation tools for NGS data.
5:30 – 6:30 Best of Show Awards Reception in the Exhibit Hall
with Poster Viewing
Thursday, April 7
7:00 am Registration and Morning Coffee
8:00 PLENARY KEYNOTE SESSION PANEL
10:00 Coffee Break in the Exhibit Hall and Poster Competition
Winners Announced
10:30 Chairperson’s Opening Remarks
10:40 Application of Targeted NGS Sequencing in Personalized
Clinical Cancer Therapies
Qichao Zhu, Ph.D., Associate Professor, Genetics &
Genomics Sciences, Icahn School of Medicine at Mount Sinai
Our current clinical cancer genome research project is focused on the three key components, sequence analysis for patient genetic profiling, biomarker (genetic variation) collection for cancer precision medicine, and the data processing and integration platform application for clinical report. The goal of the project is developing a comprehensive platform that can totally support precision medicine approach in cancer treatment. The approach is based on the approved concepts that tumor biomarkers are associated with patient prognosis and tumor response to therapy and patient genetic profile can be associated with drug metabolism, drug response and toxicity. Personalized tumor genetic profiles, combining with tumor site and other relevant information are then used for determining optimum individualized therapy options. This presentation concentrates on the following major components for our project: 1) Accurately detecting the tumor genetic and molecular variants in terms of both coverage and precision by developing the new algorithms to improve our variant calling; 2) Matching patients with treatments that are more likely to be effective and cause fewer side effects by collecting, curating and associating biomarkers (genetic and molecular variations) with diseases, drugs and treatment plans; and, 3) Handling the cases in a high-throughput manner by developing a web-based pipeline platform for cancer data processing, sequence analysis, data integration and report generation.
11:10 Integration of Whole Genome and RNA Sequencing to Inform Clinical Treatment of Cancer
Michael Zody, Ph.D., Research Director, Computational Biology, New York Genome Center
11:40 Building National-Scale Genomics Projects with Collaborative, Portable, Reproducible Analysis
Deniz Kural, CEO, Seven Bridges
The number of large genomics projects worldwide is rapidly growing. Such projects involve analysis of hundreds of thousands of whole genomes to accelerate discovery in basic and clinical research. National-scale genomics projects make intensive demands on computation and storage, and test the limits of existing infrastructure. They present severe challenges that require novel approaches to overcome.
12:10 pm Session Break
12:20 Luncheon Presentation (Sponsorship Opportunity
Available) or Lunch on Your Own
1:20 Dessert Refreshment Break in the Exhibit Hall with Poster
Viewing
1:55 Chairperson’s Remarks
Yuval Itan, Ph.D., MRes, Research Associate, Human Genetics of Infectious Diseases, The Rockefeller University
2:00 Talk Title to be Announced
Gunaretnam (Guna) Rajagopal, Ph.D., Vice President &
Global Head, Computational Sciences, Discovery Sciences, Janssen Research &
Development, A Johnson & Johnson Company
2:30 A Clinical Genetics Diagnostic System Incorporating
Next-Gen Sequencing and Informatics to Advance Pediatric Precision Care
Marcia Nizzari, MS, CIO, Claritas Genomics
Claritas Genomics serves children affected with complex
genetic disorders by providing timely and accurate results, resolving families’
long search for answers. We developed a unique “orthogonal sequencing” approach
that simultaneously sequences exomes on both the Illumina NextSeq and the Life
Technologies Ion Proton instruments. This talk will cover both the lab approach
and the bioinformatics analysis pipelines, key components of Claritas’
enterprise architecture for pediatric precision care.
3:00 Software for Interpretation of Next-Gen Sequencing Data
in a Clinical Setting
Neil Miller, Director, Informatics, Center for Pediatric
Genomic Medicine, Children’s Mercy, Kansas City
The scale and complexity of NextGen Sequencing Data present unique informatics challenges particularly with the issues of variant characterization and clinical interpretation. The Center for Pediatric Genomic Medicine at Children's Mercy, Kansas City has developed novel software applications which are specifically designed to enable non-expert clinicians and researchers to make use of targeted NGS in the diagnosis and management of rare disease. The software programs described are the analytical backbone of the clinical and research applications at CMH including STAT-seq, a program for the ultra-rapid whole genome sequencing of critically ill patients in the neonatal intensive care unit (NICU). Children's Mercy, Kansas City is a leader in the field of applying genomics to clinical care; STAT-seq was named one of Time Magazine's top 10 medical breakthroughs of 2012. The software developed at CMH has been referenced in multiple publications and will soon become available at no cost for research use. Attendees will learn an overview of an end to end solution for interpretation of NextGen Sequence data which is used extensively in a children's hospital. An introduction to software that will shortly become publicly available.
3:30 Finding a Needle in a Haystack: New Approaches to Identify Disease-Causing Mutations in Patients’ Next Generation Sequencing Data
Yuval Itan, Ph.D., MRes, Research Associate, Human Genetics of Infectious Diseases, The Rockefeller University
4:00 Conference Adjourns