Track 4 - April 5 – 7, 2016
Cloud Computing
Applying Cloud for Next-Generation Computing
Life scientists agree that the shift to cloud is both necessary and practical. Adoption has been greater than anyone expected and users continue to expand applications. Track 4 explores the rapid growth and progressive maturation of cloud as well as evolving provider and user experiences.
Tuesday, April 5
7:00 am Workshop Registration and
Morning Coffee
8:00 – 11:30 Recommended Morning Pre-Conference Workshops*
Security Considerations for Virtual Research
12:30 – 4:00 pm Recommended Afternoon Pre-Conference Workshops*
Growth Strategy: Leveraging Cloud Scalability to Enable Rapid Growth and Change
Foundational Services for Bioinformatics in the AWS Cloud with Intel
* Separate registration required
2:00 – 6:00 Main Conference
Registration
4:00 PLENARY KEYNOTE SESSION
5:00 – 7:00 Welcome Reception in the Exhibit Hall with Poster
Viewing
Wednesday, April 6
7:00 am Registration Open and
Morning Coffee
8:00 PLENARY KEYNOTE SESSION
9:00 Benjamin Franklin Awards and Laureate Presentation
9:30 Best Practices Awards Program
9:45 Coffee Break in the Exhibit Hall with Poster Viewing
10:50 Chairperson’s Opening Remarks
David LaBrosse,Strategic Partner Manager, NetApp Healthcare
11:00 Building the Bionic Cloud
Michaela Iorga, Ph.D., Senior Security Technical Lead for
Cloud Computing; Co-Chair, NIST Cloud Security Working Group; Co-Chair, NIST
Cloud Forensic Science Working Group, National Institute of Standards and
Technology
When coupling the continuously growing and changing landscape
of advance persistent threats with the explosion of “pervasive computing” or
“ambient intelligence” founded by the hyperconnectivity of “everyware”, we
reach a technical inflection point which calls for innovative solutions to support
the further development of a strong, secure backbone of the Internet of Things
(IoT) – a bionic cloud.
11:30 Adopting Public Cloud at Enterprise Scale: Public IaaS
at AstraZeneca
Don Barber, Infrastructure Architect, Enterprise Computing, IT
Infrastructure & Operations, AstraZeneca
Public cloud adoption at scale requires significant
rethinking of enterprise systems, processes and culture. This talk outlines how
AstraZeneca IT has made this journey to offer public cloud as an enterprise
service by tackling challenges with provisioning, management, security and
qualification. Subject material will range broadly from technical issues to
policy challenges and workforce education needs, concluding with a few
predictions about the future.
12:00 pm Sponsored Presentation (Opportunity Available)
12:15 Bringing Data and Computing Together to Enable Research Innovation
Joe Corkery, M.D., Senior Product Manager, Google Cloud Platform
We will discuss how the vast data storage, sharing, and computing capabilities of Google Cloud Platform have enabled numerous researchers to bring together previously unwieldy data sets to gain novel insights. We will also explore how access to Google's cloud resources enables researchers to revisit traditional approaches to data analysis and pursue new methodologies that would otherwise be out-of-reach in a traditional environment.
12:30 Session Break
12:40 Luncheon Presentation I: Making Cloud R&D Electronic Laboratory Environments a Reality
John Conway, Global Director, Strategy and Technology,
Research & Development, LabAnswer
The drivers behind moving to Cloud-based, enterprise class,
scientific software applications are substantial, and the trend is rapidly
gaining momentum. LabAnswer will showcase examples and discuss the practical
considerations of deploying electronic laboratory environments (ELE) via the
Cloud, including Electronic Laboratory Notebooks, LIMS, etc. Capabilities and
functionality topics to be addressed include Data Governance, Entity
Registration, Request/Sample/Inventory Management, and Data Aggregation &
Analytics.
1:10 Luncheon Presentation II: Large-Scale Cancer Genomics in the AWS Cloud
Brian O'Connor, Ph.D., Managing Director, Cloud Technologies, Informatics and Bio-Computing, The Ontario Institute for Cancer Research
Angel Pizarro, MSE, Technical Business Development Manager, Scientific Computing, Amazon Web Services
ICGC recently made available the genomes of approximately 1,300 cancer donors as part of the AWS Public Data Sets program. In this presentation, learn how the Ontario Institute of Cancer Research leverages the cloud to process large-scale data sets like ICGC and uniformly analyze 900 TB of data in less than four months.
1:40 Session Break
1:50 Chairperson’s Remarks
Tom Johnson, Senior Director & Product Manager, Pharma and
Life Sciences Services, Exostar
1:55 Cloud Computing in a GxP Environment
Krista Woodley, Director, Digital Quality and Risk Management,
Biogen
We discuss the regulatory expectations and associated
challenges with moving to cloud-based solutions (SaaS, IaaS, PaaS). Discussion
points include requirements for vendor oversight, validation and maintenance of
cloud-based solutions.
2:25 Case Study: How Merck Is Leveraging Information Security
to Enable & Accelerate Clinical Trials
Andrew K. Porter, Director, Enterprise Architecture, IT
Planning & Innovation, Merck & Co.
Bringing data, applications and people together for clinical
trials takes too long, costs too much and leaves security gaps that threaten
intellectual property and regulatory compliance. Merck explains its cloud-based
solution to close these gaps and mitigate risks by leveraging entitlements
management and fine-grained provisioning to automate partner onboarding,
connect required applications and data, assign permissions and control access
by authenticating identities.
2:55 Why Would You NOT Use Public Clouds for Your
Big Compute Workloads?
Jason Stowe, CEO, Cycle Computing
Up to now there’s been resistance to leveraging the cloud for
the compute and data intensive workloads that historically run on in-house HPC
environments. But genomics, computational chemistry, and other data collection
and analytics have outpaced internal capacity. The lure of zero queue times,
unlimited amounts of processing, and the ability to directly fit jobs to
budget/value instead to available capacity is proving impossible to resist.
This talk will highlight the risks and rewards of doing science in the cloud.
3:25 Refreshment Break in the Exhibit Hall with Poster Viewing
4:00 Building Cloud-Enabled Cancer Genomics Workflows with Luigi and Docker
Jacob Feala, Ph.D., Principal Scientist, Bioinformatics, Caperna, an affiliate of Moderna Therapeutics
As bioinformatics scientists, we tend to write custom tools for managing our workflows, even when viable, open-source alternatives are available from the tech community. Our field has, however, begun to adopt Docker containers to stabilize compute environments. I introduce Luigi, a workflow system built by engineers at Spotify to manage long-running big data processing jobs with complex dependencies. Focusing on a case study of next-generation sequencing analysis in cancer genomics research, I show how Luigi can connect simple, containerized applications into complex bioinformatics pipelines that can be easily integrated with compute, storage, and data warehousing on the cloud.
4:30 The ISB Cancer Genomics Cloud
Sheila Reynolds, Ph.D., Senior Research Scientist, Ilya
Shmulevich Laboratory, Institute for Systems Biology
The ISB-CGC is a cloud-based platform that will serve as a
large-scale data repository for TCGA data, while also providing the
computational infrastructure and interactive exploratory tools necessary to
carry out cancer genomics research at unprecedented scales. The ISB-CGC will
provide both interactive and programmatic access to the TCGA data, leveraging
many aspects of Google Cloud Platform including BigQuery and Compute Engine. February 2016 Speaker Interview
5:00 GATK4 - The Next Generation of Broad Institute's Genomics Tools, on the Cloud
Adam Kiezun, Ph.D., Senior Group Leader, Computational Methods Development, Broad Institute of MIT and Harvard
The breathtaking pace of genomics growth requires tools and pipelines that can support cutting-edge analyses, at petabyte scales, with optimized speed and cost. With this in mind, we have launched GATK4, a complete reimagining of Broad’s Genome Analysis Toolkit. GATK4 now supports both germline and somatic mutation analysis, CNV and SV detection, tumor heterogeneity analysis, and more. Designed with cloud infrastructure in mind, GATK4 is implemented with support for Apache Spark and is hundreds of times faster than previous generations of GATK.
5:30 – 6:30 Best of Show Awards Reception in the Exhibit Hall
with Poster Viewing
Thursday, April 7
7:00 am Registration and Morning Coffee
8:00 PLENARY KEYNOTE SESSION PANEL
10:00 Coffee Break in the Exhibit Hall and Poster Competition
Winners Announced
10:30 Chairperson’s Opening Remarks
Ben Cotton, Director, Customer Satisfaction, Cycle Computing
10:40 Splicing-Centric Analysis of RNA-Seq Data Using the
SpliceCore Platform
Martin Akerman, Ph.D., CTO & Co-Founder, Envisagenics,
Inc.; Cold Spring Harbor Laboratory
Over 15% of disease mutations can cause structural errors in
mRNA through Alternative Splicing (AS). SpliceCore is a cloud-based platform
for AS discovery, analysis and interpretation using RNA-seq data. SpliceCore
assists the increasing demand of data analysis and innovation for RNA
therapeutics. I demonstrate SpliceCore in a breast cancer study case, including
implementation and experimental validation in collaboration with Cold Spring
Harbor Laboratory.
11:10 A Cloud Computing Infrastructure for BLAST
Thomas L. Madden, Ph.D., Staff Scientist, National Center for
Biotechnology Information, National Library of Medicine, National Institutes of
Health
The BLAST Cloud platform allows a user to perform a large
number of sequence similarity searches. The platform supports queries through a
REST style API, a webpage and command-line searches. There is minimal setup as
a FUSE client downloads the latest database. We discuss the project design and
performance as well as the use of this project in an existing workflow.
11:40 Turning Biology into an Information Technology: Moving Experimental Biology into The Cloud
Max Hodak, Founder & CEO, Transcriptic, Inc.
Despite rapid advances in the science, the practice of molecular and cell biology looks similar today to 20 years ago. Transcriptic allows scientists to use a remote, software-defined infrastructure for complex workflows without upfront capital costs. This advance makes the lab itself another part of your data workflow, and lifts the focus from the bench to the analytics and science.
12:10 pm Session Break
12:20 Luncheon Presentation (Sponsorship Opportunity
Available) or Lunch on Your Own
1:20 Dessert Refreshment Break in the Exhibit Hall with Poster
Viewing
1:55 Chairperson’s Remarks
John M. Conley, J.D., Ph.D., William Rand Kenan, Jr. Professor
of Law, University of North Carolina, Chapel Hill; Counsel, Robinson Bradshaw
& Hinson
2:00 FEATURED PRESENTATION: precisionFDA
Taha A. Kass-Hout, M.D., MS, Chief Health Informatics Officer
& Director, Office of Health Informatics, FDA
precisionFDA is an informatics cloud-based platform for
ensuring the accuracy of Next-Generation Sequencing (NGS) tests by
crowdsourcing reference material and data. A key part of President Obama’s
Precision Medicine Initiative, it serves as a collaborative research effort
that will inform later regulatory pathways and decision making. During this
talk, Dr. Taha Kass-Hout, FDA’s Chief Health Informatics Officer, describes the
platform and its successes since the December 2015 beta release.
2:30 FEATURED PRESENTATION:
How the Plecosystem, Blockchain, and Federated Data Enclaves will Shape
Genomics Innovation and Application: Emerging Initiatives from the Global
Alliance for Genomics and Health
John E. Mattison, M.D., Chief Medical Information Officer,
Assistant Medical Director, Southern California Medical Group, Kaiser Permanente;
Co-Chair, eHealth Workgroup, Global Alliance for Genomics and Health GA4GH
How can we maximize genomic research for the good of all
citizens without violating their privacy? We need powerful new approaches to
ensure ethical research without unwarranted risk to citizens who consent to use
of personal data. The Global Alliance for Genomics and Health includes
worldwide institutions seeking consensus on policy frameworks supported by
creative technical solutions to achieve these paired goals of higher value and
lower risk. I discuss progress to date.
3:00 FEATURED PRESENTATION:
Large-Scale Data Commons for Genomic and Clinical Data and the Changing
Landscape for Sharing Research Data
Robert Grossman, Ph.D., Director, Center for Data Intensive
Science (CDIS); Core Faculty, Institute for Genomics & Systems Biology and
Computation Institute, Professor of Medicine, Section of Genetic Medicine,
University of Chicago
Open commons containing large amounts of public biomedical
data from the research community can potentially dramatically speed up medical
research. We describe our experiences developing large-scale open source data
commons for genomic and associated clinical data. We also discuss options for
integrating and interoperating in-house genomic and clinical data with public
data commons and private data partnerships.
3:30 PANEL DISCUSSION: How
Will Data Sharing Innovations Fare in the Regulatory Environment?
Moderator: John M. Conley, J.D., Ph.D., William Rand Kenan,
Jr. Professor of Law, University of North Carolina, Chapel Hill; Counsel,
Robinson Bradshaw & Hinson
Panelists:
Robert Grossman, Ph.D., University of Chicago
Taha A. Kass-Hout, M.D., MS, FDA
John E. Mattison, M.D., Kaiser Permanente
Andrew K. Porter, Merck & Co.
Mollie Shields-Uehling, SAFE-BioPharma Association
The growth in patient healthcare and life sciences innovations can be attributed to technology enhancements like cloud computing, big data analytics and mobile applications, but may conflict with increasing regulatory compliance demands to ensure protection of healthcare life and quality as well as patient data privacy and security. This panel of esteemed technology solution providers and regulators debates real-world challenges and how regulation must also innovate at technology’s pace.
4:00 Conference Adjourns