Visualizing Genomic Data: Spotlight on IOBIO

Kaitlin Searfoss:
Hi everyone. Welcome to this podcast from Cambridge Healthtech Institute for the data visualization and exploration tools conference, which runs April 5-7, 2016 in Boston as part of the Bio-IT world conference and expo. I'm Kaitlin Searfoss, conference producer. We have with us today one of our speakers. This is Dr. Chase Miller, the director of research and science at the Eccles Department of Genetics in the USTAR Center for Genetic Discovery at the University of Utah School of Medicine. He's also one of the developers for the IOBIO system. Dr. Miller, thank you for joining us.

Chase Miller:
No problem.

Kaitlin Searfoss:
What are the biggest challenges scientists face when visualizing and analyzing genomic data, and how does the IOBIO system help address some of that?

Chase Miller:
There are a lot of really good challenges in genomics right now, and that's part of what makes it such an interesting field. One of the ones I've been focusing on lately is extracting more information and more value out of the huge amounts of data that we're generating. We're generating every day, it seems like, petabytes and petabytes of data, and yet we do a really good job of the first pass analysis where we align the reeds, and we call variance and annotate, and sort of answer that first question that the data was generated for. After that it gets a little harder. How do the secondary scientists who aren't associated with the project ask their own questions? Here's a way where I think visualization can help in two big ways. First, visualization can help us figure out new questions to ask. One of the hardest things is not answering a specific question, but figuring out what new or what right question to ask. Visualization is great at this because humans are very good at looking at pictures, but we're not so good at looking at raw data, and then pulling out say patterns or anomalies from that raw data. This is I think one way where visualization can be used for us to provoke new insight and provoke new questions, and follow us down paths that we may have not originally generated the data for.

A second way is, once we've generated this data, how do we get it in the right hands, and get it into those hands in the right way? There's many people who are doing research on particular systems, or model organisms, or diseases, who would be able to make great use of this data. It would help them answer the questions that they're asking, but at this point they may not have the expertise and this sort of data analysis may be out of reach for many of these scientists and researchers. Here I think visualization can help in the UI side. Two ways, one building an application or user interface that's very easy for scientists to use, and then visualization that makes the result easy for them to understand.

To do this, where IOBIO excels at, is that we've taken the approach of building smaller applications, more focused applications. Instead of an application that does a do-it-all, one-size-fits-all application, we are trying to build a focused application to answer a specific question. When we do this, it gives us the ability to cut back on the number of buttons and options, and clutter on a page, but it also lets us design custom visualizations that can really hone in on how this data should be viewed and how this data is better understood. I think in this way, we can kind of get empower more researchers, wet lab researchers, that really have been excluded from this big data generation and analysis problem.

Kaitlin Searfoss:
You mentioned those smaller focused applications. Can you tell me what your favorite IOBIO app is and why?

Chase Miller:
It's hard to pick a favorite, but one that I'm really excited about right now is gene.iobio. We just released version two of it, I think two days ago, with a bunch of new features. Gene.iobio is a disease variant investigation web app, and it's kind of slowly different from most of the other stuff out there. The way it works, is that you give it variant data, and align reeds if you have the list of potential disease variants, although you don't need it. Then gene.iobio will visualize all this data and also annotate your variance by resident through various annotation software, and integrate into this very nice visualization analysis platform. It's an interesting tool because it kind of focuses on this current gap in the tool chain when you go from a list of many potential disease causing variants with none of them being a particular smoking gun to honing in on one suspected problem variant. This is something that is causing a lot of problems and a lot of grief for research groups and hospitals, and diagnostic companies because it's very labor intensive, requires very skilled people to look at this. Each case may take up to a half a day. This is a very hard problem, and gene kind of stream lines this and pulls in all the information to make that an easy way to figure out what the most likely disease causing variant is.

That's something that we're really excited about, and it's also just a really fun app to play around with.

Kaitlin Searfoss:
What are you most excited to learn about at the data visualization and exploration tools conference?

Chase Miller:
I'm really excited to see what everyone else is up to, especially in visualization, you know? Visualization is still very young in biology and the truth is geeks really don't get too many chances to meet each other and see the amazing work that everyone's doing. Specifically I like to see if anyone has come up with any novel genomic visualizations, new ways to kind of visualize this data. There's a ton of genomic data, it's complex, multi-faceted, and I think I can count a number of unique genomic visualizations on one hand, so I think we're due for a few more out there.

Kaitlin Searfoss:
Thanks so much for your time today, Dr. Miller.

Chase Miller:
Thank you.

Kaitlin Searfoss:
That was Chase Miller, director of research and science at the Eccles Department of Genetics in the USTAR Center for Genetic Discovery at the University of Utah School of Medicine. He'll be speaking at the data visualization and exploration tools conference, which runs April 5-7, 2016 in Boston, as part of the Bio-IT world conference and expo. If you'd like to hear him in person, go to www.bio-itworldexpo.com for registration information, and enter the key code podcast. I'm Kaitlin Searfoss, thank you for listening.

Data Platforms and Storage Infrastructure