PhD Dissertation

Data Visualization Tools for Large Biological Data Sets

ABSTRACT

Researchers have access to an ever-growing volume of data available at multiple levels of biological analysis. Many visual analytic tools have been developed to display a variety of biological data types but many of these tools are challenging to use and only examine one biological level of analysis at a time. The development and testing of hypotheses is difficult when the information is hard to integrate and laborious to interpret. The application of data visualization principles and user experience design best practices could improve systems biology research workflows by providing visual analytic tools with what is known in the information visualization community as a “transparent” user interface.

This thesis consists of four chapters that explore two central questions: 1) What is the best way to represent biological information at different levels of analysis? and 2) How do we enable researchers to explore and interact with their data as naturally and intuitively as possible? The first chapter describes, ePlant, a tool for visualizing multiple levels of data that was developed using an agile process that included several rounds of user testing. The second chapter presents Gene Slider, a tool for visualizing the conservation and entropy of orthologous DNA and protein sequences using a data visualization paradigm that takes better advantage of preattentive visual processing than current methods. The third chapter describes Topo-phylogeny, a tool for visualizing phylogenetic relationships using a topographic map visualization paradigm that requires less cognitive processing to interpret than iii traditional tree diagrams. The final chapter demonstrates the importance of user testing when developing a “rapid serial visual presentation” interface for identifying genes of interest using electronic fluorescent pictographs.

Together these chapters illustrate the complexities and benefits of applying data visualization principles and user experience design best practices to building data visualization tools for the analysis of large biological data sets. Given that hypothesis generation is fundamentally a creative process, any tools or techniques that can help researchers consider their data at a deeper level should be valuable to the scientific community.

Peer reviewed publications

ePlant

Jamie Waese, Jim Fan, Asher Pasha, Hans Yu, Geoffrey Fucile, Ruian Shi, Matthew Cumming, Lawrence A. Kelley, Michael J. Sternberg, Vivek Krishnakumar, Erik Ferlanti, Jason Miller, Chris Town, Wolfgang Stuerzlinger, Nicholas J. Provart. ePlant: Visualizing and Exploring Multiple Levels of Data for Hypothesis Generation in Plant Biology. Plant Cell. 2017 Aug; 29(8): 1806–1821.
https://doi.org/10.1105/tpc.17.00073

20 years of Arabidopsis genomics

Nicholas J Provart, Siobhan M Brady, Geraint Parry, Robert J Schmitz, Christine Queitsch, Dario Bonetta, Jamie Waese, Korbinian Schneeberger, Ann E Loraine. Anno genominis XX: 20 years of Arabidopsis genomics. The Plant Cell, December 29, 2020
https://doi.org/10.1093/plcell/koaa038
Download PDF

eFP-Seq Browser

Alexander Sullivan1a, Priyank Purohit1a, Nowlan Freese, Asher Pasha, Eddi Esteban, Jamie Waese, Alison Wu, Michelle Chen, Chih Ying Chin, Richard Song, Sneha Ramesh Watharkar, Agnes Chan, Vivek Krishnakumar, Chris Town, Ann E. Loraine and Nicholas J. Provart. An “eFP-Seq Browser” for Visualizing and Exploring RNA-Seq Data. The Plant Journal. October 31, 2019.
https://doi.org/10.1111/tpj.14468

Topo-phylogeny

Jamie Waese, Nicholas J. Provart, David S. Guttman. Topo-phylogeny: Visualizing evolutionary relationships on a topographic landscape. PLOS ONE. Published: May 1, 2017
https://doi.org/10.1371/journal.pone.0175895

RSVP Visual Search Study

Jamie Waese, Wolfgang Stuerzlinger, Nicholas J. Provart (2016) An Evaluation of Interaction Methods for Controlling RSVP Displays in Visual Search Tasks. IEEE Proceedings of the International Symposium on Big Data Visual Analytics, November 22, 2016, Sydney, Australia.
http://ieeexplore.ieee.org/document/7787041/

Gene Slider

Jamie Waese, Asher Pasha, Tingting Wang, Anna van Weringh, David Guttman, Nicholas J. Provart. Gene Slider: sequence logo interactive data-visualization for education and research. Bioinformatics, Volume 32, Issue 23, 1 December 2016, Pages 3670–3672.
https://doi.org/10.1093/bioinformatics/btw525

Expression Angler

Ryan S. Austin, Shu Hiu, Jamie Waese, Asher Pasha, Nina Wang, Jim Fan, Curtis Foong, Robert Breit, Alan Moses, Nicholas J. Provart (2016), New BAR Tools for Mining Expression Data and Exploring Cis-Elements in Arabidopsis thaliana. Bioinformatics. The Plant Journal, 2016 Nov;88(3):490-504. Epub 2016 Oct 5. https://www.ncbi.nlm.nih.gov/pubmed/27401965

The Bio-Analytic Resource for Plant Biology

Jamie Waese and Nicholas J. Provart. The Bio-Analytic Resource for Plant Biology. Book chapter in: Methods in Molecular Biology - Plant Genomics Databases, Aalt-Jan van Dijk ed., Springer, 2017. ISBN 978-1-4939-6658-5
http://www.springer.com/gp/book/9781493966561

The Bio-Analytic Resource: Data visualization and analytic tools for multiple levels of plant biology

Jamie Waese and Nicholas J. Provart. The Bio-Analytic Resource: Data visualization and analytic tools for multiple levels of plant biology. Current Plant Biology Volumes 7–8, November 2016, Pages 2-5.
https://doi.org/10.1016/j.cpb.2016.12.001

50 Years of Arabidopsis Research - Citation Network

Nicholas Provart, Jose Alonso, Sarah Assmann, Dominique Bergmann, Siobhan Brady, Jelena Brkljacic, John Browse, Clint Chapple, Vincent Colot, Sean Cutler, Jeff Dangl, David Ehrhardt, Joanna Friesner, Wolf Frommer, Erich Grotewold, Elliot Meyerowitz, Jennifer Nemhauser, Magnus Nordborg, Craig Pikaard, John Shanklin, Chris Somerville, Shauna Somerville, Mark Stitt, Keiko Torii, Jamie Waese, Doris Wagner, and Peter McCourt. 50 Years of Arabidopsis Research: Highlights and Future Directions. New Phytologist, Octoboer 2015. Tansley Review. DOI: 10.1111/nph.13687
http://onlinelibrary.wiley.com/doi/10.1111/nph.13687/abstract

Light Stage

Paul E. Debevec, Andreas Wenger, Chris Tchou, Andrew Gardner, Jamie Waese, Tim Hawkins (2002). A lighting reproduction approach to live-action compositing. Proceeding SIGGRAPH '02 Proceedings of the 29th annual conference on Computer graphics and interactive techniques. Pages 547-556
https://dl.acm.org/citation.cfm?id=566614

High Dynamic Range Light Probe

Jamie Waese, Paul Debevec (2001) A Real Time High Dynamic Range Light Probe. SIGGRAPH 2001 Technical Sketches
http://vgl.ict.usc.edu/Research/rtlp/RealTimeLightProbe-sketch-2001.pdf

Conference Presentations

IEEE VIS 2019 Application Spotlight: "Does AI mean data visualization is dead? A discussion with IBMers working at the intersection of AI and data visualization about the opportunities and challenges of building next generation business intelligence products." Vancouver Convention Centre, October 23, 2019. Jamie Waese, Anne Stevens, Afrooz Samaei, Stephen O'Connell, Frank van Ham.
http://ieeevis.org/year/2019/info/application-spotlights

Designing & Building Data Visualization Tools for Large Biological Data Sets. SORA-TABA Workshop & DLSPH Biostatistics Research Day. Hospital for Sick Children (SickKids), Peter Gilgan Centre for Research and Learning Auditorium, May 5, 2017
http://sorataba.org/sorataba-workshop-2017/
Download Slides Here

ePlant: An agile approach to visualizing multiple levels of biological data. Jamie Waese, Asher Pasha, Nicholas Provart, International Conference of Arabidopsis Researchers at Paris, July 9, 2015.

Gene Slider: A new visualization tool for the AIP. Jamie Waese, Asher Pasha, Nicholas Provart. Arabidopsis Information Portal Developer Workshop at the Texas Advanced Computing Center, University of Texas at Austin, November 5, 2014.

ePlant: A user friendly data visualization tool for integrating and exploring multiple levels of biological data. Jamie Waese, Asher Pasha, Nicholas Provart, International Conference of Arabidopsis Researchers at Vancouver, June 29, 2014.

Data Sets, Webservices and Visualization Apps from the Bio-Analytic Resource for use in the AIP and other Cyberinfrastructure Assets. Nicholas Provart, Jamie Waese, Zhenming Yu, Asher Pasha, Rohan V. Patel, Sylva Donaldson, Adrian Platts, Mathieu Blanchette and Stephen I. Wright. International Arabidopsis Informatics Consortium session at the XXII Plant and Animal Genome conference in San Diego, January 13, 2014.

Building Data Visualization Tools For Large Biological Datasets, Toronto Bioinformatics Users Group (TorBUG), March 27, 2013

Medium Articles

What TV Taught Me About Design Presentations

Seven principles of storytelling to elevate your next presentation. Published in the Shopify UX Medium Channel on February 3, 2022. Based on a 30-minute talk I gave to the #Writers-of-Shopify interest group.
Read it here.

Data Visualization and AI

Stephen O’Connell, Afrooz Samaei, Anne Stevens and Jamie Waese. Does Artificial Intelligence Mean Data Visualization is Dead? Nightingale - The Journal of the Data Visualization Society, Medium.com. January 10, 2020.
Download PDF

Refereed Posters

ePlant: A user friendly data visualization tool for integrating and exploring multiple levels of biological data. Jamie Waese, Asher Pasha, Nicholas Provart, International Conference of Arabidopsis Researchers 2014.

Gene Slider: A Sequence Logo Interactive data visualization tool for Education and Research, Jamie Waese, Asher Pasha, Nicholas Provart, BioVis 2013.

Next
Next

ePlant