-
27/02/2012 - 5:30pm
-
01/03/2012 - 8:45am
-
21/03/2012 - 9:00am
-
28/03/2012 - 8:30am
-
18/06/2012 - 9:00am
-
09/07/2012 - 9:00am
-
24/09/2012 - 12:00am
-
08/10/2012 - 9:00am
-
14/10/2012 - 12:00am
-
17/11/2012 - 9:00am
Genomics Virtual Laboratory
Modern genome research is a data-intensive form of discovery, encompassing the generation, analysis and interpretation of vast amounts of data against catalogues of public genomic knowledge in complex multi-stage workflows. New algorithm and tool development continues at a rapid pace to keep up with new sequencing technologies: visualisation of genomic data and public genomic catalogues is possible through a variety of mature genome browsers (UCSC Genome Browser, GBrowse, IGV and others), while analysis platforms such as Galaxy, Bioflow or GenePattern (to name a few) allow biologists with little training in programming to develop analysis workflows and launch tasks on HPC clusters.
However, the reality is that the necessary tools, platforms and data services for best practice genomics are complicated to install and customise, require dedicated servers and massive data stores, and typically involve a high level of ongoing maintenance to keep the software, data and hardware up to date. This requires significant expertise in software development, system administration, and hardware and networking, as well as access to hardware resources. These are beyond the means of all but the largest research groups.
Aims and objectives
Genome Research Computing at the University of Queensland and the Victorian Life Sciences Computation Initiative have proposed the establishment of a Genomics Virtual Laboratory (GVL) to provide rolling best practice genomics tools and data to genome researchers nationwide. Physically, the GVL will consist of local instances of centralised scalable genomic informatics platforms, data repositories and support services in research precincts housing high-throughput genomics technologies. It will provide a vehicle for collaboration, training, support and outreach, and ongoing strategic planning and strategic coordination, including the development of informed and timely applications to national agencies to upgrade and expand the facilities.
The specific objectives of the GVL will be to:
- Provide infrastructure tailored to the unique data-intensive demands of genomics.
- Provide a forum for researchers to collaborate and share data and workflows.
- Coordinate with the multiple genomics groups Australia-wide, promote understanding of the unique needs of genomics and coordinate and participate in grant applications necessary to secure ongoing state, federal and international funding.
- Provide a platform for outreach, learning and dissemination of new tools and techniques.
- Build computational skills in existing and potential genome research groups (which include biologists, clinicians, and others who may have little or no formal training in computer programming or the use of HPC systems).
- Involve the genome research community in defining future computational needs to help sustain and promote genomics in Australia.
Working with the LSCC (Life Sciences Computation Centre), VeRSI will contribute to the GVL by assisting with the implementation and customisation of a genomics workflow platform on the NeCTAR research cloud.
Outcomes
Researchers and bioinformaticians would benefit from a GVL in several equally important and complementary ways:
- ‘Reduced entry’ best practice genomics: currently best practice typically requires significant expertise in programming, scripting and data management, and investment in understanding state-of-the-art analysis techniques. The GVL is intended to provide a central resource of hardware, software and human expertise to allow researchers to focus on the biological interpretation of genomic data rather than the details of technical analysis.
- Integration of analysis tools, public datasets and visualisation platforms, streamlining research and reducing time from experiment to publication.
- Scalability through infrastructure: implementation of the GVL on scalable infrastructure such as the NeCTAR Research Cloud simplifies the scaling of analysis in response to rapidly rising numbers of genomics datasets of increasing size.
- Enhanced collaboration between researchers and across the community through shared datasets, workflows and customised toolsets.
- Reproducibility and research provenance: workflow platforms record all aspects of an experiment, allowing for confidence in repeatability and for the publication of workflows along with the resulting data.
VeRSI thanks Clare Sloggett of VLSCI for her contribution to this project summary
Project details
ID number UOB/P/010
Project title VLSCI Life Science Computation Centre
Start date February 2011 End date June 2012
Lead institute VLSCI
Principal investigator Prof Andrew Lonie – Head LSCC
Partner PIs and/or participating institutions Dr. Nathan Hall – Bioinformatician
Prof Justin Zobel – High Throughput Genomics Theme Leader
Partner sponsor The University of Melbourne
VeRSI executive sponsor Dr Ann Borda – VeRSI Executive Director
VeRSI project management Jared Winton

Keywords: VLSCI | Galaxy | VeRSI | LSCC | Bioinformatics | NeCTAR | GVL | Genomics | Virtual Laboratory | Research | Data | Life Science | Genome | Informatics | Collaboration | Training | Repositories
