Tag Archives: NGS

The Ramialison Group Analysis Workflow on R@CMon

The Ramialison Group at the Australian Regenerative Medicine Institute (ARMI) located in the biomedical research precinct of Monash University, Clayton specialises in systems biology both on the bench and through computational analysis. Their work is driven by the in vivo and in silico dissection of regulatory mechanisms involved in heart development, where deregulation of such mechanisms cause congenital heart disease, which results in 1 out of every 100 babies to be born with heart defects in Australia.

Heatmap generated from transcriptomic data from heart samples (Nathalia Tan)

Heatmap generated from transcriptomic data from heart samples (Nathalia Tan)

Their research focuses on identifying DNA elements that play a crucial role in the development of the heart and, that could be impaired in disease. To identify these sequences, several genome-wide interrogation technologies (genomics and transcriptomics) are employed on different model organisms such as mouse or zebrafish. Downstream analysis of the data generated from these experiments involves high performance computing and requires large storage, which can be up to hundreds of gigabytes in size for a single project.

To optimise their investigation into heart development, the R@CMon team has deployed a dedicated Decoding Heart Development and Disease (DHDD) server on the Monash node of the NeCTAR Research Cloud infrastructure, which has now been running for over a year. This has not only provided the group with faster processing speeds in comparison to running jobs on a local desktop, but also an appropriate file storage infrastructure with persistent storage for files that are regularly accessed during analysis. Through VicNode, the group has been given vault storage for archiving completed results for their various research projects. With the assistance R@CMon, the group has been able to easily add users to the server as it continues to grow with new members and local collaborators.

Web interface for the Trawler web service.

Web interface for the Trawler web service.

In addition to the DHDD server, the R@CMon team also assisted the Ramialison Group in deploying a dedicated cloud server that has been used to develop the Trawler motif discovery tool web service. The implementation of this tool allows the group to quickly and easily analyse next-generation sequencing data and identify overrepresented motifs, which has led to a manuscript that is currently in preparation. The Ramialison Group envisage future developments of similar simple and easy to use bioinformatics analysis tools through R@CMon.

Histone H3.3 Analysis on R@CMon

The Epigenetics and Chromatin (EpiC) Lab at Monash University is working on understanding how mutations in certain chromatin factors promote the formation of brain tumours. This project involves the generation and analysis of high-throughput sequencing data of chromatin modifications and remodellers in normal and mutated cells. The sequencing is carried out at the MHTP Medical Genomics Facility and the resulting datasets are then imported into  the analysis workflow running on the Monash node (R@CMon) of the NeCTAR Research Cloud. The sequencing reads are first aligned to the repetitive fraction of the genome using a script developed by Day et al. (Genome Biology 2010) to determine enrichment at repeats. Sequencing reads are then aligned to the genome using Bowtie. The resulting files are filtered for quality, poor matches and PCR duplicates using customised Perl scripts. The filtered files are then imported into SeqMonk for further analysis.

Overlap analysis using SeqMonk

Overlap analysis using SeqMonk

This allows for rapid visualisation of individual aligned reads across the entire genome. The inbuilt MACs peak caller is used for first pass peak calling. A selection of peaks is then validated in the lab by ChIP-qPCR experiments and peak-calling parameters can be adjusted based on these results. Overlap analysis with regions of interest can be performed in SeqMonk. Aligned sequence files are converted to BigWig format using customised Perl scripts and uploaded onto the NeCTAR Object Storage (Swift), which can then be loaded seamlessly on the UCSC Genome Browser for visualisation and further investigation. Once the sequence files are uploaded to the object storage, it can then be easily compared against public ENCODE datasets and UCSC genomic annotations to identify any potentially interesting correlations.

Aligned sequence visualisation using the UCSC Genome Browser.

Aligned sequence visualisation using the UCSC Genome Browser.

The R@CMon team and the Monash Bioinformatics Platform supported the EpiC Lab by deploying a dedicated analysis instance on the NeCTAR Research Cloud based on the training environment first developed for the BPA-CSIRO Bioinformatics Training Platform. The open access and reusability of the training platform means it can be easily readapted to various analysis workflows. The R@CMon team and the Monash Bioinformatics Platform will continue to engage with the EpiC Lab as they grow and scale their analysis workflow on the NeCTAR Research Cloud.

Bioplatforms Australia – CSIRO NGS Workshop (July 1-3, 2014)

Last July 1-3, 2014, the latest Bioplatforms Australia – CSIRO joint Next Generation Sequencing hands-on workshop was held at the University of New South Wales, Sydney. The workshop was delivered using the established Bioinformatics Training Platform running on the NeCTAR Research Cloud and provided bench biologists and PhD students with NGS training on the following topics:

      • Introduction to the command-line interface – Software Carpentry
      • Introduction to Next Generation Sequencing
      • Illumina Next Generation Sequencing Data Quality
      • Sequence Alignment Algorithms
      • ChIP-Seq Analysis
      • RNA-Seq Analysis
      • de novo Genome Assembly
Sequence data quality analysis and visualisation using FastQC and FASTX-Toolkit.

Sequence data quality analysis and visualisation using FastQC and FASTX-Toolkit.

The R@CMon team helped the workshop organisers in updating the training environment with the latest tools, datasets and other materials as well as ensuring resource stability throughout the 3 day workshop. Future Bioplatforms Australia and CSIRO joint workshops will be announced on the Bioplatforms Australia Training page.

Screen Shot 2014-07-16 at 12.40.46 pm

Alignment visualisation using IGV.

The trainees have the following to say about the workshop:

“The practical component made it 1000 times easier to get my head around the course and I feel like I can be confident in actually applying what I’ve learned (instead of just in lecture format).”

“The beginning with introduction to Unix environment and explanation of the de novo assembly was the best part of the course as the commands were described in more detail so I could understand what the different commands were executing. There was more practical work with the de novo assembly which was good.”

“Hands on experience is good, and the first part on command lines is good for the beginners.”

Bioinformatics Training on R@CMon

A multidisciplinary partnership between Monash eResearch Centre and Bioplatforms Australia has provided a broadly accessible solution to delivering hands-on bioinformatics workshops with seamless access to cloud computing using the new NeCTAR Research Cloud infrastructure.

Running hands-on bioinformatics workshops in Australia has previously been hampered by the lack of specialised bioinformatics training facilities and a paucity of skilled trainers to develop and deliver these courses.

To improve the bioinformatics skills of bench scientist now faced with handling gigabyte size datasets generated by next-generation sequencing technologies, Bioplatforms Australia and CSIRO have been collaborating to advance bioinformatics expertise among Australian ‘omics researchers.  Through an international partnership with the EMBL European Bioinformatics Institute in the UK, a cutting-edge three-day Australian hands-on NGS workshop has been created. This course introduces bench scientists to quality control of NGS data, alignment, ChIPSeq, RNASeq and de novo assembly workflows and software.

Professor Paul Bonnington, Director of the Monash eResearch Centre and the R@CMon team contributed to this training initiative through the development of a cloud computing-based NGS bioinformatics training platform based on the open source bioinformatics software package CloudBioLinux. The platform allows sharing of data, tools and applications and enables trainers anywhere in the world to readily work together to develop and test new workshop material.

The Bioplatforms Australia Next Generation Sequencing workshop platform enables compute-intensive NGS training courses to be easily delivered and accessed widely around Australia and requires very little local IT expertise or need for high end computational hardware. The first hands-on workshops using the NGS workshop platforms were held in July 2012 at Monash University in Melbourne and at University of New South Wales in Sydney. To date 10 workshops have been delivered around Australia in Melbourne, Sydney, Brisbane, Adelaide, Perth and Canberra to 345 trainees.

The team of trainers from Bioplatforms Australia and CSIRO in collaboration with EBI-EMBL and Monash eResearch Centre are currently developing a metagenomics 2-day workshop using a bespoke metagenomics image built by R@CMon. This course will be run at University of New South Wales in Sydney on the 6-7 February 2014 and at Monash University on the 10-11th February 2014.

Monash University is one of the nodes of NeCTAR‚ a research cloud platform, a landmark investment that will extend the advantages of high-performance computing and high capacity networks to Australian researchers. This exciting initiative provides on-line access to scalable computational power and data storage allowing a new realm of data sharing and collaboration.

The Bioplatforms Australia Next Generation Sequencing workshop platform is now freely accessible on the NeCTAR research cloud and provides access to hundreds of bioinformatics software packages.

Further information on the Bioplatforms Australia Next Generation Sequencing workshop platform is available from Catherine Shang, Bioplatforms Australia at cshang@bioplatforms.com. Contact Prof. Paul Bonnington, MeRC Director for further details and assistance on e-research solutions.

 

Software Carpentry Bootcamp for Bioinformaticians (Adelaide/Melbourne) – UPDATE

Last September 24-26 and October 1-3, the latest Software Carpentry Bootcamps were held in University of Adelaide and Monash University.

These Software Carpentry Bootcamps were designed for Bioinformaticians to enhance their knowledge and skills in programming and software development practices.The bootcamps were delivered using the NeCTAR Research Cloud where each trainee has been given dedicated access to a specific virtual workstation.

SWC-03

With the use of an automatic provisioning system, each virtual workstations have been preconfigured with the required tools, training materials and computing resources to perform the hands-on exercises.

SWC-02

On the first day of the bootcamp, the Software Carpentry instructors including the R@CMon team introduced the trainees to Python. The second day was mostly about software testing and documentation. The last day was when the trainees applied their knowledge from the previous sessions into collaborative group exercises.

SWC-01

Photos taken by Nathan Watson-Haigh (ACPFG).

“Thanks a lot for this information and also your kind efforts in running such a  useful and informative workshop” – Fariborz Sobhanmanesh (Research Engineer, Bioinformatician with CSIRO Animal, Food and Health Sciences Centre)

“I recently attended the SWC bootcamp in Adelaide and found it incredibly useful. Sure there was a lot of information in a short amount of time but the topics covered were practical and very relevant to my daily work. Thanks must go to the presenters and organisers who kept things moving along brilliantly.” – Terry Bertozzi (Research Scientist with the South Australian Museum)

Bioplatforms Australia – CSIRO NGS Workshop (July 9-12 2013)

Last July 9-12 2013, Bioplatforms Australia and CSIRO conducted the latest NGS (Next Generation Sequencing) training workshop at Monash University.

This is the second NGS training workshop organised by Bioplatforms Australia that is held at Monash University since the very first, last year.
Using the same but improved tools and machine image from the very first workshop, the team provisioned virtual machines on the Monash node of the NeCTAR Research Cloud. See this announcement regarding the Monash node.

Bioplatforms Australia has conducted 7 NGS training workshops across Australia (Melbourne, Sydney, Brisbane, Adelaide, Canberra and Perth) with a total of 190 attendees in the last 12 months with 2 more workshops planned later this year.

Future Bioplatforms Australia workshops are listed on their website.

Bioplatforms Australia – CSIRO NGS Workshop (July 9-10 2012)

We’ve successfully engaged Bioplatforms Australia and CSIRO in utilising the NeCTAR Research Cloud infrastructure to deliver next-generation sequencing workshops in Monash University and University of New South Wales last July 9-10, 2012.

We’ve created a custom cloud image which contains the relevant training materials, datasets and software stack.The image has been uploaded into the NeCTAR cloud and used to instantiate multiple virtual machines for the hands-on workshop. The same image has also been made available for download for participants to try locally on their local machines. 

Automation tools that integrates provisioning of virtual machines, software stack installation, dataset preparation have been created for easy re-deployment of resources for various workshop sites around Australia.

A manuscript entitled “Next Generation Sequencing (NGS): A challenge to meet the increasing demand for training workshops in Australia” has been recently accepted for publication in Briefings in Bioinformatics.