Professor Jenkins’ research focuses on pancreatic cancer, an inflammation-associated cancer and the fourth most common cause of cancer death worldwide, with an extremely low 5% five-year survival rate. Typically studies look at gene expression patterns between normal pancreas and cancerous pancreas in order to identify unique signatures, which can be indicative of sensitivity or resistance to specific chemotherapeutic treatments.
“Using next generation gene sequencing, involving big instruments, big data and big computing – allows near-term disruptive change in the clinical treatment of pancreatic cancer.” Prof. Jenkins, Monash Health..
To date, gene expression studies have largely focused on samples taken from open surgical biopsy; a procedure known to be very invasive and only possible in 20% of pancreatic cancers. Prof Jenkins’ group, in collaboration with Dr Daniel Croagh from the Department of Upper Gastrointestinal and Hepatobiliary Surgery at Monash Medical Centre, recently trialled an alternative less invasive process available to nearly all pancreatic cancer patients known as endoscopic ultrasound-guided fine-needle aspirate (EUS-FNA) which uses a thin, hollow needle to collect the samples of cells from which genetic material can be extracted and analysed. The challenge then becomes to ensure gene sequencing from EUS-FNA samples is comparable to open surgical biopsy such that established analysis and treatment can be used.
Twenty-four EUS-FNA-derived genetic samples from normal and cancerous pancreas were sequenced at the MHTP Medical Genomics Facility producing a total amount of 40Gb of raw data. Those data were securely transferred onto R@CMon by the Monash Bioinformatics Platform for processing, statistical analysis and computational exploration using state-of-the-art Bioinformatics methods.
Results thus far from this study show that data from EUS-FNA-derived samples were of high quality and also allowed the identification of gene expression signatures between normal and cancerous pancreas. Professor Jenkins’ group is now confident that EUS-FNA-derived material not only has the potential to capture nearly all of pancreatic cancer patients (compared to ~20% by surgery), but to also improve patient management and their treatment in the clinic.
“The current clinical genomics research space requires specialized high performance computational and storage infrastructure to support the processing and long term storage of those so-called “big data”. Thus R@CMon plays a major role in the discovery and development of new therapies and the improvement of Human health care in general.” Roxane Legaie, Senior Bioinformatician, Monash Bioinformatics Platform
The Monash Digital Object Identifier (DOI) Minter was developed by the ANDS-funded Monash University Major Open Data Collections (MODC) Project as an extendible service and deployed on the Monash node (R@CMon) of the NeCTAR Research Cloud for providing a persistent and unique identifier for datasets and research publications. A DOI is permanently assigned to datasets and publications to provide information about them, including where they or information about them can be found on the Internet. The DOI will not change even if information about the datasets changes over time.
Store.Synchrotron’s data publishing form using the Monash DOI minter service.
The Monash DOI Minter gives Monash University the ability to mint DOIs for data collections that are hosted and managed by services on R@CMon. The integration and accessibly to DOIs has never been easier. For instance the Monash Library can now use this service to mint DOIs for publicly accessible research collections. But also it is now being utilised by the Australian Synchrotron’s Store.Synchrotron service, which manages data produced by the Macromolecular Crystallography (MX) beamline and streamlines DOI minting for datasets through a publication workflow.
A demo published collection on Store.Synchrotron.
An MX beamline user can now collect data on the beamline which is stored, archived and made accessible through the Store.Synchrotron service. When the researcher has publication quality data, a copy of this data is deposited in the Protein Data Bank (PDB), with the appropriate metadata. The new publication workflow allows researchers to publish data hosted by the Store.Synchrotron service, with PDB metadata being automatically attached to the datasets, and a DOI being minted and activated after a researcher-selected embargo period. The DOI reference can then be included in their research papers.
We think it is a brilliant pattern of play for accelerating persistent identifiers of research data held at universities. To this end, we have made the DOI Minter available for others to instantiate.
Ceph Days are a series of regular events in support of the Ceph open source community. They now occur at locations all around the world. In November, R@CMon hosted Australia’s first Ceph Day. The day hosted 70-odd guests, many of which were from interstate and a few from overseas. There participants were from the research sector, private industry and ICT providers. It was a fantastic culmination of Australia’s growing Ceph community.
If you don’t already know, Ceph is basically an open-source technology for software-defined cluster-based storage. It means our storage backend is essentially infinitely scalable, and our focus can shift to the access mechanisms for data.
Check out the promo:
R@CMon has pioneered the adoption of Ceph for accessible research data storage and at mid-2013 was the first NeCTAR Research Cloud node to provide un-throttled volume storage. R@CMon has also worked closely with was InkTank and now Redhat to develop the support model for such an enterprise (see Ceph Enterprise – a disruptive period in the storage marketplace).
The day began with the Ceph Community Director – Patrick McGarry. His presentation included information about the upcoming expanded Ceph metrics platform, what the Ceph User Committee has been up to, new community infrastructure for a better contributor experience, and revised open source governance.
Undoubtedly the highlight of the day was the joint talk given by R@CMon’s very own director – Steve Quenette and technical lead – Blair Bethwaite. Here we explain Ceph in the context of the 21st century microscope – the tool each researcher creates to do modern day research. We also explain how we technically approached creating our fabric.
At SuperComputing 2015 in Austin our network/fabric partner Mellanox announced R@CMon (Monash University) as a “HPC Centre of Excellence“. A core goal of the HPC CoE is to drive the technological innovations required for the next generation (exascale) supercomputing, whilst also ensuring that such an exascale computer is relevant to modern research. R@CMon is a stand out pioneer at converging cloud, HPC and data, all of which are key to the “next generation”.
“We see Monash as a leader in Cloud and HPC on the Cloud with Openstack, Ceph and Lustre on our Ethernet CloudX platform.” Sudarshan Ramachandran, Regional Sales Manager, Australia & New Zealand
From a fabric innovation point of view, it has been a very productive and exciting 24months for R@CMon. By early 2014 the internal Monash University HPC system “MCC” was burst onto the Research Cloud, allowing a researcher’s own merit the be leveraged with institutional investment. It also represents a shift towards soft HPC, where the size of a HPC system changes regularly with time. Earlier this year we announced our early adoption of RoCE (RDMA over Converged Ethernet) using Mellanox technologies. The meant the same fabric used for cloud networking could also be used for HPC and data storage backplanes. In turn MCC on the R@CMon also enabled RDMA communications, that is, real HPC performance but on an otherwise orchestrated cloud.
Finally at the Tokyo OpenSack summit 2015, Mellanox announced R@CMon as debuting the World’s first 100G End-to-End Cloud. This technology eases scaling and heterogeneity of performance aspects. In particular, it sets the basis for processor and storage performance for peak and converged cloud/HPC needs. Watch this space!
The Ramialison Group at the Australian Regenerative Medicine Institute (ARMI) located in the biomedical research precinct of Monash University, Clayton specialises in systems biology both on the bench and through computational analysis. Their work is driven by the in vivo and in silico dissection of regulatory mechanisms involved in heart development, where deregulation of such mechanisms cause congenital heart disease, which results in 1 out of every 100 babies to be born with heart defects in Australia.
Heatmap generated from transcriptomic data from heart samples (Nathalia Tan)
Their research focuses on identifying DNA elements that play a crucial role in the development of the heart and, that could be impaired in disease. To identify these sequences, several genome-wide interrogation technologies (genomics and transcriptomics) are employed on different model organisms such as mouse or zebrafish. Downstream analysis of the data generated from these experiments involves high performance computing and requires large storage, which can be up to hundreds of gigabytes in size for a single project.
To optimise their investigation into heart development, the R@CMon team has deployed a dedicated Decoding Heart Development and Disease (DHDD) server on the Monash node of the NeCTAR Research Cloud infrastructure, which has now been running for over a year. This has not only provided the group with faster processing speeds in comparison to running jobs on a local desktop, but also an appropriate file storage infrastructure with persistent storage for files that are regularly accessed during analysis. Through VicNode, the group has been given vault storage for archiving completed results for their various research projects. With the assistance R@CMon, the group has been able to easily add users to the server as it continues to grow with new members and local collaborators.
Web interface for the Trawler web service.
In addition to the DHDD server, the R@CMon team also assisted the Ramialison Group in deploying a dedicated cloud server that has been used to develop the Trawler motif discovery tool web service. The implementation of this tool allows the group to quickly and easily analyse next-generation sequencing data and identify overrepresented motifs, which has led to a manuscript that is currently in preparation. The Ramialison Group envisage future developments of similar simple and easy to use bioinformatics analysis tools through R@CMon.
The Epigenetics and Chromatin (EpiC) Lab at Monash University is working on understanding how mutations in certain chromatin factors promote the formation of brain tumours. This project involves the generation and analysis of high-throughput sequencing data of chromatin modifications and remodellers in normal and mutated cells. The sequencing is carried out at the MHTP Medical Genomics Facility and the resulting datasets are then imported into the analysis workflow running on the Monash node (R@CMon) of the NeCTAR Research Cloud. The sequencing reads are first aligned to the repetitive fraction of the genome using a script developed by Day et al. (Genome Biology 2010) to determine enrichment at repeats. Sequencing reads are then aligned to the genome using Bowtie. The resulting files are filtered for quality, poor matches and PCR duplicates using customised Perl scripts. The filtered files are then imported into SeqMonk for further analysis.
Overlap analysis using SeqMonk
This allows for rapid visualisation of individual aligned reads across the entire genome. The inbuilt MACs peak caller is used for first pass peak calling. A selection of peaks is then validated in the lab by ChIP-qPCR experiments and peak-calling parameters can be adjusted based on these results. Overlap analysis with regions of interest can be performed in SeqMonk. Aligned sequence files are converted to BigWig format using customised Perl scripts and uploaded onto the NeCTAR Object Storage (Swift), which can then be loaded seamlessly on the UCSC Genome Browser for visualisation and further investigation. Once the sequence files are uploaded to the object storage, it can then be easily compared against public ENCODE datasets and UCSC genomic annotations to identify any potentially interesting correlations.
Aligned sequence visualisation using the UCSC Genome Browser.
The R@CMon team and the Monash Bioinformatics Platform supported the EpiC Lab by deploying a dedicated analysis instance on the NeCTAR Research Cloud based on the training environment first developed for the BPA-CSIRO Bioinformatics Training Platform. The open access and reusability of the training platform means it can be easily readapted to various analysis workflows. The R@CMon team and the Monash Bioinformatics Platform will continue to engage with the EpiC Lab as they grow and scale their analysis workflow on the NeCTAR Research Cloud.
Arvind Rajan is a scholar from the School of Engineering at the Monash University Sunway Malaysia campus. Arvind’s project, “Analytical Uncertainty Evaluation of Multivariate Polynomial”, supported by Monash University Malaysia (HDR scholarship) and the Malaysia Fundamental Research Grant Scheme, extends analytical method of “Guide to the Expression of Uncertainty in Measurement (GUM)” by the development of a systematic framework – the Analytical Standard Uncertainty Evaluation (ASUE) for the analytical standard measurement uncertainty evaluation of non-linear systems. The framework is the first step towards the simplification and standardisation of the GUM analytical method for non-linear systems.
The ASUE Toolbox
The R@CMon team supported the ASUE team at Sunway in deploying the framework on the NeCTAR Research Cloud. The project has been given access to the Monash-licensed Windows Server 2012 image and Windows-optimised instance flavour for configuration of the Internet Information Services (IIS) and ASP.NET stack. The ASUE team developed and deployed the framework on NeCTAR using remote desktop access (yes once again – even from overseas!). Mathematica, specifically webMathematica is then used on the NeCTAR instance to power the web-based dynamic ASUE Toolbox. The ASUE toolbox has been published in Measurement, a journal by International Measurement Confederation (IMEKO) and IEEE Access, an open access journal:
Y. C. Kuang, A. Rajan, M. P.-L. Ooi, and T. C. Ong, “Standard uncertainty evaluation of multivariate polynomial,” Measurement, vol. 58, pp. 483-494, Dec. 2014
A. Rajan, M. P. Ooi, Y. C. Kuang, and S. N. Demidenko, “Analytical Standard Uncertainty Evaluation Using Mellin Transform,” Access, IEEE, vol. 3, pp. 209-222, 2015
“The NeCTAR Research Cloud is a great service for researchers to host their own website and share the outcome of their research with engineers, practitioners and other professional community. Honestly, if it is not for the NeCTAR Research Cloud, I doubt our team could have made it this far. The support has been incredible so far. I will continue to publish my work using this service.”
Monash University Scholar
Electrical and Computer Systems Engineering