Our cloud (and HPC) building team is globally regarded as pioneers as cloud / digital research infrastructure builders. We apply these skills and experience with every research engagement. Some examples of our impact are:
ARDC Nectar Core Services – The Research Cloud at Monash team forms part of Nectar’s Core Services, the central group that operates and innovates the underpinning federated cloud. Core Services maintains its own distribution of OpenStack – the most widely deployed Open Source cloud software in the world. It distribution adapts for the nuances to make a Research Cloud.
Nectar adopted OpenStack in its infancy (2012), and well before it became the world’s 3rd most contributed to open source project. Over the decade, many of Nectar’s learnings have been contributed back into OpenStack. In the Research Cloud model, the federation / poly-cloud is achieved whereby Horizon (the Research Cloud dashboard), Keystone (connected to Australia’s Access Federation – AAF), and Glance (the virtual machine image repository) are operated nationally / centrally. The nodes of the federation, of which Monash is one, operate Nova (the compute resources), Cinder (the volume storage resources) and Swift (the object storage resources). Core Services also coordinates the upgrade cycle of the federation, the high quality (“golden”) virtual machine images and the default set of virtual machine flavour configurations.
OpenStack Scientific Interest Group (SIG) – The Monash team aided the establishment of the research sector user forum within the OpenStack community – the SIG. We made substantive contributions to the seminal recipe book for building HPC systems on OpenStack – The Crossroads of Cloud and HPC: OpenStack for Scientific Research (2016).
Pioneering the adoption of Ceph – Ceph’s Reliable Autonomic Distributed Object Store (RADOS) provides extraordinary data storage scalability, proven to thousands of hosts / virtual machines within a cloud, accessing petabytes to exabytes of data. Applications can choose between object, block or file system interfaces to the same RADOS cluster. This means we can treat all (but HPC / parallel filesystem and tape library storage) as one storage pool for the majority of our storage use cases. This enables us to incrementally refresh, and in-turn flat-line our yearly spend.
The Research Cloud at Monash adopted Ceph in its infancy (2012), and it has promulgated to other Nectar nodes and other purposes within the University since. We had built the southern hemisphere’s largest Ceph cluster by 2016, and were awarded Red Hat’s the storage client of the year award in 2017. Today we actively participate in the Ceph Foundation as an associate (academic) member, mainly driving the adoption of Ceph for research and university workloads.
Pioneering the adoption of RoCE for cloud – In 2012 we were very early adopters of Mellanox’s ConnectX NICs. Mellanox had decided to promulgate RDMA, originally applied to Infiniband for HPC (and of which they had and still have substantive marketshare), to RDMA over Converged Ethernet (RoCE) for clouds. We produced the global first reference implementation of ConnectX for societally relevant use cases, launching at OpenStack Summit Tokyo 2015. ConnectX is now 70% of the marketshare for NICs above 10GbE.
Pioneering the adoption of CumulusOS – By 2016/2017 we noticed that network operations was the sole remaining ICT pillar that needed “devops-ing”. We partnered with Cumulus, who were commercialising an approach heavily inspired by Google’s own approach, to transition our network fabric to Linux (CumulusOS), and thus enabling the devops paradigm (via NetQ). This work was done in partnership with the University’s networks team (within eSolutions), where it has been promulgated to the enterprise.