Volumes on R@CMon

The Monash node now has a block-storage volume service in pre-production. The volumes are delivered via OpenStack Cinder and Ceph. Cinder hooks up networked real or virtual block devices to your virtual machine instances from a huge variety of backend storage systems. This gives you persistent storage on-demand, in the mold of Amazon EBS.  Like object storage, volumes exist independent of your instances; but unlike object storage, volumes provide virtual direct-attached storage suitable for your favourite filesystem, database or application store. Some tools built on top of the RC require volumes to unleash their full abilities, e.g., Galaxy. There is more general information about the volume service on the NeCTAR wiki Research Cloud Storage page.

Originally, NeCTAR never specifically required the RC nodes to provide persistent block-storage, only to capable of interfacing with it at some stage in the future – clearly that was an intended meeting/union point (in budget line items) for NeCTAR and RDSI. At Monash we have seeded our volumes capability via NeCTAR and plan to expand it via RDSI. We plan to offer trial quota of about 50GB to those who ask, we’ll give some out via NeCTAR merit-allocation, and more via ReDS. Thanks to the “data-deluge” (please don’t hurt me for repeating that), block-storage will prove to be the scarcest resource across the RC, so we will probably have to implement some form of quota retirement – similar to how HPC resources put tight quotas on and have regular cleanups of their /short or /scratch filesystems.

We expect the Monash volume service to stay in “pre-production” until the end of the year. What we mean by that is that we are giving it best-effort support. It is a highly available and redundant system (Ceph, if you are interested – more specific details below), but we have not operationalised the system administration to give it 24/7 attention, or for that matter done much in the way of tuning. So in summary, we don’t expect or plan to cause any problems/outages, but we also have no fixed service levels or detailed DR plans as yet.

The current setup is running across eight Dell R720xd boxes with a total of 192TB raw storage, all very closely coupled with the monash-01 compute AZ (availability-zone). The “nova” pool has two replicas of each object, so in practice we have about 90TB of usable capacity at the moment. It seems to be working out quite well so we expect to expand this as part of R@CMon stage 2. A bunch of folks are already using it, as you can see from the status:

root@rcstor01:~# ceph -s
 cluster b72.....x
 health HEALTH_OK
 monmap e1: 5 mons at {0=w.x.y.z2:6789/0,1=w.x.y.z3:6789/0,2=w.x.y.z4:6789/0,3=w.x.y.z5:6789/0,4=w.x.y.z6:6789/0}, election epoch 184, quorum 0,1,2,3,4 0,1,2,3,4
 osdmap e696: 96 osds: 96 up, 96 in
 pgmap v2671880: 4800 pgs: 4800 active+clean; 8709 GB data, 18387 GB used, 156 TB / 174 TB avail; 17697B/s rd, 134KB/s wr, 21op/s
 mdsmap e1: 0/0/1 up

The awesome thing about Ceph is the bang for buck and fine-grained scalability. The value proposition is exactly the same as for object storage – you can expand a server at a time at commodity prices per TB (which seem to be about half to two-thirds that of vendor proprietary solutions), you don’t ever need to buy a huge amount more storage than you really need, you keep buying today’s spec of the hardware and reap the capacity and performance improvements, and every time you add a storage server you’re increasing the number of clients you can serve (no other components to scale).

If you think your project needs volumes please put a request in or edit an existing request through the NeCTAR Dashboard

Leave a Reply