Author Archives: Philip Chan

More FLAIR to Fluid Mechanics via the Monash Research Cloud

Advanced research in engineering can often benefit from extra compute capacity. This is where a research-oriented computational cloud like R@CMon is very handy. We report on the use of cloud resources to augment the resources available for running large-scale fluid mechanics studies.

FLAIR (Fluids Laboratory for Aeronautical and Industrial Research), from the Department of Mechanical and Aerospace Engineering, Faculty of Engineering, has been conducting experimental and computational fluid mechanics research for over twenty years, focusing on fundamental fluid flow problems that impact the automotive, aeronautical, industrial and biomedical fields.

A key research focus in recent years has been understanding the wake dynamics of particles near walls. Particle-particle and wall-particle interactions were investigated using an in-house spectral-element numerical solver. Understanding these interactions is key in many engineering industries. When applied to biological engineering, blood cells / leukocytes are numerically modelled as canonical bluff bodies (i.e., as cylinders and spheres) and numerical computations are carried out. These simulations are not only useful in understanding biological cell transport but have wider applications in mineral processing, chemical engineering and applications in ball sports. Due to the computational and data-intensive nature of this research, it has always been a challenge to get access to sufficient computing resources for its needs.

In particular, their project aims to understand the wake dynamics on multiple particles in various scenarios such as rolling, collisions and vortex-induced vibrations; and the resultant mixing which occurs as a result of these interactions, etc. The group’s two- and three-dimensional fluid flow solver also incorporates two-way body dynamics to model these effects. As the studies involve multiple parameters such as Reynolds number, body rotation, height of the body above the wall, etc, the total parameter space is extensive, requiring significant computational resources. While the two-dimensional simulations are carried out on single processors, their three-dimensional counterparts require parallel processing, making NeCTAR nodes an ideal platform to run these computations. Some of the visualisations from the group’s three-dimensional simulations are shown in Figures 1 and 2 below.

Since 2008, the FLAIR team has been making good use of the Monash Campus Cluster (MCC), a high-performance/high-throughput heterogeneous system with over two thousand CPU cores. However, MCC is heavily utilised by researchers from across the university and FLAIR users often found themselves waiting long periods before they could run their fluid flow simulations. It became clear that FLAIR researchers needed additional computational resources.

R@CMON was able to secure a 160-core allocation to the FLAIR team, which increased valuable resources for the group. Now, thanks to both NeCTAR and MCC-R@CMon, over one million CPU hours distributed across 4,000 jobs were provided for the project’s CPU-intensive calculations.

This has resulted in a number of publications in the highest impact fluid mechanics journals, with several more in a pre-submission stage; for example:
  • Rao, A., Thompson, M.C., & Hourigan, K. (2016) “A universal three-dimensional instability of the wakes of two-dimensional bluff bodies.” Journal of Fluid Mechanics, 792, 50-66.
  • Rao, A., Radi, A., Leontini, J.S., Thompson, M.C., Sheridan, J., & Hourigan, K. (2015) “A review of rotating cylinder wake transitions.” Journal of Fluids and Structures, 53, 2–14.
  • Rao, A., Radi, A., Leontini, J.S., Thompson, M.C., Sheridan, J., & Hourigan, K. (2015) “The influence of a small upstream wire on transition in a rotating cylinder wake.” Journal of Fluid Mechanics (published online) 769 (R2), 1-12. DOI
  • Rao, A., Thompson, M.C., Leweke, T., & Hourigan, K. (2013) “The flow past a circular cylinder translating at different heights above a wall.” Journal of Fluids and Structures, 41, 9–21.
  • Rao, A., Passaggia, P.-Y., Bolnot, H., Thompson, M.C., Leweke, T., & Hourigan, K. (2012) “Transition to chaos in the wake of a rolling sphere.” Journal of Fluid Mechanics, 695, 135-148.

Figure
Figure

MCC-on-R@CMon Phase 2 – HPC on the cloud

Almost a year ago, the Monash HPC team embarked on a journey to extend the Monash Campus Cluster (MCC), the university’s internal heterogeneous HPC workhorse, onto R@CMon and the wider NeCTAR Australian Research Cloud. This is an ongoing collaborative effort between the R@CMon architects and tech-crew, and the MCC team, which has long-standing and strong engagements with the Monash research community. Recently, this journey has been further enriched by the close coordination with the MASSIVE team, which will enhance the sharing of technical artefacts and learnings between the two teams.

By September 2014, the MCC-on-the-Cloud has grown to over 600 cores, spanning across three nodes on the Australian Research Cloud. Its size was only limited because the Research Cloud was full and awaiting a wave of new infrastructure to be put in place. Nevertheless, Monash researchers from Engineering, Science, and FIT have collectively used over 850,000 CPU-core hours. Preferring the “MCC service”, they have offered their NeCTAR allocations to be managed by the MCC team, rather than building a cluster and installing the software stack by themselves. From the researchers’ perspective, this has the twofold benefit of providing a consistent user experience to that of the dedicated MCC and freeing them from the burden of managing cloud instances, software deployment, queue management, etc.

Deploying a usable high-performance/high-throughput computing (HPC/HTC) service on the cloud poses many challenges. Users expect a certain robustness and guaranteed service availability typical of traditional clusters. All this must be achieved despite the fluidity and heterogeneity of the cloud infrastructure and nuances in service offerings across the Research Cloud nodes. For example, one user reported that jobs were cancelled by the scheduler because they exceeded the specified wall time limits, and we subsequently discovered that some MCC “cloud” compute nodes were running on oversubscribed hosts (contrary to NeCTAR architecture guidelines). Nevertheless, we can declare that our efforts have paid off – MCC-on-the-cloud is now operating and delivering the reliable HPC/HTC computing service wrapped in the classic MCC look-and-feel that Monash researchers have come to depend on. Despite the many challenges, we are convinced that this is a good way to drive the federation forward.

Now with R@CMon Phase 2 coming online, we have taken a step closer towards realising this aim of “high-performance” computing on the cloud. Equipped with Intel Ivy Bridge Xeon processors, R@CMon Phase 2 hardware stands out amidst the cloud of commodity hardware on most other NeCTAR nodes. These specialist servers are already proving invaluable for floating-point intensive MPI applications. In production runs of a three-dimensional Spectral-Element method code, we observed performance of nearly double on these Xeons as compared to the AMD Opteron nodes across most of the rest of the cloud, even when hyper-threading is enabled. By pinning the guest vCPUs to a range of hyper-threaded cores on the host, we achieved a further 50% performance improvement; this is effectively over 2.6x improvement to the “commodity” AMD nodes. We look forward to implement this vCPU pinning feature once it is natively supported in OpenStack Juno, the RC’s next version.

Measured performance improvement with a production 3D Spectral Element code R@CMon Phase 1: AMD Opteron 6276 @ 2.3 GHz                 Phase 2: Intel Xeon E5-4620v2 @ 2.6 GHz

Measured performance improvement with a production 3D Spectral Element code
R@CMon Phase 1: AMD Opteron 6276 @ 2.3 GHz
Phase 2: Intel Xeon E5-4620v2 @ 2.6 GHz

Thus, our journey continues… Once RDMA (Remote Direct Memory Access) is enabled on Phase 2, accelerated networking will make it feasible to run large-scale, multi-host MPI workloads. Achieving this will take us even closer to a truly high-performance computing environment on the cloud. Look out for MCC science stories and infrastructure updates soon!