Proteomics | R@CMon

David Stroud, NHMRC Doherty Fellow and member of the Ryan Lab from the Department of Biochemistry and Molecular Biology, Monash University does proteomics research and uses the MaxQuant quantitative proteomics software as part of his analysis workflows. MaxQuant is designed for processing high-resolution Mass Spectrometry data and is freely available on the Microsoft Windows platform. Step one in the workflow is to do sample analyses using Liquid chromatography-mass spectrometry (LC-MS) on a Thermo Orbitrap Mass-spectrometer. This step produces raw files containing spectra that represent thousands of peptides. The resulting raw files are then loaded into MaxQuant to perform searches where spectra are compared against known list of peptides. A quantification step is then performed enabling peptide abundance to be compared across samples. Once this process is completed, the resulting tab delimited files are captured for downstream analysis.

Inspection of results using the MaxQuant software.

MaxQuant searches are both CPU and IO intensive tasks. A typical search takes 24 to 48 hours, and in some cases up to a week, depending on the size of the raw files being processed. David has been running his workflow on his own machine with 8 cores, 16 gigabytes of memory (RAM) and a solid state drive (SSD) for storage where a standard search takes 2 to 3 weeks to complete. Performing large MaxQuant searches on the local machine became a struggle, and David needed a bigger machine with a desktop environment to scale up his analysis workflow. The R@CMon team assisted David in deploying the MaxQuant software on the Monash node of the NeCTAR Research Cloud with an m1.xxlarge instance, spawned using the Monash-licensed Windows Server 2012 image. MaxQuant searches on the NeCTAR instance shows a 3-4x speed-up compared to the local machine, what takes several weeks on the local machine now just takes several days on the NeCTAR instance.

Maxquant search of Thermo RAW files.

The R@CMon team are currently working with David to explore further scaling options. The high-memory and PCIe SSD-enabled specialist kit on R@CMon Phase 2 can be exploited by MaxQuant for bursting IO intensive activities during searches. More on this coming soon!

The Proteome Browser (TPB) is a web portal that integrates human protein data and information. It provides an up-to-date view of the proteome (the entire library of proteins that can be expressed by cells or organisms – like us!) across large gene sets to support human proteome characterisation as part of the Chromosome-centric Human Proteome Project (C-HPP). Pertinent genomic and protein data from multiple international biological databases are assembled by TPB in a searchable format supporting C-HPP’s global proteomics effort.

TPB’s primary report of chromosome-ordered genes visualised using traffic light colour system.

TPB’s framework extracts biological data from numerous sources, maps it into the genome, and performs categorisation on the results based on quality and information content. The result (level of evidence) is presented by TPB using a simple point matrix coded by traffic light system (green – highly reliable evidence, yellow – reasonable evidence, red – some evidence is available or black – there is no available evidence). TPB uses hierarchical data types to group similar information from different experiment types.

TPB’s summary report for the chosen chromosome.

TPB is supported by Monash University, Monash eResearch Centre (MeRC), Chromosome-centric Human Proteome Project (C-HPP), Australia/New Zealand Chromosome 7 Consortium and the Australian National Data Service (ANDS). Researchers are now using TPB in various proteomic-related discoveries.

The R@CMon cloud team recently provided assistance to migrate The Proteome Browser web service to be hosted on the Monash node of the NeCTAR Research Cloud. TPB is using persistent storage (Volumes) granted via a VicNode computational storage allocation to house its underlying database. TPB’s new home will ensure it has stable and scalable long-term hosting supported by the NeCTAR and RDSI federal research infrastructure programmes.

R@CMon

Research @ Cloud Monash

Tag Archives: Proteomics

MaxQuant Proteomic Searches on R@CMon

The Proteome Browser on R@CMon