Computational analysis of high-throughput proteomics and metabolomics data

Modern biomedicine, shaped by novel, complex experimental methods, is  generating massive data that can no longer be analyzed by traditional  computational tools. Progress in biomedicine now substantially depends on  advanced computational methods, turning bioinformatics into a key area of  biomedical research and technology. On the other hand, biomedicine  motivates  novel problem areas for computational scientists.

The course gives an overview of modern developments in computational analysis of mass-spectrometry and structural data in proteomics and metabolomics, as well as integration of these data sets with NG sequencing information. The proteomics and metabolomics approaches are addressing a variety of biomolecular questions and drug design.


Major topics covered are:

1. Functional cell biology — mass-spectroscopy methods:

a. metabolomics

b. proteomics

c. SILAC approach in proteomics

2. Functional cell biology – structural methods and protein complexes

a. Small molecules

b. X-ray crystallographic structures

c. electron microscopy (EM) map

d. nuclear magnetic resonance (NMR) data

3. Analysis of mass-spectroscopy data: isotope patterns, protein-protein interactions and protein modifications

4. SILAC data and MaxQuant in a pipeline for MS/MS identification of peptides: QC of MaxQuant via isotope patterns for fragments of identified proteins

5. MultiScaling and clustering of p-p associations (BioGrid database)

6. Approaches to integration of heterogeneous data types

7. Drug design: structure-activity relationship (SAR) of small molecules

8. Local similarity of 3D structures

9. Docking of ligands

10.  3D structure of protein complexes

11.  XCMS in a pipeline: warping of RT,QC of XCMS data and artifact correction

12. Metabolite annotations via MS/MS spectra and isotope patterns




1. Simon B, Madl T, Mackereth CD, Nilges M, Sattler M (2010) An efficient protocol for NMRspectroscopy-based structure determination of protein complexes in solution. Angewandte Chemie (International ed in English) 49: 1967–1970.

2. Brooks BR, Brooks CL, Mackerell AD, Nilsson L, Petrella RJ, et al. (2009) CHARMM: the biomolecular simulation program. J Comput Chem 30: 1545–1614.

3. Lawson CL, Baker ML, Best C, Bi C, Dougherty M, et al. (2011) unified data resource for CryoEM. Nucleic Acids Res 39: D456–464.

4. Lasker K, Sali A, Wolfson HJ (2010) Determining macromolecular assembly structures by molecular docking and fitting into an electron density map. Proteins 78: 3205–3211.

5. Lasker K, Topf M, Sali A, Wolfson HJ (2009) Inferential optimization for simultaneous fitting of multiple components into a CryoEM map of their assembly. J Mol Biol 388: 180–194.

6. Case DA, Cheatham TE, 3rd, Darden T, Gohlke H, Luo R, et al. (2005) The Amber biomolecular simulation programs. J Comput Chem 26: 1668–1688.

7. Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory and Computation 4: 435–447.

8. Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G.XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification, Anal Chem. 2006 Feb 1;78(3):779-87.

9. Cox, J. and Mann, M. (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26, 1367-72.

10. de Godoy LM, Olsen JV, Cox J, Nielsen ML, Hubner NC, Fröhlich F, Walther TC, Mann M. (2008) Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast. Nature 455, 1251-4.

11. Cox J, Neuhauser N, Michalski A, Scheltema RA, Olsen JV, Mann M. (2011) Andromeda: a peptide search engine integrated into the MaxQuant environment. J Proteome Res 10, 1794-805.

12. Cox, J. and Mann, M. (2009) Computational principles of determining and improving mass precision and accuracy for proteome measurements in an Orbitrap. J Am Soc Mass Spectrom 20, 1477-85.