Computational Biology

Computational Biology (for first degree students)

Modern biomedicine, shaped by novel, complex experimental methods, is
 generating massive data that can no longer be analyzed by traditional
 computational tools. Progress in biomedicine now substantially depends 
 on  advanced computational methods, turning bioinformatics into a key area 
 of  biomedical research and technology. On the other hand, biomedicine 
 motivates  novel problem areas for computational scientists.

  The course gives an overview of modern developments in biomedicine, 
 “omics”  (genomics/transcriptome/proteomics/metabolomics) data types, and
 computational approaches to analysis and integration of these data. Computational  applications can be technological in nature, e.g. development of  methods for  the analysis of deep sequencing data or high-density oligonucleotide microarray as used for genome-wide association studies, or specifically  addressing concrete biomolecular questions, drug design, systems biology and population genetics oriented computer modeling.

Major topics covered are:

Functional genomics: an overview of highthroughput omics data:
DNA micrioarrays: expression, tiling, SNP
next generation sequencing (NGS): DNA-seq, RNA-seq, ChIP-seq;
mass-spectroscopy: metabolomics and proteomics
Microarray experiments: design, normalization, statistical issues
Genome assembly from short reads
Analysis of transcriptome data: mapping of reads, detection of transcribed isoforms, denovo transcriptome assembly
Analysis of  whole genome data (RNA/DNA editing, epigenetics patterns, transcription factor binding sites): motif discovery, signal peaks across genome
Haplotype reconstruction, and haplotype frequency estimation
Proteomics and metabolomics: isotope patterns, protein-protein interactions and protein modifications
Downstream  analysis of highthroughput biological data: multivariate and factor analysis, ANOVA and regression methods, clustering, discrimination
Integration of physiological/clinical measurements and omics  data
Systems biology approaches: modeling genetics, genomics, and biochemical networks
Autoregulation and multistability in the biological systems: bifurcations and chaos in biochemical processes
Mutations and adaptive evolution

Bibliography:
Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe T, Hall G, Shea TP, Sykes S, Berlin AM, Aird D, Costello M, Daza R, Williams L, Nicol R, Gnirke A, Nusbaum C, Lander ES, Jaffe DB: High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A 2011, 108:1513-1518.
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I: ABySS: A parallel assembler for short read sequence data. Genome Res 2009, 19:1117-1123.
Simpson JT, Durbin R: Efficient de novo assembly of large genomes using compressed data structures. Genome Res 2012, 22:549-556.
Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, Li S, Yang H, Wang J, Wang J: De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 2010, 20:265-272.
Blankenberg, D., N.Coraor, K.G.Von, J.Taylor, and A.Nekrutenko. 2011. Integrating diverse databases into an unified analysis framework: a Galaxy approach. Database. (Oxford) 2011:bar011.
Bock, C., K.G.Von, K.Halachev, J.Taylor, A.Nekrutenko, and T.Lengauer. 2010. Web-based analysis of (Epi-) genome data using EpiGRAPH and Galaxy. Methods Mol. Biol. 628:275-296.
Albert, R. 2005. Scale-free networks in cell biology. J. Cell Sci. 118:4947-4957.
Christensen, C., J.Thakar, and R.Albert. 2007. Systems-level insights into cellular regulation: inferring, analysing, and modelling intracellular networks. IET. Syst. Biol. 1:61-77.
Kugel, J.F., and J.A.Goodrich. 2012. Non-coding RNAs: key regulators of mammalian transcription. Trends Biochem. Sci.
Loots, G.G. 2008. Genomic identification of regulatory elements by evolutionary sequence comparison and functional analysis. Adv. Genet. 61:269-293.
Mortazavi, A., B.A.Williams, K.McCue, L.Schaeffer, and B.Wold. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5:621-628.
Zhang, Y., T.Liu, C.A.Meyer, J.Eeckhoute, D.S.Johnson, B.E.Bernstein, C.Nusbaum, R.M.Myers, M.Brown, W.Li, and X.S.Liu. 2008. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9:R137.
Manfred G Grabherr M. G., Brian J Haas B. J. et al. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome, Nat. Biotechnology, 29(7): 644-652
Ramaswami G., Zhang R. 2013. Identifying RNA editing sites using RNA sequencing data alone, Nature Methods, 10(2): 128-132
J.D.Murray, Mathematical Biology, 1989
D.Wilkinson Stochastic Modelling for Systems Biology, Chapman & Hall/CRC ,2006
E Klipp, R Herwig, A Kowald, C Wierling, and H Lehrach. Systems Biology in Practice. Wiley-VCH: 2005
Z. Szallasi, J. Stelling, and V.Periwal (eds.) System Modeling in Cellular Biology: From Concepts to Nuts and Bolts, MIT Press: 2006
B Palsson, Systems Biology – Properties of Reconstructed Networks. Cambridge University Press: 2006
U Alon. An Introduction to Systems Biology: Design Principles of Biological Circuits. CRC Press: 2006
Wilke C.O.2001. Adaptive evolution on neural networks, Bulletin of Mathematical Biology, 63(4): 715-730