DESCRIPTION OF COURSES

Close

AS 608 ADVANCED BIOINFORMATICS                                                                               (2L+1P) III
(Pre-requisite
: AS 571)

Objectives
This is a course on Bioinformatics that aims at exposing the students to some advanced statistical and computational techniques related to bioinformatics. This course would prepare the students in understanding bioinformatics principles and their applications.

Theory

UNIT I
Genomic databases and analysis of  high-throughput data sets, Analysis of  DNA sequence, Sequence annotation, ESTs, SNPs. BLAST and related sequence comparison methods. EM algorithm and other statistical methods to discover common motifs in biosequences. Multiple alignment and database search using motif models, Clustal W and others. Concepts in phylogeny. Gene prediction based on codons, decision trees. classificatory analysis, neural networks, genetic algorithms, pattern recognition, Hidden Markov models.

UNIT II
Computational analysis of protein sequence, structure and function. Modeling protein families. Expression profiling by microarray/gene chip, proteomics etc.. Multiple alignment of protein sequences. Modelling and prediction of structure of proteins. Designer proteins. Drug designing.

UNIT III
Markov Chains (MC with no absorbing states, higher order Markov dependence, patterns in sequences, Markov Chain Monte Carlo – Hastings-Metropolis algorithm, simulated annealing, MC with absorbing States). Bayesian techniques and use of Gibbs Sampling. Advanced topics in design and analysis of DNA microarray experiments.

UNIT IV
Computationally intensive methods (classical estimation methods, Bootstrap estimation and confidence intervals, hypothesis testing, multiple hypothesis testing). Evolutionary models (models of nucleotide substitution). Phylogenetic tree estimation (distances, tree reconstruction - ultrametric and neighbor-joining cases, surrogate distances, tree reconstruction, parsimony and maximum likelihood, modeling, estimation and hypothesis testing). Neural Networks (universal approximation properties, priors and likelihoods, learning algorithms - back propagation, sequence encoding and output interpretation, prediction of protein secondary structure, prediction of signal peptides and their cleavage sites, application for DNA and RNA nucleotide sequences). Analysis of SNPs and haplotypes.

Practicals
Genomic databases and analysis of high-throughput data sets, BLAST and related sequence comparison methods. Statistical methods to discover common motifs in biosequences. Multiple alignment and database search using motif models, clustalw, classificatory analysis, neural networks, genetic algorithms, pattern recognition, Hidden Markov models. Computational analysis of protein sequence. Expression profiling by microarray/gene chip, proteomics. Modelling and prediction of structure of proteins. Bayesian techniques and use of Gibbs Sampling. Analysis of DNA microarray experiments. Analysis of one DNA sequence, multiple DNA or protein sequences. Computationally intensive methods, multiple hypothesis testing, Phylogenetic tree estimation, Analysis of SNPs and haplotypes.

Suggested Readings