SUBJECT

Title

Biostatistics

Type of instruction

practical

Level

master

Part of degree program
Credits

4

Recommended in

Semester 3

Typically offered in

Autumn semester

Course description

1. Statistical estimation (Miklos I.) Definitions of terms defining goodness of estimates. Unbiasedness, consistency, robustness, efficiency, power. Central tendency and dispersion. Maximum likelihood estimation with some biological examples. Bayes statistics. Biological examples for the distribution of priors.

2. Generalized linear models (J Garay) Continuous functions describing the effect of one ore more factors. Methods for the construction of such functions. Linear models, generalized linear models, measurement of goodness of fit, residuals. Linear models for contiuous data for constant variance. Linear regression in one or more dimensions. Nonlinear regression.

3. ANOVA and its extensions (J Garay) Variance analysis with one or more factors. Model I ANOVA with single and double classification. Model II ANOVA, single classification. Repeated ANOVA.

4. Non parametric methods (J Garay) Statistics for variables satisfying neither the normal nor the exponential distribution function. Hypothesis testing based on nominal and ordinal scale variables. CHi square methods and applications (goodness of fit, homogeneity testing, testing independence of variables). Rank statistics (Wilcoxon test, Kruskal / Wallis test, Spearman rank correlation, Kendall test of independence). Tests for the comparison of two variables (Kolmogorov/ Smirnov tesat, binomial test).

5. Monte Carlo methods in biology (J Podani, I Miklos) Exact, permutation and randomization tests. Jackknife and the bootstrap. Computer intensive methods. Biological applications.

6. Fundamentals of geostatistics (J Podani) Descriptove spatial statistics, correlation covariance, regression. Spatial autocorrelation and its significance in ecology. Selected statistics, e.g. Moran's I function. Semivariograms in one and multivariate cases. Kriging. Moving window methods.

7. Information theory in biology (J Podani) Short history of information theory. Definitionof information. Entropy and mutual information, with emphasis on binary cases. Inf. theory distance of distributions. Measurement of ecological diversity using inf, theory.

8. Analysis of spatial pattern (J. Podani) Selected point processes. Applications of the quadrat method. The priority of space series. Transects and blocks. Computer simulated sampling. Multivariate extensions: Juhasz Nagy's family of methods, individual centered analysis.

9. Graph theory with biological applictions (J. Podani and F. Jordan) Relations and graphs. Definition of graphs, main types. Biological applications: food webs, neuronal networks and the internet. Scale free networks. Evaluation of network structures. Trees: definitions, biologcal applications in evolutionary biology.

10. Multivariate data analysis in biology (J Podani) he importance of multivariate data exploration in ecology, taxonomy, and biology in general. Data types, data space, distance measues. Hierarchical and non hierachical clustering methods. Ordination: metric methods. Discriminant analysis.

11. Introduction to stochastic modeling, with emphasis on cellular automata (A Kun). Fundamentals of stochastic modeling. Applicability and objectives. Some selected simple models in biology, and the problems of interpretation for non modelers. Spatially explicit models. Cellular automata and their applications. Computer lab for practising simpler models.

12. Abundance distributions (J Izsak). Methodological introdution. Poisson process, mixture distributions, negative binomial distribution, logarithmic distribution, lognormal distribution, gamma distribution. Earlier observations, histroical facts, the work of Corbet, Fisher and Preston. Classical abundance distributions. Poisson lognormal distribution. Stochastc and logistic models, the Gomphertz model. Broken stick models, exponential functions, Zipf anf Mandelbrot distributions. Large number of rare soecies. Relationship between abundance distributions and ecological diversity. 13. Numerical methods in ecology (A Kun) Applications of numerical methods, with special emphasis on biology. Limits and errors. Methods for generating pseudorandom numbers. Numerical integration. Numerical solution of differential equations. Numerical solution of linear and nonlinear equations. Minima and maxima of functions.

14. Importance sampling (I Miklos) Definition of importance sampling, examples. Estimating the variance of importance sampling. Superefficient sampling. Estimating the partition function with importance sampling.

15. The markoc Chain Monte carlo simulation / MCMC (I Miklos) Convergence of Markov chains. The Metropolis - Hastings algorithm. Gibbs sampling. Parallel tempering and Metropolis Coupled Monte carlo Markov chains. Biological applications.

Readings
  • János Podani: Introduction to the exploration of multivariate biological data, Backhuys Publishers, 2000

  • János Podani: Multivariate data analysis in ecology and systematics: a methodological guide to the SYN-TAX 5.0 package, SPB Academic Pub., 1994