course introduction

Histogram

This course builds upon the material presented in GPHY 247 (students who have STAT 263 or the equivalentwill find this background suitable). The course will be organized around a set of modules, each of which will be about 2 weeks of course time and will include both lectures and practical computer based material. These topics include

  1. Resampling Statistics

Conventional parametric statistics are very powerful when we meet the demanding assumptions of randomness and independence of our samples and often normality in the distribution of our variables. Resampling stats are a useful alternative when normality fails or else our data are at lower levels of measurement including ordinal and nominal scales.

  • Data Mining and Machine Learning

Data can be examined in many ways. Given the rapid growth of data, particularly data held in databases, it is useful to see how other approaches to analysis offer an alternative to statistical approaches. In this module, we will look at data mining tools as used in classification, association and clustering.

  • Regression Analysis

In this module we will focus on the extension of linear regression analysis introducing some advanced topics including multiple regression, leverage, partial and multiple correlation, nonlinear relationships and validation.

  • Miscellaneous Topics

The remaineder of course time will be used for a variety of special topics depending upon interest and time. There may be material on Spatial statistics and autocorrelation, Linear programming, Optimization problems, and other multivariate statistical techniques such as factor analysis.