Monday, March 16, 2009

Joining the dots

John Tukey On Friday we had an informal meeting to discuss statistics teaching at UoL (the background to this is here). The anticipated tweet stream didn't materialize - apologies to anyone who waited up :-)
Several departments were represented. This is my recollection of the discussion, but if you were there, feel free to contribute (and if you weren't, feel free to contribute :-)

The consensus was that there is a lack of understanding of process of statistical analysis. Students (and staff) seek quick fix, single number "answers" which do not align with Tukey's emphasis on the important distinction between exploratory data analysis and confirmatory data analysis, believing that much statistical methodology placed too great an emphasis on the latter:
"Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise." J.W.Tukey 1962, The future of data analysis. Annals of Mathematical Statistics 33(1): 1–67.
Moreover students do not retain what they have learned and cannot apply the knowledge. The problems are confounded by underinvestment in terms of staff and curriculum time in comparison with the expected outputs. This seems to be a general problem across the campus.

Over the weekend I formulated my future teaching plans in this area, which are now up for discussion:

Year 1:
Open Office: data manipulation; statistics functions; graphs (including error bars).
Introduction to R: basic data handling in R; producing graphs; descriptive statistics.
The plan is that students will be given a "cheat sheet" for essential R commands, support materials will concentrate on statistical principles.

Year 2:
Statistics with R: Reprise of "working with R" basics from Year 1; exploratory data analysis & the normal frequency distribution; correlation; simple linear regression; t test and chi squared; standard deviation vs. standard error vs. confidence intervals.
Students will be given a "cheat sheet" for essential R commands, statistical analysis decision tree, support materials will concentrate on statistical principles.

Year 3/Postgraduate:
Advanced statistics with R: ANOVA; covariance; multiple regression; Mann-Whitney U test.
This goes beyond what we currently teach and should equip students for most circumstances. However, this can only be achieved if staff and curriculum resources are made available.

Looks like we may be going cross-platform, open source ;-)