Pages

Wednesday, February 27, 2013

Absolute beginners #RHelp

R At the weekend I was thinking about posting here about how often, when I am trying to do simple things with R, I can't find the low level information I need, and that the online R communities are too high powered to give me help at the level I, and I'm sure others, need. I thought about calling for a new community to provide this sort of low level help.

Then I thought, f**k it, I'll do it myself.





Click the graphs for larger version, copy and paste the code into R to try this for yourself.
Note: These are not real data, I made them up to show the method.

The anatomy of fish species depend on their diet. In a practical class, students examined two species of fish, mackerel (Scomber scombrus) and gurnard (Chelidonichthys cuculus). They made various measurements, including the standard length (from tip of snout to end of body) and the length of the intestine. The class pooled their data. There was a variation in the length of individual fish of each species, so the students calculated the ratio of the length of the intestine to the standard length of each fish. Using R:

> mackerel <- c(0.278, 0.389, 0.292, 0.268, 0.277, 0.364, 0.362, 0.217, 0.375, 0.338, 0.368)

> gurnard <- c(0.655, 0.702, 0.595, 0.667, 0.705, 0.687, 0.715, 0.656, 0.636, 0.701, NA)

> fish.data <- data.frame(cbind(mackerel, gurnard))

> attach(fish.data)

> summary(fish.data)
    mackerel         gurnard     
 Min.   :0.2170   Min.   :0.5950 
 1st Qu.:0.2775   1st Qu.:0.6552 
 Median :0.3380   Median :0.6770 
 Mean   :0.3207   Mean   :0.6719 
 3rd Qu.:0.3660   3rd Qu.:0.7017 
 Max.   :0.3890   Max.   :0.7150 
                  NA's   :1      

> boxplot(fish.data)

Boxplot

# This is a comment line - I can write notes here to remind me what I've done.
# Let's colour the graph in so that it's easier to see.
# What colours can R use?

> colors()
[1] "white" "aliceblue" "antiquewhite" "antiquewhite1" "antiquewhite2"
[6] "antiquewhite3" "antiquewhite4" "aquamarine" "aquamarine1" "aquamarine2"
#Edited - try it for yourself

# This is a scientific report, so don't use anything too bright.
# Why Should Engineers and Scientists Be Worried About Color?
# http://www.research.ibm.com/people/l/lloydt/color/color.HTM
# But don't spend too long playing with colours :-)
# This might help:
# http://research.stowers-institute.org/efg/R/Color/Chart/

> boxplot(fish.data, col="slategray2")

# This graph needs tidying up for presentation:

> boxplot(fish.data, ylab="Standard length : Intestine ratio", main="Graph of standard length to intestine length ratio", col="slategray2")

Boxplot

# Well they look different.
# We could do statistical tests to see if they really are.
# But this post is about data visualization.

> stripchart(fish.data)

Stripchart

#Come on R, you can do better than that.

> ?stripchart

> stripchart(fish.data, vertical=TRUE)

# Better.

> stripchart(fish.data, vertical=TRUE,method="jitter")

# Even better

> stripchart(fish.data, vertical=TRUE,method="jitter", pch=c(1, 2), main="Plots of standard length to intestine length ratio", ylab="Standard length : Intestine ratio")

Stripchart

# You can have more than one graph window open at a time if you want:
# windows()    on Windows OS
# quartz()     on Macintosh OS
# I'd also like to plot this data as a barplot with standard deviation error bars to show the variation in the data.

> summary(fish.data)
    mackerel         gurnard    
 Min.   :0.2170   Min.   :0.5950
 1st Qu.:0.2775   1st Qu.:0.6552
 Median :0.3380   Median :0.6770
 Mean   :0.3207   Mean   :0.6719
 3rd Qu.:0.3660   3rd Qu.:0.7017
 Max.   :0.3890   Max.   :0.7150
                  NA's   :1     

# Make a file of the mean values to plot:

> means <- c(0.3207, 0.6719)

# Calculate the standard deviations for the error bars:

> sd(mackerel, na.rm=TRUE)
[1] 0.05640761

> sd(gurnard, na.rm=TRUE)
[1] 0.03756313

# Make a file to plot the error bars:

> error.bars <- c(0.05640761, 0.03756313)

> SD.graph <- barplot(means, ylim=c(0,max(means)+max(error.bars)))

# This plots the graph. The y-axis scale will depend on the size of the longest error bar. You can change it by setting a value of ylim

> arrows(SD.graph, means-error.bars, SD.graph, means+error.bars, code=3, angle=90, length=.1)

# This adds the error bars
# Now a decent version for presentation:

> SD.graph <- barplot(means, main="Means of standard length to intestine length ratio", names.arg=c("Mackerel", "Gurnard"), ylab="Standard length : Intestine ratio", ylim=c(0,max(means)+max(error.bars)))

> arrows(SD.graph, means-error.bars, SD.graph, means+error.bars, code=3, angle=90, length=.1)

Barplot





No comments:

Post a Comment