Tuesday, November 13, 2007

Putting on the style: How bloggers write

You know how it is. It's 3am when your eyelids clang open and you know instantly you've got absolutely no chance of getting back to sleep. So you get up, make a cup of tea and fire up the computer.

Pretty soon you've checked your email and you're reading your RSS feeds and then it hits you, the most cunning plan that ever won first prize for cunning in an international cunning competition.

You can use FLESH to analyze the writing styles of a sample of your favourite edubloggers. All you need to do is take a sample of first 1000 words each published on their blog in the month of October 2007 (text only, no blog hardware or comments), run it through FLESH to calculate the Flesch Reading Ease Score and Flesch-Kincaid Grade Level, quick bit of statistical analysis and bung it off to a journal.

Tea finished, back to bed and sleep peacefully. But when you wake up in the morning, the cunning plan doesn't seem so ... cunning.

Author Fleisch Reading Ease Fleisch-Kincaid Grade Level

The Flesch Reading Ease formula uses only two variables, the number of syllables and the number of sentences for each 100-word sample:

It predicts reading ease on a scale from 1 to 100, with 30 being "very difficult" and 70 being "easy." Flesch wrote that a score of 100 indicates reading matter understood by readers who have completed the fourth grade and are, in the language of the U.S. Census barely "functionally literate." Flesch compared the reading scores of popular magazines with other variables:
Flesch Reading Ease

The Flesch-Kincaid Grade Level is an index that gives the years of (U.S.) education required to comprehend a document. For example, a document with a Flesch-Kincaid Grade Level score of 10 would require that a reader have about 10 years (or a 10th grade level) of education to comprehend the document. It can be calculated using the equation:

Flesch-Kincaid Grade Level

Something funny with Cann1 - no matter how many times I check it, FLESH always gives a weird result for that sample. As for the rest of it - what do the numbers mean? No idea. I suppose that edubloggers writing styles (on this rather small sample) are somewhere between "easy" and "very difficult". The scores are roughly normally distributed. There's a negative correlation between the Reading Ease scores and the Flesch-Kincaid Grade Level but it's not statistically significant.

So you sigh, decide against the idea of trying to stretch it out for a journal article and think, sod it, I'll just blog it.

Update: Thanks to Tony for sending me this additional link. This site in particular seems to confirm the opinion I've formed over the past 24 hours that readability indices are oversimplified nonsense.


  1. Nice - although as you say, I have no idea what it signifies. Should I be pleased that I am easy to read and anyone with 9 years of education could understand what I write? Is this evidence of dumbing down or of writing clearly? I don't know!
    And why do these ideas always come in the middle of the night.
    It might be nice to compare these with some 'formal' publications by the same people.

  2. Rudolf Flesch would be delighted that a 9 year old could understand your blog Martin ... except that I have some doubt about the validity of these numbers. They are based on a 1000 word sample, so maybe I could improve the reliability by increasing the sample size, but I have my doubts about the FLESH application - it does seem to throw up some odd numbers. Originally it was my intention to compare samples of online and peer-reviewed writing from the same authors, but based on these results, I don't know if it's worth spending any more time on it.
    What questions need to be asked?

  3. Apart from the strange Cann1 outlier, I seem to be at the extreme. I had expected, in such august company, to be at the tabloid end, but no, mine was 'difficult', at the "High School or some college level".

    I wasn't convinced that this tool is working correctly. However I've just taken the first paragraph of my last 4 posts in October, and ran a readability check in MS Word. I got a score of 35.4 and 12.0, compared with your scores (for all of October) of 39.45 and 13.60.

    However when I was an Information Officer many years ago, I used and then discarded this tool as I found it didn't seem to correlate with my views on readability.

    For info the data I analysed is given below.

    Brian Kelly, UKOLN

    "I attended a meeting recently at which a civil servant introduced a report which he was summarising as ‘exciting’. I had to stifle a yawn, thinking that what might be exciting for a civil servant would probably be very dull and boring. But I was wrong - the report on “The Power Of Information” is of much interest to those of us (and I include many readers of this blog) with an interest in promoting open access to information."

    "What is the UK’s newest university? I thought that it was probably Edge Hill University. But I recently discovered that the University of Central England is now BCU - Birmingham City University. I’m assuming this is the UK’s newest university."

    "I recently suggested that the English secretly prefer being failures, as we enjoy complaining about our failures and belittling the vulgarities of those who are successful, and that, while this is particularly true in the sporting field, in IT and Web development we find it easier to criticise successful services rather than to exploit their successes."

    "I suspect many of my peers who make their content available under a Creative Commons licence have, like me, chosen an Attribution, Non-commercial ShareAlike licence, which permits the content to be reused for non-commercial purposes provided acknowledgements are given and the same rights are applied to the derived materials."

  4. Yes, I'd agree Brian - the FLESH tool does seem to throw up some odd results at times. However, my main misgivings about readability metrics are deeper.