Thursday, April 24, 2008

There's no such thing as average

Long tail User-generated content follows a long tail distribution.

So what?

The "average user" (or student) is mythical: participation inequality rules online.

Ochoa, Xavier and Duval, Erik (2008) Quantitative analysis of user-generated content on the Web. pp. 19-26. In: Proceedings of the First International Workshop on Understanding Web Evolution (WebEvolve2008), 22 Apr 2008, Beijing, China.

User-generated content (UGC) is becoming the most popular and valuable information available on the WWW. However, little serious research has been conducted to measure the properties of its production process. This paper presents an in-depth quantitative analysis of nine popular websites that are based on different UGC types. The Information Production Process is used as a framework for the analysis. The findings provide for first time strong scientific evidence for previously anecdotic knowledge: UGC production follows “long-tail” distributions and it is marked with a strong “participation inequality”. Also, the analysis arrived to unexpected findings: not all the UGC types follow the inverse power-law distribution, and large content collections could be dominated by the presence of ultraproductive users. The analysis results also have implications for the administration of UGC-based websites.


  1. Glad to read that you liked our paper!... We also made a similar study for Learning Object Repositories, Open Courseware Initiatives, LMSs and the result is the same... fat-tail. So there is not just inequality of participation among students, but also teachers.

    Xavier Ochoa

  2. It's a great paper, much appreciated.