Archive for the 'Digital Humanities' Category


Big Data and Social Media : Australia’s ‘Warmest 100’ Countdown 2013

An interesting case study in using collated data to predict social trends hit the interwebs recently.

Showcasing the most popular hits of the past year, the annual Triple J Hottest 100 countdown is one of the most significant popular music charts in the country. Listeners submit their votes, and the list is tabulated from compiling those votes. This year, a group of four individuals utilised social media to compile their own prediction of what tracks would make the top 100. The details about how these individuals went about collecting their data is described in this article, but in a nutshell, this is what the ‘project’ entailed:

1 – Every user is able to share their votes via the Triple J website itself, on a unique page. This means that the root URL of every voting page is the same. In other words, the only thing that distinguishes person A’s vote from person B is the ‘random number’ at the end of the URL.

2 – For instance, your voting site might look like this:

and mine might look like this:

4 – So, because these root URL’s are similar, and because all this information is freely available, votes can actually be collated!

5- The Warmest 100 was thus collated (for more details about the process look it up here) with a sample size of about 35 thousand votes from roughly 3600 unique voters, giving them a sample size of a mere 2.7 percent of the voting total.

6- The result: They accurately predicted the top 3 songs on the list, and several others in the top 20, give or take some errors and discrepancies here and there. Here is a spreadsheet detailing the differences between the predictions and the actual results.

For a small sample size, the results obtained are impressive. So much so that the organisers of the countdown have decided to change the voting system for next year’s countdown.

Lev Manovich addresses some issues and concerns with the implications of ‘Big Data’ in the age of social media proliferation and increasing digital presence. One of the concerns he raises is the authenticity of information shared over social networks. In light of this, the Warmest 100 countdown provides one with an interesting example. Unlike other forms of digital information such as photos shared on Flickr and Facebook wall posts, there is no doubting the authenticity of users’ countdown votes. In fact, users are even encouraged to share their votes via social media – Facebook, Twitter and Pinterest plugins are standard fare. In retrospect, allowing this information to be freely available has, in the case of the Warmest 100, given users and voters themselves the ability to take the collating process into their own hands, eliminating (to a certain extent) the countdown’s element of mystery!

The goal, the objective of the Warmest 100 is clear cut – predicting the results of a countdown. When applying the collation of big data in humanities based projects however, ‘objectives’ and ‘goals’ might not be so clear cut. With so much information on users available freely over the internet, the example of the Warmest 100 tells us just how powerful statistics can be in extrapolating information about social trends. If we apply this to humanities based work, the emphasis comes back to that of interpretation. One might have all this data available, but so what? What can one do with data that is available? What can the data tell us?

And I suppose that’s where humanities researchers step in. Interpreting and analysing big data. Which brings me back to the roots of my doctoral research – where my first foray into digital humanities research stemmed from a quantitative analysis of Shakespeare’s 154 sonnets.


Matt Shea, “The Inside Story of How Four Techs Broke Open Triple J’s Hottest 100“, The Vine, Jan 2013

Lev Manovich, “Trending: The Promises and Challenges of Big Social Data”, Debates in the Digital Humanities, edited by Matthew K Gold, University of Minnesota Press, 2012


RIP Aaron Swartz

The web is awash with tributes to Aaron Swartz. You might remember him as the person persecuted for downloading articles off JSTOR with the intention to make their access free to all.

The charges against Swartz were nothing short of excessive. This link compares the charges that he faced against sentences for crimes such as bank robbery and manslaughter.

Swartz was also suffering from depression, and he took his life on Friday.

Here are some links to some related articles on Swartz’s case.

How the felony count for Swartz’s case was raised from 4 to 13. This article also has a link to Swartz’s indictment: click here

Interestingly enough, it seems that JSTOR had already settled their civil case with Swartz in 2011.

In a response to Swartz’s death, hacker group Anonymous took down the MIT website (Swartz was affiliated with MIT) for a period of time: click here.

Statement from Swartz’s family: click here.

Lawrence Lessig’s post on the prosecutor as bully summarises it all: click here.

Here is a link to Swartz’s manifesto on open access: click here


'horror': Middle English: via Old French from Latin horror, from horrere ‘tremble, shudder’.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 25 other followers

Twitter Feed

Most clicks