The above article is adapted from e pur si muove. Kudos to the author of this partcular weblog for his effort !In fact, i did fairly well for the prediction part. I thought i screwed it all up. May 6, 2006
Blog Pundits and the 2006 Singapore General Elections
The elections are over, the results tallied, every last ‘i’ dotted and ‘t’ dashed.
Surely every blogger’s burning question is, now, how well did the Singapore blogosphere do on the predictions front?
Since I promised to stay away until the elections were over, I will now contribute my little bit by attempting to answer the previous question. I ran some searches on Google, Technorati, and within my own blogroll and uncovered five blogs which publicly announced their predictions for the elections. After a little number-crunching (explained in full detail at the end), here are some concrete numbers.
Data
Everyone loves a graph, so here is a graph summarizing my data set:
Analyses and details follow.
Rankings of Blog predictions
sgelection06.djourne.net (96.0% accuracy)
sembawang-voter.blogspot.com (92.0% accuracy)
conformityisdead.blogspot.com (90.3% accuracy) singaporegovt.blogspot.com (87.8% accuracy)
sha0x.blogspot.com (86.0% accuracy)
Everyone in Singapore loves rankings, so here are my rankings for blog pundit accuracy. Congratulations to the crew at Singapore, Ink. for the most accurate (if most incomplete) elections predictions!
Again, I’m sure I missed out other blogs. Feel free to let me know if you know of any.
Top 3 least accurately predicted wards
Sembawang (22.2% rms deviation)
Macpherson (20.3% rms deviation)
Potong Pasir (20.1% rms deviation)
Unsurprisingly, this correlates well with highly-contested districts with strong opposition presence. I don’t quite understand Macpherson, though. Did I miss something happening here?
Top 3 most accurately predicted wards
Choa Chu Kang (6.1% rms deviation)
Jalan Besar (8.2% rms deviation)
Aljunied (9.8% rms deviation)
Can I take the contrapostive of the previous statement and therefore say that these must therefore be the most boring districts? Surely not; Aljunied seemed to have been a big campaigning hotspot. Maybe what this means is that the results of campaigning in these districts were most accurately reflected in the perspectives of individual bloggers.
Methodology
Here’s the nitty-gritty of what I did.
I used a variant of the least-squares weighted sum of residues squared to compute the “goodness of fit” for each bloggers’ data set.
Some judgement was exercised in enumerating predictions when no specific numerical margins were predicted. “Too close to call” was analyzed as 50%, “Win by narrow margin” as 51%, and a plain “Win/lose” enumerated as 67%/33% respectively.
The prediction percentages were subtracted from the actual voting percentages as reported on the Wikipedia article. The difference was weighted by multiplying with the number of registered voters in each ward to get weighted residuals. (This means that getting the percentage right for larger districts would count more strongly.) The weighted residuals were squared, summed up. The number thus obtained was divided by the number of districts predicted minus two, and finally square-rooted to get the root mean square deviation per ward (D).
To get the accuracy figure, I worked out the average number voters of per ward (75,809) and then computed the fractional rms deviation by dividing D by 75,809. Subtracting the fraction from one and converting it into a percentage finally gives the quantity I called the accuracy.
Clearly this is but one of several statistics I could have used. If anyone has a better idea, I can run the numbers and see how that changes.