I’m piecing together an article on opinion poll performance at the federal election for Crikey, to which this post is intended as a sort of appendix (UPDATE: And here it is ($)). So far as national polling is concerned, this accounting from ReachTEL speaks for itself, as the primary vote percentages have hardly budged in the week-and-a-half since it was published. The one big outstanding issue is that we don’t yet have a reliable national two-party preferred total, as there are 14 electorates where preferences haven’t been counted on a Labor-versus-Coalition basis. My back-of-envelope reading is that the final result will be about 50.5-49.5 in favour of the Coalition, and it’s on this basis that I’m proceeding here.
The thrust of the Crikey article is that seat polling did not perform terribly well, and that it failed in an interestingly consistent way that you’ll have to read Crikey to learn more about (UPDATE: Did I mention that it was here?. I’m aware of 81 seat polls conducted over the course of the campaign, of which 62 were conducted for media outlets and 19 were conducted privately (excluding a few for which results were not provided with sufficient detail). The following table records the polls’ average biases, which refers simply to the difference between poll results for a given party and the election results and does not imply anything sinister, and errors, which do the same in absolute terms without being concerned with whom the error favoured. For consistency of comparison, this accounting excludes 12 polls that were not Labor-versus-Coalition contests, which you can learn more about at the bottom of the post.
A look at the two-party bias measures might suggest that ReachTEL outperformed Galaxy (which in this table is taken to include the polls it conducted as Newspoll), but I’m inclined to give Galaxy the honours because ReachTEL polls tended to have two errors that cancelled out: an understatement of the Labor primary vote, and an overstatement of the minor party and independent preference flow to Labor. ReachTEL’s headline figures were from respondent-allocated preferences, whereas Galaxy’s were based on preference flows from the 2013 election (which at a national level understated the preferences flow to Labor this time – the jury is still out on which of the two methods produced the smaller errors).
ReachTEL’s success in pitching its electorate polls to private clients resulted in a large amount of detailed private polling emerging in the media, which is not something we have seen much of in Australia in the past. The 19 such polls identified here were, remarkably enough, nearly all conducted for left-of-centre concerns, namely trade unions and GetUp! There is some evidence of selection bias here, by which those commissioning the polls are more likely to publicise the results if they find them to their liking. The two-party bias for the private polls leans in the other direction from the media polls, and while the private polls are recorded as landing slightly nearer the mark, the gap would have closed had I not excluded a poll commissioned by Labor’s candidate in Wentworth that wrongly picked a 10% swing against Malcolm Turnbull, on the basis of insufficient detail was published.
The chart below offers another view of the waywardness of the seat polls, with the spread of biases in the two-party preferred results illustrated by the blue histogram – so for example, 16% (11 out of 69) came in between 0% and 1% too high for the Coalition. This is overlaid by a blue distribution curve that best approximates the spread of the results in the histogram, and a thick black curve that shows how the curve would have looked if the polls behaved as they should have, taking into account a 4% margin of error associated with polls with a sample of 600. The flatness of the blue curve relative to the black one illustrates the point that seat polls behaved is if they had a margin of error more like 7% than 4%, and its right-of-centre placement illustrates a statistically significant 1.3% bias to the Coalition.
When the same exercise is conducted across the national polling, from which we have a rather more limited sample of 33 polls, there is a strikingly different result. This time the black curve shows how we would expect a distribution to look for polls with a margin of error of 2.5%, associated with a sample size of around 1500, which is about what you get these days from Newspoll/Galaxy and Ipsos in particular. The histogram illustrates that Coalition two-party results were all between 48.5% and 51.5%, which is to say that the errors covered a very narrow range from -3% to 1%. The blue curve, being taller than narrower than the black one, tells us the range is narrower than you would expect given the margin of error, with the polls behaving more like they had a margins of error of 2% than 2.5%, and samples of 2400. This time the bias is slightly in favour of Labor, at 0.9%, but this can partly be accounted for by the fact that many of the polls were conducted early in the campaign period, and there appeared to be slight movement to the Coalition over the full course of the campaign.
That the pollsters should outperform the theoretical expectation is quite remarkable, given sampling error is the only kind of error the theory acknowledges, and we would expect further issues to arise from different polling methods and rising non-response rates. This raises a suspicion that we are witnessing the herding effects that so blighted polling in Britain before the general election there last May, at which support for the Conservatives was uniformly measured at 6% below the actual result. If so, the very big difference in Australia is that the pollsters have herded at the right place.
Finally, a listing of the 12 seat polls that didn’t fit the Labor-versus-Coalition metric, and were thus excluded from the table above. Note that “OTH” refers to the main non-major candidate, rather than the combined vote for parties other than the Coalition, Labor and the Greens. The only comments I’ll add are that essentially none of them did well at predicting the Coalition vote, and the two Denison polls were even more skewed to the Liberals than the others in Tasmania, but with errors at the expense of Andrew Wilkie rather than Labor.