the experimental blog

fivethrityeight.com’s forecast bothers me – now I know why

It’s the era of “big data” (humans have always used all data at their disposal through history but that’s another blog post) and I think I have found a big problem with one of big data’s biggest reporting hubs: 538.

They are cherrypicking the polls they use.

Justin Epperly 538.com analysis

Justin Epperly 538.com analysis

fivethrityeight.com is a site by Nate Silver and they do a pretty good job of reporting and analyzing news from a “data science” perspective. Again, journalism has always been about ‘data’ but I digress.

538’s “Election Forecast” is a well made site that gives users 3 different ‘views’ of 538 projections based on how they are made: http://projects.fivethirtyeight.com/2016-election-forecast/florida/

Here are the categories:

  • Polls Plus – 538’s proprietary research that weights (changes) poll results according to 538’s factors they thought up, like how a state voted historically or how well they rate the poll. More relevant factors get more ‘weight’ meaning.
  • Polls Only – Just a poll of polls, no ‘weight’ from 538 data analysts
  • Now-cast – Highly chaotic, based on flash polls and other sources of quick polling

How does 538 decide what polls to include in their ‘poll of polls’?

That’s the question of course. Big Data nerds will never admit this but no matter what, *the data analyst* makes subjective decisions on *how* to go about analyzing the data. It’s far from unbiased. Numbers don’t like, but people interpreting the numbers are apt to ‘lie’ or screw up just as any other human.

So why do I think 538 might be cherrypicking polls?

Sienna College

In the pic above, a highly weighted poll by Sienna College showing a +6 for Trump is listed as “new”.

However, browse over to the Sienna College site and see they have been polling for at least since March. Here’s their list of all polls: https://www.siena.edu/news-events/news-archive/category/sri-political

Why start including Sienna College now and not before?

I’m 100% sure they have an answer, and there’s a good chance it actually explains their choice here.

However, this is lesson in understanding polls. They are complex but they can be understood and their flaws are rooted in the same thing all system flaws are: human choice.

Advertisements

One response

  1. Al Epperly

    Well done… Totally agree…

    November 1, 2016 at 8:28 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s