It’s 2019. Everyone knows data is really important. But how can you tell if the data we get exposed to is any good? And more importantly, when we’re being lied to? At our most recent Lazy Book Club, we time travelled to the 1950s to discuss Darrell Huff’s classic “How To Lie With Statistics”. Here’s a summary of that conversation.
But first, a̶ ̶w̶o̶r̶d̶ ̶f̶r̶o̶m̶ ̶o̶u̶r̶ ̶s̶p̶o̶n̶s̶o̶r̶ a brilliant cartoon that kicks off the book.
(Yeah. It’s funny because it’s true. Statistics are a question of narrative. Don’t believe us? Keep reading.)
Alright, let’s crack on. Four things we discussed in a pub, over beers and some pretty badass 90s rap tunes.
1. Beware sampling bias
“How To Lie With Statistics” is full of examples that, even though they were written in the 1950s, you could imagine being applied today with more contemporary datasets. One of the thing Duff constantly reminds us is that, when looking at statistics, it’s always important to determine any biases that might have influenced the sample of the original research.
A basic example he starts off with: a fictitious study claiming that “most people claim they like to answer questionnaires”. Sounds brilliant for the questionnaire industry, right? Except this is only based on the data of people who bothered to answer this fictitious questionnaire in the first place. It doesn’t really account for those who saw the questionnaire and threw it in the bin. So in reality, the headline should be “most people who bothered to answer to a questionnaire said they like to answer questionnaires”. Not so impressive, right?
One to consider next time you do some social listening (it’s only based on people who bothered to publicly talk about a brand or thing). Or quant analysis (the likes of TGI are great but let’s remember it’s based on people who bothered to answer a questionnaire for 90 minutes in their homes). Or – goodness – that type of report that says 90% of people would buy a product endorsed by an influencer (but it was commissioned by an influencer agency who surveyed people who consume influencer content regularly, so ¯\_(ツ)_/¯).
2. Whatcha mean “the average”?
Ah yes, “the average”. The be all, end all of many research conversations where it turns out the average 18-34-year-old in the UK is into rap music and likes football, but also fashion and volunteering and cats. Averages seem harmless, but they can get quite dangerous because they sound pretty robust. After all, if it’s average it’s literally the neutral, middle-of-the-road result, right? Which feels like a good thing. But it all depends on the types of averages we’re talking about.
Huff gives the example of the average yearly income of a certain neighbourhood, which at first he says is £10,000 (remember, this was the 1950s), but later claims is £2,000. So when was he lying? Turns out, in technical terms, neither time. Because it depends on whether he was describing the mean (what we’d typically call the “average”, where you add up all income figures and divide by the number of people), the median (which is the middle figure in a group of numbers, so if we had £0 and £4,000, the median would be £2,000) or mode (which means the most common figure, so most people in the group actually earn £2,000).
Defining averages can totally change what we perceive to be true. One to consider next time someone says the average person is like this, that and the other. Did you know the average person in the world only has one testicle? (We’ll let you try and riddle that one out. It’s easier than you think.)
3. Framing. Fucking. Matters.
Hey, framing is great, right? It’s how we get to kick-ass strategies and brilliant storytelling, whether that’s for your boss, your client or your audience in the real world. Framing is creativity at its best. But framing can also be quite dangerous, if used for… uh, less transparent motives. To make that point, let’s look at this chart for US government spend, from January to December. £20 to $22 billion in 12 months. Cool. Not staggering, but cool.
But what if we wanted to make it really, really cool? Like hockey stick cool? Using the very same data set, without lying? Well, Huff offers a very simple recommendation. Look at how, above, the vertical axis is split 2 by 2. Now, what would happen if you split the vertical axis by decimal clauses?
BOOM. You just got yourself a much more impressive graph. Want another example? Look at the graphs below, side by side, which account for the increase in US government payroll throughout 1937. And again, these two graphs are based on the very same dataset, just framed in different ways.
The graph on the right wouldn’t impress anyone’s boss. The graph on the left makes you almost instantly eligible for three promotions in one go (if your job is about paying people more money, otherwise not sure how your hypothetical boss would react to it).
So yeah. Beware charts that look explosively good. There might be something funky about them.
4. How to spot a (statistical) lie
Huff’s intention with this book isn’t to teach us to lie to others with statistics, but to find ways to spot when we’re being lied to. Towards the end, he leaves us with a five-point checklist of what we can ask ourselves every time we see a report, piece of news or graph trying to tell us something “important”.
It doesn’t mean we use them all, at all times (sometimes there’s no time), but it does help to do some quick probing to make sure we’re not the sucker in the conversation. So next time you see a fancy statistic, ask yourself:
Who says so? (question who the source is)
How do they know? (try and understand the sample size, is it representative enough? Are there any other biases we should be aware of?)
What’s missing? (ask what types of averages we’re talking about, question the framing of the data and especially the graphs)
Did somebody change the subject? (how are people measuring the stuff they’re comparing?)
Does it make sense? (good ol’ human judgement – remember that? If it feels too good to be true, it probably is)
We hope all of this made sense and is useful for your next research project (or when trying to debunk other people’s research). Now, good luck reading those next few reports on how Gen Z are totally different from everyone else, ever, in history! Or you know, any of those “people who eat chocolate are smarter” studies that we all love to share on Facebook to justify our sugar intake choices. (Mmmm, chocolate…)