Those headlines about dog breeds and behaviour are hyped, obviously
A quick comment on the new Science paper
(This is a quick response to the press coverage of a scientific paper that came out yesterday. The main Science Fictions post, with audio version and all that, will be on Monday).
We all know how it goes:
See dramatic or counterintuitive scientific claim in media headlines;
Check actual scientific study on which media headlines are based;
Discover that headlines are overhyped, going way beyond what the study actually showed.
It happened again yesterday. There was a new paper in the journal Science entitled “Ancestry-inclusive dog genomics challenges popular breed stereotypes”. The authors, mainly at the Broad Institute (a big-deal genetics lab shared by MIT and Harvard) gathered an impressive dataset of over 18,000 dogs of many different breeds, and got DNA samples from around 2,000 of them. They had the owners fill in questionnaires about their dogs’ behaviour and appearance.
And here’s what they found, according to various media headlines:
Dog behaviour has little to do with breed, study finds (The Guardian)
A genetic study has debunked the stereotypes leashed to dog breeds (Today)
Your dog’s breed doesn’t determine its personality, study suggests (Science News)
Humans can’t quit a basic myth about dog breeds (The Atlantic)
They’re all good dogs, and it has nothing to do with their breed (The New York Times)
Depending on your intuitions about dog breeds, that could be quite an extraordinary claim. So how strong is the evidence?
The first thing to say is that there’s lots of interesting stuff in the paper about dog genetics that I won’t cover here (the short version is that having only n=2,000 does restrict what they can say, but it’s a good start). I’m going to focus on the purely phenotypic part of the study—the stuff about reported behaviours and breeds, which didn’t require any genetics to be measured—because that’s what caused the media stir.
Here’s the relevant part of the paper:
In an analysis of variance (ANOVA) of confirmed purebred dogs representing 78 breeds, the breed effect… averages 0.089 ± 0.039 (±SD) (range 0.034 to 0.253)…
That translates to breed explaining an average of 0.089, or 9%, of the variance in behavioural traits, which were sorted into these factors: “human sociability”, “arousal level”, “toy-directed motor patterns”, “biddability”, “agonistic threshold”, “dog sociability”, and “proximity seeking”. Each factor was measured from multiple questions filled in by the owners.
Now, you might think that these non-genetic, behavioural analyses were done in the total n = 18,000ish sample. But that’s not the case. As you can see above, the reported analyses were done in “confirmed purebreed” dogs - that is, the ones who had a DNA test which confirmed they were the breed their owners said they were. So the actual sample size is around 2,000 dogs for all of these analyses. This isn’t super-clear from the paper, but it’s confirmed by a look at the Supplementary Data (it’s in Data S9, to be specific).
(Note that I can’t quite reproduce that 9% number from their data sheet: I get 12% if I look at the overall behavioural factors, 11% if I look at the individual questions, and also 11% if I look at everything together. I guess I’m missing something minor).
2,000 dogs still seems like a decent sample, right? Except you also have to consider how many breeds there were: 78. We know from the paper’s Method section that they sampled around 27 dogs of each breed for genotyping - which just about fits, given the sample size varies a bit from measure to measure: 2,000/78 = ~26.
So what we have for each breed is an n=27 sample. Should we expect to find any reliable behavioural differences between groups when each group only has 27 individuals in it? The relation with breed would have to be pretty big for statistics to clearly differentiate between groups of that size - and I don’t think even a believer in appreciable behavioural differences between breeds would argue the effects are enormous (it’s a different story for physical traits like fur texture or ear floppiness, which really do differ dramatically between breeds - and indeed were found to do so in this study).
That’s before you consider the issue of noise. Remember that these are all subjective owner-ratings (as opposed to “objective” measures made under controlled laboratory conditions). With sample sizes of 27, it doesn’t take many owners giving unrealistic answers (think: “my dog nearly tore someone’s arm off last week but he’s a sweetie really…”) to skew the numbers. This kind of measurement error is a much bigger deal for behaviours, which are far more ambiguous than easy-to-observe physical traits.
Just to lay it out: if Breed X is in reality high in proximity seeking, but just a couple of Breed X owners give unrealistically low scores (because their particular dog is a genuine outlier, or they were mistaken, or they weren’t paying attention), then the average might be pulled down, substantially away from its “true” value. This kind of noisy measurement could be happening across any or all of the 78 dog breeds, and could easily obscure any real differences between them. In a sense, I’m surprised they even found that 9%.
There’s also potential selection bias: are people whose dogs are very sickly, or extremely aggressive or otherwise badly behaved, going to be more or less likely to take part in a study like this? I’d say less likely - which means you’ll miss the full range of doggy variation and find a smaller-looking effect.
Somewhat remarkably, I couldn’t find any mention of these limitations in the paper itself. Some of the press coverage did a better job of this - of the ones I read, the Atlantic article is the best, and mentions previous research that finds a larger link between breed and behaviour.
Consider an alternative, ideal setup: you get (say) 500 Staffordshire bull terriers, and 500 poodles (delete and replace for any dog breeds you prefer). You bring them all into the lab and take careful measures of their behaviours—approaching people or toys; reacting to sounds like other dogs barking, etc.—under controlled conditions. This isn’t a realistic setup, of course - or, it’s a lot less realistic than the pragmatic way the researchers gathered data for the Science paper. But my bet would be that the bigger samples of each specific breed, and the higher-precision measurements, would help you see all sorts of differences emerge that wouldn’t under lower-powered, noisier conditions.
Perhaps it’ll turn out, in future studies, that dog breed really doesn’t say much about behaviour - that outside of working dogs and some other categories, modern breeds have mainly been selected for their appearance rather than their behaviour. But given the above problems, I don’t think this analysis should make us update our views very much on this question.
I’ll finish with a picture I took the other day of my dog, Biscuit, who is a cockapoo, and who currently has an absurd teddy-bear haircut. Her defining behavioural trait, bless her, is anxiety.
Image credits: Getty (upper image); the author (lower image)
As an applied ethologist, I’m additionally very interested in the KINDS of questions in the survey. I participated at the time and reviewed the questions two days ago. Frankly it’s logical to me that even without the NOISE you describe, that we would not expect to observe much difference between breeds in the behaviors asked about in the study, because the vast majority (literally almost all) off the questions aren’t targeting behaviors that had selective pressure on them. Dogs are different because of the selective pressure humans have put on their behaviors historically for functions- so questions that get at THOSE distinctions are what I would expect to demonstrate the variations in answers. The questions they did ask about breed/clade specific motor patterns (especially relative to modifications to the predatory sequence) DID demonstrate variation. But that was brushed over in the conclusions. Since there was no selective pressure on behaviors like paw crossing, circling before pooping, licking a bowl after a meal, or being scared of strangers (no one ever bred dogs to display these behaviors deliberately) - why would we expect to observe variation? Had the questions targeted the meaningful differences we know about between genetic groups of dogs, I think we would have seen much greater trends in the data even with the noise you mention. Can’t wait for your full piece!
Informative as always Stuart! As an aside your (very sweet looking) dog is almost identical to my pal Tommy’s dog, which I believe is a cockapoo as well.