The New York Times goes overboard on cleaning up the ocean
The latest concerns about an ocean cleanup affecting marine life are based on questionable science
(The audio version of this post can be found at this link).
Helping old ladies across the road.
Giving all your money to charity.
Cleaning tons of garbage out of the ocean.
You might think that none of these altruistic acts would be super-controversial - but you’d be wrong. The latter one—specifically, the garbage-cleaning project run by The Ocean Cleanup, the Dutch organisation who invent and develop massive new machines and technologies to dredge plastic pollution out of the sea—might seem cool and inspiring to you (and me!), but it’s come in for heavy criticism. And the latest critical firestorm—or maybe, typhoon—is all about neuston.
Won’t somebody please think of the neuston?
You know the Great Pacific Garbage Patch? The enormous area of ocean between California and Hawaii where currents have conspired to collect a titanic amount of plastic debris? Well, it’s become home to an under-studied form of life. These neuston are organisms that float on the surface of the water: in the ocean, that includes snails, jellyfish-like animals, and other weird aquatic creatures that defy easy description. The new, and perhaps surprising, criticism is that cleaning up the rapidly-accumulating plastic could mean damaging this precious ecosystem.
The claims about the Garbage Patch come from a new preprint, posted on bioRxiv by a team led by the University of North Carolina biologist Rebecca Helm. The researchers—first author Fiona Chong—report that:
[w]e found increased concentrations of floating life in the [North Pacific Garbage Patch] and a significant positive relationship between the abundance of floating life and floating plastic. Further, neuston densities in the [North Pacific Garbage Patch] are among the highest ever described.
So it’s not just that there’s some life there - it’s abundant. More plastic, more life. The “shocking” findings were described by Helm in a viral Twitter thread:
Which led to headlines like these:
Host Of Wildlife Found Living It Up In The Great Pacific Garbage Patch (IFL Science)
Hooray! The Great Pacific Garbage Patch Has Become a Thriving Ecosystem, Scientists Say (Futurism)
Most prominent was an article in the New York Times last week, entitled:
The Ocean’s Biggest Garbage Pile Is Full of Floating Life
I don’t pretend to understand this area of oceanic science in a deep (no pun intended) sense. And I don’t pretend to be able to adjudicate the debates about the best way to clean up the ocean (it seems everyone agrees that stopping the garbage from getting into the sea in the first place would be the best idea - including The Ocean Cleanup, who are also developing tech to do just this). But something I do know a little bit about is statistics and measurement. And that’s what this particular debate hinges on.
Water way to have a good time
First: measurement. How did we get any of these data? As Helm’s tweet said, someone had to swim (and, okay, boat) through the Garbage Patch and collect it. A solo diver, Ben Lecomte, followed a mathematical model that was designed to predict from wind and wave motion the locations of the biggest collections of garbage. He and his “Vortex Swim” team then scooped up whatever was there with a trawling net, dragging it through the water for 30 minutes at a time.
How was the neuston and plastic measured? The 22 samples were returned to the boat, sieved, and the solid contents dropped into a bucket. Photos were taken of the contents of each bucket. Two members of Helm’s team (one was Helm herself) counted all the objects they could see in each photo - that included neuston and plastic, but also fish, bits of wood and rope, and other unidentified objects.
Aside from how fun this sounds, you can probably already see all the potential for measurement error, here: the net-dragging might not have happened at precisely the point the model described; the model itself might’ve been off in the first place; the analysis of the photos is subject to human fallibility; the two raters might not have agreed on every single observation (they don’t report what the level of agreement was, even though there are statistical ways to quantify this; and they don’t say what they did if there was any disagreement - take an average, maybe?).
There are other methods that don’t rely on photos. A 2020 study by scientists from The Ocean Cleanup took samples from the Garbage Patch back to dry land, where researchers painstakingly handpicked every plastic piece and weighed and measured them in the lab. Among their samples, they used some from the Vortex Swim trawls - the exact ones used by Helm and her team (as we saw, Helm had 22 samples; the Vortex Swim took 44 overall but, frustratingly, photos weren’t available for half of them).
It feels to me like the laboratory method would be more reliable than the method that involves inspecting photos (though hardly infallible). And here’s the thing: when you use the lab-counting method, you find substantially higher amounts of plastic in the very same samples compared to when you use the photo-spotting method - we’re talking sometimes double or triple the amount found in the lab in the Ocean Cleanup study than spotted on the photo in Helm’s research.
So already we have a question mark, before we’ve even done any statistical analysis: there seems to be major disagreement about the amount of plastic in these samples. The 2020 Ocean Cleanup study didn’t measure neuston - and as far as I know nobody has analysed the same samples in both ways so we can compare them.
But The Ocean Cleanup have measured neuston in a different study - and comparing Helm’s data to theirs is what we’ll do next.
Garbage Patch in, Garbage Patch out
The other Ocean Cleanup study—the one that looked at neuston as well as plastic—is by Egger et al. (2021). They also took various trawls of the water in and around the Garbage Patch—54 of them, to be precise—and took the samples back to their lab in Rotterdam for analysis. We can compare their results to those from Helm’s team. Let’s focus on two of the main claims made in Helm’s new preprint:
There are more neuston within the garbage patch than outside it;
The amount of neuston correlates positively with the amount of plastic.
Claim 1 is pretty straightforward: just a test of the mean difference between the neuston in the Patch and the neuston outside of it (by the way, just as a “sanity check”, both studies found that there was more plastic in the Garbage Patch than outside of it. Phew).
In both studies, there’s a big inside-outside difference in the amount of neuston. But wait! The difference is in the opposite direction. The lab-based data from the Ocean Cleanup find more neuston outside of the Patch; Helm’s data from photo observations find more within it.
Oh dear. What about Claim 2, about the plastic/neuston correlation? I plotted the data from each of the five types of neuston that Helm’s team identified (plus a sixth graph for the total of all the neuston types), as a function of how much plastic there was in each sample:
It’s not super-convincing, is it? There’s one outlying sample that has much higher values than the others, which helps drive a massively-strong positive correlation between total neuston and plastic (r = .93, so that’s an R^2 of 86%). You can also see that the whole thing is mainly driven by two particular neuston species, V. velella (which is a strange little animal sometimes known as a “sea raft” or a “by-the-wind sailor”) and J. janthina (which is a kind of sea snail) - there were relatively few of the others. In the preprint, they show that after removing the outlier, or transforming the data so that the distributions are a bit more “normal”, the correlation remains (it becomes .49 and .77, respectively). But you can’t get away from the fact that N = 22 is going to give you a very unstable correlation: just a few extra points could pull it in quite a different direction.
We should also be instinctively wary of big correlations (even the .49 with the outlier removed is pretty chunky), especially with all the potential for measurement error that we discussed above. The world is a very complicated place, and if fuzzy measurements correlate together so strongly, it’s often a signal that you have a result that’s too good to be true.
What if we do the exact same thing but for the Egger data? See below. I’ve log-transformed these, just so the graphs are a bit more interpretable, but it’s the same story in the raw data. And… wow! It’s the opposite result, again: in this case, there’s a negative neuston-plastic correlation (less plastic, more neuston). And here, it’s just the Velella that’s driving the r = -.23 correlation (R^2 = 5%). To me, that’s a more plausible correlation size - though it’s still relying on a pretty small sample overall.
It’s also worth noting the average levels. Helm’s team note in their preprint that Egger’s team found twice or three times the plastic (similar to the other Ocean Cleanup paper we saw above), but 75 times less neuston. That’s a lot! What could explain it? One thing Helm suggests is that it could be seasonal: the studies took their samples at different times of the year, and since we don’t know that much about neuston, it could easily wax and wane through the months, with Egger running their trawls at times with less neuston *[Edit 17 May 2022: Mattias Egger writes to note that there are data from the same season as Helm in his paper, and there’s still a difference - so this is unlikely to be the explanation]. But that’s just a post hoc explanation. It’s hard to disagree with Egger et al., when they say:
We advocate for a more systematic assessment based on repeated and standardized monthly sampling of specific locations across the eastern North Pacific Ocean over multiple years. Furthermore, we stress that cleanup technologies need to closely monitor neuston bycatch…
The stories are hard to reconcile. If you read the new preprint (or its media coverage), you might get the impression that all that plastic is good for the neuston. Read The Ocean Cleanup’s paper from last year, and you’ll want to get rid of the plastic to save them. I think it’s clear, though, that The Ocean Cleanup have higher-quality data: not only are there more samples (though not that many more), but the lab measurement and generally more detailed annotation mean that the results are likely more reliable. It’s also clear, per the quote above, that The Ocean Cleanup do seem to be concerned with the effects of their techniques on neuston - concerned enough to publish research papers on it, even before the “shocking” discoveries of this new preprint.
Turning and turning in the widening gyre
There’s something very weird about this whole story. This isn’t about plucky environmental campaigners criticising ExxonMobil or some other big corporation for dumping pollution into the oceans. The apparent “baddies” in this case are from a nonprofit who have devoted incredible resources and a great deal of creative thinking to cleaning up the environment. And remember, the reason they’re doing this is the threat that ocean plastics pose to all sorts of creatures, not just neuston.
Maybe it’s the counter-intuitive nature of the story that draws people in - the tragedy of people trying to do something nice but instead ruining everything, like Lennie killing the mouse by petting it too hard.
However—leaving aside any other criticism of The Ocean Cleanup, which I’m not covering here—there’s very little evidence of this kind of tragedy in this new preprint. This is an N = 22 study that conflicts strongly with other, higher-quality data. It should be of interest to scientists who work directly on this topic, but it’s very questionable whether it should be broadcast to the entire world via coverage in the New York Times.
As so often in these stories, apportioning blame for the media hype is tricky: yes, Helm tweeted about her “shocking” findings, but she didn’t press-release the preprint, and she told me that she hadn’t intended for it to be picked up by the press. The New York Times did some due diligence by getting some balancing quotes from one of The Ocean Cleanup’s scientists; they also, quite rightly, described Helm’s study as “severely limited in size and scope”.
But that just brings us back to the same question: why cover it, then? Of all the science we could be hearing about, why focus on a tiny, “severely limited” study? Both sides of this debate agree that it’s important for cleanup efforts to keep an eye on unintended consequences. But to raise a specific claim like this to the world’s attention, it surely needs to be based on solid data. After seeing the data above, I hope you’ll agree that there’s barely enough here to cause a ripple - let alone a splash.
Acknowledgements: I’m very grateful to Rebecca Helm, Fiona Chong, Matthew Spencer, and Boyan Slat for answering my questions and for sharing their data.
Image credits: Getty (photos); the author (graphs)
Edit 17 May 2022: Mattias Egger writes with two corrections, both of which I’ve made above: 1) I said that most of the plastic in the Garbage Patch was small particles. This was wrong: the small particles make up the biggest number of plastic objects, but the biggest mass of plastic is large objects. I’ve just removed this. 2) Although Helm and her colleagues suggested that seasonality might explain the results, Egger notes that in his dataset there were indeed some datapoints from the same season as Helm’s data. I’ve added a note to make this point in brackets after an asterisk.
My introduction to the replication crisis in psychology was years ago when I started hearing about IQ-raising brain-training games. After that whole mess and a huge industry was born based off those studies, I wondered why anyone bothers to write about a study that has not yet been replicated. I prefer to hear about meta-analyses than individual studies.
Would it really be so much trouble for NYT to have someone review a preprint before running a story like this? This isn’t even a question of putting too much faith in the peer-review process.
For that matter, they ought to have enough in-house expertise to know not to read too much into a correlation with that small a sample. This is a wealthy news org that promises paying subscribers (including me) high-quality science journalism. Argh!