Mathematical Model Could Help Correct Bias in Measuring Bacterial Communities
Researchers from North Carolina State University have developed a mathematical model that shows how bias distorts results when measuring bacterial communities through metagenomic sequencing. The proof-of-concept model could be the first step toward developing calibration methods that could make metagenomic measurements more accurate.
Metagenomic sequencing identifies the number and type of bacteria present in a particular community – for example, in a human gut microbiome – through DNA extracted from the sample. “We’re measuring communities of bacteria – which ones are present and how many of each one are there,” says Ben Callahan, assistant professor of population health and pathobiology and corresponding author of a paper describing the work. “However, the measurement technology isn’t perfect, which introduces bias into the results. And that means we don’t get an accurate picture of the community we’re trying to measure.”
According to Callahan, since metagenomic sequencing is a multi-step process, biases can be introduced in every step.
“The most well-known step is DNA extraction, where we break open the bacteria to get to the DNA,” Callahan says. “The cells of some bacteria are harder to break open then others. Let’s say I have a bacterium that makes up half of the community but doesn’t break very well. I could end up with only 10% of this bacterium in my measurement, instead of the 50% that is actually there. That introduces bias. Now every measurement or calculation I do from that point onward is systematically skewed.”
Callahan, with NC State postdoctoral researcher Michael McLaren and biostatistician Amy Willis from the University of Washington, tested their model of bias against two types of metagenomic sequencing – 16S RNA gene and shotgun metagenomics – in microbial communities of known composition, and found that the model accurately described bias in those circumstances.
“What this experiment shows is that the model we propose works in at least these limited circumstances,” Callahan says. “The long-term goal is to provide a calibration tool for metagenomic measurements of complex natural communities, just as we have standards that we use to calibrate measurement technologies like scales, oscilloscopes and microscopes. This work is a first step toward that.”
The research appears in eLife. McLaren is first author.
Note to editors: An abstract follows.
“Consistent and correctable bias in metagenomic sequencing experiments”
Authors: Michael McLaren, Ben Callahan, North Carolina State University; Amy Willis, University of Washington
Published: Sept. 10, 2019 in eLife
Marker-gene and metagenomic sequencing have profoundly expanded our ability to measure biological communities. But the measurements they provide differ from the truth, often dramatically, because these experiments are biased towards detecting some taxa over others. This experimental bias makes the taxon or gene abundances measured by different protocols quantitatively incomparable and can lead to spurious biological conclusions. We propose a mathematical model for how bias distorts community measurements based on the properties of real experiments. We validate this model with 16S rRNA gene and shotgun metagenomics data from defined bacterial communities. Our model better fits the experimental data despite being simpler than previous models. We illustrate how our model can be used to evaluate protocols, to understand the effect of bias on downstream statistical analyses, and to measure and correct bias given suitable calibration controls. These results illuminate new avenues towards truly quantitative and reproducible metagenomics measurements.