How statistics weakened mRNA’s predictive power
The Scientist - 05/22/2017
Most biologists would likely have nodded at this conclusion and read on, but to bioengineer Nikolai Slavov of Northeastern University in Boston, the paper’s claim represented a statistical “elephant in the room,” he said. “It was clear to me that this was not consistent with their data from the moment I saw it, and that’s why we decided to reanalyze the data.”
The problem, Slavov said, was that, in the original study, changes in mRNA and protein levels between different genes, which can vary by 1,000-fold or more, had been grouped together with expression differences for individual genes between tissues, which are “usually within a 10-fold range.” Analyzing the data en masse in this way had created “a classical Simpson’s paradox,” said Slavov—a statistical phenomenon whereby apparent trends in individual sets of data disappear or reverse when the sets are pooled.