The Reason for the Replication Crisis in Neuroscience in the Stats Study

The Reason for the Replication Crisis in Neuroscience in the Stats Study ...

The ability to image inside of our skulls has helped with neuroscience research in exactly the same way that Ignaz Semmelweisss germ theory advanced hospital practice. In short, it has proven a revolutionary technology that has greatly improved our brain''s cognition.

A wide array of experiments have raised significant doubt on the statistical validity of this study. These studies, which had been the next major step for neuroimaging, are a series of efforts to link specific signatures in brain scans to complex psychiatric symptoms and states.

Issues with association

The findings, published in Nature, suggest that large numbers of so-called brain wide association studies (BWAS) may be statistically underpowered. This specific study is focused on the reproducibility of linking neuroimaging measures with complex behavioral phenotypes, much in the same way geneticists focus on linking genes to similar complex phenotypes, according to Dr. Scott Marek, a research co-author and instructor in the department of psychiatry at the Washington University School of Medicine.

Several of the most powerful findings in this area have linked general cognitive systems, like memory, to brain regions that house neurons controlling said processes. Other studies have shown that changes in blood oxygenation mirrored the activation of certain brain areas in response to behavioral tasks.

BWAS research tries to draw a biological signature to brain processes that are notoriously variable, but it would be a medical marvel to discover a brain that doesn''t use hippocampal structures in memory recall. This means that the size of the association involved is significantly smaller.

Lifting research by the bootstraps

Marek and his colleagues used a statistical technique called bootstrapping to create a slew of virtual datasets, ranging in size from the more commonly used small sizes (n=25) to the vast array of scans.

The Mareks team then modeled how accurate and reproducible the results from each of these hypothetical datasets were. Their findings suggest that current BWAS approaches may require a seismic shift to ensure their data is reliable.

Inflated effect sizes

Regardless of size, the types of associations studied in BWAS studies were extremely susceptible to being inflated by chance. This meant that the Mareks'' smaller simulated studies were largely irreproducible. Only after the experiments were modeled with thousands of brain scans, the effect sizes began to decline.

Small BWAS tests, developed to detect minuscule potential effects with less scans, are highly likely to be irreproducible. Moreover, in these small studies, the authors argue that the exact findings that are most inflated by chance and least reliable are most likely to be found statistically significant and make it to publication.

Marek is careful not to throw the baby out with the bathwater. Other neuroimaging methods are much more reliable at smaller sample sizes. This variation is not the result of the correlational nature of these studies, however, the proof that these investigations may reproducibility, or lack thereof. Some neuroimaging studies (e.g., basic brain mapping of specific functions, task induced effects, etc.) do not fall under the umbrella of BWAS.

More samples are required for even BWAS studies, according to the author, which is a clear, if not difficult way to increase their reproducibility. Alternatively, a method to improve BWAS would be to increase sample size, as previously discussed with the ABCD, HCP, and UK Biobank studies. This can be done through data aggregation across several labs, according to the author.

Following the GWAS journey

GWAS assessments, which are integrated to specific gene signatures, have embarked on their own replication journey over the 21st century. In response to these issues, and the overwhelming cost of genomic sequencing, they have been able to vastly expand their sample numbers into millions. It is not immediately clear how the approach is compatible with current BWAS practices, where small labs, operating with tight budgets, use a median sample size of 23.

Marek proposes that a clearer reporting of effect sizes, regardless of their statistical significance, would help clarify the reasons for irreproducibility.

Marek is keen to demonstrate that BWAS studies are beneficial given that the sample design is adequate. One might argue that we have a lot to learn about when the brain connects to complex phenotypes.

Marek believes that researchers who want to conduct a BWAS study should do so only in the greatest available dataset and report all effect sizes.

You may also like: