The Shallow Brain Hypothesis — BrainPost | Easy-to-read summaries of the latest neuroscience publications

Is a neural network a good model of brain function?

The brain is a complex physical system that enables the processing of sensory information, the formation of memories, and the guidance of behavior and cognition. To advance the field of machine learning, artificial neural networks were developed, inspired by our understanding of brain connectivity and function. These networks are used in both scientific and technical applications like graphic processing units (GPUs used in video game hardware), healthcare, scientific research, aerospace engineering, and artificial intelligence.

In neuroscience research, the design of neural networks that can capture aspects of how the brain processes information has incredible implications for theoretical and experimental understanding. However, whether contemporary neural network techniques adequately capture the complexity and structure of the brain is under debate.

The complex architecture of the brain

To understand the current debate of how to best design neural networks, we must first understand the basic architecture of the brain.

When sensory information (visual, auditory, taste, touch) travels from the peripheral nervous system into the central nervous system, these signals arrive to subcortical regions and are relayed into a brain region called the thalamus. The thalamus is located deep within the brain, beneath the cortex, but exhibits rich connectivity with cortical and subcortical regions. There are thalamic regions that receive and transmit information from subcortical sources (first order), and thalamic regions that transmit information between cortical regions (higher-order). These higher-order thalamic-cortical dynamics are the subject of much current research, as these signals have been found to be involved in not just sensory processing, but also attention, arousal, consciousness, and many other cognitive functions.

Higher-order thalamic nuclei receive and transmit information to the cortex via complex connectivity patterns. Within the cortex, pyramidal neurons receive information from numerous cortical and subcortical sources. These pyramidal neurons are unique, as they are the most excitatory cells within a given cortical column, receiving information from numerous cortical and subcortical sources. There are many local recurrent connections within each cortical column, as well as long-range connections between distant cortical columns across the cortex. As such, the cortex is involved in both primary sensory processing as well as higher cognitive abilities and is strongly interconnected via pyramidal neurons to transmit information to distant cortical, thalamic, and subcortical regions.

While an overly simplistic summary, this connectivity between subcortical, thalamic, and cortical regions is an essential feature of neural dynamics. However, much remains to be understood regarding this complex interconnected system.

Hierarchical deep learning neural network models

Early development of neural network models was based on observed connectivity patterns in the visual cortex. Researchers found evidence of hierarchical information processing, from lower to higher cortical areas. Feedforward neural network models are inspired by this architecture and are generally structured with information flowing from input layers, through hidden layers, to output layers.

The application of deep learning methods introduces “learning” into these networks (known as backpropagation) to enable the model to fine-tune itself. This method requires the adjustment of weights throughout the network hierarchy, with some debate as to how this would be implemented at the rapid scale present in the brain’s architecture. Contemporary neural network models often utilize recurrence, meaning that there is a bidirectional flow of information forward and backward. There is a diversity of architectures used in current neural network modeling, but much debate as to whether a primarily hierarchical-based network design is capable of capturing the computations occurring in the brain.

What’s the Shallow Brain Hypothesis?

This potential discrepancy has led to the development of the Shallow Brain hypothesis. The focus of this hypothesis is that the inclusion of the thalamo-cortical and subcortical connectivity patterns of the brain (as opposed to a primarily hierarchical-based network) is essential to model neural dynamics effectively. The primary tenet of this hypothesis is that “hierarchical cortical processing is integrated with a massively parallel process to which subcortical areas substantially contribute.” In other words, the transmission of information from the deep regions of the brain directly to the outer cortex and vice versa, bypassing the hierarchical transmission of information through each layer, is very important to brain function.

The Shallow Brain hypothesis is built from the evidence that each cortical column is a highly complex computational unit specialized to process information through distinct recurrent architecture. Across the classical cortical hierarchy, these distributed cortical columns comprise a massive array of parallel recurrent networks. Through extensive thalamic-cortical and cortical-subcortical connections, these parallel recurrent networks are integrated with each other to enable flexible and rapid information processing in the brain.

Proposed benefits of the Shallow Brain hypothesis include a more physiologically plausible mechanism for local learning, increased speed of information flow in a parallel rather than serial architecture, and the capture of complex representations and flexible integration of features in network models. The Shallow Brain hypothesis outlines many dimensions by which the Shallow Brain architecture can more accurately and realistically capture the dynamics of information processing in the brain.

The Shallow Brain hypothesis raises many interesting questions, with implications for neuroscience and computational modeling research.

Are neural networks with primarily cortico-centric designs and theoretical underpinnings missing essential features of information processing occurring with subcortical (i.e., deep) regions of the brain?
Are shallow architectures, as proposed in the Shallow Brain hypothesis, able to outperform other architectures in capturing neural dynamics?
Does the thalamus play an essential role in information processing, and does disruption of thalamic activity lead to deficits in learning and other cognitive faculties?
Finally, does the integration of parallel cortical processing occur at a cortical or a subcortical level?

The development of novel hypotheses of how neural networks should be designed has implications for both neuroscientific research and technological application alike.

References +

Sherman, S.M. The thalamus is more than just a relay. Curr Opin Neurobiol. 2007.

Kumar, V.J., Beckmann, C.F., Scheffler, K., Grodd, W. Relay and higher-order thalamic nuclei show an intertwined functional association with cortical networks. Communications Biology. 2022.

LeCun, Y., Bengio, Y., Hinton, G. Deep learning. Nature. 2015.

Olgenburg, I.A., Hendricks, W.D., Handy, G., Shamardani, K., Bounds, H.A., Doiron, B., Adesnik, H. The logic of recurrent circuits in the primary visual cortex. Nature Neuroscience. 2024.

Voges, N., Lima, V., Hausmann, J., Brovelli, A., Battaglia, D. Decomposing neural circuit function into information processing primitives. Journal of Neuroscience. 2023.

Sherf, N., Shamir, M. Multiplexing rhythmic information by spike timing dependent plasticity. PLoS Computational Biology. 2020.