Photo credit: ioat / Shutterstock.com
The mystery of learning is as old as philosophy: how does the nearly insensate newborn become, within just a few years, a walking, talking, intelligent being?
In his new book entitled How We Learn: Why Brains Learn Better Than Any Machine . . . for Now, Stanislas Dehaene illuminates the great strides of the last few decades in understanding learning. To explain these complex discoveries, Dehaene, a leading cognitive neuroscientist and professor at the Collège de France, relies on a strategy as old as Plato: if you’re trying to teach something new, use a simpler model people already grasp. The fascinating assumption guiding Dehaene’s book is that the omnipresence of contemporary artificial intelligence means we understand computers better than ourselves—that our familiarity with machine learning can be used as a model to understand more clearly our own abilities.
The idea that we can use machines to understand ourselves is not new; the historian Jessica Riskin’s 2015 The Restless Clock highlighted traditions from the 17th century onwards to explain organic beings animated by springs, gears, levers, and pulleys. And the neuroscientist Matthew Cobb’s excellent new book, The Idea of the Brain, provides an exhaustive overview of attempts to understand the brain through contemporary technology.
But, for both Riskin and Cobb, the history is in part a cautionary tale of too simply relying on mechanical models; comical misunderstandings and radical oversimplifications are the rule, not the exception. Cobb is especially skeptical of using the computer as a model for the brain; the computational models often involve clever tricks with little biological basis. These concerns are especially important to remember for contemporary neuroscientists: the artificial neural network at the heart of machine learning, was initially designed to roughly mimic the interconnected layers of biological neurons performing simple functions in the brain. This can lead to a superficial identification of the two that fails to acknowledge their equally striking differences.
Dehaene aims to skirt these worries by focusing less on mapping a one-for-one of biology and silicon operations, and more to use artificial neural nets as functioning models of learning—as proofs of principle that distributed processing by many different neurons can result in learning.
He thus defines learning through seven technical definitions cribbed from AI, such as learning is forming an internal model, optimizing a function, restricting the search space, minimizing errors, and so on.
He explains these technical definitions in terms of the near-ubiquitous convolutional neural network (CNN), the workhorse of deep learning and especially computer vision. The central innovation of CNNs is an architecture modeled on the visual cortex in the primate brain: numerous, sometimes hundreds, of layers of small, discrete networks—called feature detectors—that are stacked into a hierarchy of progressively more abstract feature detectors. This architecture acts as an innate bias for interpreting the data: the layers of multiple simple feature detectors operating in parallel biases the network to assume most equally abstract features are mutually exclusive—a line can’t be both straight and jagged at the same point—and the hierarchy assumes the data is compositional, so complex features are built from simpler ones—lines form edges, edges form shapes, shapes form objects, and so on.
From the perspective of CNN, learning — for example in a dog-breed detector — is a matter of minimizing the errors the machine makes when matching input (such as an image of a dog), and their correct output (the label “Elkhound”) This process begins somewhat randomly, with the network simply guessing different answers to the input. If the guess is wrong, the network goes back and tweaks the different values go into and out of each neuron in all the different layers, so they’ll make a better guess on the next run. Through millions of examples, the network will generate predictive features that group similar data points together—beagle and hound ears close, but far from shepherd ears. The end result is a recognition system which bears notable similarities to our own minds in both its architecture and performance.
By introducing learning through CNNs, Dehaene aims to disarm an intuitive skepticism many feel towards treating intelligence as just neural firings and algorithms.
Since Leibniz, philosophers have scoffed at those seeking cognition in the mechanics of the brain. But Dehaene flips this script: he uses machine learning to understand better human cognition.
Neurons in the brain can be understood as computing information and clustering data in ways roughly akin to the other, artificial neural networks around us.
He uses the same technique to connect contemporary AI to what I regard as our best understanding of child development —namely that children come wired to quickly acquire a rough model of the world—and the Bayesian brain hypothesis that goes with it—neurons at each level in the brain are using prior information (much of it innate) to form best guesses about what their sensory inputs mean.
Though even the simplest human brains are far more complex than any machine we’ve made, approaching the human mind through AI paradoxically helps demystify how neural firing alone can lead to intelligent behavior: architecture imposes biases, learning adjusts the model to group similar data together, and so on.
While the book goes far in highlighting the commonalities of contemporary neural nets and the human brain, the latter chapters turn on what makes human learning superior to machine learning.
Current machines largely imitate only the unconscious capacities of humans—things like image segmentation, object recognition, simulation, word parsing, and grammatical analysis. This is not a fault; learning in large part involves doing things unconsciously—for example, turning the cumbersome process of picking out letters into the effortless process of reading. But the problem is that machine learning at only is only unconscious—a largely blind, undirected search. This is why contemporary machine learning is so inefficient and requires so much data.
Dehaene connects this up with his lifelong theory of consciousness. For decades, he’s been the foremost proponent of the “global workspace of consciousness.” The basic idea is that, while the majority of the processes in the brain are unconscious, some of them can be raised to awareness as needed to focus on them—place them on a workspace so we can tinker using them–in accomplishing some specific task. The global workspace selectively attends to a small subset of processes needed for a task, connects them with disparate sources of data, and integrates them within larger models of the world that can then direct actions.
Many thinkers throw their hands up at this point, asserting machines simply can’t have human-like intelligence. And some doubt we could ever know if a machine is conscious, or whether it matters at all if they are.
Dehaene sharply disagrees; he takes it for granted that neural networks—whether made of cells or silicon—are capable of consciousness; the question he focuses on is why these networks need consciousness: what kinds of functions require the selective attention made possible by a global workspace? More specifically, what capacities will make machines more capable learners?
Dehaene highlights four abilities of humans that make them better learners that the machines we have built: 1) conscious attention, 2) curiosity, 3) teacher-guided education, and 4) sleep.
These four abilities provide more efficient learning, with less blind trial and error and more focused convergence on the most general and predictive features of a domain. Attention allows us to isolate specific features of data needed for performing a task; pairing this with curiosity enables to focus on what we don’t know and thus what needs more effort from us; doing this while receiving guidance from others ensures we stick with effective strategies and don’t get stuck with seeming shortcuts that fail to generalize; finally, our brains continue to replay these generalizable solutions as we sleep, reinforcing the neurons in the network so they’ll be faster, more efficient, and require less conscious oversight in the future.
Turning these into functional properties of a machine would require building better attentive mechanisms to focus on which outputs matter for some task, comparing outputs with inner simulations, and correcting errors; actively exploring the data without an objective to make general information available for many different tasks; using guidance from others and evaluating others’ evidence for their claims; and using offline hallucinations of key previously experienced examples to reinforce those concepts and techniques that correlated with success during learning. All these abilities are already being aggressively pursued in AI, not because humans do them, but because they are, as Dehaene shows, they are functional useful techniques for improving learning.
The strength of the Dehaene’s remarkable book is the mutual illumination of both human and machine learning.
The weakness is his final claim, that contemporary science requires an update to how we educate children. Dehaene helpfully criticizes approaches to education which haven’t panned out —such as the belief that children each have their own learning style—and highlights counterintuitive findings—such as the benefits of regular testing. But these comments are sporadic and not systematized, so his call in the last chapter for education reform isn’t really persuasive.
The text also, somewhat surprisingly, delves little into AI-based approaches to teaching, despite increased interest—and criticism—in such approaches of late. The big picture approach also makes little contact with the challenging realities of teaching, especially in a country as riven with inequities as the United States. While teachers will eventually benefit from a more rigorous scientific approach to learning, a reform of educational practices will ultimately require a different, policy-focused book, one making space for teachers’ voices.
Dehaene ends the book with the usual disclaimer that “machines still have a long way to go,” predicting that “the brain keeps the upper hand over machines” for some time. As he notes, it is an open question how far contemporary neural networks can go at mechanically reproducing symbolic reasoning and common-sense understanding.
But throughout the book he is at pains to point out that machines are well on their way; as the subtitle hints, humans aren’t likely to remain the best learners on the planet forever. The advancement of machines will not only be milestones in AI but, as Dehaene’s book shows, will also provide us new insights into how our own brains work. It is likely that only when the machines finally outpace us that we will finally understand what makes our intelligence so incredible.
Jacob Browning is a Berggruen/Templeton World Charity Fellow and Post-Doctoral Researcher at NYU Center for Data Science working on the philosophy of AI.
Cobb, M. (2020). The Idea of the Brain: A History. New York: Basic Books.
Dehanae, S. (2020). How We Learn: Why Brains Learn Better Than Any Machine . . . for Now. New York: Penguin Books.
Riskin, J. (2016). The Restless Clock. Chicago: University of Chicago Press.
Author Portrait by Spencer Lowell, 2018.