Monday, December 11, 2017

How to study brains and minds

There is currently a fight going on in cog-neuro whose outcome GGers should care about. It is illuminatingly discussed in a recent paper by Krakauer, Ghazanfar, Gomez-Marin, MacIver and Poeppel (KGG-MMP) (here). The fight is about how to investigate the mind/brain connection. There are two positions. One, which I will call the “Wrong View” (WV) just to have a useful mnemonic, takes a thoroughly reductionist approach to the problem. The idea is that a full understanding of brain function will follow from a detailed understanding of “their component parts and molecular machinery” (480). The contrary view, which I dub the “Right View” (RV) (again, just to have a name),[1] thinks that reductionism will not get nearly as far as we need to go and that the only way to get a full understanding of how brains contribute to thinking/feeling/etc. requires neural implementations in tandem with (and more likely subsequent to) “careful theoretical and experimental decomposition of behavior.” More specifically, “the detailed analysis of tasks and of the behavior they elicit is best suited for discovering component processes and their underlying algorithms. In most cases,…the study of the neural implementation of behavior is best investigated after such behavioral work” (480). In other words, WV and RV differ not over the end game (an understanding of how the brain subvenes the brain mechanisms relevant to behavior) but the best route to that end. WV thinks that if you take care of the neuronal pennies, the cognitive dollars will take care of themselves. The RV thinks that doing so will inevitably miss the cognitive forest for the neural trees and might in fact even obscure the function of the neural trees in the cognitive forest. (God I love to mix metaphors!!). Of course, RV is right and WV is wrong. I would like to review some of the points KGG-MMP makes arguing this. However, take a look for yourself. The paper is very accessible and worth thinking about more carefully.

Here are some points that I found illuminating (along with some points of picky disagreement (or, how I would have put things differently)).

First, framing the issue as one of “reductionism” confuses matters. The issue is less reduction than it is a neurocentric myopia. The problem KGG-MMP identifies revolves around the narrow methods standard practice deploys not the ultimate metaphysics that it endorses. In other words, even if there is, ontologically speaking, nothing more than “neurons” and their interactions,[2] discovering what these interactions are and how they combine to yield the observed mental life will require well developed theories of this mental life expressed in mentalistic non-neural terms. The problem then with standard practice is not its reduction but its methodological myopia. And KGG-MMP recognizes this. The paper ends with an appeal for a more “pluralistic” neuroscience, not an anti-reductionist one.

Second, KGG-MMP gives a nice sketch of how WV has become so prevalent. It provides a couple of reasons. First, has been the tremendous success of “technique driven neuroscience” (481). There can be no doubt that there has been an impressive improvement in the technology available to study the brain at the neuronal level. New and better machines, new and better computing systems, new and better maps of where things are happening. Put these all together and it is almost irresistible to grab for the low hanging fruit that such techniques bring into focus. Nor, indeed should this urge be resisted. What needs resisting is the conclusion that because these sorts of data can be productively gathered and analyzed that these data suffice to answer the fundamental questions.

KGG-MMP traces the problem to a dictum of Monod’s: “what is true of the bacterium is true of the elephant.” KGG-MMP claims that this has been understood within cog-neuro as claiming that “what is true for the circuit is true for the behavior” and thus that “molecular biology and its techniques should serve as the model of understanding in neuroscience” (481).

This really is a pretty poor form of argument. It effectively denies the possibility of emergence. Here’s Martin Reese (here) making the obvious point:

Macroscopic systems that contain huge numbers of particles manifest ‘emergent’ properties that are best understood in terms of new, irreducible concepts appropriate to the level of the system. Valency, gastrulation (when cells begin to differentiate in embryonic development), imprinting, and natural selection are all examples. Even a phenomenon as unmysterious as the flow of water in pipes or rivers is better understood in terms of viscosity and turbulence, rather than atom-by-atom interactions. Specialists in fluid mechanics don’t care that water is made up of H2O molecules; they can understand how waves break and what makes a stream turn choppy only because they envisage liquid as a continuum.

Single molecules of H2O do not flow. If one is interested in fluid mechanics then understanding will come only by going beyond the level of the single molecule or atom. Similary if one is interested in the brain mechanisms underlying cognition or behavior then it is very likely that we will need to know a lot about how groups of fundamental neural elements interact, not just how one does what it does. So just as a single bird doesn’t flock, nor a single water molecule flow, nor a single gastric cell digest, so neither does a single brain particle (e.g. neuron) think. We will need more.

Before I get to what more, I should add here that I don’t actually think that Mondo meant what KGG-MMP take him to have meant. What Monod meant was that the principles of biology that one finds in the bacterium are the same as those that we find in the elephant. There is little reason to suppose, he suggested, that what makes elephants different from bacteria lies in their smallest parts respecting different physical laws. It’s not as if we expect the biochemistry to change. What KGG-MMP and Reese observe is that this does not mean that all is explained by just understanding how the fundamental parts work. This is correct, even if Monod’s claim is also correct.

Let me put this another way: what we want are explanations. And explanations of macro phenomena (e.g. flight, cognition) seldom reduce to properties of the basic parts. We can completely understand how these work without having the slightest insight into why the macro system has the features it does. Here is Reese again on reduction in physics:

So reductionism is true in a sense [roughly Monod’ sense, NH]. But it’s seldom true in a useful sense. Only about 1 per cent of scientists are particle physicists or cosmologists. The other 99 per cent work on ‘higher’ levels of the hierarchy. They’re held up by the complexity of their subject, not by any deficiencies in our understanding of subnuclear physics.

So, even given the utility of understanding the brain at the molecular level (and nobody denies that this is useful), we need more than WV allows for. We need a way of mapping two different levels of description onto one another. In other words, we need to solve what Embick and Poeppel have called the “granularity mismatch problem” (see here). And for this we need to find a way of matching up behavioral descriptions with neural ones. And this requires “fine grained” behavioral theories that limn mental mechanisms (“component parts and sub-routinges”) as finely as neural accounts describe brain mechanisms. Sadly, as KGG-MMP notes, behavioral investigation “has increasingly been marginalized or at best postponed” (481-2), and this has made moving beyond the WV difficult. Rectifying this requires treating behavior “as a foundational phenomenon in its own right” (482).[3]

Here is one more quibble before going forward. I am not really fond of the term ‘behavioral.’ What we want is a way of matching up cognitive mechanisms with neural ones. We are not really interested in explaining actual behavior but in explaining the causal springs and mechanisms that produce behavior. Focusing on behavior leads to competence/performance confusions that are always best avoided. That said, the term seems embedded in the cog-neuro literature (no doubt a legacy of psychology’s earlier disreputable behaviorist past) and cannot be easily dislodged. What KGG-MMP intends is that we should look for mental models and use these to explore neural models that realize these mental systems. Of course, we assume that mental systems yield behaviors in specific circumstances, but like all good scientific theories, the goal is to expose the mental causes behind the specific behavior and it is these mental causal factors whose brain realization we are interested in understanding.  The examples KGG-MMP gives show that this is the intended point.

Third, KGG-MMP nicely isolates why neuroscience needs mental models. Or as KGG-MMP puts is: “Why is it the case that explanations of experiments at the neural level are dependent on higher level vocabulary and concepts?” Because “this dependency is intrinsic to the very concept of a “mechanism”.” The crucial observation is that “the components of a mechanism do different things than the mechanism organized as a whole” (485). As Marr noted, feathers are part of the bird flight mechanism, but feathers don’t fly. To understand how birds fly requires more than a careful description of their feathers. So too with neurons.

Put another way, as mental life (and so behavior) is an emergent property of neurons how neurons subvene mental processes will not be readily apparent by only studying neural properties singularly or collectively.

Fourth, KGG-MMP gives several nice concrete examples of fruitful interactions between mental and neural accounts. I do not review them here save to say that sound localization in barn owls makes its usual grand appearance. However, KGG-MMP provides several other examples as well and it is always useful to have a bunch of these available on hand.

Last, KGG-MMP got me thinking about how GGish work intersects with the neuro concerns the paper raises, in particular minimalism and its potential impact for neuroscience. I have suggested elsewhere (e.g. here) that MP finally offers a way of bridging the granularity gap that Embick and Poeppel. The problem as they saw it, was that the primitives GGers were comfortable with (binding, movement, c-command) did not map well to primitives neuro types were comfortable with. If, as KGG-MMP suggests, we take the notion of the “circuit” as the key bridging notion, the problem with GG was that it did not identify anything simple enough to be a plausible correlate to a neural circuit. Another way of saying this is that theories like GB (though very useful) did not “dissect [linguistic, NH] behavior into its component parts or subroutines” (481). It did not carve linguistic capacity at its joints. What minimalism offers is a way of breaking GB parts down into simpler subcomponents. Reducing macro GB properties to products of simple operations like  Merge or Agree or Check Feature promises to provide mental parts simple enough to be neurally interpretable. As KGG-MMP makes clear finding the right behavioral/mental models matters and breaking complex mental phenomena down into its simpler parts will be part of finding the most useful models for neural realization.

Ok, that’s it. The paper is accessible and readable and useful. Take a look.

[1] As we all know, the meaning of the name is just what it denotes so there is no semantic contribution that ‘wrong’ and ‘right’ make to WV and RV above.
[2] The quotes are to signal the possibility that Gallistel is right that much neuronal/cognitive computation takes place sub neuronally.
[3] Again, IMO, though I agree with the thrust of this position, it is very badly put. It is not behavior that is foundational but mentalistic accounts of behavior, the mechanisms that underlie it, that should be treated as foundational. In all cases, what we are interested in are the basic mechanisms not their products. The latter are interesting exactly to the degree that they illuminate the basic etiology.

Friday, December 1, 2017

Fodor and Piatelli Palmarini on Natural Selection

The NYT obit on Jerry Fodor accurately recognizes the important contributions he made to philosophy, psychology and linguistics. The one reservation noted, the strained reception of his late work on evolution and his critique of Darwin. It accurately notes that Jerry saw the achilles heal of natural selection theories residing in their parallels with behaviorism (a parallel, it should be noted, that Skinner himself emphasized). Jerry and Massimo concluded that to the degree the parallels with behaviorism were accurate then this was a problem for theories of natural selection (a point also made by Chomsky obliquely in his review of Skinner at the outset of the cognitive revolution). I think it is fair to say that Jerry and Massimo were hammered for this argument by all and sundry. It's is one thing to go after Skinner, quite another to aim to decapitate Darwin (though how much Darwin was a radical selectionist (the real target of Jerry's and Massimo's critique) is quite debatable). At any rate, as a personal tribute to the great man I would like to post here an outline of what I took to be the Jerry/Massimo argument. As I note at the end, it strikes me as pretty powerful, though my aim is not to defend it but to elucidate it for most of the critiques it suffered did not really engage with their claims (an indication, I suspect, that people were less interested in the argument than in defending against the conclusion).

The content of the post that follows was first published in roughly this form in Biolinguistics. I put it up here for obvious reasons. Jerry Fodor was a great philosopher. I knew him personally but not as well as many of my friends did. I was charmed the few times I socially interacted with him. He was so full of life, so iconoclastic, so funny and so generous (most of his insights he graciously attributed to his grandmother!). I looked up how often I talked about Jerry's stuff on FoL and re-reading these made me appreciate how much my own thinking largely followed his (though less funny and less incisive). So, he will be missed.

So, without further ado, here is a reprise of what I take to have been the Jerry/Massimo argument against Natural Selection accounts of evolution. 


Jerry Fodor and Massimo Piatelli-Palmarini (F&P, 2010) have recently argued (in What Darwin Got Wrong) that the theory of Natural Selection (NS) fails to explain how evolution occurs.  Their argument is not with the fact of evolution but with the common claim that NS provides a causal mechanism for this fact.  Their claim has been greeted with considerable skepticism, if not outright hostility.[1]  Despite the rhetorical heat of much of the discussion, I do not believe that critics have generally engaged the argument that F&P have actually presented.  It is clear that the validity of F&P’s argument is of interest to biolinguists.  Indeed, there has been much discussion concerning the evolution of the Faculty of Language and what this implies for the structure of Universal Grammar.  To facilitate evaluation of F&P’s proposal, the following attempts to sketch a reconstruction of their argument that, to my knowledge, has not been considered.

1. 'Select' is not 'select for', the latter being intensional.[2]    
2. The free rider problem shows that NS per se does not have the theoretical resources to distinguish between ‘select’ and ‘select for.’
3. If not, then how can NS causally explain evolutionary change?
4. There are two ways of circumventing the free rider problem.[3]
a.                    Attribute mental powers to NS, i.e. NS as Mother Nature, thereby endowing NS with intentionality and so the wherewithal to distinguish ‘select’ from ‘select for.’ 
b.                   Find within NS Law supporting counterfactuals, i.e. Laws of Natural Selection/Evolution, which also would suffice to provide the requisite intentionality.
5. The first option is clearly nuts, so NS accounts must be presupposing 4b.
6. But NS contains no laws of evolution, a fact that seems to be widely recognized!
7. So NS can't do what it purports to do; give a causal theory that explains the facts of evolution.
8. Importantly, NS fails not because causal accounts cannot be given for individual cases of evolution. They can be and routinely are. Rather the accounts are individual causal scenarios, natural histories specific to the case at hand, and there is nothing in common across the mechanisms invoked by these individual accounts besides the fact that they end with winners and losers. This is, in fact, often acknowledged.  The only relevant question then is whether NS might contain laws of NS/Evolution?  F&P argue that NS does not contain within itself such laws and that given the main lines of the theory, it is very unlikely that any could be developed.
9. Interestingly, this gap/(flaw) in NS is now often remarked in the Biology Literature.  F&P review sample some work of this sort in the book. The research they review tends to have a common form in that it explores a variety of structural constraints that were they operative would circumscribe the possible choices NS faces. However, importantly, the mechanisms proposed are exogenous to NS; they can be added to it but do not follow from it.
10. If these kinds of proposals succeed then they could be combined with NS to provide a causal theory of evolution. However, this would require giving up the claim that NS explains evolution.  Rather, at most, NS + Structural Theories together explain evolutionary change.[4]
11. But, were such accounts to develop the explanatory weight of the combined 'NS + Structural Theory' account would be carried by the added structural constraints not NS. In other words, all that is missing from NS is that part that can give it causal heft and though this could be added to NS, NS itself does not contain the resources to develop such a theory on its own.  Critics might then conclude as follows: this means that NS can give causal accounts when supplemented in the ways indicated.  However, this is quite tendentious.  It is like saying Newton's theory suffices to account for electro-magnetic effects for after all Newton's laws can be added to Maxwell's to give an account of EM phenomena!  
12. F&P make one additional point of interest to linguists.  Their review and conclusions concerning NS are not really surprising for NS replays the history of empiricist psychology, though strictly speaking, the latter was less nutty than NS for empiricists had a way of distinguishing intentional from non-intentional as minds are just the sorts of things that are inherently intentional.  In other words, though attributing mental intentional powers to NS (i.e. Mother Nature) is silly, attributing such powers to humans is not.

This is the argument.  To be honest, it strikes me as pretty powerful if correct and it does indeed look very similar to early debates between rationalist and empiricist approaches to cognition.  However, my present intention has not been to defend the argument, but to lay it out given that much of the criticism against the F&P book seems to have misconstrued what they were saying.

[1] See, for example: A misguided attack on evolution, Massimo Pigliucci. 2010. Nature 464, A misunderstanding Darwin, Ned Block and Philip Kitcher. 2010. Boston Review of Books, 35(2), Futuyma, D. 2010, Two critics without a clue. Science, 328: 692-93.
[2] Intensional contexts are ones in which extensionally identical expressions are not freely interchangeable.  Thus, if John hopes to kiss Mary and Mary is The Queen of the Night, we cannot conclude that John hopes to kiss the Queen of the Night.
[3] F&PP develop this argument in Chapter 6.  The classic locus of the problem is S.J. Gould and R.C. Lewontin.  The spandrels of San Marco and the Panglossian paradigm: a critique of the adaptationist program. Proceedings of the Royal Society of London, Series B biological sciences, vol 205, 1979, 581-98.
[4] Observe, the supposition that selection is simply a function of “external” environmental factors lies behind the standard claim that NS (and NS alone) explains why evolutionary changes are generally adaptive.  Adding structural “internal” constraints to the selective mix, weakens the force of this explanation.  To the degree that the internal structural factors constrain the domain of selection, to that degree the classical explanation for the adaptive fit between organism and environment fails.

Thursday, November 30, 2017

Jerry Fodor died yesterday

Jerry Fodor was one of the great analytic philosophers of his era, and that is saying a lot. Contemporaries include, Hilary Putnam, Robert Nozick, David Lewis, and Saul Kripke. In my opinion, his work will be profitably read for a very long time, and with constant amusement and provocation. Fodor was unique. He never avoided the deep issues, never avoided a good joke or trenchant jab. He always saw to the nub of the matter. And he was a philosopher whose work mattered for the practice of science. He made innumerable contributions to cognitive psychology (most of whose points psychologists would profit from reading still) and linguistics (not the least of which is insisting that we never confuse metaphysical issues (what is the case) with epistemological ones (how do we know that what is the case is the case)). He led the good fight against Empiricism (started early and never relented) in all it guises (behaviorism, connectionism, radical versions of natural selection) and his papers are still worth reading again and again and again today. He will be missed. Philosophy will be both duller and less illuminating without his constant contributions.

Friday, November 24, 2017

Repost (from Talking Brains) of a report from the neuro front by William Matchin

Here is a terrific post William Matchin first posted on Talking Brains reviewing some of the highlights of the two big recent cog-neuro meetings. He has kindly allowed me to repost it here for FoLers. FWIW, I find his take on this both heartening and consistent with my own impressions. I do think that there is a kind of appreciation dawning in the cog-neuro of lang community of the utility of the kinds of abstract considerations concerning "competence" that GGers have advocated. At any rate, it is one of those long pieces that you regret are not even longer. 

Abstractness, innateness, and modality-independence of language: reflections on SNL & SfN 2017

Guest post by former student, William Matchin:
It’s been almost 10 years since the Society for the Neurobiology of Language conference (SNL) began, and it is always one of my favorite events of the year, where I catch up with old friends and see and discuss much of the research that interests me in a compact form. This year’s meeting was no exception. The opening night talk about dolphin communication by Diana Reiss was fun and interesting, and the reception at the Baltimore aquarium was spectacular and well organized. I was impressed with the high quality of many of the talks and posters. This year’s conference was particularly interesting to me in terms of the major trending ideas that were circulating at the conference (particularly the keynote lectures by Yoshua Bengio & Edward Chang), so I thought I would write some of my impressions down and hear what others think. I also have some thoughts about Society for Neuroscience (SfN), in particular one keynote lecture: Erich Jarvis, who discussed the evolution of language, with the major claim that human language is continuous with vocal learning in non-human organisms. Paško Rakić, who gave a history of his research in neuroscience, also had an interesting comment on the tradeoff between empirical research and theoretical development and speculation, which I will also discuss briefly.

The notions of abstractness, innateness, and modality-independence of language loomed large at both conferences; much of this post is devoted to these issues. The number of times that I heard a neuroscientist or computer scientist make a logical point that reminded me of Generative Grammar was shocking. In all, I had an awesome conference season, one that gives me great hope and anticipation for the future of our field, including much closer interaction between biologists & linguists. I encourage you to visit the Faculty of Language blog, which often discusses similar issues, mostly in the context of psychology and linguistics.

1. Abstractness & combinatoriality in the brain

Much of the work at the conference this year touched on some very interesting topics, ones that linguists have been addressing for a long time. It seemed that for a while embodied cognition and the motor theory of speech perception were dominant topics, but now it seems as though the table has turned. There were many presentations showing how the brain processes information and converts raw sensory signals into abstract representations. For instance, Neal Fox presented ECoG data on a speech perception task, illustrating that particular electrodes in the superior temporal gyrus (STG) dynamically encode voice onset time as well as categorical voicing perception. Then there was Edward Chang’s talk. I should think that everyone at SNL this year would agree that his talk was masterful. He clearly illustrated how distinct locations in STG have responses to speech that are abstract and combinatorial. The results regarding prosody were quite novel to me, and nicely illustrate the abstract and combinatorial properties of the STG, so I shall review them briefly here.

Prosodic contours can be dramatically different in frequency space for different speakers and utterances, yet they share an underlying abstract structure (for instance, rising question intonation at the end of a sentence). It appears that certain portions of the STG are selectively interested in particular prosodic contours independently of the particular sentence or speaker; i.e., they encode abstract prosodic information. How can a brain region encode information about prosodic contour independently of speaker identity? The frequency range of speech among speakers can vary quite dramatically, such that the entire range for one speaker (say, a female) can be completely non-overlapping with another speaker (say, a male) in frequency space. This means that the prosodic contour cannot be defined physically, but must be converted into some kind of psychological (abstract) space. Chang reviewed literature suggesting that speakers normalize pitch information by the speaker’s fundamental frequency, thus resulting in an abstract pitch contour that is independent of speaker identity. This is similar to work by Phil Monahan and colleagues (Monahan & Idsardi, 2010) who showed that vowel normalization can be obtained by dividing F1 and F2 by F3.

From Tang, Hamilton & Chang (2017). Different speakers can have dramatically different absolute frequency ranges, posing a problem for how common underlying prosodic contours (e.g., a Question contour) can be identified independently of speaker identity.

Chang showed that the STG also encodes abstract responses to speaker identity (the same response regardless of the particular sentence or prosodic contour) and phonetic features (the same response to a particular sentence regardless of speaker identity or pitch contour). Thus, it is not the case that there are some features that are abstract and others are not; it seems that all of the relevant features are abstract.

From Tang, Hamilton & Chang (2017). Column 1 shows the responses for a prosody-encoding electrode. The electrode distinguishes among different prosodic contours, but not different sentences (i.e., different phonetic representations) or speakers.

Why do I care about this so much? Because linguists (among other cognitive scientists) have been talking for decades about abstract representations, and I think that there has often been skepticism placed about how the brain could encode abstractness. But the new work in ECoG by Chang and others illustrates that much of the organization of the speech cortex centers around abstraction – in other words, it seems that abstraction is the thing the brain cares most about, doing so rapidly and robustly in sensory cortex.

Two last points. First, Edward also showed that any of these properties identified in the left STG are also found in the right STG, consistent with the claim that speech perception is bilateral rather than unilateral (Hickok & Poeppel, 2000). Thus, it does not seem that speech perception is the key to language laterality in humans (but maybe syntax – see section 3). Second, the two of us also had a nice chat about what his results mean for innateness and development of these functional properties of the STG. And he had the opinion that the STG innately encodes these mechanisms, and that different languages make different use of this pre-existing phonetic toolbox. This brings me to the next topic, which centers on the issue of what is innate about language.

2. Deep learning and poverty of the stimulus

Yoshua Bengio gave one of the keynote lectures at this year’s SNL. For the uninitiated (such as myself), Yoshua Bengio is one of the leading figures in the field of deep learning. He stayed the course during the dark ages of connectionist neural network modeling, thinking that there would eventually be a breakthrough (he was right). Deep learning is the next phase of connectionist neural network modeling, centered on the use of massive amounts of training data and hidden network layers. Such computer models can correctly generate descriptions of pictures, translate between languages; in sum, things for which people are willing to pay money. Given this background, I expected to hear him say something like this in his keynote address: deep learning is awesome, we can do all the things that we hoped to be able to do in the past, Chomsky is wrong about humans requiring innate knowledge of language.

Instead, Bengio made a poverty of the stimulus argument (POS) in favor of Universal Grammar (UG). Not in those words. But the logic was identical.

For those unfamiliar with POS, the logic is that human knowledge, for instance language, is underdetermined by the input. Question: You never hear ungrammatical sentences (such as *who did you see Mary and _), so how do you know that they are ungrammatical? Answer: Your mind innately contains the relevant knowledge to make these discriminations (such as a principle like Subjacency), making learning them unnecessary. POS arguments are central to generative grammar, as they provide much of the motivation for a theory of UG, UG being whatever is in encoded in your genome that enables you to acquire a language, and what is lacking in things that do not learn language (such as kittens and rocks). I will not belabor the point here, and there are many accessible articles on the Faculty of Language blog that discuss these issues in great detail.

What is interesting to me is that Bengio made a strong POS argument perhaps without realizing that he was following Chomsky’s logic almost to the letter. Bengio’s main point was that while deep learning has had a lot of successes, such computer models make strange mistakes that children would never make. For instance, the model would name a picture of an animal correctly on one trial, but with an extremely subtle change to the stimulus on the next trial (a change imperceptible to humans), the model might make a wildly wrong answer. This is directly analogous to Chomsky’s point that children never make certain errors, such as formulating grammatical rules that use linear rather than structural representations (see Berwick et al., 2011 for discussion). Bengio extended this argument, adding that children have access to dramatically less data than deep learning computer models do, which shows that the issue is not the amount or quality of data (very similar to arguments made repeatedly by Chomsky, for instance, this interview from 1977). For these reasons, Bengio suggested the following solution: build in some innate knowledge that guides the model to the correct generalizations. In other words, he made a strong POS argument for the existence of UG. I nearly fell out of my seat.

People often misinterpret what UG means. The claim really boils down to the fact that humans have some innate capacity for language that other things do not have. It seems that everyone, even leading figures in connectionist deep learning, can agree on this point. It only gets interesting when figuring out the details, which often include specific POS arguments. And in order to determine the details about what kinds of innate knowledge should be encoded in genomes and brains, and how, it would certainly be helpful to invite some linguists to the party (see part 5).

3. What is the phenotype of language? The importance of modality-independence to discussions of biology and evolution.

The central question that Erich Jarvis addressed during his Presidential Address at this year’s SfN on its opening night was whether human language is an elaborate form of vocal learning seen in other animals or rather a horse of a different color altogether. Jarvis is an expert of the biology of birdsong, and he argued that human language is continuous with vocal learning in non-human organisms both genetically and neurobiologically. He presented a wide array of evidence to support his claim, mostly along the lines of showing how the genes and parts of the brain that do vocal learning in other animals have closely related correlates in humans. However, there are three main challenges to a continuity hypothesis that were either entirely omitted or extravagantly minimized: syntax, semantics, and sign language. It is remiss to discuss biology and evolution of a trait without clearly specifying the key phenotypic properties of that trait, which for human language includes the ability to generate an unbounded array of hierarchical expressions that have both a meaning and a sensory-motor expression, which can be auditory-vocal or visual-manual (and perhaps even tactile, Carol Chomsky, 1986). If somebody had only the modest aim of discussing the evolution of vocal learning, I would understand omitting these topics. But Jarvis clearly had the aim of discussing language more broadly, and his second slide included a figure by Hauser Chomsky & Fitch (2002), which served as the bull’s-eye for his arguments. Consider the following a short response to his talk, elaborating on why it is important to discuss the important phenotypic traits of syntax, semantics, and modality-independence.

It is a cliché that sentences are not simply sequences of words, but rather hierarchical structures. Hierarchical structure was a central component of Hauser, Chomsky & Fitch’s (2002) proposal that syntax may be the only component of human language that is specific to it, as part of the general Minimalist approach to try and reduce UG to a conceptual minimum (note that Bengio, Jarvis and Chomsky all agree on this point – none of them want to have a rich, linguistically-specific UG, and all of them argue against it). Jarvis is not an expert on birdsong syntax, so it is perhaps unfair of him to discuss syntax in detail. However, Jarvis merely mentioned that some have claimed to identify recursion in birdsong (Gentner et al., 2006), feeling that to be sufficient to dispatch syntax. However, he did not mention the work debating this issue (Berwick et al., 2012), which illustrates that birdsong has syntax that is roughly equivalent to phonology, but not human sentence-level syntax. This work suggests that birdsong may be quite relevant to human language as a precursor system to human phonology (fascinating if true), but it does not appear capable of accounting for sentence-level syntax. In addition, the main interesting thing about syntax is that it combines words to produce new meanings, unlike birdsong, which does not.

With respect to semantics, Jarvis showed that dogs can learn to respond to our commands, such as sitting when we say “sit”. He suggested that because dogs can “comprehend” human speech, they have a precursor to human semantics. But natural language semantics is way more than this. We combine words that denote concepts into sentences which denote events (Parsons, 1990). We do not have very good models of animal semantics, but a stimulus-response pairing is probably a poor one. It may very well be true that non-human primates have a similar semantic system as we do – desirable from a Minimalist point of view – but it needs to be explored beyond pointing out that animals learn responses to stimuli. Many organisms learn stimulus response pairing, probably including insects – do we want to claim that they have a similar semantic system as us?

The most important issue for me was sign language. I do not think Jarvis mentioned sign language once during the entire talk (I believe he briefly mentioned gestures in non-human animals). As somebody who works on the neurobiology of American Sign Language (ASL), this was extraordinarily frustrating (I cannot imagine the reaction of my Deaf colleagues). I believe that one of the most significant observations about human language is that it is modality-independent. As linguists have repeatedly shown, all of the relevant properties of linguistic organization found in spoken languages are found in sign languages: phonology, morphology, syntax, semantics (Sandler & Lillo-Martin, 2006). Deaf children raised by deaf parents learn sign language in the same way that hearing children spoken language, without instruction, including a babbling stage (Petitto & Martentette, 1991). Sign languages show syntactic priming just like spoken languages (Hall et al., 2015). Aphasia is similarly left-lateralized in sign and spoken languages (Hickok et al., 1996), and neuroimaging studies show that sign and spoken language activate the same brain areas when sensory-motor differences are factored out (Leonard et al., 2012; Matchin et al., 2017a). For instance, in the Mayberry and Halgren labs at UCSD we showed using fMRI that left hemisphere language areas in the superior temporal sulcus (aSTS and pSTS) show a correlation between constituent structure size and brain activation in deaf native signers of ASL (6W: six word lists; 2S: sequences of three two word phrases; 6S: six word sentences) (Matchin et al., 2017a). When I overlap these effects with similar structural contrasts in English (Matchin et al., 2017b) or French (Pallier et al., 2011), there is almost perfect overlap in the STS. Thus, both signed and spoken languages involve a left-lateralized combinatorial response to structured sentences in the STS. This consistent with reports of a human-unique hemispheric asymmetry in the morphology of the STS (Leroy et al., 2015).

TOP: Matchin et al., in prep (ASL). BOTTOM: Pallier et al., 2011 (French).

Leonard et al. (2012), also from the Mayberry and Halgren labs, show that semantically modulated activity in MEG for auditory speech and sign language activates pSTS is nearly identical in space and time.

All of these observations tell us that there is nothing important about language that must be expressed in the auditory-vocal modality. In fact, it is conceptually possible to imagine that in an alternate universe, humans predominantly communicate through sign languages, and blind communities sometimes develop strange “spoken languages” in order to communicate with each other. Modality-independence has enormous ramifications for our understanding of the evolution of language, as Chomsky has repeatedly noted (Berwick & Chomsky, 2015; this talk, starting at 3:00). In order to make the argument that human language is continuous with vocal learning in other animals, sign language must be satisfactorily accounted for, and it’s not clear to me how it can. This has social ramifications too. Deaf people still struggle for appropriate educational and healthcare resources, which I think stems in large part from ignorance about how sign languages are fully equivalent to spoken languages among the scientific and medical community.

When I tweeted at Jarvis pointing out the issues I saw with his talk, he responded skeptically:

At my invitation, he stopped by our poster, and we discussed our neuroimaging research on ASL. He appears to be shifting his opinion:

This reaffirms to me how important sign language is to our understanding of language in general, and how friendly debate is useful to make progress in understanding scientific problems. I greatly appreciate that Erich took the time to politely respond to my questions, come to our poster, and discuss the issues.

If you are interested in learning more about some of the issues facing the Deaf community in the United States, please visit Marla Hatrak’s blog:, or Gallaudet University’s Deaf Education resources:

4. Speculative science

Paško Rakić is a famous neuroscientist, and his keynote lecture at SfN gave a history of his work throughout the last several decades. I will only give one observation about the content of his work: he thinks that it is necessary to posit innate mechanisms when trying to understand the development of the nervous system. One of his major findings is that cortical maps are not emergent, but rather are derived from precursor “protomaps” that encode the topographical organization that ends up on the cortical surface (Rakić, 1988). Again, it seems as though some of the most serious and groundbreaking neuroscientists, both old and new, are thoroughly comfortable discussing innate and abstract properties of the nervous system, which means that Generative Grammar is in good company.

Rakić also made an interesting commentary on the current sociological state of affairs in the sciences. He discussed a previous researcher (I believe from the late 1800s) who performed purely qualitative work speculating about how certain properties of the nervous system developed. He said that this research, serving as a foundation for his own work, probably would be rejected today because it would be seen as too “speculative”. He mentioned how the term speculative used to be perceived as a compliment, as it meant that the researcher went usefully beyond the data, thinking about how the world is organized and developing a theory that would make predictions for future research (he had a personal example of this, in that he predicted the existence of a particular molecule that he didn’t discover for 35 years).

This comment resonated with me. I am always puzzled about the lack of interest in theory and the extreme interest in data collection and analysis: if science isn’t about theory, also known as understanding the world, then what is it about? I get the feeling that people are afraid to postulate theories because they are afraid to be wrong. But every scientific theory that has ever been proposed is wrong, or will eventually be shown to be wrong, at least with respect to certain details. The point of a theory is not to be right, it’s to be right enough. Then it can provide some insight into how the world works which serves as a guide to future empirical work. Theory is a problem when it becomes misguiding dogma; we shouldn’t be afraid of proposing, criticizing, and modifying or replacing theories.

The best way to do this is to have debates that are civil but vigorous. My interaction with Erich Jarvis regarding sign language is a good example of this. One of the things I greatly missed about this year’s SNL was the debate. I enjoy these debates, because they provide the best opportunity to critically assess a theory by finding a person with a different perspective who we can count on to find all of the evidence against a theory, saving us the initial work of finding this evidence ourselves. This is largely why we have peer review, even with its serious flaws – the reviewer acts in part as a debater, bringing up evidence or other considerations that the author hasn’t thought of, hopefully leading to a better paper. I hope that next year’s SNL has a good debate about an interesting topic. I also feel that the conference could do well to encourage junior researchers to debate, as there is nothing better for personal improvement in science than interacting with an opposing view to sharpen one’s knowledge and logical arguments. It might be helpful to establish ground rules for these debates, in order to ensure that they do not cross the line from debate to contentious argument.

5. Society for the Neurobiology of …

I have pretty much given up on hoping that the “Language” part of the Society for the Neurobiology of Language conference will live up to its moniker. This is not to say that SNL does not have a lot of fine quality research on the neurobiology of language – in fact, it has this in spades. What I mean is that there is little focus in the conference on integrating our work with people who spend their lives trying to figure out what language is: linguists and psycholinguists. I take great value in these fields, as language theory provides a very useful guide for my own research. I don’t always take the letter of language theory in detail, but rather as inspiration for the kinds of things one might find in the brain.

This year, there were some individual exceptions to this general rule of linguistic omission at the conference. I was pleased to see some posters and talks that incorporated language theory, particularly John Hale’s talk on syntax, computational modeling, and neuroimaging. He showed that anterior and posterior temporal lobe are good candidates for basic structural processes, but not the IFG – no surprise but good to see converging evidence (see Brennan et al., 2016 for details). But, my interest in Hale’s talk only highlighted the trend towards omission of language theory at SNL that can be well illustrated by looking at the keynote lectures and invited speakers at the conference over the years.

There are essentially three kinds of talks: (i) talks about the neurobiology of language, (ii) talks about (neuro)biology, and (iii) talks about non-language communication, cognition, or information processing. What’s missing? Language theory. Given that the whole point of our conference is about the nature of human language, one would think that this is an important topic to cover. Yet I don’t think there has ever been a keynote talk at SNL about psycholinguistics or linguistics. I love dolphins and birds and monkeys, but doesn’t it seem a bit strange that we hear more about basic properties of non-human animal communication than human language? Here’s the full list of keynote speakers at SNL for every conference in the past 9 years – not a single talk that is clearly about language theory (with the possible exception of Tomasello, although his talk was about very general properties of language with a lot of non-human primate data).

Michael Petrides: Recent insights into the anatomical pathways for language
Charles Schroeder: Neuronal oscillations as instruments of brain operation and perception
Kate Watkins: What can brain imaging tell us about developmental disorders of speech and language?
Simon Fisher: Building bridges between genes, brains and language

Karl Deisseroth: Optogenetics: Development and application
Daniel Margoliash: Evaluating the strengths and limitations of birdsong as a model for speech and language

Troy Hackett: Primate auditory cortex: principles of organization and future directions
Katrin Amunts: Broca’s region -- architecture and novel organizational principles

Barbara Finlay: Beyond columns and areas: developmental gradients and reorganization of the neocortex and their likely consequences for functional organization
Nikos Logothetis: In vivo connectivity: paramagnetic tracers, electrical stimulation & neural-event triggered fMRI

Janet Werker: Initial biases and experiential influences on infant speech perception development
Terry Sejnowski: The dynamic brain
Robert Knight: Language viewed from direct cortical recordings

Willem Levelt: Localism versus holism. The historical origins of studying language in the brain
Constance Scharff: Singing in the (b)rain
Pascal Fries: Brain rhythms for bottom-up and top-down signaling
Michael Tomasello: Communication without conventions

Susan Goldin-Meadow: Gestures as a mechanism of change
Peter Strick: A tale of two primary motor areas: “old” and “new” M1
Marsel Mesulam: Revisiting Wernicke’s area
Marcus Raichle: The restless brain: how intrinsic activity organizes brain function

Mairéad MacSweeney: Insights into the neurobiology of language processing from deafness and sign language
David Attwell: The energetic design of the brain
Anne-Lise Giraud: Modelling neuronal oscillations to understand language neurodevelopmental disorders

Argye Hillis: Road blocks in brain maps: learning about language from lesions
Yoshua Bengio: Bridging the gap between brains, cognition and deep learning
Ghislaine Dehaene-Lambertz: The human infant brain: A neural architecture able to learn language
Edward Chang: Dissecting the functional representations of human speech cortex

I was at most of these talks; most of them were great, and at least entertaining. But it seems to me that the great advantage of keynote lectures is to learn about something outside of one’s field that is relevant to it, and it seems to me that both neurobiology AND language fit this description. This is particularly striking given the importance of theory to much of the scientific work I described in this post. And I can think of many linguists and psycholinguists who would give interesting and relevant talks, and who are also interested in neurobiology and want to chat with us. At the very least, they would be entertaining. Here are just some that I am thinking of off the top of my head: Norbert Hornstein, Fernanda Ferreira, Colin Phillips, Vic Ferreira, Andrea Moro, Ray Jackendoff, and Lyn Frazier. And if you disagree with their views on language, well, I’m sure they’d be happy to have a respectful debate with you.

All told, this was a great conference season, and I’m looking forward to what the future holds for the neurobiology of language. Please let me know your thoughts on these conferences, and what I missed. I look forward to seeing you at SNL 2018, in Quebec City!


Check out my website:, or follow me on twitter: @wmatchin


Berwick, R. C., & Chomsky, N. (2015). Why only us: Language and evolution. MIT press.

Berwick, R. C., Pietroski, P., Yankama, B., & Chomsky, N. (2011). Poverty of the stimulus revisited. Cognitive Science35(7), 1207-1242.

Berwick, R. C., Beckers, G. J., Okanoya, K., & Bolhuis, J. J. (2012). A bird’s eye view of human language evolution. Frontiers in evolutionary neuroscience4.

Brennan, J. R., Stabler, E. P., Van Wagenen, S. E., Luh, W. M., & Hale, J. T. (2016). Abstract linguistic structure correlates with temporal activity during naturalistic comprehension. Brain and language157, 81-94.

Chomsky, C. (1986). Analytic study of the Tadoma method: Language abilities of three deaf-blind subjects. Journal of Speech, Language, and Hearing Research29(3), 332-347.

Gentner, T. Q., Fenn, K. M., Margoliash, D., & Nusbaum, H. C. (2006). Recursive syntactic pattern learning by songbirds. Nature440(7088), 1204-1207.

Hall, M. L., Ferreira, V. S., & Mayberry, R. I. (2015). Syntactic Priming in American Sign Language. PloS one10(3), e0119611.

Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: what is it, who has it, and how did it evolve?. science298(5598), 1569-1579.

Hickok, G., Bellugi, U., & Klima, E. S. (1996). The neurobiology of sign language and its implications for the neural basis of language. Nature381(6584), 699-702.

Hickok, G., & Poeppel, D. (2000). Towards a functional neuroanatomy of speech perception. Trends in cognitive sciences4(4), 131-138.

Leonard, M. K., Ramirez, N. F., Torres, C., Travis, K. E., Hatrak, M., Mayberry, R. I., & Halgren, E. (2012). Signed words in the congenitally deaf evoke typical late lexicosemantic responses with no early visual responses in left superior temporal cortex. Journal of Neuroscience32(28), 9700-9705.

Leroy, F., Cai, Q., Bogart, S. L., Dubois, J., Coulon, O., Monzalvo, K., ... & Lin, C. P. (2015). New human-specific brain landmark: the depth asymmetry of superior temporal sulcus. Proceedings of the National Academy of Sciences112(4), 1208-1213.

Matchin, W., Villwock, A., Roth, A., Ilkbasaran, D., Hatrak, M., Davenport, T., Halgren, E. & 
Mayberry, M. (2017). The cortical organization of syntactic processing in American Sign Language: Evidence from a parametric manipulation of constituent structure in fMRI and MEG. Poster presented at the 9th annual meeting of the Society for the Neurobiology of Language.

Matchin, W., Hammerly, C., & Lau, E. (2017). The role of the IFG and pSTS in syntactic prediction: Evidence from a parametric study of hierarchical structure in fMRI. Cortex88, 106-123.

Monahan, P. J., & Idsardi, W. J. (2010). Auditory sensitivity to formant ratios: Toward an account of vowel normalisation. Language and cognitive processes25(6), 808-839.

Pallier, C., Devauchelle, A. D., & Dehaene, S. (2011). Cortical representation of the constituent structure of sentences. Proceedings of the National Academy of Sciences108(6), 2522-2527.

Parsons, T. (1990). Events in the Semantics of English (Vol. 5). Cambridge, Ma: MIT Press.

Petitto, L. A., & Marentette, P. F. (1991). Babbling in the manual mode: Evidence for the ontogeny of language. Science251(5000), 1493.

Rakic, P. (1988). Specification of cerebral cortical areas. Science241(4862), 170.

Sandler, W., & Lillo-Martin, D. (2006). Sign language and linguistic universals. Cambridge University Press.

Tang, C., Hamilton, L. S., & Chang, E. F. (2017). Intonational speech prosody encoding in the human auditory cortex. Science357(6353), 797-801.