Thursday, June 30, 2016

Two quick reads

Here are two quick pieces:

The first is on an unexpected consequence of making academic life more family friendly. Here are the facts: woman get pregnant. Men don't. Moreover, child bearing comes at an awkward time in the academic life cycle (i.e. right before tenure decision time). US universities have accommodated to this by freezing tenure clocks for those choosing to start families, in effect allowing pregnancy to lengthen the tenure clock. The NYT's piece reports on a study seeing how this worked out. The results is that it made it favored men. The reason is that the leave rule was applied equally to men and women in a family allowing both to use the lengthening provision. Men were able to use this extra time more effectively than women to burnish their research records. The result: men gained disproportionately from the new liberal leave rule. The piece also discusses ways of "fixing" this, but all in all, the result is that things get complicated.

Here's my hunch: the problem arises because of how we insist on evaluating research. There is a kind of assumption that the bigger a CV the better it is. Line items matter a too much. More is better. This really does handicap those that hit a dry patch, and given the biology of families, this means that on average women will have a tougher time of it than men if this be the criteria. We need a rethink here. Simple things like quality not quantity might help. But I suspect a better measure will arise if we shift from maximizing to satisfying: what do we consider a good/reasonable publication record. In fact, might too much publication be as bad as too little? What marks a contribution and what is just academic paper churning?

The second piece is by Frans de Waal on whether animals think. There is a line of thought (I have heard it expressed by colleagues) that denies that animals think because they identify that with having  linguistic capacity and animals don't have such. Hence they cannot think. This, btw, is a standard Cartesian trope as well; animals are machines bereft of res cogitates. De Waal begs to differ, as indeed does Jerry Fodor, who notes (quite rightly IMO) the following in LOT (1975):
‘The obvious (and I should have thought sufficient) refutation of the claim that natural languages are the medium of thought is that there are non-verbal organisms that think.’

Not only do animals think, they do so systematically. Of course, having linguistic capacity changes how you think. But then so does picking up a new word for something that you had no explicit word for. So, language affects thought, but being without language does not entail being thoughtless.

But this is not what I wanted to highlight in this piece. De Waal, one of the most important animal cognition people in the world, notes the obvious here concerning human linguistic capacity and its non continuity with what we find in animals:

You won’t often hear me say something like this, but I consider humans the only linguistic species. We honestly have no evidence for symbolic communication, equally rich and multifunctional as ours, outside our species. 
In other words, nothing does language like we do. There is no qualitative analogue to human linguistic capacity in the rest of nature. Period.

De Waal, however, makes a second important observation. Despite this unique human talent, there are    "pieces" of it in other parts of animal cognition.

But as with so many larger human phenomena, once we break it down into smaller pieces, some of these pieces can be found elsewhere. It is a procedure I have applied myself in my popular books about primate politics, culture, even morality. Critical pieces such as power alliances (politics) and the spreading of habits (culture), as well as empathy and fairness (morality), are detectable outside our species. The same holds for capacities underlying language.
There is a version of this observation that points to something like the Minimalist Program as an important project: find out which pieces are special to us that allow for our linguistic capacities and those that we share with other animals. Of course, the suggesting is that there will be pieces (or, if we are lucky, just one piece) that is special to us and that allows us to do linguistically what nothing else can. At any rate, De Waal is right: if one identifies the capacity for thought with the capacity for language then animals had better have (at least) rudimentary language. Of course, if we don't identify the two, as De Waal and Fodor urge, then there is nothing biologically untoward about one species of primate having a capacity unique among animals.

Wednesday, June 29, 2016

The end of science yet again

While incommunicado, I read another article (here) declaring the end of science as an institution, or at least declaring that it is very corrupt, maybe irreparably so and that it is chock full of perverse incentives that it would be amazing if anything at all got done right. I confess that I am getting tired of these doom declarations and I suspect that they are largely BS, though reflecting an importantly wrong view of inquiry (see the end). Or, more accurately, I have no doubt that bad work gets done and that scientists share the same character traits as other mortals and so are influenced by the possibility of fame and fortune and that this influences how they carry on their work, but, what I don’t see is any indication that we now live in a fallen state and that once there was a golden age during which truth and beauty alone motivated scientific inquiry. In other words, I have no reason to think that the quality of work done today is any different than that done before. There is just a lot more of it. Oddly, I think that the above could be considered an idiosyncratic perspective.

The author of the linked to piece above, Jerome Ravetz, thinks otherwise. He sees corruption and malpractice everywhere. He identifies the problem as the rise of industrial science where “Gesellschaft” has replaced “Gemeischaft” (love those German words; just what you need to add some suggestion of depth) and quality control has disappeared. Couple this with a very real job squeeze for scientists and “pathologies inevitably ensue,” most particularly chasing “impact” displaces the “self-sacrificing quest for scientific rigour.” This leads to “shoddy” and “sleazy” science where standards have so “slipped” and basic skills so “atrohp[ied]” that most practitioners don't know that their work is “sub-standard.” Wow! The end of the scientific days! Sciencemageddon! What crap!!

Let me say why I can’t take this junk seriously anymore. Three reasons:

First, the empirical basis of these analyses is that lots of work in many domains don’t replicate. This is what John Ionnidis (see here) showed a while back. Much work in biomedicine, social psych, and neuroscience fails to replicate. Often the numbers cited are over 50% of this work. Ok, say that this is so. What I have never heard is whether this is a big failure rate, a low one or just par for the course. In other words, what should I expect the non-replication rate to be a priori? Should we expect science in general to run at an efficiency of better than 50%? Maybe a success rate of 10% is unbelievably good (especially in domains where we really don’t know much). I have heard that Ray Bradbury is reputed to have said that 90% of everything is junk. If so, a 50% replication rate would be amazing. So, absent specifying an expected base rate, these numbers really don’t mean much. They should serve to warn the uninitiated from taking every reported experiment at face value, but for any working scientist, this should not be news.

Second, I see no evidence that things have gotten worse in this regard. When I was a tot, I hazarded to read some of the earliest proceedings of the Royal Academy of Science. This was fun stuff. Many reports of odd creatures and other funny findings. In fact, aside from its entertainment value, most of this is, from our point of view, junk. Entertaining junk I grant you, but really not of current scientific value. My suspicion is that the ration of important science and, ahem, questionable stuff is always roughly as it was then, at least in many domains. The stuff that survives and we covet is the smallest good bit sitting on a huge pile of detritus. Maybe it takes lots of junk to make the good stuff (who knows). But, I have seen nothing credibly arguing that today we produce more garbage than yesteryear. But if the problem with science today is that it is science done today viz. big Gesselchaft science where impact rather than truth is the lure then we should expect that all that ails us once did not. Color me very very skeptical.

Third, I am pretty sure that lots of the shoddiness is localized in particular areas. The logic of the modern decline should extend to the successful sciences and not be limited to the aspirational domains. So, do we have evidence that current practice in physics is shoddier? Do physicists rush to publish shoddy work more now than they did? Does the regular experimental submission to the Physical Review only replicate at a rate of 50%? How about cell biology or molecular genetics or chemistry? These are surely subject to the very same pressures as any other domain of scientific inquiry and, moreover, we have ways of comparing what is done now with what they did before because these fields have a before that we can compare our decline with. I, at least, have not heard of these domains suffering a replicability crisis. But if they are not, then it is very unlikely that the generic incentives that the doomsayers like Ravetz like to point to are really behind the problems in domains like social psych, neuro-science or bio-medicine. Not that there isn’t a clear difference between the “real” sciences and the ones most often criticized. There is, and here it is: we know something in physics and chemistry and genetics and cell biology and we know much less in the “crises” riven areas. We have real theories in physics and genetics and cell biology, theories that tell us something about the basic underlying mechanisms. This is not true in much of what we call “science” today. The problem with the poorer performing areas is not that the practitioners are shoddy or corrupt or venal or less skilled than their predecessors. The problem is that we are still largely ignorant of the basic lay of the causal land in these domains, and this is because we lack ideas not methods.[1] All of this brings me to my familiar refrain.

What really bugs me about this “end of science” doomology is the presupposition that failure to gain insight must be due to failure to apply the methods of inquiry correctly. The assumption is that there is such a method and that the only reason that we are failing (if we are) is that we are not applying it correctly. Given that the method is presupposed to be clear and simple (as well as domain general) then the only reason that it is not being applied right must be due to the personal failings of the investigators. They whore after the wrong gods: fame, fortune, impact. But this presupposition is poddle poop. There is no method and no mechanical recipe for inquiry that guarantees success if only conscientiously applied. The idea that there could be one is one of the most baleful legacies of Empiricism.

Empiricism is dedicated to the idea that what there is largely reflects what you can see. It’s all there in front of view if only you look carefully. There are no hidden forces or mechanisms. Empiricism is built on a faith in shallow explanation reflecting a surfacy metaphysics.[2] In such a world, the idea that there exists a Scientific Method makes sense, as does the idea that failure to apply it must reflect some kind of pathology. If you think that science is hard because it involves trying to locate hidden structure that is only imperfectly reflected in what you can see (experimentally or otherwise) then a high failure rate is to be expected until you discover ways of thinking (theory) that tracks this underlying reality. Only then might you be able to avoid being misled by what you see.

Let me finish with one more observation. Part of the doomsaying has been prompted by scientific over reach (aka scientism). Scientists love to preen that they are just motivated by the facts and not swayed by vulgar conceptions the way regular folk are. Scientists are hard headed and deserve respect, kudos, reward and deference because science has a method to check itself and so when science speaks it’s not just opinion. In other words, when it suits us, we scientists often trumpet the Scientific Method to claim deference. It’s what makes us “experts” and expertise trumps mere opinion. But if there is no Scientific Method then there is no expertise in virtue of having been produced by such a method. There is expertise, but it must be harder won and it is very limited. If this is right, then there is a lot less science out there than is advertised, and a lot less expertise than generally claimed. And that suits me just fine. That’s a doomsday I can both live with and rejoice in. That, however, has nothing to do with the status of actual scientific inquiry. It’s doing no worse than before, muddling along and saddled with all the problems it has always had.

[1] Though, this said, does anyone think that bio-medicine has made no progress? Would you rather give up modern methods for those from 1990, or 1950, or 1920? I am pretty sure that I wouldn’t. This would suggest that despite all the problems, at least in bio-medicine, we have learned something useful, even if many basic mechanisms are opaque.
[2] I stole “shallow explanation” from Chomsky who used this in another (related) context.

Monday, June 20, 2016

Classical case theory

So what’s classical (viz. GB) Case Theory (CCT) a theory of? Hint: it’s not primarily about overt morphological case, though given some ancillary assumptions, it can be (and has been) extended to cover standard instances of morphological case in some languages. Nonetheless, as originally proposed by Jean Roger Vergnaud (JRV), it has nothing whatsoever to do with overt case. Rather, it is a theory of (some of) the filters proposed in “Filters and Control” by Chomsky and Lasnik (F&C).

What do the F&C filters do? They track the distribution of overt nominal expressions. (Overt) D/NPs are licit in some configurations and not in others. For example, they shun the subject positions of non-finite clauses (modulo ECM), they don’t like being complement to Ns or As, nor complements to passivized verbs. JRV’s proposal, outlined in his famous letter to Chomsky and Lasnik, is that it is possible to simplify the F&C theory if we reanalyze the key filters as case effects; specifically if we assume that nominals need case and that certain heads assign case to nominals in their immediate vicinity. Note, that JRV understood the kind of case he was proposing to be quite abstract. It was certainly not something evident from the surface morphology of a language. How do I know? Because F&C filters, and hence JRV’s CCT, was used to explain the distribution of all nominals in English and French and these two languages display very little overt morphology on most nominals. Thus, if CCT was to supplant filters (which was the intent) then the case at issue had to be abstract. The upshot: CCT always trucked in abstract case.

So what about morphologically overt case? Well, CCT can accommodate it if we add the assumption that abstract case, which applies universally to all nominals in all Gs to regulate their distribution, is morphologically expressed in some Gs (a standard GG maneuver). Do this and abstract case can serve as the basis of a theory of overt morphological case. But, and this is critical, the assumption that the mapping from abstract to concrete case can be phonetically pretty transparent is not a central feature of the CCT.[1]

I rehearse this history because it strikes me that lots of discussion of case nowadays thinks that CCT is a theory of the distribution of morphological case marking on nominals. Thus, it is generally assumed that a key component of CCT assigns nominative case to nominals in finite subject positions and accusative to those in object slots etc. From early on, many observed that this simple morphological mapping paradigm is hardly universal. This has led many to conclude that CCT must be wrong. However, this only follows if this is what CCT was a theory of, which, I noted above, it was not.

Moreover, and this is quite interesting actually, so far as I can tell the new case theorists (the ones that reject the CCT) have little to say about the topic CCT or C&F’s filters tried to address. Thus, for example, Marantz’s theory of dependent case (aimed at explaining the morphology) is weak on the distribution of overt nominals. This suggests that CCT and the newer Morphological Case Theory (MCT) are in complimentary distribution: what the former takes as its subject matter and what the latter takes as its subject matter fail to overlap. Thus, at least in principle, there is room for both accounts; both a theory of abstract case (CCT) and a theory of morphological case (MCT). The best theory, of course, would be one in which both types of case are accommodated in a single theory (this is what the extension of the CCT to morphology hoped to achieve). However, were these two different, though partially related systems this would be an acceptable result for many purposes.[2]

Let’s return to the F&C filters and the CCT for a moment. What theoretically motivated them? We know what domain of data they concerned themselves with (the distribution of overt nominal).[3] But why have any filters at all?

F&C was part of the larger theoretical project of simplifying transformations. In fact, it was part of the move from construction based G rules to rules like move alpha (MA). Pre MA, transformations were morpheme sensitive and construction specific. We had rules like relative clause formation and passive and question formation. These rules applied to factored strings which met the rules’ structural conditions (SD). The rules applied to these strings to execute structural changes (SC). The rules applied cyclically, could be optional or obligatory and could be ordered wrt one another (see here for some toy illustrations). The theoretical simplification of the transformational component was the main theoretical research project from the mid 1970s to the early-mid 1980s. The simplification amounted to factoring out the construction specificity of earlier rules, thereby isolating the fundamental displacement (aka, movement) property. MA is the result. It is the classical movement transformations shorn of their specificity. In technical terms, MA is a transformation without specified SDs or SCs. It is a very very simple operation and was a big step towards the merge based conception of structure and movement that many adopt today.

How were filters and CCT part of this theoretical program? Simplifying transformations by eliminating SDs and SCs makes it impossible to treat transformations as obligatory. What would it mean to say that a rule like MA is obligatory? Obliged to do what exactly?  So adopting MA means having optional movement transformations. But optional movement of anything anywhere (which is what MA allows) means wildly overgenerating. To regulate this overgeneration without SDs and SCs requires something like filters. Those in F&C regulated the distribution of nominals in the context of a theory in which MA could freely move them around (or not!). Filters make sure that these vacate the wrong places and end up in the right ones. You don’t move for case strictly speaking. Rather the G allows free movement (it’s not for anything as there are no SDs that can enforce movement) but penalizes structures that have nominals in the wrong places. In effect, we move the power of SDs and SCs from the movement rules themselves and put them into the filters. F&C (and CCT which rationalized them) outline one type of filter, Rizzi’s criterial conditions provide another variety. Theoretically, the cost of simplifying the rules is adding the filters.[4]  

So, we moved from complex to simple rules at the price of Gs with filters of various sorts. Why was this a step forward? Two reasons.

First, MA lies behind Chomsky’s unification of Ross’s Islands via Subjacency Theory (ST) (and, IMO, is a crucial step in the development of trace theory and the ECP). Let me elaborate. Once we reduce movement to its essentials, as MA does, then it is natural to investigate the properties of movement as such, properties like island sensitivity (a.o.). Thus, ‘On Wh Movement’ (OWM) demonstrates that MA as such respects islands. Or, to put this another way, ST is not construction specific. It applies to all movement dependencies regardless of the specific features being related. Or, MA serves to define what a movement dependency is and ST regulates this operation regardless of the interpretive ends the operation serves, be it focus or topic, or questions, or relativization or clefts or… If MA is involved, islands are respected. Or, ST is a property of MA per se, not the specific constructions MA can be “part” of.[5]

Second, by factoring out MA form movement transformations and replacing SDs/SCs with filters focuses on the question of where these filters come from? Are they universal (part of FL/UG) or language specific? One of the nice features of CCT was that it had the feel of a (potential) FL/UG principle. CCT Case was abstract. The relations were local (government). Gs as diverse as those found in English, French, Icelandic and Chinese looked like they respected these principles (more or less). Moreover, were CCT right, then it did not look like easily learnable given that it was empirically motivated by negative data. So, simplifying the rules of G led to the discovery of plausible universal features of FL/UG. Or, more cautiously, it led to an interesting research program: looking for plausible universal filters on simple rules of derivation.[6]

What should we make of all of this today in a more minimalist setting? Well, so far as I can tell, the data that motivated the F&C filters and the CCT, as well as the theoretical motivation of simplifying G operations, is still with us. If this is so, then some residue of the CCT reflects properties of FL/UG. And this generates a minimalist question: Is CCT linguistically proprietary? Why Case features at all? How, if at all, is abstract case related to (abstract?) agreement? What is anything relates CCT and MTC? How is case discharged in a model without the government relation? How is case related to other G operations? Etc. You know the drill.[7] IMO, we have made some progress on some of these questions (e.g. treating case as a by product of Merge/Agree) and no progress on others (e.g. why there is case at all).[8] However, I believe research has been hindered, in part, by forgetting what CCT was a theory of and why it was such a big step forward.

Before ending, let me mention one more property of abstract case. In minimalist settings abstract case freezes movement. Or, more correctly, in some theories case marking a nominal makes it ineligible for further movement. This “principle” is a reinvention of the old GB observation that well formed chains have one case (marked on the head of the chain) and one theta role (marked on the foot). If this is on the right track (which it might not be) the relevant case here is abstract. So, for example, a quirky subject in a finite subject position in a language like Icelandic can no more raise than can a nominative marked subject. If we take the quirky case marked subject to be abstractly case marked in the same way as the nominative is, then this follows smoothly. Wrt abstract case (i.e. ignoring the morphology) both structures are the same. To repeat, so far as I know, this application of abstract case was not a feature of CCT.

To end: I am regularly told that CCT is dead, and maybe it is. But the arguments generally brought forward in obituary seem to me to be at right angles to what CCT intended to explain. What might be true is that extensions of CCT to include morphological case need re-thinking. But the original motivation seems intact and, frow what I can tell, something like CCT is the only theory around to account for these classical data.[9] And this is important. For if this is right, then minimalists need to do some hard thinking in order to integrate the CCT into a more friendly setting.

[1] Nor, as I recall, did people think that it was likely to be true. It was understood pretty early on that inherent/quirky case (I actually still don’t understand the difference, btw) does not transparently reflect the abstract case assigned. Indeed, the recognized difference between structural case and inherent case signaled early on that whatever abstract case was morphologically, it was not something easily read off the surface.
[2] Indeed, Distributed Morphology might be the form that such a hybrid theory might take.
[3] Actually, there was a debate about whether only overt nominal were relevant. Lasnik had a great argument suggesting that A’-traces also need case marking. Here is the relevant data point: * The man1 (who/that) it was believed t1 to be smart. Why is this relative clause unacceptable even if we don’t pronounce the complementizer? Answer: the A’-t needs case. This, to my knowledge, is the only data against the idea that case exclusively regulates the distribution of overt nominal expressions. Let me know if there are others out there.
[4] Well, if you care about overgeneration. If you don’t, then you can do without filters or CCT.
[5] Whether this is an inherent property of movement rather than, say, overt movement, was widely investigated in the 1980s. As you all know, Huang argued that ST is better viewed as an SS filter rather than part of the definition of MA.
[6] I should add, that IMO, this project was tremendously successful and paved the way for the Minimalist Program.
[7] The two most recent posts (here and here) discuss some of these issues.
[8] Curiously, the idea that case and agreement are effectively the same thing was not part of CCT. This proposal is a minimalist one. It’s theoretical motivation is twofold: first to try to reduce case and agreement to a common “mystery,” one being better than two. Second, because if case is a feature of nominals then probes are not the sole locus of uninterpretable features. Case is the quintessential uninterpretable feature. CCT understood it to be a property of nominals. This sits uncomfortably with a probe/goal theory in which all uninterpretable features are located in probes (e.g. phase heads). One way to get around this problem is to treat case as by-products of the “real” agreement operation initiated by the probe.
            From what I gather, the idea that case reduces to agreement is currently considered untenable. This does not bother me in the least given my general unhappiness with probe/goal theories. But this is a topic for another discussion.
[9] Reducing nominal distribution to syntactic selection is not a theory as the relevant features are almost always diacritical.

Saturday, June 18, 2016

Modern university life

Here are two articles on modern academic life that might interest you.

The first is on grad student unionization. I have heard many people argue that grad student unions would severely negatively affect the mentor-mentee relation that lies at the heart of grad education. How? By setting up an adversarial relationship between the two mediated by a bureaucracy (the union) whose interest is not fully in line with that of the grad student. I have never been moved by this, but I have been moved by the observation that grad student life if currently pretty hard with a less than stellar prospect of landing a job at the end (see here for some discussion). The piece I link to goes over these arguments in some detail. His conclusion is that the objections are largely bad. However, even where it true that grad student unions would change the prof-student mentoring relationship, it is not clear to me that this would not be a cost worth bearing. Grad students are in an extremely exploitable position. This is when unions make sense.

The second piece is about how the composition of university personnel has changed over the last several years. If confirms the observation that tenure track faculty has shrunk and that part-time faculty has risen. But, it notes that the problem is likely not the growth in admin people or other non-prof personnel. It seems that this group has stayed relatively stable. This said, the paper does not investigate funding issues (are non-profs sucking up more of the money than the used to?) nor does it discuss how much money at universities is now being diverted from the core missions of teaching and research to the “entertainment” part of current university life (i.e. new gym facilities, art centers, fancy dorms, support staff for entrepreneurship, etc.). Here is the conclusion. I will keep my eye out for the promised sequel.

The results of this analysis suggest that the share of employees at colleges who are administrators has not been much higher in recent years than it was in 1987. There has been growth, though, in the other professionals employment category. This growth is potentially related to a growth of amenities and other programs outside of the teaching and research that have been the traditional focus of colleges and universities, although this is difficult to ascertain due to the broad nature of this category. An additional result in the analysis is that the share of faculty who are full-time employees has been declining. This decline has occurred within the public sector, the private sector, and the for-profit sector.

One limitation of the analysis here is that it considers only employment and not spending on salaries, amenities, or anything else. However, I plan to address spending by colleges and universities in a future Economic Commentary.