The trouble with theory

Don’t get me wrong. Without theory, we would not know what to look for, or why. We would not make predictions about patterns in nature. I would not have known to put unique marks on wasps, or to look at genetic relatedness. I would not have known where to expect sequence variation in genomes. I would not have known to predict that in social amoebae the altruism caste is environmentally not genetically determined.Theory gives us the delight of expectations confirmed. It tells us just which stone to turn over. It allows us to advance, to generalize, to take ideas from one area and apply them to another.

Theories do not always prove to be true. Theories can be shown to be false in a variety of ways. They may be logically inconsistent. They may not fit with other theories that have been extensively shown to be true. They may not fit with the data.

The data. There’s the rub. The point of theory is to tell us how the real world works. Theories may tell you what observations to make, but once you have observed, theory will predict what patterns you will find. If the data points do not fall in the directions predicted by the theory, the theory must be rejected. Of course sample sizes need to be adequate and statistics need to be unbiased and properly applied. One study failing to support a grand theory will not torpedo it, but many such studies will.

If you tell me that your theory must be true, so you modify it slightly every time data-based studies challenge the theory, then it is not a theory. It is a religion. A theory must predict pattern in the natural world. A theory must point to kinds of data that will test it. Theory has a close relationship to data. The best theories make clear, specific predictions. Data can be collected that test the theories, either supporting or falsifying them.

This is where we get to the trouble with theory. In a way, it is really the trouble with models. Theoreticians unfettered with any knowledge of reality see only the models. The math of one may be like the math of another, and preferred for some reason or another. They may substitute purple for red because they like it better, untroubled that in making the substitution they have made their model untestable. Often these theoreticians invade from physics or math. They are amazed at the mathematical simplicity of many of our models. They do not understand why we do not soar on their mathematical balloons. They rewrite and change with facility, often ignoring that they have buried key features of biology in their models. They may write a sib-based model, but hide or ignore this feature because it is no more important to them than any other mathematical fillip. They think only in the world of models, disdaining the hard-working empiricists.

You may think I’m being too harsh, but my own field of social evolution has had more than its fair share of such people recently. How many of us have had to waste time that could be spent advancing the field on this silliness? This is not to say there are not pilgrims from math or physics that come into the field and do real good, advancing important ideas. The difference is that they pay attention to the data side of things. They think about how models might be tested. They care about testing. They care about the real, natural world. They do not have a model with nowhere to take it.

In my book, the best theoreticians are grounded in data, using it as a springboard, and as a place they can return to for truth-testing their ideas. Charles Darwin is an obvious example of such a theoretician. There are lots of others and their ideas last. They are tested. They matter.

Advertisements

About Joan E. Strassmann

Evolutionary biologist, studies social behavior in insects & microbes, interested in education, travel, birds, tropics, nature, food; biology professor at Washington University in St. Louis
This entry was posted in New ideas and tagged , , , , . Bookmark the permalink.

43 Responses to The trouble with theory

  1. Jeremy Fox says:

    I’d suggest that theory is actually useful for all kinds of purposes. Making predictions about empirical data is an important one, but it’s far from the only important one. You identify some of these other purposes in your first paragraph (e.g., suggesting to the empirical investigator what questions to ask and what variables to measure), but there are others, some of which actually have no connection to data at all (for instance, “What’s the simplest model that can produce behavior X?” is a purely theoretical question, which has value in a theoretical context even if that simplest model is empirically unrealistic, and even if that behavior is never observed). For discussion of the many uses of models, I recommend Amy Hurford’s excellent blog Just Simple Enough: The Art of Mathematical Modelling (http://theartofmodelling.wordpress.com/), especially this post: http://theartofmodelling.wordpress.com/2012/03/08/making-the-list-checking-it-twice/ and the references therein.

    In my experience, one particularly important role of mathematical models is as a check on our logic and intuitions. Expressing your ideas about how the world works in a mathematical model forces you to explicitly and precisely specify your assumptions, and to logically derive all of the implications of those assumptions. Often, in attempting to mathematically formalize your ideas about how the world works, you discover that you were making unrecognized assumptions, or that your assumptions do not actually imply the predictions you thought they implied. In my experience, it is extremely common for even very smart scientists to make unrecognized assumptions and logical errors when reasoning purely verbally. This is a reflection of how difficult logical reasoning of any complexity is. For instance, the most important and influential ideas in ecology about how disturbances and fluctuating environmental conditions affect species diversity are logically flawed (http://oikosjournal.wordpress.com/2011/06/17/zombie-ideas-in-ecology/).

    Your post emphasizes that models are often false. True enough, but Ithis observation does not have the purely-negative implications for theory that many empirically-minded scientists seem to think it has. Philosopher of science William Wimsatt has an excellent discussion, grounded in examples from the history of genetics, on the many ways in which false models can help empiricists learn about the world (available here: http://mechanism.ucsd.edu/teaching/models/Wimsatt.falsemodels.pdf). His main points are: models can be false in many different ways, models are often useful *because* they are false rather than despite being false, and different ways of being false are useful for different purposes and in different contexts. I wish more empirically-minded scientists would recognize these points and take them to heart.

    Your post also bemoans the influence of empirically-uninformed modelers from fields like physics. Without wanting to contradict this (you know your own field far better than I do), I would note that such “outsiders” often have made fundamentally important advances in other fields. For instance, in my own field of ecology, ecologists before 1972 widely assumed that “diversity” begets “stability” in ecological communities. Then an outsider, Robert May (trained as a physicist) came along and pointed out that this was nothing more than woolly thinking and unexamined intuition. In fact, in the simplest models, diversity actually inhibits rather than promotes stability. This work forced ecologists interested in this topic to start precisely defining their terms (which they had never done before, hence much of the woolly thinking), and to start thinking hard (for the first time!) about why real natural systems might deviate from the behavior of the simplest models, the “baseline case”.

    In my experience, many empirically-minded scientists dislike particular theoretical models, or theoretical models in general, because they misunderstand the purpose of those models. I don’t have the impression that you fall into this trap, but it would be a shame if readers who are so inclined were to misread your post as reinforcing their own misunderstandings of the many uses of theory. Again, the linked post from Amy Hurford’s blog is a great source for papers discussing this issue.

  2. Thanks for the thoughtful comments. There is certainly a lot of value in theory and in models, when they make clear, testable predictions about the real world.

    • Jeremy Fox says:

      That’s a very polite way of saying you don’t buy my comment at all! 😉

      Let me try to make some of those points more concretely, in the context of your own field. Wwhat real-world, testable predictions does the Price equation make? I’d say none. But I’d argue that the Price equation has still been very useful, even to empiricists, because it provides things like conceptual clarification and unification. Or consider simple game theoretical models like the Hawk-Dove game. Is the value of such models really in, or even mostly in, the testable predictions they make about specific real-world systems? (I doubt individuals of any real-world species literally play the Hawk-Dove game…)

      • Jeremy Fox says:

        And speaking of game theory, a John Maynard Smith line is relevant here:

        “Mathematics without natural history is sterile, but natural history without mathematics is muddled.”

  3. Markku K. says:

    Are you at all familiar with economics? I am myself a mathematician, to some degree familiar with both evolutionary biology and economics, and I must say that you biologists are still having a good time.

    Economics is ultimately based on models, and the canonical models are untestable, and most economists even think that they should not be tested, because the axioms are correct (as if), and rest is maths. Please, don’t let biology come that way. Throw us out, if you need to.

    • Oh, yes, I am very familiar with economics, because my father, my sister, and my daughter all trained in that discipline. But they specialized in development, feminism, and health/education, fields where data matters. My daughter moved to data rich sociology. Biology has not reached the absurdity of econometrics. In fact, we are in sore need of new theory to explain the deluge of data we get from things like genome sequences. Thanks for the warning!

    • Julián García says:

      Markku, I was trained in economics but happen to work in evolutionary biology (because I am interested in ultimate causes of human behavior). There is a very important distinction that is not made in Joan’s argument, and matters a lot (at least in economics, and also in physics). One thing is economic theory, where questions are of the kind that Jeremy discusses, e.g., given that these and these are the assumptions, what would we expect for an outcome. This is very important in generating hypothesis, and usually is based on assumptions that are not realistic but very useful in checking rigorously how ideas are organized. One example of this kind of models is ‘the rational actor’ in economics.

      Another thing is model fitting. This tackles questions of the form: given a model, does it fit real data? Most of the questions here compare how different models fare in the real world. The main issue here is discarding models that are not good. But what is important in the context of the discussion is that these are other sort of models, formulated on the basis of theory, and specifically designed to be tested statistically.

      The two types of questions are of course related, but they develop somehow separately (we know from biology that division of labour has many advantages!). They are different questions! My feeling as an outsider that has tried to read the biological literature is that these two different lines of work are somehow conflated in biology. Therefore you end up with theoretical papers that feature, for instance, regression coefficients (essentially about fitting data! not making predictions). This kind of mix-up is nowhere to be seen in other areas (like physics, or economics). See the example discussed here: http://j.mp/MwZkI1 (PDF) and the two types of questions discussed here: http://dx.doi.org/10.1016/j.jtbi.2005.04.026

      Markku’s assertion that most economic models are untestable ignores the now standard approach of behavioral economics. See for instance the book by Camerer (behavioral game theory) to see how economic theories are actually tested, and to see how theory and (statistical) models combine in order to produce an accurate picture.

  4. Erol Akcay says:

    Interesting post, Joan. I think Jeremy has said it well: theory is more than just a tool for giving empirical studies something to measure.

    But I want to highlight how factually wrong, and very wrong-headed you post is. I’m a theoretician, and know lots of theoreticians. Almost none (in fact, none that I know personally) operate like the caricature you are drawing. Maybe we aren’t full time naturalists knowing every species in our favorite beetle genus, but none of us are “unfettered with any knowledge of reality” when we do our models, and all of us are quite eager to work with empiricists both in model development and testing. Of course, many a theoretician has moved into empirical territory themselves.

    As for the theoretician who won’t accept they are wrong: I’m sure there are intransigent theory types out there, but god knows sticking to a discredited theory is hardly endemic to, or even prevalent among theoreticians.

    I’m sure you have particular people in mind that you think are guilty of all these sins. Why not call them out and make your case with evidence? Why slander the entire theoretical profession with unsubstantiated claims about people’s knowledge and motivation, served with this hot rhetoric? What is a graduate student in a department without a full-time theoretician (i.e., most schools) supposed to take away from this post? That theoreticians are people with questionable motives who should only be trusted as far as one can throw them? How is this going to help to the productive interplay between theory and data that we all (yes, all) want to promote?

    If you have problems with the way people operate, call them out (that would also give them a chance to respond). Otherwise, posts like this will simply reinforce an arbitrary and counterproductive disciplinary barrier.

    • Jeremy Fox says:

      I wasn’t going to say anything (unusual for me, I know 😉 ), but I’m with Erol on this: if you have specific people in mind, you should name them. I don’t think doing so would be rude at all–if anything, it would be more polite because it would give them an opportunity to respond. I agree with Erol that otherwise many readers, especially students, are likely to take away from your post a different and broader message than you intended, and a different and broader message than your post justifies.

      On the other hand, if you don’t have specific people in mind and really do mean your post to apply quite broadly, well, as I’ve already indicated I’m afraid I’ll have to respectfully but strongly disagree. It sounds like you are overgeneralizing from your own experience.

      And by the way, I make all these comments as an empiricist myself. Most of what I and my students do involves collecting and analyzing our own data.

    • Jeremy Fox says:

      And on the subject of people not admitting they’re wrong, I highly doubt that sin is any more common among theoreticians than among empiricists. Here, for instance, is a recent striking example of some of the most famous empirical plant ecologists in the world embarrassing themselves by casting about for any flimsy excuse to avoid admitting they were wrong:

      http://oikosjournal.wordpress.com/2012/03/22/trying-to-save-a-zombie-idea/

      • Jeremy, I like your comments. I’m a modeller and I’m forever saying models are wrong. They are supposed to be wrong, in that models are supposed to be an imperfect representation of reality. Why? Because reality is too hard to comprehend without some sort of simplification. That simplification can be done in one’s head, by writing a description of how reality operates, by collecting some data about reality while holding some aspects constant and ignoring variation in other aspects, or by writing a mathematical model. All of these approaches (and others) to simplifying reality can have different levels of detail, and all are potentially valuable. The interesting aspects lie where the different approaches agree and disagree. But if a modeller, or anyone else, thinks models are anything but wrong, they are kidding themselves.

  5. Pingback: Ecology blog roundup « theoretical ecology

  6. Owen says:

    Joan always states her case forcefully and clearly, so you can tell her if you think she’s wrong. If she made a generalization, she probably meant to.

    I think Joan said a few things that are either false or “fuzzy,” and a couple of things that are absolutely true.

    Things she said I think are false:

    “If you tell me that your theory must be true, so you modify it slightly every time data-based studies challenge the theory, then it is not a theory. It is a religion.”

    This seems not only false but an absolute misrepresentation of reality. If it were true, then the theory of natural selection is a religion because it has been modified and extended in various ways. For example, it was extended by Darwin and then Hamilton to explain altruism. Most neodarwinists furthermore accept natural selection as the primary truth of evolution. The argument could be made that a main difference between a scientific theory and a religious doctrine is that the former can be modified, extended, and revamped in order to explain additional phenomena.

    “Often these theoreticians invade from physics or math. They are amazed at the mathematical simplicity of many of our models.”

    If we are talking about Nowak et al. and the like, such authors often argue that inclusive fitness is too complicated in how it calculates fitness. Like John Maynard Smith (a former engineer) made this argument in an interview with Richard Dawkins, where he said “it is a swine to calculate” and “I don’t understand why he drew back from the 63 model and presents his very much more difficult model.” He clearly didn’t understand the heuristic value of an “inclusive fitness” as opposed to “genic selection (http://www.webofstories.com/play/7292?o=S&srId=255653).” He thought Hamilton did it because he was a student of Fisher and that he thought Fisher’s fundamental theorem was “the holy grail.” Of course, the alternative is that Hamilton foresaw the heuristic value of an organism-level selection approach, particularly for empiricists and naturalists who think most of the time about organisms.

    “If the data points do not fall in the directions predicted by the theory, the theory must be rejected… One study failing to support a grand theory will not torpedo it, but many such studies will.”

    This is can be true. However, in other cases anomalies build up for some time, and yet this is not ground for rejection until there is a better theory available. In the face of anomaly with no alternative, people usually just question the experiments.

    Fuzzy:

    “Theories do not always prove to be true.”

    Actually, theories never prove to be true, because theories cannot be proven (at least, that is pretty much the consensus).

    “The point of theory is to tell us how the real world works.”

    But sometimes the only way to do this is by first thinking about fantasy worlds. Take the famous Fisher quote, ‘No practical biologist interested in sexual reproduction would be led to work out the detailed consequences experienced by organisms having three or more sexes; yet what else should he do if he wishes to understand why the sexes are, in fact, always two?’

    “Theories can be shown to be false in a variety of ways They may be logically inconsistent. They may not fit with other theories that have been extensively shown to be true.”

    First, I don’t see why we have to pretend like theories are falsified, when in reality what happens is one theory gradually replaces another. Falsification implies rejection of a theory based on a refutation. This is almost never the case, even in physics, where theories are more easily falsified (owing to more precise predictions).

    Second, this does not support the argument that the role of all theory is only to produce testable hypotheses. Progress can be made if a mathematician probes the internal consistency of a model. This exercise is useful even if not immediately producing a testable hypothesis. Further, searching for consilience, or lack thereof, between theories is important way to support or probe theories. This is a purely theoretical endeavor.

    “How many of us have had to waste time that could be spent advancing the field on this silliness?”

    We can ask if there is anything good to come from this (Nowak et al.) controversy. I think some of the papers that discuss the assumptions of inclusive fitness in detail, which came as a response, are useful. Some of the assumptions have been discussed only in a few, somewhat obscure places. For example, I never fully grasped the different meanings of the assumption of “additivity” until I read one of these papers. After reading that paper and the references cited therein (Gardner, et al. 2011 and Frank 1998), I realized the term “additivity” is used in at least three distinct ways: 1) additivity of the fitness effects of helping behavior, as stems from Hamilton’s original formulation but not some more recent ones 2) additivity of effects of an allele in linear regression model vs. non-additive effects, and 3) the assumption of “constancy of average fitness effects” and hence a lack of change of non-additive effects with time (as might result if there is only additivity, hence “additivity,” or if the relevant non-additive factors do not change).

    Another good thing to come of it was the new phylogenetic analyses and discussions of the importance of monogamy for the evolution of eusociality. This work was definitely stimulated by Wilson. The problem for Wilson, though, was that his goal was to provide support for the importance of ecological factors. Alas, he would have been better off arguing that ecological factors are unimportant, so that people would test those.

    I think absolutely true:

    “Without theory, we would not know what to look for, or why.”

    “Theory gives us the delight of expectations confirmed. It tells us just which stone to turn over. It allows us to advance, to generalize, to take ideas from one area and apply them to another.”

  7. Jeremy Fox says:

    Oh, and as for Darwin as a model theoretician, grounded in natural history and sticking close to the data: Darwin was widely criticized in his own time for engaging in ungrounded speculation and for straying from the tried-and-true path of inductive generalization based on observational data. That isn’t to say Darwin didn’t care about natural history, or empirical data more generally (far from it!), but merely to point out that he was actually much *less* “grounded” in “data” than his many natural historian colleagues. He and Wallace were both inspired to think of natural selection not just, or even primarily, by their knowledge of natural history or any other “data”, but by reading Malthus (an economist). It’s also important to keep in mind what kind of data Darwin paid attention to. He himself placed much weight on data from *highly* artificial systems *much* removed from nature, in particular domestic plant and animal breeding. And over the course of his life, he probably devoted more time to experimenting on plants and animals at home than he did out in the field (though of course he also corresponded with many others about their field work). He even collaborated with mathematicians, for instance collaborating with a geometer to figure out how honeybees could build hexagonal honeycombs by following simple behavioral rules. So before we hold Darwin up as a model of what a theorist ought to be, let’s make sure we’re clear on exactly what he was.

    Joan: I know at this point I’m probably just piling on, but I find it hard to help it. I usually find myself nodding along in agreement with almost everything you write, and so it really struck me to encounter a post that just seems so off-base for reasons that have been much discussed in many venues. I’m totally speculating here, but like Owen I’m guessing this post is born primarily from your dislike of Nowak et al. I don’t like that paper either, at all. But if that speculation is at all close to the mark, then I really do think you’re badly over-generalizing from one rather atypical example of theoretical work in ecology and evolution.

    I hope you’ll take the time to respond to the comments you’ve been getting at greater length and detail than you have so far, and would read such a response with great interest.

  8. Don’t get me wrong. Did someone say that? I’m taking a great on-line course in creativity from http://www.tctc4.me/ and one of the things they say is important is to defer judgement and to try to take the perspective of others. Then, once all the data are in, consolidate. So that and the intervention of real life now I’m back in wonderful St. Louis caused my recent silence.

    Hmm, would you ever read a blog entry entitled “The power of theory” or would you decide it is too boring to bother? I suppose in fairness I now need to write something on the trouble with data. Seriously, let’s work through the comments.

    Jeremy Fox http://oikosjournal.wordpress.com/ begins with his usual thoughtful and detailed response. I agree with most of it, and in particular like Amy Hurford’s blog. Look at her Fig. 1 on the importance of models http://theartofmodelling.wordpress.com/2012/01/01/overview/. Her first step is Biologically important question, and she ends with Conclusions with biological relevance. We need more modelers like Amy. Why did Jeremy think I wouldn’t like Amy’s work? Did I come across as against all modeling? That is not my view at all. But I would say I have run across quite a few modelers less interested in biological relevance and testability than Amy.

    Furthermore, I love the precision and clarity of mathematics. I am no enemy of clear thinking. Let’s just be really focused on what the assumptions of the model are, what they address and what they do not address. Your example from Lord May of Oxford is one where a field-migrating theoretician did something very useful. So, there are useful models and useless models. It is my distinct impression that useless models get published in higher tier journals than useless empirical results. This may be wrong, it may be many things, but it is my impression.

    Early in my career I visited a much older biologist in south Texas. He was a wasp expert. He could pluck a Polistes nest off a building, and hold it calmly as the furious wasps stung him repeatedly. I saw him pick up a velvet ant and let it sting him repeatedly under the thumb nail. His thumb swelled and turned red. He had a big trash can where he put wasp nests he had harvested. They were still partly alive, and had hatching workers crawling around, confused, if their brains or sub-esophogeal ganglia are large enough for such an emotion. I wanted to know why he did what he did. He wanted to know what I wanted to learn about the wasps. He had biomass. I had theory. So I knew what to do with my wasps.

    OK, how about the Price equation? This is a theorem that has been incredibly useful to evolutionary biology http://en.wikipedia.org/wiki/Price_equation . With it breeding values, and inclusive fitness, to name just two applications, are much more easily understood and measured. I love the Price equation. Obviously I was unclear if you think I don’t like the Price equation. Maybe the problem is that I appeared to mean small sense direct applicability on how many fish to count today rather than more general sense applicability, say, on why fish school.

    John Maynard Smith http://en.wikipedia.org/wiki/John_Maynard_Smith is a thinker as clear as they come. I once had a long conversation with him at meetings about conventions in queues, taking him away from old buddies. He said he’d rather talk science with someone he didn’t know than chat with old friends and proved his point.

    Economics has real problems, but it is not my field. I think the arguments of Julián García are very interesting and may be useful in identifying different kinds of model uses.

    Erol thinks I’m wrong-headed and have someone personal in mind. He seems to take this as a personal attack and this is too bad because he is charming. This interpretation is probably because we had dinner together at the Animal Behavior Society http://animalbehaviorsociety.org/ picnic in Albuquerque a week ago and I had nothing good to say about Roughgarden’s bizarre take on sexual selection which Erol turned out to be an author on. It is true that I don’t think Roughgarden’s paper did the field any good and I also think it would not have been published if reviewed anonymously. But I also think my points are more general, and about some kinds of models and not others. I think that was clear in my post from the very beginning. Do I think the very best theoreticians do both theory and empirical work? Yes. I’m married to one of the best, so I might be a little biased here. But I know others who are excellent and do mostly or entirely theory, like Steve Frank http://stevefrank.org/, Hanna Kokko http://www.anu.edu.au/BoZo/kokko/, or Stuart West http://www.zoo.ox.ac.uk/group/west/index.html.

    Hmmm, somehow it seems my post came across to some as arguing that empiricists are better than theoreticians. But I’m not staring into a trash can of wasps. I do not want to waste my time collecting data that address no theory. I’m arguing for communication between the two approaches to understanding how the world works. Nothing Jeremy has said argues for theory without any relevance. Nothing I said argues for data without theory. Has this been useful? Don’t know, but let’s see what Owen, always a thoughtful one, says.

    On the point of modifying theories when counter data appear, I am thinking of something specific, a smaller theory than natural selection, one I thought was wrong, tested multiple times, was repeatedly told the theory was so true it could not be falsified by any data. Darwin said the opposite. I did not and do not want to clarify this example, and erroneously generalized. The main point is at the end of the day, there must be a way to prove a theory wrong. I know you can’t ever “prove” it right now that we are all using some form of Popper’s talk.

    OK, I’m running out of steam. Owen would have gotten additivity if Dave had taught pop gen at Rice the way he will here. I don’t think the phylogenetic analyses followed Nowak. The Hughes paper preceded it, I think. I do not think the Nowak paper was useful in any sense.

    I suppose I could end on a point from all the psychological stuff I’ve been reading recently, on how little we know about how our brains decided to respond to things. We polarize, we read into things what we think is there, we are hesitant to embrace any form of nuance (not that I’m often accused of nuance). We need theory. We need data. There is useless theory. There are useless data. Maybe it is my view that the useless theory is more common in high profile journals these days than useless data. What we need is both.

    • Jeremy Fox says:

      Welcome back and thanks for taking the time to reply at such length Joan, I’m reassured that this is just a case of your post coming off somewhat differently than you intended. I’ve certainly had that happen to me, more than once.

      As to whether useless theory is more common than useless data in high-profile journals, I can’t say. Seems like one of those kinds of things where our selective human memories probably shouldn’t be trusted–better to go back and systematically count up the number of useless theory and data papers. 😉

      As to whether anyone would’ve read a post called “The power of theory” rather than “The trouble with theory”, I can only say “Write one and find out!” Over at Oikos Blog I’ve done posts like that. Although then again, I have also done plenty of intentionally-contrarian posts with intentionally-provocative titles, so I’m certainly not going to complain when someone else pursues the same strategy. 😉

  9. Pingback: In reply to The trouble with theory | Just Simple Enough: The Art of Mathematical Modelling

  10. Bruce Lyon says:

    I am mostly a field biologist but most of my research is based on testing concepts. When I teach Ecology I also focus mostly on concepts so I feel I have good appreciation for theory in its broadest sense. I also found myself agreeing with much of what Joan wrote so I wonder if people from different backgrounds take a different message from what she wrote. Perhaps a better title for the piece might have been The Trouble with Some Theories…(the bad ones).

    The first comment by Jeremy Fox makes a really important point that I think often falls through the cracks and needs to be emphasized repeatedly—that model assumptions are often key and, in some cases, more important than the predictions. A particularly clear example of this, and one I use for undergraduate teaching, is Gotelli’s (Primer of Ecology) treatment of the logistic population growth model. He shows that everything interesting derives from one simple biological assumption—linear negative density-dependence. With this assumption, everything else, including key predictions, is just fiddling with the math.

    My obsession with assumptions comes from an interesting experience I had during my thesis work on within-species brood parasitism in coots. (Female birds in many species lay eggs in each others’ nests). I was testing a clutch size model of Malte Andersson and Matts Eriksson that made very specific quantitative predictions about how hosts should alter their clutch size in response to brood parasitism: the prediction was that hosts should reduce their clutch size one half egg for each egg a brood parasite lays in their nest. Remarkably my data showed exactly that pattern—model supported! However, the model was based on the assumption of negative linear density dependence (basically logistic) and the prediction falls out of that. Digging deeper, I discovered that the key assumption of the model did not apply to my system (not even remotely close) and that my nice fit with the quantitative prediction was spurious. It turns out that I had a mix, roughly 50:50, of birds that showed no clutch size response and birds that reduced clutch size one egg for each egg received. The explanation I eventually discovered was more interesting than a the simple clutch size model I started with, and I found evidence that the birds can recognize and count eggs.

    On Jeremy’s other points about zombie ideas, there are times when it can go the other way. In some cases, theory can generate zombie-like effects as well (perhaps zombie is not the correct term for theory that falsely convinces people that a reasonable idea is illogical). In the early 1980’s I submitted a paper on Zahavi’s handicap principle. The paper was rejected because both reviewers pointed out that Maynard Smith had proven with theory that the handicap idea could not work. This premature death of the handicap persisted for a long time—largely due to Maynard Smith’s work—until Alan Grafen eventually killed the Maynard Smith zombie with new theory.

    • Wow! What a thoughtful response! I wish I had taken ecology from you! Oh, never mind, I’d need a time machine. I love it that it seems like melding models and theory with field studies is getting easier and easier. One area in desperate need of theory, and rigorous experimental methods is the microbiome studies that are so popular right now. How do we know when there is a new species, or when there is sequencing error? How do we know what is a difference between samples and what is a difference within a sample? That’s crazy about the Zahavi paper. But I understand. It is hard to push forward with unpopular ideas, even when you are right!

    • Erol Akcay says:

      Very well-taken points, Bruce: assumptions are indeed key. Math is just the language that takes assumptions and restates them in different forms (conclusions or predictions). If one does the math correctly, all the predictions are informationally equivalent to your assumptions. In other words, garbage in, garbage out. The key is to know what your assumptions are, and express them clearly in math *and* words.

      Of course, sometimes what people say in words is different than what they say in math. There doesn’t need to be any malintent involved in that, my sense is most of the time, it essentially happens because the author isn’t really on top of their own model. But, you can only track down these discrepancies by actually going through the math. I don’t think many non-theoreticians do, and even amongst theoreticians, there is a bit of over-specialization, where people aren’t as engaged in frameworks outside their favored one.

      And the solution to this isn’t reducing the fancy math, or somehow making theoreticians to do field work (most cases of this I know of actually come from people who do both empirical and theoretical work). The questions we work on in ecology and evolution are complex questions. They are hard. We need the math to solve them. The simplest possible math, yes, but that might still be quite complicated. A good example is Trivers, who probably is as bright as anyone in the field, and had so many seminal insights. But he shuns math, and the result is he gets lost in this parental investment paper. Turns out, discrepancies in parental investment don’t cause sexual selection — that’s not an empirical statement, it’s a mathematical one. So his prediction actually doesn’t follow from his assumptions. It took McNamara&Houston and Kokko and colleagues a fair bit of sophisticated modeling to show that. But in the intervening 30 years, Trivers’ argument got entrenched, so that even though these new results have been around for a decade now, they are still not widely appreciated by many people (even though Kokko and Jennions have a paper with a cute little box that has two fish talking to each other about PI and sexual selection in a totally non-mathy way).

      The other example Bruce mentions, about the handicap principle is also instructive. The nice thing about mathematics is that it gives you definitive answers to logical questions (is this at all possible or not, under which assumptions?). I’d say that’s an argument for more people engaging with theory, since any one prominent theoretician’s proclamation can be corrected more quickly. As it happens, this case is also an argument for more people reading and using ideas from other fields, too, since Zahavi could have avoided all the pain by simply citing Michael Spence, the economist, who had published his signaling paper prior to Zahavi. So, essentially behavioral ecology waited until 1990 to rediscover a proof that was basically there for the taking in economics since 1973. (I don’t mean to diminish Grafen’s contribution; he did come up with an independent proof after all and did the pop-gen model, too, but it just would’ve been easier to not rediscover the wheel).

      The upshot is, I think we need to be more engaged with mathematical theory. Even field people do. I don’t mean to say that everyone should become a theoretician. Obviously, us theoreticians are most directly responsible for keeping quality of theory high. I do think we need to better police each other, and perhaps we need to have a few standards about reviewing modeling papers, like those for reviewing statistics in some journals. But if people expect that most people will just skip the math and go to the results, that diminishes the incentives to be extra-careful about the math and how it translates to biology. I think that increases the kinds of problems that Joan talks about.

  11. Erol Akcay says:

    Alright, first things first: I don’t think you are wrongheaded, I wrote that your post was (attack the idea, not the person, and all that). I still think it is. And I don’t think I took your post inordinately personal. You made some pretty strong allegations about theoreticians without specifying who, or even what kind, and I think those allegations are simply wrong for the vast majority of theoreticians I know, myself included of course. So I call you out and ask you to be specific and substantiate your claims. Jeremy does the same. I don’t see why this should seem personal to you. Since you mention it, I also had a lovely evening chatting with you and Dave in ABS, well at least up until the last bit (and that not because you attacked the ideas in that paper). I was hoping to run into you the next day to say goodbye, but only saw Dave. Hope to repeat it sometime soon, now that I know which topic to avoid.

    As for the substance of your post. It’s helpful that you clarify you are not against all modeling 🙂 But beyond that, I’m not sure what your point is, beyond perhaps keeping us pesky theoreticians on our toes. You say: “We need theory. We need data. There is useless theory. There are useless data.” and “What we need is both [good theory and good data]”. Well, no one I know (or know of) disagrees with those statements. I’m pretty sure all of the people who work in the little area that you have in mind (which is not hard to guess) don’t either.

    Of course, what people do disagree about (and what this is about) is which particular theories are useful and in what way, which particular pieces of data are meaningful, and what their meanings are, etc. And that’s not just fine, it’s healthy for an active field. If reasonable people agreed on every important question in a field, that field is essentially dead (either its problems are all solved, or they have collective delusion). I don’t think social evolution or behavioral ecology is dead yet. So that’s why I think it is important — if you want to have a productive discussion about the role of theory — to name names, point out features of theories you like, you don’t like, etc. and have an open discussion. You say your points were “about some kinds of models and not others”. Yes, *that* was clear, but how are we supposed to know which is which if you don’t tell us? And how are we supposed to know if we agree or disagree if we don’t know what exactly you are talking about?

    Now, you say you don’t want to go into more details of what got you riled up. Well, fine, it’s your choice and we have to live with it. It’s a shame, though.

  12. Doug Mock says:

    My simple view is that empiricism is slow (often plodding) and theoretical modeling is relatively fast (at least that’s how it seems to us field-based empiricists). Like a cumbersome wagon train that cannot tap into Mapquest for information on the hazards ahead (swollen rivers, hostile natives, etc.), it makes sense to send scouts on ahead to make semi-informed recommendations of how to proceed. But those scouts are very imperfect — having their own limitations and issues — and so their information can only be seen as suggestions. Sometimes they turn out to be right (e.g., Hamilton’s rule) and the population benefits; other times they turn out to be wrong and the whole wagon train goes way off track for awhile. But what choice do the empiricists have? If they do NOT glean what they can from the scouts and just concern themselves with their own data, they risk going round and round in ugly circles. Scouts may look flashy, but they cannot get the population relocated to a good settlement site on their own.

    This has to be a joint enterprise. When the data do not fit the elegant and oversimplified models, then that needs to be acknowledged. Modeling without testing is futile. Collecting data without much idea of what it can tell us is not likely to be highly informative. Bruce is right that disagreements between prediction and data should focus on the assumptions. And that is why the most useful models are those that make their assumptions the most transparent.

    • Jeremy Fox says:

      Modeling without testing is not futile. It actually has many useful purposes–it only seems futile to those who misunderstand the purpose of such modeling. Caswell 1988 Ecological Modelling is especially articulate (and forceful) on this point, noting that “theory has a life of its own”, independent of data.

      Similarly, data can be collected for many purposes (e.g., observational data to document an interesting pattern, vs. manipulative experiments to test hypothesized causes of that pattern), including purposes which are entirely independent of theory, and will seem flawed if the purpose for which they were collected is misunderstood.

      • But it still at the end of the day has as a goal to help explain the natural world.

      • Jeremy Fox says:

        Yes (usually–again, theory does have a life of its own…). It’s just that the phrase “helping to explain the natural world” too often gets interpreted in an overly-narrow way…

    • True, I think making assumptions transparent is crucial. Also, we empiricists would not know where to look without a theoretical eye. The theoreticians don’t know where to look without an empirical conscience, so it goes both ways. But at the end of the day, I still maintain that the point of the whole thing is to understand the natural world. We are not followers of Foucault. Missed you at the Behavior meetings!

  13. Pingback: Celebrating simple models | Michael McCarthy's Research

  14. My comment about the truthfulness of models has gotten a lot of fascinating comments. I agree with them. I think maybe I was trying to capture something different and used the wrong word. Of course I agree models are simplifications of reality. Some simplify in the way a map does, as Michael’s post suggested: http://mickresearch.wordpress.com/2012/06/26/celebrating-simple-models/. Others seem to obscure rather than illuminate, as the famous Nowak model does, claiming to not be about relatives, but having relatedness built into it. That is the kind of thing that bothers me.

    Another really interesting point was brought up by Jeremy Fox. That is the problem of how often the math is actually checked in the review process. Is this a problem or isn’t it? Why is our whole field built on the kindness of strangers, when checking theory is particularly onerous, even for excellent mathematicians? I know one really eminent theoretician that puts a limit on the papers reviewed per year, checks those carefully, but cannot possibly do all the papers someone might want to throw their way.

    • Jeremy Fox says:

      Is math checked (when it is checked) because it can be, whereas data analyses aren’t because reviewers don’t ordinarily have access to the raw data? I ask because I seem to recall hearing that, back in the days when people often could show all their data in a few tables, Sewell Wright would review papers by redoing all the analyses. Anyone know if that’s true?

      Re: checking math, we ecologists and evolutionary biologists, who use pretty basic math, have it much better than proper mathematicians. The computer-based proof of the four-color map theorem couldn’t be checked by humans at all. And Andrew Wiles’ 200-odd page proof of Fermat’s Last Theorem had to be farmed out in sections to something like 8 people, each the world’s leading expert on the area of mathematics used in that section. Those are unusual examples, of course.

      • Re checking maths: A reviewer of the dispersal model that I mention in my post (http://mickresearch.wordpress.com/2012/06/26/celebrating-simple-models/) just didn’t believe the maths because the result seemed counter-intuitive. S/he asked me to simulate it (which I did – and obtained the same result, although not as elegantly) – but that is a useful way to check even if someone doesn’t have the time or skill to work through the maths in detail.

      • Jeremy Fox says:

        That strikes me as an odd reason for a reviewer to object. I mean, what’s the point of checking the math if you then refuse to believe in your own checking? I’m glad the simulations proved convincing, though again I’m not sure why they should be any more convincing than the proofs.

        Unfortunately, certain kinds of models can be a bit tricky to simulate numerically (e.g., stochastic ODEs, models with discontinuous functions). And a good way to tell if the simulation is working properly is often to check it analytically. So I could imagine situations in which someone who can’t check the analytical results (or refuses to believe them!) could run into trouble trying to simulate them…

    • Jeremy Fox says:

      Re: models as maps, I use that analogy in my undergraduate quantitative methods class, to help ecology students (who often got into ecology in part because they thought it involved no math!) understand why they need to learn this stuff. It’s a very good and useful analogy. All models are false, just as all maps are (all maps necessarily omit some details and simplify or even distort others; you can’t have a map that’s a literal copy of the world itself). Further, this falsehood is what makes both models and maps useful (e.g., imagine trying to read a road map that included irrelevant details like lines of latitude, topography, individual buildings…). Different models include different details, and more or fewer details, because they have different purposes, just like maps (e.g., road maps vs. political maps vs. terrain maps), although which details to include often is less obvious when it comes to making models.

      • I agree that determining the appropriate level of complexity in models is often not obvious. Finding the balance between complexity and simplicity might be viewed like finding the place for a fulcrum to balance a beam. But I think it is more often a case of balancing a beam on a table – there are a range of models of varying complexity that can provide a useful balance. Even when building models for one particular purpose, it can be insightful to understand how predictions change with model complexity. But then we still have to figure out how big the table needs to be and where is it placed, which also might not be obvious…

      • Jeremy Fox says:

        Absolutely. Andrew Gelman just did a post on this in the context of “Occam’s Razor”, on how he doesn’t buy the notion that we should always seek simplicity in our models. The truth isn’t necessarily simple. Plus, no matter how simple or complex the truth is, it’s useful to understand the relationships among different approximations of varying complexity.

  15. Erol Akcay says:

    Well, as a theoretician who is kind of a stickler for mathematical and logical errors, I say that the peer-review process does have some problems when it comes to checking theory. My experience is that when I review papers, usually I am the only reviewer to have checked the math and (perhaps more importantly) ask whether the mathematical problem statement and conclusions match with the verbal and biological statements. I have the sense that many modeling papers (especially if they are simulations or the “simpler” kind of math — which can be a deceptive kind of simplicity) don’t really undergo any rigorous check of their mathematical bits before publication. I think the saving grace is that in ecology and evolutionary biology, most people are quite conscientious about checking themselves, so real errors in math and simulations are rarer than they could be, but it does happen in a non-trivial frequency.

    In general, both for empirical and theoretical work, I think we need a bit more “post-publication peer review”. Specifically, a quick and painless way of correcting obvious errors in published papers, where the correction gets attached to the paper. Right now, there is a lot of friction in that process.

    • Good for you that you actually check the math. It is a thankless but important task. I think referees that do not check the math should at least say so explicitly that they did not do so. But should an editor insist that the math be checked just as the logic of a non-mathematical argument is? Or is the appropriate analogy the actual statistical analysis, which never gets checked? This is less creative and prone to error, I suppose.

      • Erol Akcay says:

        I think that the editor should insist that the math is checked. Actually, one explanation for my experience as a reviewer is that the editors when they send the paper to a theoretician like me, expect it to be reviewed for math and therefore pick the second reviewer to concentrate on other aspects. That is almost certainly going on, but I don’t think people ever are actually asked explicitly to check the math and some proportion of these papers may not go to any theory type at all. That said, I don’t think the most important problems are in the derivations (they tend to be correct); it’s the translation stage that is usually the most problematic. That stage can and should be reviewed by non-theoreticians, but it still requires delving into the mathematics to a degree.

        Some journals (like Animal Behavior) ask in their questionnaire whether more specialist review of mathematical parts of the paper is required, but most journals don’t have a mechanism for that I think. I will say, though, the problem with statistical analyses is probably worse, since they are much more widely used.

        Re: the Fawcett and Higginson paper, it confirms a lot of people’s experience with mathematical papers and arguments. People tend to not engage with them. (Theory type students are routinely trained by the cliche that for each equation in your slides, you lose half your audience.) It’s nice to finally have some data behind this. Also not surprising is that appendices don’t have any effect, which is both good (I think putting analysis into appendices do generally improve a paper), but also is dangerous in some ways, because many people simply don’t read the appendix, so sometimes don’t really know what’s actually done in a paper that they nonetheless use and cite. The Nowak et al paper is an example of this; if I had to guess, I’d probably say that less than 1 in 10 of its readers (and perhaps even citers) actually slogged through their appendix. There is also the fact that appendices in many journals seem to get reviewed even less thoroughly than the main paper, which obviously exacerbates the problem.

  16. My friend and fellow birder and first-rate geologist, Cin-Ty Lee has a nice post on building your model that is worth reading: http://www.downtoearthquestions.blogspot.com/

  17. Mike Fowler says:

    Fascinating post and discussion so far. Joan, I think your original post came across as too general in whom it was characterising, which is why it seems to have raised so many hackles. The comments have managed to clear up much of the confusion.

    I try to check maths and even simulations when I’m reviewing papers (in theoretical ecology). Most often, I find this highlights problems with relevant information missing from the Methods, meaning a study can’t be replicated based on the information presented. I have caught a coupled of mathematical errors here and there though. And I have at least one mathematical typo that slipped past reviewers in one of my own publications.

    There’s an interesting early view article just up at PNAS, by Tim Fawcett and Andrew Higginson that is relevant to the discussion here:
    Heavy use of equations impedes communication among biologists

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s