We have wonderful undergraduates and we are failing them. We are failing in something important and I plan to fix it. That we are failing became very clear to me this past spring at their poster presentations. Generally the posters were lovely. The students well understood how to make and present a compelling poster. They outlined their main question and their results. They put their work in context. They understood what they were doing and why.
But not one of them had clearly done their own statistical analyses on their own data. Why not? What is so hard about statistics? After all, these students have all had calculus. Most had even taken a statistics class or used statistics in a class. How are we failing? Why is statistics seen as the last thing they do rather than the first?
I think it is because we have forgotten how to bake a cake. This fall we will be teaching the undergraduate researchers to bake just the cake they need for their research. Why should we do it this way? Why don’t we start with probability theory and move on to ANOVA, regression, mixed models, or whatever your favorites are? If you think about baking a cake the answer will be obvious. After all, we don’t teach cake baking by starting with the theory of heat and leavening agents. We hardly talk about when we do and don’t want to teach gluten its secret handshake. We just get out the recipe and bake it.
Once we have mastered one chocolate cake recipe, we might go on to others. We might make substitutions. In fact, my chocolate cake recipe served as my Emily and Julian’s wedding cake since it was the recipe best suited for a gluten-free flour substitution. Once you can make chocolate cake, you can try lemon cake, though it has its own tricks (I have a fabulous lemon cake secret). A few dozen cakes down the line and you might become interested in what exactly each ingredient and each step does. But how much harder it would be to start with the theory and then try to deduce an appropriate recipe!
So, this fall we are going to help students with their statistics. We are going to have them explain to us exactly what their questions are and how they are approaching them. We will work with them to be sure they have a good beginning grasp of experimental design. We will be sure they understand that at the heart of all questions is exploring variation and whether variation between treatments is greater than variation within. If you don’t have at least two categories and at least two samples within each, you are in trouble.
We will simply give the students the R code for their experiments.They will play with it along with graphing with GGPlot2 or something using invented data if they don’t yet have their own. They will understand the statistics for their own experiment. This will be their chocolate cake. From there they can compare with others in the class and gradually get familiar with other things R can do, with other kinds of questions, and with what exactly the test is doing.
But we will give them the recipe to begin with. How radical is that?