A really wonderful blog post by Stephen Senn, head of the Methodology and Statistics Group at the Competence Center for Methodology and Statistics in Luxembourg, sums up the philosophical problem I've always had with Bayesian inference in scientific studies. Basically, the question is: Where does the prior come from? Senn argues that it can't be your real prior, since you can't quantify your real prior.

Where else could your prior come from? Here's the list I thought of:

1. You could use a "standard" prior that a bunch of other people use because it's "noninformative" in some sense (e.g. a Jeffreys prior). See this Larry Wasserman blog post on some of the potential problems with that approach.

2. You could use some prior that comes from empirical data. This is the foundation of the "empirical Bayes" approach.

3. You could choose a bunch of different priors and see how sensitive the posterior is to the choice of prior. This could be done in a haphazard or a systematic way, and it's not immediately clear if one of those is always better than the other. The drawback of this approach is that it's a bit cumbersome, and hard to interpret.

4. You could choose a prior that is close to the answer you want to get. The less informative your data is, the closer your prior will be to your posterior. This seems a bit scientifically dishonest. But I bet someone out there has tried it.

5. You can choose an "adversarial prior" that is similar to what you think someone who disagrees with your conclusion would say. (Thanks to Sean J. Taylor of Twitter for pointing this out.)

Have I missed any big ones?

5. You can choose an "adversarial prior" that is similar to what you think someone who disagrees with your conclusion would say. (Thanks to Sean J. Taylor of Twitter for pointing this out.)

Have I missed any big ones?

Anyway, as always in stats, there's some element of intuition that can't be incorporated into the estimation in a systematic way.

Anyway, Andrew Gelman, one of the high priests of Bayesianism, so to speak, had this to say about Senn's post:

I agree with Senn’s comments on the impossibility of the de Finetti subjective Bayesian approach. As I wrote in 2008, if you could really construct a subjective prior you believe in, why not just look at the data and write down your subjective posterior. The immense practical difficulties with any serious system of inference render it absurd to think that it would be possible to just write down a probability distribution to represent uncertainty. I wish, however, that Senn would recognize “my” Bayesian approach (which is also that of John Carlin, Hal Stern, Don Rubin, and, I believe, others). De Finetti is no longer around, but we are!

I have to admit that my own Bayesian views and practices have changed. In particular, I resonate with Senn’s point that conventional flat priors miss a lot and that Bayesian inference can work better when real prior information is used. Here I’m not talking about a subjective prior that is meant to express a personal belief but rather a distribution that represents a summary of prior scientific knowledge. Such an expression can only be approximate (as, indeed, assumptions such as logistic regressions, additive treatment effects, and all the rest, are only approximations too), and I agree with Senn that it would be rash to let philosophical foundations be a justification for using Bayesian methods. Rather, my work on the philosophy of statistics is intended to demonstrate how Bayesian inference can fit into a falsificationist philosophy that I am comfortable with on general grounds.

Cool.

__Update__: Thinking about it a little more, I don't think Senn's point really has any implications for study design. But it does seem to have implications for how a "client" (or reader of a paper) should treat a "Bayesian" researcher's results. Basically, a researcher doing Bayesian inference is not the same as a Bayesian agent in a model. A Bayesian agent in a model always uses her own prior, and thus always uses information optimally. A researcher doing Bayesian inference*cannot*use his own prior, and so may not be using information optimally. So using Bayesian inference shouldn't be a free ticket to respectability for research results.