Let’s Take the Con Out of Econometrics

Leamer, Edward E., “Let’s Take the Con Out of Econometrics,” The American Economic Review, Vol 73, No 1 (1983),  31-43.

  • statistical inference is not a precise laboratory-style science (parable of farmers, birds, and shade)
    • econometricians can interpret data, but cannot usually perform controlled experiments
    • even with randomly selected samples, the bias of the estimators can be thought small, but it cannot be safely assumed to be zero
    • the uncertainty surrounding sample selection falls as the sample size increases
    • the uncertainty surround model misspecification does not fall with increased sample size, and cannot be inferred from the data
      • One way to decrease this uncertainty might be to collect data from two separate [non]experiments whose biases are independently distributed.  This will result in a bias that is the average of the individual experiments, and only half the misspecification uncertainty.
  • Only a model with infinite variables and infinite data is beyond all scrutiny
    • For any data set, there is an infinite number of polynomial equations that can fit the data points equally well.
    • For any experiment or nonexperiment, an infinite number of variables could plausibly effect the observed outcome (generating substantial degrees of freedom problems).
    • For a model with unlimited parameters, a finite data set can suggest infinite parameter values, each fitting the data to a different degree and appearing more or less believable.
  • Prior assumptions are the key
    • All inferences rely on assumptions formed before looking at the data.
    • It is best to use assumptions that are generally accepted, that are convenient, and that generate the same results as all other assumptions in their (broad) class of assumptions.
  • The Horizon Problem
    • Starting with a model and then adjusting the horizon until the model fits is a problem.
    • Starting with a data and then inferring a model is a problem because it is impossible to tell whether the data validates the data-inspired model.
    • Start with a model, determine beforehand the horizon that will be sufficient to validate it, and limit (but do not rule out) adding variable ex post.
  • Conclusions
    • Accept that all inferences rely on assumptions about which variables to include, how to collect the data, etc.
    • Make assumptions beforehand, and then show how the sensitivity of results to those assumptions is very low.