Identification and Credible Inference
In this section of the course we will introduce the concept of identification and – with the help of our prototype models – examine some examples of different strategies for establishing identification of economic models.
As a fan of cricket, one thing you realize when explaining the rules to a newcomer is that the word “wicket” can refer to one of several different things. That’s confusing! Likewise, economists tend to use the word “identification” in a semantically loose way, and it can mean one of two things depending on context:
- Let \(P_{\theta}\) be the distribution over data implied by model parameters \(\theta\in\Theta\). Formally, identification refers to whether the mapping between model and data can be inverted to a unique parameter (point identification) or set of parameters (partial identification). A model is point identified if for all \(\theta\) and \(\theta'\) in \(\Theta\): \[ P_{\theta}=P_{\theta'}\ \Rightarrow \theta = \theta' \]
- Informally, identification discussions typically focus on how a particular model is identified, which has more to do with the estimation strategy than it does with the question of identification.
Out of pragmatism, most quantitative models already impose parametric restrictions on functional forms (such as utilities, production functions, wage equations, and distributions of unobservables). We will see that this can often mean that there are many ways to identify and estimate parameters, and so much of our effort is directed at issue (2) instead of (1). When there are multiple paths to identification, it is worth considering whether one path might be considered more “credible” in the sense that it is more robust to mild extensions, or relies less heavily on strong functional form and / or distributional assumptions.
In my experience, one issue that can be confusing for students is that economists in some fields spend more time talking about identification than they do actually showing identification. In this chapter, we’ll show identification of our simple prototype models, and then consider extensions that show that this identification can sometimes rest on overly strong assumptions. Then we’ll consider some common strategies as potential remedies to making these overly strong assumptions.
As we work through the examples, I will (reluctantly) apply the labels “credible” and “incredible” inference, but only to help convey a broader understanding of what this labeling even means. I would like to establish, by way of these examples, that “credible inference” – a phrase you will hear more broadly in the profession (Angrist and Pischke 2010) – is not a particularly useful taxonomy for quantitative economics. Credibility is subjective and so-called “credible” inference strategies often contain quietly embedded structure, and require much stronger assumptions to draw any portable lessons from the exercise (we exhaustively discussed examples in the previous chapter). A more productive – and philosophically neutral – strategy is to be clear and transparent about the assumptions under which key parameters are identified, how they are identified, and therefore the sources of data that are most influential in determining the calculations from a given quantitative modeling exercise. That’s an honorable objective that satisfies the main imperative of the “credibility revolution”.