10  The Life-Cycle Savings Model

We’ll consider identification of the savings model with the following income process:

\[ \log(y_{n,t}) = \mu_{t} + \varepsilon_{n,t}\]

where

\[ \varepsilon_{n,t+1} = \rho \varepsilon_{n,t} + \eta_{n,t},\qquad \eta_{m,t}\sim\mathcal{N}(0,\sigma^2_\eta) \].

Collecting parameters, we want to identify

\[ (\mu,\rho,\sigma_\eta),\qquad (\beta,\sigma,\psi) \]

where the first block indicates parameters of the income process, and the second block determines preferences.

10.1 Identification of the Income Process

Assume that our data has a panel dimension, so that we see \((y_t,C_t,t)_{t=\tau_0}^{\tau_1}\) for some pair \((\tau_0,\tau_1)\). Remember that \(t\) indexes age in the model, so it is quite plausible that \(\tau_0\) and \(\tau_1\) may themselves be random variables (this will be true in the data… we see panels of individuals at different ages and for different lengths of time).

First, as long as the support of the random variables (\(\tau_0,\tau_1\)) covers \(t=1\) through to \(T\), \(\mu\) is identified as the mean of log income at each age, \(\mu_t = \mathbb{E}[\log(y_{t})]\). We can then residualize log income to get \(\varepsilon_{t} = \log(y_t)-\mu_t\).

Second, consider the following variances and covariances (remember that the \(\eta\) terms are iid):

\[\begin{eqnarray} \mathbb{V}[\varepsilon_{t+1}] = \rho^2\mathbb{V}[\varepsilon_{t}] + \sigma^2_{\eta} \\ \mathbb{C}(\varepsilon_{t},\varepsilon_{t+1}) = \rho\mathbb{V}[\varepsilon_{t}] \\ \mathbb{C}(\varepsilon_{t},\varepsilon_{t+2}) = \rho^2\mathbb{V}[\varepsilon_t] \end{eqnarray}\]

meaning that we can identify \(\rho\) and \(\sigma_\eta\) from this system of simultaneous equations.

Note: Whether vs How

Note that, since these moments can be calculated at any age \(t\) and can extend to arbitrary lags, this model is over-identified. Which moments should we use in practice? Thinking inside the model, we will address this topic when we get to discussing minimum distance estimation. You might also like to think outside the model and think about (1) how real income processes might deviate from your stylized model and (2) what features of the data you most want the parameters \(\rho\) and \(\sigma^2\) to capture.

Example: Data from PSID

Example 10.1 In this example we’ll load psid data from Arellano, Blundell, and Bonhomme (2018) and show how sample equivalents to the above moments might be calculated.

To begin, let’s load the data and pull out the variables we are interested in using. These are person identifiers (person), year, total income (y), savings (tot_assets1) and age. You should bear in mind that it is by no means trivial to measure total income and total assets in these data. The variables we are looking at are the product of a lot of data cleaning and careful choices by the authors.

using CSV, DataFrames, DataFramesMeta, Statistics
data = @chain begin 
    CSV.read("../data/abb_aea_data.csv",DataFrame,missingstring = "NA")
    @select :person :y :tot_assets1 :asset :age :year
end
19317×6 DataFrame
19292 rows omitted
Row person y tot_assets1 asset age year
Int64 Int64 Int64 Float64 Int64 Int64
1 12061 173100 605000 15500.0 65 98
2 17118 54000 60000 0.0 49 98
3 12630 61283 224000 39283.0 59 98
4 12647 42300 28240 0.0 38 98
5 5239 82275 7500 0.0 56 98
6 2671 69501 48000 3600.0 35 98
7 13027 68000 148000 20000.0 49 98
8 6791 93758 80000 160.0 41 98
9 6475 26581 23300 0.0 35 98
10 18332 33785 0 0.0 42 98
11 3856 55300 311000 5300.0 33 98
12 19326 40200 105250 0.0 40 98
13 21818 42500 13000 0.0 36 98
19306 6617 115887 241000 21346.0 62 108
19307 626 128600 98000 0.0 46 108
19308 4795 105000 -68000 0.0 34 108
19309 3223 120000 132000 0.0 47 108
19310 8098 26527 4700 0.0 37 108
19311 8954 144026 220000 25.0 46 108
19312 12990 122665 220000 0.0 53 108
19313 8782 55000 69000 0.0 31 108
19314 13059 42728 -10000 0.0 26 108
19315 13535 57000 0 0.0 26 108
19316 3806 87000 74200 0.0 26 108
19317 11085 74000 -50000 0.0 31 108

To map to the model, assume that agents begin (\(t=1\)) when aged 25 and live for 40 years (so the “terminal” period is at age 64). Thus, we should filter the data to look at only these ages.

@subset!(data,:age.>=25,:age.<=64);

Now let’s residualize log wages by age, to get our estimate of \(\varepsilon_{n,t}\):

data = @chain data begin
    groupby(:age)
    @transform :eps = log.(:y) .- mean(log.(:y))
end;

Next, here is a simple way of creating lagged variables (by mutating the year, renaming, and merging).

d1 = @chain data begin
    @select :year :person :eps
    @transform :year = :year .- 2
    @rename :epslag1 = :eps
end

d2 = @chain data begin
    @select :year :person :eps
    @transform :year = :year .- 4
    @rename :epslag2 = :eps
end

data = @chain data begin
    innerjoin(d1 , on=[:person,:year])
    innerjoin(d2 , on=[:person,:year])
end;

An example of calculating covariances:

@chain data begin
    @combine begin 
        :c1 = cov(:eps,:epslag1) 
        :c2 = cov(:eps,:epslag2)
    end
end;

Since the psid interviews are only every two years, we have to adjust our estimate of \(\rho\) slightly by taking the square root of the covariance ratio:

rho_est = sqrt(ans.c2[1] / ans.c1[1])
println("The estimate of rho is $(round(rho_est,digits=2))")
The estimate of rho is 0.98

When it comes to the identification of this income process, let’s consider its ability to fit the life-cycle profile in the variance of income:

using Plots
d = @chain begin data
    groupby(:age)
    @combine :var_income = var(log.(:y))
end
scatter(d.age,d.var_income,smooth = true,label = false)

The variance of log income seems to grow linearly with age, so this would be hard for our income process to fit if either

  1. We assume that \(\varepsilon\) is initially in its stationary distribution; or
  2. \(\rho\) is far from 1, since it implies a concave path for the variance.
Tip

Exercise 10.1 Suppose that income processes also feature permanent differences in productivity among individuals, so that:

\[ \log(y_{n,t}) = \mu_t + \alpha_n + \varepsilon_{n,t} \]

where \(\varepsilon_{n,t}\) is defined as before, and \(\alpha_n\) is the individual fixed effect in wages. Assume that \(\alpha\perp \varepsilon_1\), \(\alpha \perp \eta_t\) for all \(t\), and define \(\sigma^2_\alpha = \mathbb{V}[\alpha]\).

  1. Show that you can identify this income process using additional covariances.
  2. Estimate the parameters \((\rho,\sigma^2_\alpha,\sigma^2_\eta)\) by following your identification argument using the psid data from Example 10.1.
  3. How do your estimates compare to Example 10.1 where we ignored permanent individual heterogeneity?

10.2 Identification of Preference Parameters

This is an interesting case because the problem will likely more closely reflect how you will approach identification in your own research.

In previous examples, we typically made use of analytical (i.e. “pencil and paper”) representations of optimal behavior as they relate to deeper parameters, and we used this for identification. That is harder to do here since we know we must solve for savings policies numerically.

Identification of more complicated models

Here are some steps to help you think through identification of your model.

  1. Could you obtain identification if you had a “perfect” data set or experiment? Do your data allow a second-best approximation that harnesses the intuition of this perfect alternative? Remember that to show identification you can dispense with the practical considerations of finite samples and zoom in on very specific comparisons within the population distribution.
  2. What kind of variation is in the data that you do have and how do the parameters determine individuals’ response to that variation? If necessary, you can play with numerical solutions of the model to develop your intuition here.
  3. Can you simplify your model in a way that highlights some of the key forces of identification?

These approaches all differ in their level of precision. Your main goal is to provide your audience and yourself with some credible and sensible intuition. For the savings model, let’s use some combination of strategies (2) and (3). First, note that when \(\psi=0\), individuals would run their assets down to zero in the final period. Thus, \(\psi\) is very clearly identified by average bequests at the end of the life-cycle.

Now, suppose we remove uncertainty from the model and impose the natural borrowing constraint, such that the Euler equation becomes:

\[ \beta(1+r)\left(\frac{C_{t+1}}{C_{t}}\right)^{-\sigma}=1 \]

Notice that \(\sigma\) determines the intertemporal elasticity of substitution: how individuals would substitute consumption across periods when there is variation in the price of doing so (\(r\)). What sources of variation do we have in this model? Only the income shocks \(\eta\). Without uncertainty, individuals then choose a consumption profile based on the effect of that shock on the net present value of income. The resultant path depends on \(\beta\), \(r\), and \(\sigma\), but importantly \(\beta\) and \(\sigma\) are not separately identified. Thus, uncertainty and borrowing constraints hold the key for separately identifying parameters in our setting.

Now, we know from experimenting with this model that as individuals accumulate assets, the risk of hitting the borrowing constraint diminishes and their behavior begins to more closely reflect the case without uncertainty: consumption responses are very close to linear with respect to cash in hand. Thus, to identify \(\beta\) and \(\sigma\) separately, we need to focus on potential nonlinearities in consumption behavior closer to the borrowing constraint. One example of a set of identifying moments would be:

  1. Mean assets at each age; and
  2. The covariance of changes in log consumption with log income conditional on different asset levels.

While the first set of moments should pin down \(\beta\) and \(\psi\) jointly by effectively matching average consumption profiles, the second set attempts to pin down the nonlinear effect that \(\sigma\) has on consumption at different wealth levels.

As you can see, this is a sensible intuitive approach to identification that does not offer an exact mapping between data and parameters.

Since \(\sigma\) determines both risk aversion and the intertemporal elasticity of substitution, some ideal data settings that would identify \(\sigma\) would include:

  1. The risk profile of asset portfolio choices (we don’t have this).
  2. Variation in the income risk faced by individuals (we don’t have this).
  3. Variation in the returns to saving either through \(r_t\) or through policy intervention (we don’t have this).

In general then there are many ways to identify \(\sigma\), but most are missing in our simple model, so we will have to be more careful with the moments we choose.

Whether vs How

Suppose you wanted to use this model to evaluate the effect of a pension program reform on savings behavior. Would you be comfortable forecasting counterfactuals with this kind of identification approach? What kind of variation in the data would help you feel that your identification approach was more credible?