Recitation 6: Introduction to the Bootstrap

We are going to try getting bootstrapped standard errors for our estimates of the income process parameters.

Setup: Loading the Data

Nothing new here relative to previous weeks. We load the data and do some filtering.

using CSV, DataFrames, DataFramesMeta, Statistics
data = @chain begin 
    CSV.read("../data/abb_aea_data.csv",DataFrame,missingstring = "NA")
    @select :person :y :tot_assets1 :asset :age :year
    @subset :age.>=25 :age.<=64
end

19139×6 DataFrame

19114 rows omitted

Row	person	y	tot_assets1	asset	age	year
	Int64	Int64	Int64	Float64	Int64	Int64
1	17118	54000	60000	0.0	49	98
2	12630	61283	224000	39283.0	59	98
3	12647	42300	28240	0.0	38	98
4	5239	82275	7500	0.0	56	98
5	2671	69501	48000	3600.0	35	98
6	13027	68000	148000	20000.0	49	98
7	6791	93758	80000	160.0	41	98
8	6475	26581	23300	0.0	35	98
9	18332	33785	0	0.0	42	98
10	3856	55300	311000	5300.0	33	98
11	19326	40200	105250	0.0	40	98
12	21818	42500	13000	0.0	36	98
13	7300	121508	178000	10008.0	59	98
⋮	⋮	⋮	⋮	⋮	⋮	⋮
19128	6617	115887	241000	21346.0	62	108
19129	626	128600	98000	0.0	46	108
19130	4795	105000	-68000	0.0	34	108
19131	3223	120000	132000	0.0	47	108
19132	8098	26527	4700	0.0	37	108
19133	8954	144026	220000	25.0	46	108
19134	12990	122665	220000	0.0	53	108
19135	8782	55000	69000	0.0	31	108
19136	13059	42728	-10000	0.0	26	108
19137	13535	57000	0	0.0	26	108
19138	3806	87000	74200	0.0	26	108
19139	11085	74000	-50000	0.0	31	108

Here is a function to calculate \(\hat{\mu}_{t}\):

function estimate_mu(data)
    μ = @chain data begin
        groupby(:age)
        @combine :μ = mean(log.(:y))
        @orderby :age
        _.μ
    end
    return μ
end

μ_est = estimate_mu(data)

40-element Vector{Float64}:
 10.760860880321463
 10.841247443701185
 10.902095326248284
 10.91448929395576
 10.998556154206424
 10.953465022792393
 11.100954062096019
 11.091165981452196
 11.084065028498708
 11.063159174262802
 11.164647085813545
 11.09762896621344
 11.0958958847166
  ⋮
 11.305097087283851
 11.30167368879086
 11.289951332574214
 11.218984368534626
 11.22669701047502
 11.157589262616021
 11.212633411833302
 11.139162326045438
 11.048452223663316
 11.020306496546475
 10.955869785866497
 10.995127001069111

A function to calculate residuals and get lagged residuals:

function get_resids(data)

    data = @chain data begin
        groupby(:age)
        @transform :resid = log.(:y) .- mean(log.(:y))
    end

    d1 = @chain data begin
        @select :year :person :resid
        @transform :year = :year .+ 2
        @rename :rlag1 = :resid
    end

    d2 = @chain data begin
        @select :year :person :resid
        @transform :year = :year .+ 4
        @rename :rlag2 = :resid
    end

    data = @chain data begin
        innerjoin(d1 , on=[:person,:year])
        innerjoin(d2 , on=[:person,:year])
    end
    return data
end

get_resids (generic function with 1 method)

Then finally a function to calculate the moments that we are going to target in the minimum distance step.

function get_moments(data)
    data = get_resids(data)
    return [var(data.resid), cov(data.resid,data.rlag1), cov(data.resid,data.rlag2)]
end

m = get_moments(data)

3-element Vector{Float64}:
 0.6042229503656219
 0.3937491407622511
 0.3537264122186026

Creating a boostrapped sample

Recall from class that we have to be careful to sample from the bootstrap at the right unit level. Since we know that observations at the level of person are correlated, we have to sample on this variable with replacement. Here’s a function to create a bootstrapped panel dataset by sampling individuals with replacement.

function draw_bootstrap_sample(data)
    id = unique(data.person) #<- create a list of all person identifiers in the data
    N = length(id)
    id_boot = id[rand(1:N,N)] #<- sample individuals randomly N times with replacement
    d = DataFrame(person = id_boot, id_boot = 1:N) #<- add a unique identified for each draw (:id_boot)
    boot_data = innerjoin(d,data,on=:person) 
    @transform!(boot_data,:id_old = :person, :person = :id_boot) #<- since we are creating lags using the person identified, we need this to be unique for each draw.
    return boot_data
end

db = draw_bootstrap_sample(data)

19078×8 DataFrame

19053 rows omitted

Row	person	id_boot	y	tot_assets1	asset	age	year	id_old
	Int64	Int64	Int64	Int64	Float64	Int64	Int64	Int64
1	4489	4489	42300	28240	0.0	38	98	12647
2	4158	4158	82275	7500	0.0	56	98	5239
3	3180	3180	69501	48000	3600.0	35	98	2671
4	937	937	68000	148000	20000.0	49	98	13027
5	4278	4278	93758	80000	160.0	41	98	6791
6	4855	4855	93758	80000	160.0	41	98	6791
7	590	590	26581	23300	0.0	35	98	6475
8	3136	3136	55300	311000	5300.0	33	98	3856
9	4937	4937	55300	311000	5300.0	33	98	3856
10	2966	2966	42500	13000	0.0	36	98	21818
11	1300	1300	121508	178000	10008.0	59	98	7300
12	3010	3010	121508	178000	10008.0	59	98	7300
13	3144	3144	121508	178000	10008.0	59	98	7300
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
19067	2635	2635	128600	98000	0.0	46	108	626
19068	1170	1170	105000	-68000	0.0	34	108	4795
19069	3701	3701	105000	-68000	0.0	34	108	4795
19070	5023	5023	105000	-68000	0.0	34	108	4795
19071	1703	1703	26527	4700	0.0	37	108	8098
19072	2410	2410	26527	4700	0.0	37	108	8098
19073	4654	4654	144026	220000	25.0	46	108	8954
19074	1207	1207	55000	69000	0.0	31	108	8782
19075	4153	4153	55000	69000	0.0	31	108	8782
19076	2238	2238	42728	-10000	0.0	26	108	13059
19077	3722	3722	42728	-10000	0.0	26	108	13059
19078	4025	4025	42728	-10000	0.0	26	108	13059

Bootstrapping for a Weighting Matrix

Suppose we would like to weight out minimum distance criterion using the inverse of the variance for each statistic. A simple way to compute this is to use the bootstrap. Here is code to do that:

using Random
function boot_moment_variance(data,B ; seed = 1010)
    M = zeros(3,B)
    Random.seed!(seed)
    for b in axes(M,2)
        db = draw_bootstrap_sample(data)
        M[:,b] = get_moments(db)
    end
    V = cov(M')
end

V_mom = boot_moment_variance(data,100)

3×3 Matrix{Float64}:
 0.00162775   0.000432301  0.000423284
 0.000432301  0.000378095  0.000195091
 0.000423284  0.000195091  0.00035493

We could use the inverse of this matrix or just the inverse of the diagonal component.

Bootstrapping the Parameters

To bootstrap the parameters we first need to write some simple code to estimate the income process.

using LinearAlgebra, Optim
function model_moms(ρ,σ)
    v = σ^2 / (1 - ρ^2)
    return [v, ρ^2 * v, ρ^4 * v]
end

function min_distance_obj(x,m0,W)
    m1 = model_moms(x[1],x[2])
    dm = m1 .- m0
    return dm' * W * dm
end

function estimate_income_process(data, W)
    μ = estimate_mu(data)
    m0 = get_moments(data)
    res = optimize(x->min_distance_obj(x,m0,W),[0.9,0.1],Newton(),autodiff=:forward)
    return (; μ, ρ = res.minimizer[1],ση = res.minimizer[2])
end

r = estimate_income_process(data,I(3))

(μ = [10.760860880321463, 10.841247443701185, 10.902095326248284, 10.91448929395576, 10.998556154206424, 10.953465022792393, 11.100954062096019, 11.091165981452196, 11.084065028498708, 11.063159174262802  …  11.289951332574214, 11.218984368534626, 11.22669701047502, 11.157589262616021, 11.212633411833302, 11.139162326045438, 11.048452223663316, 11.020306496546475, 10.955869785866497, 10.995127001069111], ρ = 0.8619354766570221, ση = 0.3888428287852483)

Now we can apply the exact same routine as before. In this case, let’s return the entire bootstrapped sample of parameters instead of just the variance.

function boot_parameters(data,B,W ; seed = 2020)
    mu_b = zeros(40,B)
    ρ_b = zeros(B)
    σ_b = zeros(B)
    Random.seed!(seed)
    for b in eachindex(ρ_b)
        db = draw_bootstrap_sample(data)
        r = estimate_income_process(db,W)
        mu_b[:,b] = r.μ
        ρ_b[b] = r.ρ
        σ_b[b] = r.ση
    end
    return mu_b, ρ_b, σ_b
end

W_opt = inv(V_mom)

mu_b, ρ_b, σ_b = boot_parameters(data,100,W_opt)

([10.73580760699535 10.816247401707118 … 10.826997992362788 10.823451250771296; 10.80821234096262 10.838035193746032 … 10.838646491668136 10.860638767368634; … ; 10.932575248306065 10.976627931101024 … 10.993294394103312 10.905486358310869; 10.987084615895972 10.949352984020404 … 10.90096845528756 11.097782268019063], [0.8758970696492175, 0.8749031842046412, 0.876266168284345, 0.8842489702570284, 0.8895021308118279, 0.8813061168077119, 0.8674746704022943, 0.8631370148641132, 0.869562261996464, 0.8691779540595619  …  0.873224740029997, 0.8858033021530047, 0.8832392563927576, 0.8705757274838543, 0.8793051161142941, 0.8883241567846492, 0.9232806638603563, 0.871593632447117, 0.8752252490344966, 0.9017572628768009], [0.3631400069730627, 0.34742241728061213, 0.352626326206807, 0.34117177871761845, 0.33614178219528, 0.343324062252806, 0.3863865751127758, 0.3778308056994547, 0.373132407007769, 0.37179238682244176  …  0.37077420410018186, 0.3435244271756953, 0.339663045471818, 0.36043023196833307, 0.3503088163031108, 0.3481738289608236, 0.2622013601425302, 0.35282338167619254, 0.35564318535399464, 0.3018655922610499])

This would give us standard errors:

se_mu = std(mu_b,dims=2)
@show se_ρ = std(ρ_b)
@show se_σ = std(σ_b)

se_ρ = std(ρ_b) = 0.015419484415975956
se_σ = std(σ_b) = 0.029651429799185287

0.029651429799185287

Exercise

Try comparing the variance of these estimators using the estimated optimal weighting matrix to the identity matrix. Does it make an appreciable difference to the estimated distributino of the parameter estimates?