Assignment 2: Setup for Estimating a Search Model

Setup:

Consider the following extension of the undirected search model. Let \(X_{n}\) be a vector of demographics for person \(n\):

\[ X_{n} = [1,\ C_{n},\ F_{n},\ R_{n}] \]

where \(C_{n}\) is a dummy variable that indicates if an individual has a college degree, \(F_{n}\) is a dummy variable indicating that an individual is female, and \(R_{n}\) is a dummy that indicates if person \(n\) reports their race as not “white”. Define a new set of parameters that depend on these observables:

The flow value of unemployment is \(b(X) = X\beta_{b}\)
The probability of job destruction is \[ \delta(X) = \frac{\exp(X\gamma_{\delta})}{1+\exp(X\gamma_{\delta})} \]
The probability of a job offer is \[ \lambda(X) = \frac{\exp(X\gamma_{\lambda})}{1+\exp(X\gamma_{\lambda})} \]
\(\beta\) takes a value of 0.995.
Wage offers are drawn from a log normal distribution with mean \(\mu(X) = X\gamma_{\mu}\) and standard deviation \(\sigma(X) = \exp(X\gamma_{\sigma})\)
Log wages are observed with measurement error: \[ \log(W^{o}_{n}) = \log(W_{n}) + \zeta_{n} \] where \(\zeta_{n}\sim\mathcal{N}(0,\sigma^2_{\zeta})\).

So the parameters of the model are:

\[ \theta = (\gamma_{b},\gamma_{\delta},\gamma_{\lambda},\gamma_{\mu},\gamma_{\sigma},\sigma^2_{\zeta}) \]

We are going to estimate this model on CPS data. Here is code to import the data and impute wages for workers who are not paid by the hour. This code also limits to observations in January so that it is a single cross-section, although you could choose a different month if you wanted. I also convert weekly unemployment durations to monthly.

using CSV, DataFrames, DataFramesMeta, Statistics

data = CSV.read("../data/cps_00019.csv",DataFrame)
data = @chain data begin
    @transform :E = :EMPSTAT.<21
    @transform @byrow :wage = begin
        if :PAIDHOUR==0
            return missing
        elseif :PAIDHOUR==2
            if :HOURWAGE<99.99 && :HOURWAGE>0
                return :HOURWAGE
            else
                return missing
            end
        elseif :PAIDHOUR==1
            if :EARNWEEK>0 && :UHRSWORKT.<997
                return :EARNWEEK / :UHRSWORKT
            else
                return missing
            end
        end
    end
    @subset :MONTH.==1
    @select :AGE :SEX :RACE :EDUC :wage :E :DURUNEMP
    @transform :DURUNEMP = round.(:DURUNEMP .* 12/52) #<- we convert weekly unemployment durations to monthly since we have a monthly model
end

61364×7 DataFrame

61339 rows omitted

Row	AGE	SEX	RACE	EDUC	wage	E	DURUNEMP
	Int64	Int64	Int64	Int64	Float64?	Bool	Float64
1	72	1	100	81	missing	true	231.0
2	66	1	100	111	missing	true	231.0
3	61	2	100	111	missing	true	231.0
4	52	2	200	73	20.84	true	231.0
5	19	2	200	73	10.0	true	231.0
6	56	2	200	111	25.0	true	231.0
7	22	2	200	81	9.5	true	231.0
8	23	2	100	124	missing	true	231.0
9	24	2	100	124	missing	true	231.0
10	59	2	200	111	missing	true	231.0
11	53	1	200	81	missing	true	231.0
12	24	2	200	73	missing	true	231.0
13	60	1	100	124	missing	true	231.0
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
61353	41	1	100	111	missing	true	231.0
61354	41	2	100	73	missing	true	231.0
61355	38	1	100	73	missing	true	231.0
61356	29	2	100	73	missing	true	231.0
61357	71	2	100	73	12.0	true	231.0
61358	45	1	100	92	21.25	true	231.0
61359	41	1	100	73	missing	true	231.0
61360	42	1	100	111	missing	true	231.0
61361	43	2	100	123	missing	true	231.0
61362	17	1	100	60	missing	true	231.0
61363	32	2	100	81	missing	true	231.0
61364	30	2	100	81	missing	true	231.0

Part 1

Following your notes from class, write a function that, given a set of parameters, solves the reservation wage for each unique combination of the variables in \(X\) (there are 8 total).

Part 2

Write a function that takes a single observation from the cross-section and calculates the log-likelihood of that observation given the model solution, current parameters, and observables \(X_{n}\).

Show the output from a function call to prove that it works, then use the @time macro to test how long it takes.

Hint:

Relative to your notes in class, you will need to integrate out the measurement error here for wages. Letting \(\phi(x;\mu,\sigma)\) be the normal pdf with mean \(\mu\) and standard error \(\sigma\), the likelihood of observing a wage \(W^{o}\) will be:

\[ f(W^{o}|E,X) = \int_{w^*}\frac{\phi(\log(w);\mu(X),\sigma(X))}{1-\Phi(\log(w^*);\mu(X),\sigma(X))}\phi(\log(W^{o})-w ; \sigma_{\zeta})dw \]

You will want to use a package like QuadGK to evaluate this integral numerically.

Part 3

Write a function that iterates over every observation in the data and calculates the log-likelihood of the data given parameters.

Show the output from a function call to prove that it works, then use the @time macro to test how long it takes.

Hint

You may find that these functions work faster if you pull the data you need out of DataFrame format and save it as arrays or vectors with known type. For example, I would recommend creating a flag for missing wage data and a default value for those missing wages, and iterating over those objects:

wage_missing = ismissing.(data.wage)
wage = coalesce.(data.wage,1.)
# creat a named tuple with all variables to conveniently pass to the log-likelihood:
d = (;logwage = log.(wage),wage_missing,E = data.E) #<- you will need to add your demographics as well.

(logwage = [0.0, 0.0, 0.0, 3.0368742168851663, 2.302585092994046, 3.2188758248682006, 2.2512917986064953, 0.0, 0.0, 0.0  …  0.0, 0.0, 2.4849066497880004, 3.056356895370426, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], wage_missing = Bool[1, 1, 1, 0, 0, 0, 0, 1, 1, 1  …  1, 1, 0, 0, 1, 1, 1, 1, 1, 1], E = Bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1  …  1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

A Disclaimer for IPUMS CPS data

These data are a subsample of the IPUMS CPS data available from cps.ipums.org. Any use of these data should be cited as follows:

Sarah Flood, Miriam King, Renae Rodgers, Steven Ruggles, J. Robert Warren, Daniel Backman, Annie Chen, Grace Cooper, Stephanie Richards, Megan Schouweiler, and Michael Westberry. IPUMS CPS: Version 11.0 [dataset]. Minneapolis, MN: IPUMS, 2023. https://doi.org/10.18128/D030.V11.0

The CPS data file is intended only for exercises as part of ECON8208. Individuals are not to redistribute the data without permission. Contact ipums@umn.edu for redistribution requests. For all other uses of these data, please access data directly via cps.ipums.org.