Assignment 2: Setup for Estimating a Search Model

Assignment 2: Setup for Estimating a Search Model

Setup:

Consider the following extension of the undirected search model. Let \(X_{n}\) be a vector of demographics for person \(n\):

\[ X_{n} = [1,\ C_{n},\ F_{n},\ R_{n}] \]

where \(C_{n}\) is a dummy variable that indicates if an individual has a college degree, \(F_{n}\) is a dummy variable indicating that an individual is female, and \(R_{n}\) is a dummy that indicates if person \(n\) reports their race as not “white”. Define a new set of parameters that depend on these observables:

  • The flow value of unemployment is \(b(X) = X\beta_{b}\)
  • The probability of job destruction is \[ \delta(X) = \frac{\exp(X\gamma_{\delta})}{1+\exp(X\gamma_{\delta})} \]
  • The probability of a job offer is \[ \lambda(X) = \frac{\exp(X\gamma_{\lambda})}{1+\exp(X\gamma_{\lambda})} \]
  • \(\beta\) takes a value of 0.995.
  • Wage offers are drawn from a log normal distribution with mean \(\mu(X) = X\gamma_{\mu}\) and standard deviation \(\sigma(X) = \exp(X\gamma_{\sigma})\)
  • Log wages are observed with measurement error: \[ \log(W^{o}_{n}) = \log(W_{n}) + \zeta_{n} \] where \(\zeta_{n}\sim\mathcal{N}(0,\sigma^2_{\zeta})\).

So the parameters of the model are:

\[ \theta = (\gamma_{b},\gamma_{\delta},\gamma_{\lambda},\gamma_{\mu},\gamma_{\sigma},\sigma^2_{\zeta}) \]

We are going to estimate this model on CPS data. Here is code to import the data and impute wages for workers who are not paid by the hour. This code also limits to observations in January so that it is a single cross-section, although you could choose a different month if you wanted. I also convert weekly unemployment durations to monthly.

using CSV, DataFrames, DataFramesMeta, Statistics

data = CSV.read("../data/cps_00019.csv",DataFrame)
data = @chain data begin
    @transform :E = :EMPSTAT.<21
    @transform @byrow :wage = begin
        if :PAIDHOUR==0
            return missing
        elseif :PAIDHOUR==2
            if :HOURWAGE<99.99 && :HOURWAGE>0
                return :HOURWAGE
            else
                return missing
            end
        elseif :PAIDHOUR==1
            if :EARNWEEK>0 && :UHRSWORKT.<997
                return :EARNWEEK / :UHRSWORKT
            else
                return missing
            end
        end
    end
    @subset :MONTH.==1
    @select :AGE :SEX :RACE :EDUC :wage :E :DURUNEMP
    @transform :DURUNEMP = round.(:DURUNEMP .* 12/52) #<- we convert weekly unemployment durations to monthly since we have a monthly model
end
61364×7 DataFrame
61339 rows omitted
Row AGE SEX RACE EDUC wage E DURUNEMP
Int64 Int64 Int64 Int64 Float64? Bool Float64
1 72 1 100 81 missing true 231.0
2 66 1 100 111 missing true 231.0
3 61 2 100 111 missing true 231.0
4 52 2 200 73 20.84 true 231.0
5 19 2 200 73 10.0 true 231.0
6 56 2 200 111 25.0 true 231.0
7 22 2 200 81 9.5 true 231.0
8 23 2 100 124 missing true 231.0
9 24 2 100 124 missing true 231.0
10 59 2 200 111 missing true 231.0
11 53 1 200 81 missing true 231.0
12 24 2 200 73 missing true 231.0
13 60 1 100 124 missing true 231.0
61353 41 1 100 111 missing true 231.0
61354 41 2 100 73 missing true 231.0
61355 38 1 100 73 missing true 231.0
61356 29 2 100 73 missing true 231.0
61357 71 2 100 73 12.0 true 231.0
61358 45 1 100 92 21.25 true 231.0
61359 41 1 100 73 missing true 231.0
61360 42 1 100 111 missing true 231.0
61361 43 2 100 123 missing true 231.0
61362 17 1 100 60 missing true 231.0
61363 32 2 100 81 missing true 231.0
61364 30 2 100 81 missing true 231.0

Part 1

Following your notes from class, write a function that, given a set of parameters, solves the reservation wage for each unique combination of the variables in \(X\) (there are 8 total).

Part 2

Write a function that takes a single observation from the cross-section and calculates the log-likelihood of that observation given the model solution, current parameters, and observables \(X_{n}\).

Show the output from a function call to prove that it works, then use the @time macro to test how long it takes.

Hint:

Relative to your notes in class, you will need to integrate out the measurement error here for wages. Letting \(\phi(x;\mu,\sigma)\) be the normal pdf with mean \(\mu\) and standard error \(\sigma\), the likelihood of observing a wage \(W^{o}\) will be:

\[ f(W^{o}|E,X) = \int_{w^*}\frac{\phi(\log(w);\mu(X),\sigma(X))}{1-\Phi(\log(w^*);\mu(X),\sigma(X))}\phi(\log(W^{o})-w ; \sigma_{\zeta})dw \]

You will want to use a package like QuadGK to evaluate this integral numerically.

Part 3

Write a function that iterates over every observation in the data and calculates the log-likelihood of the data given parameters.

Show the output from a function call to prove that it works, then use the @time macro to test how long it takes.

Hint

You may find that these functions work faster if you pull the data you need out of DataFrame format and save it as arrays or vectors with known type. For example, I would recommend creating a flag for missing wage data and a default value for those missing wages, and iterating over those objects:

wage_missing = ismissing.(data.wage)
wage = coalesce.(data.wage,1.)
# creat a named tuple with all variables to conveniently pass to the log-likelihood:
d = (;logwage = log.(wage),wage_missing,E = data.E) #<- you will need to add your demographics as well.
(logwage = [0.0, 0.0, 0.0, 3.0368742168851663, 2.302585092994046, 3.2188758248682006, 2.2512917986064953, 0.0, 0.0, 0.0  …  0.0, 0.0, 2.4849066497880004, 3.056356895370426, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], wage_missing = Bool[1, 1, 1, 0, 0, 0, 0, 1, 1, 1  …  1, 1, 0, 0, 1, 1, 1, 1, 1, 1], E = Bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1  …  1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

A Disclaimer for IPUMS CPS data

These data are a subsample of the IPUMS CPS data available from cps.ipums.org. Any use of these data should be cited as follows:

Sarah Flood, Miriam King, Renae Rodgers, Steven Ruggles, J. Robert Warren, Daniel Backman, Annie Chen, Grace Cooper, Stephanie Richards, Megan Schouweiler, and Michael Westberry. IPUMS CPS: Version 11.0 [dataset]. Minneapolis, MN: IPUMS, 2023. https://doi.org/10.18128/D030.V11.0

The CPS data file is intended only for exercises as part of ECON8208. Individuals are not to redistribute the data without permission. Contact for redistribution requests. For all other uses of these data, please access data directly via cps.ipums.org.