As you can see from the preview of the data, the data is taken from January-March 2018. Here is a quick snippet of code to see how many observations we have on average per person:
@chain data begingroupby(:CPSIDP)@combine:T =length(:EMPSTAT)@combine:average =mean(:T) :frac_panel =mean(:T.>1)end
1×2 DataFrame
Row
average
frac_panel
Float64
Float64
1
1.83191
0.582806
So we see that more than half of the individuals in this sample can be found in more than one month of the data.
The @chain macro comes from the package DataFramesMeta and is a convenient syntax for composing operations into one block. For example:
@chain x beginfunc1(y1)func2(y2)func3(y3)end
is equivalent to
func3(func2(func1(x,y1),y2),y3)
Calculating some moments
You may find the codebook useful for understanding particular variables. We have already limited the data to individuals who are working (EMPSTAT=10), have a job but did not work last week (EMPSTAT==12), or are unemployed (EMPSTAT==21).
Suppose we wanted to use the panel dimension to measure transition rates. Here is a simple way to do that by simply measuring transitions between January and Feburary.
data[!,:E] .= data.EMPSTAT.<21#<- code the employment variabledata_jan =@chain data begin@subset:MONTH.==1@select:CPSIDP :AGE :SEX :EDUC :RACE :E@rename:E_lag =:Eenddata_merged =@chain data begin@subset:MONTH.==2@select:CPSIDP :Einnerjoin(data_jan,on=:CPSIDP)end
41262×7 DataFrame
41237 rows omitted
Row
CPSIDP
E
AGE
SEX
EDUC
RACE
E_lag
Int64
Bool
Int64
Int64
Int64
Int64
Bool
1
20161200000201
true
72
1
81
100
true
2
20180100000301
true
66
1
111
100
true
3
20180100000302
true
61
2
111
100
true
4
20170100000901
true
23
2
124
100
true
5
20170100000902
true
24
2
124
100
true
6
20170100001001
true
59
2
111
200
true
7
20170100001002
true
53
1
81
200
true
8
20171200001201
true
24
2
73
200
true
9
20161200000801
true
60
1
124
100
true
10
20161200000802
true
57
2
123
100
true
11
20170100001401
false
50
2
73
200
false
12
20170100001403
true
18
1
81
200
true
13
20170100001405
true
29
1
50
200
true
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
41251
20161107451001
true
59
1
92
100
true
41252
20161107451901
true
59
1
91
100
true
41253
20171107237801
true
45
1
91
100
true
41254
20171107237802
true
37
2
123
100
true
41255
20171207232201
true
41
1
111
100
true
41256
20171207232202
true
41
2
73
100
true
41257
20161107452301
true
38
1
73
100
true
41258
20161107452302
true
29
2
73
100
true
41259
20170107445501
true
41
1
73
100
true
41260
20171207232601
true
42
1
111
100
true
41261
20171207232602
true
43
2
123
100
true
41262
20171207232604
true
17
1
60
100
true
So now we can calculate the overall transition rate out of unemployment:
So here we’re estimating a very low separation rate and a pretty high hazard rate out of unemployment.
Observable heterogeneity
Next we’ll define a very simple education classification (Bachelor’s degree or not) and race classification (white vs non-white), and use groupby to calculate rates separately by demographics:
What do these differences in transition rates tell you about how we should extend the simple model with homogenous parameters?
A Disclaimer for IPUMS CPS data
These data are a subsample of the IPUMS CPS data available from cps.ipums.org. Any use of these data should be cited as follows:
Sarah Flood, Miriam King, Renae Rodgers, Steven Ruggles, J. Robert Warren, Daniel Backman, Annie Chen, Grace Cooper, Stephanie Richards, Megan Schouweiler, and Michael Westberry. IPUMS CPS: Version 11.0 [dataset]. Minneapolis, MN: IPUMS, 2023. https://doi.org/10.18128/D030.V11.0
The CPS data file is intended only for exercises as part of ECON8208. Individuals are not to redistribute the data without permission. Contact ipums@umn.edu for redistribution requests. For all other uses of these data, please access data directly via cps.ipums.org.