Assignment 1

Setup: Loading the Data

Here is code to load the dataset and do a little cleaning / filtering.

The sample is all mothers in the PSID who are unmarried at the time of their first childbirth.

using CSV, DataFrames, DataFramesMeta

data = @chain begin
    CSV.read("../children-cash-transfers/data/MainPanelFile.csv",DataFrame,missingstring = "NA")
    @select :MID :year :wage :hrs :earn :SOI :CPIU :WelfH :FSInd
    @subset :year.>=1985 :year.<=2010
    @transform :AFDC = :WelfH.>0
    @rename :FS = :FSInd
    end

89747×10 DataFrame

89722 rows omitted

Row	MID	year	wage	hrs	earn	SOI	CPIU	WelfH	FS	AFDC
	Int64	Int64	Float64?	Int64?	Float64?	Int64	Float64	Float64?	Int64?	Bool?
1	4031	1990	missing	missing	missing	43	0.758793	0.0	0	false
2	4031	1991	missing	missing	missing	43	0.790786	0.0	0	false
3	4031	1992	missing	missing	missing	43	0.814835	0.0	1	false
4	4031	1993	missing	missing	missing	43	0.839034	0.0	0	false
5	4031	1994	missing	0	0.0	43	0.860812	1704.0	1	true
6	4031	1995	missing	0	0.0	43	0.88496	1704.0	1	true
7	4031	1996	missing	0	0.0	43	0.910948	1704.0	1	true
8	4031	1997	missing	missing	missing	43	0.932244	missing	missing	missing
9	4031	1998	missing	0	0.0	43	0.946664	0.0	1	false
10	4031	1999	missing	missing	missing	43	0.967426	missing	missing	missing
11	4031	2000	3.33333	120	400.0	43	1.0	0.0	0	false
12	4031	2001	missing	missing	missing	43	1.02817	missing	missing	missing
13	4031	2002	missing	0	0.0	43	1.04457	0.0	0	false
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
89736	9308002	1999	missing	missing	missing	39	0.967426	missing	missing	missing
89737	9308002	2000	missing	missing	missing	39	1.0	missing	missing	missing
89738	9308002	2001	missing	missing	missing	39	1.02817	missing	missing	missing
89739	9308002	2002	missing	missing	missing	39	1.04457	missing	missing	missing
89740	9308002	2003	missing	missing	missing	39	1.06857	missing	missing	missing
89741	9308002	2004	missing	missing	missing	39	1.09708	missing	missing	missing
89742	9308002	2005	missing	missing	missing	39	1.13401	missing	missing	missing
89743	9308002	2006	missing	missing	missing	39	1.17054	missing	missing	missing
89744	9308002	2007	missing	missing	missing	39	1.20414	missing	missing	missing
89745	9308002	2008	missing	missing	missing	39	1.25008	missing	missing	missing
89746	9308002	2009	missing	missing	missing	39	1.24608	missing	missing	missing
89747	9308002	2010	missing	missing	missing	39	1.26647	missing	missing	missing

You may be unfamiliar with some of these commands, which make use of DataFrames and DataFramesMeta. In particular, think of the @chain macro as a way to compose functions. So for example:

d1 = @chain d2 begin
    func1(x)
    func2(y)
    func3(z)
end

is equivalent to calling:

d1 = func3(func2(func1(d2,x),y),z)

If you want to understand better, google is your friend!

Question 1

Calculate average welfare participation (AFDC) by year and plot it. What do you think happened with welfare participation in 1996 and after? If you don’t know the historical context, a quick search online or a read of this paper should help you out.

If you are new to julia, here is average hours calculated and plotted to get you started.

using StatsPlots, Statistics

d = @chain data begin
    groupby(:year)
    @combine :Hours = mean(skipmissing(:hrs))
    @subset .!isnan.(:Hours)
end

@df d plot(:year,:Hours, legend = :none, linewidth = 2)
xlabel!("Year")
ylabel!("Average Welfare Participation")

Question 2

Now write code to

Deflate earnings by CPI (CPIU).
Calculate annual average earnings for each individual (identified by MID).
Drop individuals with fewer than 10 years of data.
Categorize individuals by whether their average earnings is below or above the median across individuals.
Plot average participation in each year for individuals in each of these two categories.

Do you think this pattern is likely to be generated by a model without persistent unobserved heterogeneity? No strictly correct answer here, just curious to read what you think.

In case it helps, here is code for the first three steps. You could edit this to add additional operations to the chain or work with d directly.

d = @chain data begin
    @transform :earn = :earn ./ :CPIU
    groupby(:MID)
    @combine :T = sum(.!ismissing.(:earn)) :earn = mean(skipmissing(:earn)) 
    @subset :T .>= 10
end

1089×3 DataFrame

1064 rows omitted

Row	MID	T	earn
	Int64	Int64	Float64
1	4031	10	40.0
2	4179	16	6089.12
3	7030	11	4500.53
4	41007	12	16147.7
5	41008	11	0.0
6	45030	11	14374.3
7	45031	11	17693.5
8	47031	11	15946.7
9	84005	18	30769.6
10	105030	14	4633.61
11	106173	13	17714.4
12	122173	13	56275.2
13	126003	19	6705.08
⋮	⋮	⋮	⋮
1078	6843006	19	1805.93
1079	6843173	19	5005.44
1080	6845005	19	19795.4
1081	6849005	19	2450.55
1082	6849188	15	23259.0
1083	6853003	19	4856.42
1084	6862005	11	18899.4
1085	6862008	19	26634.8
1086	6864002	19	8695.92
1087	6864003	18	13627.5
1088	6867013	13	389.097
1089	6872171	17	3294.07

Question 3

This question is to familiarize you with the module Tranfers.jl which will enable you to calculate post-tax and transfer income for individuals given their earnings, non-labor income, state, year, and family size. The function budget in this module takes the arguments:

E: monthly earnings (either real or nominal)
N: monthly non-labor income (real or nominal)
SOI: the SOI code for state of residence
year: calendar year
num_kids: the number of children
cpi: set to 1. if E and N are nominal
p: equal to 0 if no programs, 1 if food stamps, 2 if food stamps + welfare.

For example the function call:

Transfers.budget(500.,0.,23,2000,2,1.,2)

calculates net income for a mother in Michigan (SOI code 23) with 2 kids, nominal labor income of $500 a month in the year 2000, and no non-labor income.

include("../children-cash-transfers/src/Transfers.jl")

Transfers.budget(500.,0.,23,2000,2,1.,2)

978.5408333333334

Create a graph that represents total net transfers for a single mother with two kids in the years 1990 and 2000 and in the states of Mississippi and New York. Depict these transfers as a function of earnings between the values of 0 and $1,000 a month (nominal). You can assume that all households are receiving both food stamps and welfare.

What do you make of the differences in these transfers across states and over time?