I have an inner for-loop in R which I have identified as significant bottleneck in my code. The script simulates the effect of a time-varying policy on individuals prior to adulthood. The outer loop runs over a list of cohorts (yob = 1910,...,1930 etc.) that I would like to study. The inner loop counts from ages from a = 5 to a = 17. CSL.details is a data.table that contains the details of each law that I am studying in form of the variables I grab, which vary by year = birthyear + a. To understand the overall effects of the policy by birth cohort, I need to track ca_years1, ca_years2, ca_years3, and ca_years4 for each a.
ages = seq.int(5,17)
state = "Massachusetts"
yob = seq.int(1910, 1930)
for (birthyear in yob){
ca_years1 = 0; ca_years2 = 0; ca_years3 = 0; ca_years4 = 0;
for (a in ages){
thisyear = birthyear + a
# Grab each law for given state and year and implement exemption permit
thislaw <- CSL.details[statename == state & yob == birthyear & thisyear == year]
if (nrow(thislaw) == 0) next
exempt_workpermit = (ca_years2 >= thislaw$workyrs & a >= thislaw$workage & thislaw$workage > 0)
exempt_yearstodropout = (ca_years3 >= thislaw$earlyyrs & a >= thislaw$earlyyrs_condition & thislaw$earlyyrs > 0)
exempt_cont = ((ca_years2 + ca_years4) >= thislaw$contyrs & thislaw$contyrs > 0)
# Increment each law when school is required
if(thislaw$entryage <= a & a < thislaw$exitage){
ca_years1 = ca_years1 + 1
if(!exempt_workpermit){ca_years2 = ca_years2 + 1}
if(!exempt_yearstodropout){ca_years3 = ca_years3 + 1}
}
if(thislaw$contage > a &
a >= thislaw$workage &
!exempt_cont &
thislaw$workage > 0 &
!(thislaw$entryage <= a & a < thislaw$exitage & !exempt_workpermit)
){ca_years4 = ca_years4 + 1}
}
CSL.exposures[statename == state & yob == birthyear]$ca_years1 = ca_years1
CSL.exposures[statename == state & yob == birthyear]$ca_years2 = ca_years2
CSL.exposures[statename == state & yob == birthyear]$ca_years3 = ca_years3
CSL.exposures[statename == state & yob == birthyear]$ca_years4 = ca_years4
}
Is there a data.table solution for replacing the inner-loop? I am an intermediate R coder and it is a bit difficult to think of how to get started. Although I would prefer data.table exclusively, I am open to dplyr-type solutions if they significantly speed up the code.
Edit: here is an example of what CSL.detail looks like, as a copy-pasted data.table.
statename year yob statefip entryage exitage earlyyrs earlyyrs_condition workage workyrs contage contyrs statecompschoolyr
1: Massachusetts 1913 1800 25 7 16 4 14 14 4 16 0 1852
2: Massachusetts 1913 1801 25 7 16 4 14 14 4 16 0 1852
3: Massachusetts 1913 1802 25 7 16 4 14 14 4 16 0 1852
4: Massachusetts 1913 1803 25 7 16 4 14 14 4 16 0 1852
5: Massachusetts 1913 1804 25 7 16 4 14 14 4 16 0 1852