I have a data frame df
that contains many field names for a series of years.
field
year description
1993 bar0 a01arb92
bar1 a01svb92
bar2 a01fam92
bar3 a08
bar4 a01bea93
Then, for every year, I have a stata file that has id
as a column and as additional columns, some (or all) of the field names mentioned in df
. For example, 1993.dta
could be
id a01arb92 a01svb92 a08 a01bea93
0 1 1 1 1
0 1 1 1 2
I need to check for every year if all the fields listed in df
really exist (as columns) in the corresponding file. I then would like to save the result back in the original data frame. Is there a nice way to do this without iterating over every single field?
Expected Output:
field exists
year description
1993 bar0 a01arb92 1
bar1 a01svb92 1
bar2 a01fam92 0
bar3 a08 1
bar4 a01bea93 1
For example, if every field but a01fam92
exists in the 1993 file as a column.