-3

How can I create a new indicator variable, lets call it WorkingFamily, which is equal to 1 if any individual in the household is employed and 0 otherwise?

Individuals with the same SERIAL are in the same household.

Individuals with EMPSTAT==10 or EMPSTAT==12 are employed.

> dput(head(IPUMS.SDdata.MC))
structure(list(YEAR = c(2016L, 2016L, 2016L, 2016L, 2016L, 2016L
), SERIAL = c(89076L, 89077L, 89078L, 89079L, 89080L, 89104L), 
HWTSUPP = c(30187500L, 30183100L, 28600900L, 21051300L, 31378100L, 
17928900L), ASECFLAG = c(1L, 1L, 1L, 1L, 1L, 1L), COUNTY = c(6073L, 
6073L, 6073L, 6073L, 6073L, 6073L), MONTH = c(3L, 3L, 3L, 
3L, 3L, 3L), PERNUM = c(1L, 1L, 1L, 1L, 1L, 3L), WTSUPP = c(30187500L, 
30183100L, 28600900L, 21051300L, 31378100L, 17497400L), FAMSIZE = c(1L, 
1L, 1L, 1L, 1L, 4L), EMPSTAT = c(32L, 32L, 32L, 32L, 32L, 
0L), HIMCAID = c(2L, 2L, 2L, 2L, 2L, 2L), PID = c("2016 3 89076 1", 
"2016 3 89077 1", "2016 3 89078 1", "2016 3 89079 1", "2016 3 89080 1", 
"2016 3 89104 3"), WTSUPP2 = c(3018.75, 3018.31, 2860.09, 
2105.13, 3137.81, 1749.74)), .Names = c("YEAR", "SERIAL", 
"HWTSUPP", "ASECFLAG", "COUNTY", "MONTH", "PERNUM", "WTSUPP", 
"FAMSIZE", "EMPSTAT", "HIMCAID", "PID", "WTSUPP2"), row.names = c(174187L, 
174188L, 174189L, 174190L, 174191L, 174248L), class = "data.frame")

enter image description here

Rick
  • 61
  • 5
  • 12
  • 6
    We really do not want to type in your data. Instead of an image, please provide a text form of the data that we can cut and paste. Ideally, that would be created with `dput` from an R data.frame. – G5W Mar 06 '17 at 21:00
  • 1
    Did you try anything? This seems more like a "please do this for me" type question than a proper programming question. Please see how to create a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and be sure to include sample input data (not in an image) and the desired output for that input so solutions can be tested and verified. – MrFlick Mar 06 '17 at 21:01

1 Answers1

1

I'm assuming the difficulty you're having is grouping by serial. You can use data.table to make this easy to do. You should really include what you've tried and what you're stuck on though.

library(data.table)
dt = data.table(serial = c(rep(1,4), rep(2,4)), empstat = c(32,rep(0,7)))

What the data.table looks like before adding on Employed

   serial empstat
1:      1      32
2:      1       0
3:      1       0
4:      1       0
5:      2       0
6:      2       0
7:      2       0
8:      2       0

Then, you can just run this to see if anyone has an EMPSTAT of 12 or 32.

dt[ , "Employed" := ifelse(any(empstat %in% c(12,32)),1,0), by = .(serial)]
   serial empstat Employed
1:      1      32        1
2:      1       0        1
3:      1       0        1
4:      1       0        1
5:      2       0        0
6:      2       0        0
7:      2       0        0
8:      2       0        0
Kristofersen
  • 2,736
  • 1
  • 15
  • 31
  • Thanks Kristofersen I was having trouble grouping by SERIAL and library(data.table) is exactly what I needed to simplify this. For others that use this solution with your own data, make sure you first use define a new object <- data.table(your data name). I kept getting unused argument errors because my data was not recognized as a data.table. – Rick Mar 06 '17 at 21:35
  • @Rick glad it worked. You can change an existing data.frame to a data.table by running `setDT(dat)` – Kristofersen Mar 06 '17 at 21:36