Thanks in advance. I have a data frame of family members and their relationship to the "head of household", and I'd like to count the number of unique combinations of family structures.
I can achieve this (likely in a roundabout way) by converting the data to a wide format and using ddply count, but this does not account for identical family structures that are in a different order. Like such:
familyMember <- c("son","son","Head of household","daughter","grandmother","Head of household","son",
"Head of household","son","son","daughter","grandmother","Head of household","son")
familyGroup <- c(1,1,1,2,2,2,2,3,3,3,4,4,4,4)
families <- data.frame(familyMember,familyGroup)
Note that familyGroups '2' and '4' are exactly the same family structure in the same order. Note that familyGroups '1' and '3' are the same family structure but are in a different order. I then use dplyr to create an index that is the count of 'family member' for each 'family group'
familiesIndex <- ddply(families, .(familyGroup), mutate,
index = paste0('family', 1:length(familyGroup)))
Next I reshape to wide format:
familiesIndex_reshape <- reshape(familiesIndex, idvar = "familyGroup", timevar="index", direction = "wide")
Finally, I use count to get the number of unique combinations:
familiesIndex_reshape_Unique <- count(familiesIndex_reshape,
familyMember.family1,
familyMember.family2,
familyMember.family3,
familyMember.family4) %>% ungroup()
This leads to separate groups for familyGroups 1 and 3. I'd like these two groups to be counted as the same despite their order. Thanks so much, again.