My data looks like this:
HH_ID INDUSTRY FREQUENCY
1002 NURSE 2
1002 DOCTOR 1
1003 NOT APPLICABLE 3
1004 ENGINEER 1
1004 CLERK 1
1004 NURSE 1
In one dataset df1 and in another data set it looks like this
HH_ID INDUSTRY AGE
1002 NURSE 26
1002 NURSE 25
1002 DOCTOR 34
1003 NOT APPLICABLE 40
1003 NOT APPLICABLE 28
1003 NOT APPLICABLE 23
1004 ENGINEER 35
1004 CLERK 40
1004 NURSE 24
The other data set with age is called df2 I want a data set that looks like this:
HH_ID INDUSTRY FREQUENCY
1002 NURSE 2
1003 NOT APPLICABLE 3
1004 CLERK 1
In other words, I want to create another dataset df3 that gives me the max frequency of industry for each HH_ID and if this is not possible because there is no maximumum value of frequency any industry associated to a HH_ID like in the case of 1004 I it to select the INDUSTRY for a HH_ID on the basis of the age of the HH_ID member based on the other dataset df2 in R. I have tried data.table package but didnt work. Please help