I need to add a variable to my dataframe based on work status within a given month and year.
The work status can change for each person every month. Therefore, I want the workstatus that is present for >50% of the year.
However, I can not figure out how to do this. Does anybody have any suggestions?
I have theese variables for each observation:
- Workstatus (20 different codes for different workstatus)
- Year (2012-2019)
- Month of each year
I guess I need to group by each observation, and then condition somehow, so that for the year e.g. 2012, the work status code that is present >50% within this year is the value that is returned.
Thank you so much!