I am analysing a longitudinal survey (https://microdata.worldbank.org/index.php/catalog/3712) with around 2k participating households (dropping with each round). There were 11 waves/rounds, each divided into around 6-8 datasets based on theme of the questions. To analyse it, I need it in proper panel data format, with each theme being in one file, combining all the waves.
Please see the excel snippets (with most columns removed for simplicity) of how it looks: Round 1 vs round 9 (The levels of categorical variables have change names, from full names to just numbers but it's the same question). Basically, the format looks something like this:
head(round1.csv)
ID | INCOME SOURCE | ANSWER | CHANGE |
---|---|---|---|
101 | 1.Business | 1. YES | 3. Reduced |
101 | 2.Pension | 2. NO | |
102 | 1.Business | 1. YES | 2. No change |
102 | 2. Assistance | 1. YES | 3. Reduced |
So far I have only been analysing seperate waves by their own, but I do not know how to:
- Combine so many data frames together.
- Convert it to the format where each ID appears only once per wave. I used spread to use modelling in single files. I think I can imagine what the data frame would look like if the question was only whether they receive the income source (maybe like this?:
WAVE | ID | Business | Pension |
---|---|---|---|
:1 | 101 | 1. YES | 1. NO |
:1 | 102 | 1. YES | 1. YES |
:2 | 101 | 1. NO | 1. YES |
:2 | 102 | 1. YES | 1. YES |
), but I do not understand how it is supposted to look like with also the change to that income included.
- How to deal with weights - there are weights added to one of the files for each wave. Some are missing, and they change per wave, as fewer households agree to participate each round. I am happy to filter and only use houesholds that participated in every round, to make it easier.
I looked for an aswer here: Panel data, from wide to long with multiple variables and Transpose a wide dataset into long with multiple steps but I think my problem is different.
I am a student, definitely not an expert, so I apologise for my skill-level.