I am trying to create a panel dataset for multiple variables across years. My data is in the following format in RStudio:
2012 2012 2012 2012 2013 2013 2013 .....
Name Var1 Var2 Var3......Var10 Var1 Var2...Var10.....
X 1 4 20 .......
Y 2 7 25 ......
Z 3 9 26 ...... and so on for each variable across years
I want to create a panel dataset for these that looks like
Name Year Var1 Var2 Var3....Var10
X 2012
Y 2012
Z 2012
X 2013
Y 2013
Z 2013
I have created the panel in the following steps:
Merged top two rows so the variables read as - 2012.Var1, 2012.Var2
Used the gather function to create a separate dataset for each variable as
Name Year Var1 X 2012.Var1 1 Y 2012.Var2 2 X 2013.Var1 and so on
I then combined the sheets to form one dataframe in the desired format
For step 1:
colnames(df) <- paste(colnames(df), df[1, ])
For step 2:
Var1 <- df %>% gather(year, Var1, `2012.Var1`,`2012.Var2`and so on)
Creating a Dataset for Var1
- did the same for all the variables
For step 3:
new_data = cbind(Var1,Var2, ..., Var10) # Combining Different Datasets
names(new_data)
While the code worked to create the dataframe, it is however, extremely tedious and time consuming when the variables are large in number and spread across years. I'm looking for an easier way to create the same dataframe?