So I was revising what this guy asked: How do I "fill down"/expand observations with respect to a time variable?
I need the same thing for my dataset:
So they send him to check this:Complete column with group_by and complete (i tried to replicate the answers codes, but they didn't worked)
So my dataset looks like this (I present a simplification, in the real dataset there are more variables, and the real dimensions are 631230 obs. of 21 variables)
df
Year ID Name Brunch Sales Wages Labor productivity
2014 1750941579 JEN A 3 2 1.5
2015 1750941579 JEN A 4 2 2
2016 1750941579 JEN A 6 4 1.5
2017 1750941579 JEN A 8 4 2
2018 1750941579 JEN A 8 4 2
2014 1303477204 MIC B 6 2 3
2015 1303477204 MIC B 8 4 2
so i used this code DF<-complete(df, ID, Year=full_seq(Year, period=1),fill=list(Labor productivity=0))
and got something like this
Year ID Name Brunch Sales Wages Labor productivity
2014 1750941579 JEN A 3 2 1.5
2015 1750941579 JEN A 4 2 2
2016 1750941579 JEN A 6 4 1.5
2017 1750941579 JEN A 8 4 2
2018 1750941579 JEN A 8 4 2
2014 1303477204 MIC B 6 2 3
2015 1303477204 MIC B 8 4 2
2016 1303477204 #¿NOMBRE? B 0 0 NaN
2017 1303477204 NA NA NA NA NA
2018 1303477204 NA NA NA NA NA
It completed the panel, as I wanted, but is there a way to keep the Name, Brunch, (and other columns not listed here)?
It's fine if the quantitative variables (sales, wages) are NA or 0 i don't mind. But I need to keep the qualitative variables(Name and Brunch, that are associated with the ID).
I tried with this code from the second link (adaptation to my dataset)
DF<-df %>%
group_by(Year, ID) %>%
summarise(`Labor Productivity`=n()) %>%
ungroup() %>%
complete(Year, ID, fill = list(`Labor Productivity`=1))
but i only get summarise() regrouping output by 'Year' (override with .groups argument)
and the output dataset looks like this:
Year ID Name Labor productivity
2014 1750941579 JEN 1
2014 1303477204 MIC 1
2015 1750941579 JEN 1
2015 1303477204 MIC 1
2016 1750941579 JEN 1
2016 1303477204 MIC 1
And so on... (dimensions: 631230 obs. of 3 variables)
So, second question: What's wrong with this code?