I would like to create new rows in a data.frame for all missing years for each group (firm and type). The dataframe looks as follows:
minimal <- data.frame(firm = c("A","A","A","B","B","B","A","A","A","B","B","B"),
type = c("X","X","X","X","X","X","Y","Y","Y","Y","Y","Y"),
year = c(2000,2004,2007,2010,2008,2001,2002,2003,2007,2000,2001,2008),
value = c(1,3,7,9,9,2,3,3,7,5,9,15)
)
Dataframe:
firm type year value
A X 2000 1
A X 2004 3
A X 2007 7
B X 2010 9
B X 2008 9
B X 2001 2
A Y 2002 3
A Y 2003 3
A Y 2007 7
B Y 2000 5
B Y 2001 9
B Y 2008 15
Now, what I want to get is the following: I can see in the data that the minimum year is 2000 and the maximum is 2010. I want to add a row for each missing year for each combination of firm-type. E.g. for firm A and type X, I would like to add rows such that it looks like this:
Final output:
firm type year value
A X 2000 1
A X 2004 3
A X 2007 7
A X 2001 1
A X 2002 1
A X 2003 1
A X 2005 3
A X 2006 3
A X 2008 7
A X 2009 7
A X 2010 7
Additionally, I want to write the value from the previous year into the column 'value' for the missing row for all subsequent years until a new non-missing row appears (as seen in the final output example).
I have not yet come up with any useful code, but what I have found so far is the following which might be the right direction:
setDT(minimal)[, .SD[match(2000:2010, year)],
by = c("firm","type")]
I don't really understand the concept of setDT and .SD, but this creates at least one row for each firm type combination. However, there is not content for year.
Thanks a lot in advance!