I have a data frame, in which data for different countries are listed vertically with this pattern:
country | time | value
I want to transform it in a data frame, in which each row is a specific time period, and every column is the value relative to that country. Data are monthly.
time | countryA-value | countryB-value |countryC-value
Moreover, not all periods are present, when data is missing, the row is just absent, and not filled with NA or similar. I thought to two possible solutions, but they seem too complicated and inefficient. I do not write here the code,
If the value in a cell of the column "time" is more than one month after the cell above, while the cells to the left are the same (i.e. the data pertains to the same country), then we have a gap. I have to fill the gap and to this recursively until all missing dates are included. At this point I have for each country the same number of observations, and I can simply copy a number of cells equal to the number of observations.
Drawbacks: it does not seem very efficient.
I could create a list of time periods using the command
allDates <- seq.Date(from = as.Date('2020-02-01'), to = as.Date('2021-01-01'), by = 'month')-1)
Then I look up the table about each period of allDates for each subset of the table of each country. If the value exist, copy the value, if there is not, fill with NA.
Drawbacks: I have no idea of which function I could use to this purpose.
Below the code to create a small table with two missing rows, namely data2
data <- data.frame(matrix(NA, 24, 3))
colnames(data) <- c("date", "country", "value")
data["date"] <- rep((seq.Date(from = as.Date('2020-02-01'), to = as.Date('2021-01-01'), by = 'month')-1), 2)
data["country"] <- rep(c("US", "CA"), each = 12)
data["value"] <- round(runif(24, 0, 1), 2)
data2 <- data[c(-4,-5),]