I have data in long format (one line has a specific date, ID and several variables - see code below) and I would like to build an array in R from it.
test_df <- data.frame("dates"=c(19801230,19801231,19801231,19810101), "ID"=c(101,101,102,102), "var1"=0:3, "var2"=5:8)
If I focus on a single variable only, I can always create a wide table, having a row for each date and a column for each ID reporting the relative value; but I would like to build automatically an array out of it, so to have all variables in one object where I can work with an ordered time dimension.
In the example of test_df
, I would like to obtain two tables binded together (an array), where the first table has values of var1
, the second one of var2
but both tables have dates 19801230
19801231
and 19810101
as row indeces and 101
and 102
column indeces, which allows them to be binded together in an array (with NAs in missing values).
I could run a lapply
by the ID indeces or by the date indeces and marge the output lists into an array, but it seems complicated to make dimensions match (different IDs are present in different dates). Do you have suggestions?
The only other close question I have seen around is the other way around here, but it did not help me much.