0

I get always a data table of this format:

set.seed(123)
dt <- data.table(date = seq(as.Date('2020-01-01'), by = '1 day', length.out = 365),
                 Germany = rnorm(365, 2, 1), check.names = FALSE)

The data table dt is only a counterexample. Every day of a month has a value (price), but it is also possible that a day has an NA-value.

... can be understood as any value.

What I want to do is, to change the format of this data table into the following form (colors can be ignored):

enter image description here

The first column defines the months of the year and all other columns are declaring the dates, e.g. price on the 2nd Jan is 2.11 or on the 31th Sept is 1.78. How can I do this?

markus
  • 25,843
  • 5
  • 39
  • 58
MikiK
  • 398
  • 6
  • 19

2 Answers2

1

Extract month and date in separate column and get the data in wide format.

library(dplyr)
dt %>%
  mutate(month = format(date, '%b'), 
         date = format(date, '%d')) %>%
  tidyr::pivot_wider(names_from = date, values_from = Germany)

If you want to do this in data.table you can use dcast.

library(data.table)
dcast(dt[,`:=`(month = format(date, '%b'), date = format(date, '%d'))], 
               month~date, value.var = 'Germany')
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thank you for your fast answer. If I use the first approach with ```dplyr``` package and define it as a new data table ```dt.test```, I get the following error message: **Error in exists(what, where, inherits = FALSE) invalid first argument** – MikiK Oct 19 '20 at 07:27
  • Are you using the same data? Seems like an issue with the environment. Can you restart R and try again? – Ronak Shah Oct 19 '20 at 07:32
  • Now it's working. Thank you!! – MikiK Oct 19 '20 at 07:39
1

Here is another way to solve it using data.table. Note that month.abb is a built-in variable in r. Using factor function and month.abb as levels allows to sort the months properly.

library(data.table)

dcast(dt[, month := factor(months(date, abbr = TRUE), month.abb)], 
      month ~ mday(date), value.var = 'Germany')

 #    month     1     2    3    4    5    6    7    8    9   10   11    12 ...
 # 1:   Jan  1.44  1.77 3.56 2.07 2.13 3.72 2.46 0.73 1.31 1.55 3.22  2.36
 # 2:   Feb  1.70  2.90 2.88 2.82 2.69 2.55 1.94 1.69 1.62 1.31 1.79  0.73
 # 3:   Mar  2.38  1.50 1.67 0.98 0.93 2.30 2.45 2.05 2.92 4.05 1.51 -0.31
 # ...