0

I'm trying to do some reshaping of hourly climatic data, but I can't get it right... Here is the data, one day varible (365 levels.+/- 1 depending year), one hour variable (24 levels), one numeric temperature (+/- 8760 obs).

head(df)
####         .day .hour temperature
#### 2 2013-01-01     1          19
#### 3 2013-01-01     2          19
#### 4 2013-01-01     3          18
#### 5 2013-01-01     4          18

My expected output is a data.frame like this, but instead of the values one (the lenghts) I need the temperature values inside...

        .day 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
1 2013-01-01 1 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
2 2013-01-02 1 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
3 2013-01-03 1 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
4 2013-01-04 1 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

This output is generated with dcast(.day~.hour), Il also tried some tidyr with no success. How can I do this? And what about if there are some missing lines somewhere (a day missing, etc.)? Thanks.

agenis
  • 8,069
  • 5
  • 53
  • 102
  • 2
    What you are trying to do is reformat long data to wide format. `tidyr` has the relevant function `spread` which is appropriate for this purpose. The help files has sufficient example here https://cran.r-project.org/web/packages/tidyr/tidyr.pdf#page.14 – Frash Oct 20 '16 at 10:49
  • @Frash ok now i get this error with `spread`: *Error: Duplicate identifiers for rows (2138, 2161), (7178, 7179)*. i try to solve it then i let you know if it works – agenis Oct 20 '16 at 12:04
  • ok that's probably because of the hour change twice a year. – agenis Oct 20 '16 at 12:17
  • @Frash ok now it works fine! thanks a lot. you can post as answer if you want; – agenis Oct 20 '16 at 12:58

1 Answers1

1

To reformat data from long to wide format, we can use tidyr has the relevant function spread. The help files has sufficient example here: http://cran.r-project.org/web/packages/tidyr/tidyr.pdf#page.14

require(tidyr)
spread(df, .hour, temperature, fill = NA) #fill any missing data with NA 

A comprehensive tour of other options available to effect the same changes is given here: https://stackoverflow.com/a/9617424/2724299

Community
  • 1
  • 1
Frash
  • 724
  • 1
  • 10
  • 19