dcast data having id.vars variable with levels of unequal length

Question

I have a data frame like this one:

  date_time            id        Temp
2012-02-27 00:12:30   |  Sens_1  |  25.2
2012-02-27 00:13:00   |  Sens_1  |  25.5
2012-02-27 00:13:30   |  Sens_1  |  26.3
2012-02-27 00:14:00   |  Sens_1  |  26.3
2012-02-27 00:14:30   |  Sens_1  |  26.3
           :               :          :
2012-02-27 00:12:30   |  Sens_2  |  35.4
2012-02-27 00:13:00   |  Sens_2  |  36.5
2012-02-27 00:13:30   |  Sens_2  |  38.8
2012-02-27 00:14:30   |  Sens_2  |  36.3
          :                 :         :
2012-02-27 00:12:30   |  Sens_3  |  15.4
2012-02-27 00:13:00   |  Sens_3  |  15.5
2012-02-27 00:13:30   |  Sens_3  |  16.8
2012-02-27 00:14:30   |  Sens_3  |  16.3
          :                 :         :

And I want to get another data frame that looks like this:

      date_time           Sens_1 |  Sens_2  |  Sens_3
2012-02-27 00:12:30   |  25.2    |  35.4    |  15.4
2012-02-27 00:13:00   |  25.5    |  36.5    |  15.5
2012-02-27 00:13:30   |  26.3    |  38.8    |  16.8
2012-02-27 00:14:00   |  26.3    |   NA     |   NA
2012-02-27 00:14:30   |  26.3    |  36.3    |  16.3
           :               :          :          :

How can I achieve that? Despite that in the piece of df I put above all the three variables got the same "date_time" value (and so it is in most of the cases) there are certain rows where any subgroup of variables can have a certain "date_time" value and the rest won't.

You can use spread function from dplyr. – thisisrg Feb 21 '17 at 04:06 — thisisrg, Feb 21 '17 at 04:06

Sathish · Accepted Answer · 2017-02-21T23:18:55.053

Here is the one-liner:

library( 'data.table' )
setDT( df1 )  # convert data frame to data table by reference
dcast( df1, date_time ~ id, value.var = 'Temp', fun = function( x ) x, fill = NA_real_ )
#              date_time Sens_1 Sens_2 Sens_3
# 1: 2012-02-27 00:11:00   23.0   34.6   14.6
# 2: 2012-02-27 00:11:30   23.4   33.4   14.4
# 3: 2012-02-27 00:12:00   34.6     NA   13.6
# 4: 2012-02-27 00:12:30   35.4     NA   15.4
# 5: 2012-02-27 00:13:00   25.5   36.5   15.5
# 6: 2012-02-27 00:13:30   26.3   38.8   16.8

Data:

df1 <- structure(list(date_time = c("2012-02-27 00:11:00", "2012-02-27 00:11:30", 
                                    "2012-02-27 00:12:00", "2012-02-27 00:12:30", "2012-02-27 00:13:00", 
                                    "2012-02-27 00:13:30", "2012-02-27 00:11:00", "2012-02-27 00:11:30", 
                                    "2012-02-27 00:12:00", "2012-02-27 00:12:30", "2012-02-27 00:13:00", 
                                    "2012-02-27 00:13:30", "2012-02-27 00:11:00", "2012-02-27 00:11:30", 
                                    "2012-02-27 00:12:00", "2012-02-27 00:12:30", "2012-02-27 00:13:00", 
                                    "2012-02-27 00:13:30"), 
                      id = c("Sens_1", "Sens_1", "Sens_1", "Sens_1", "Sens_1", "Sens_1", "Sens_2", "Sens_2",
                             "Sens_1", "Sens_1", "Sens_2", "Sens_2", "Sens_3", "Sens_3", "Sens_3", "Sens_3", 
                             "Sens_3", "Sens_3"),
                      Temp = c(23, 23.4, 24.6, 25.2, 25.5, 26.3, 34.6, 33.4, 34.6, 35.4, 36.5, 38.8, 14.6,
                               14.4, 13.6, 15.4, 15.5, 16.8)), .Names = c("date_time", "id", "Temp"),
                 row.names = c(NA, -18L), class = "data.frame")

Thanks for the answer @Sathish I still having the same problem though. I probably wasn't specific enough when I first posted the question, but my doubt is about when I get values in a certain row that not all the new columns will have measured, and so each won't have the same lenght, wish gives me problems to rearrange or create a new dataframe. — NAOS, Feb 21 '17 at 19:38
Please edit your question with your doubts. I would suggest editing the title like "reshaping data having `value.vars` variable with levels of unequal length". I do not know if somebody will tag you with an existing answer and make it as duplicate again. It does not rest with me. I posted an edited solution. Please take a look. — Sathish, Feb 21 '17 at 22:50
title like "dcast data having `id.vars` variable with levels of unequal length". - more specific — Sathish, Feb 21 '17 at 22:58
The solution you proposed works perfectly! Thanks very much for helping and all the extra advices. — NAOS, Feb 22 '17 at 15:49
No problem. Make sure you have `datetime` column has unique values. If you have duplicates in `datetime` column, then this solution will not work. Note that `Sens1` is supposed to be 8 rows, but condensed to 6 rows, because of duplicates in `datetime` column. — Sathish, Feb 22 '17 at 17:38

dcast data having id.vars variable with levels of unequal length

1 Answers1