0

I am relatively new to R. I am working with a dataset that has multiple datapoints per timestamp, but they are in multiple rows. I am trying to make a single row for each timestamp with a columns for each variable.

Example dataset

Time    Variable    Value
10      Speed       10
10      Acc         -2
10      Energy      10
15      Speed       9
15      Acc         -1
20      Speed       9
20      Acc         0
20      Energy      2

I'd like to convert this to

Time    Speed    Acc    Energy
10      10       -2     10
15      9        -1     (blank or N/A)
20      8         0     2

These are measured values so they are not always complete.

I have tried ddply to extract each individual value into an array and recombine, but the columns are different lengths. I have tried aggregate, but I can't figure out how to keep the variable and value linked. I know I could do this with a for loop type solution, but that seems a poor way to do this in R. Any advice or direction would help. Thanks!

Jakemcp
  • 9
  • 1

2 Answers2

0

I assume data.frame's name is df

library(tidyr)
spread(df,Variable,Value)
Tomas H
  • 713
  • 4
  • 10
0

Typically a job for dcast in reshape2.First, we make your example reproducible:

df <- structure(list(Time = c(10L, 10L, 10L, 15L, 15L, 20L, 20L, 20L),    
Variable = structure(c(3L, 1L, 2L, 3L, 1L, 3L, 1L, 2L), .Label = c("Acc", 
"Energy", "Speed"), class = "factor"), Value = c(10L, -2L, 10L, 
9L, -1L, 9L, 0L, 2L)), .Names = c("Time", "Variable", "Value"),
class = "data.frame", row.names = c(NA, -8L))

Then:

library(reshape2)
dcast(df, Time ~ ...)

Time Acc Energy Speed
10  -2     10    10
15  -1     NA     9
20   0      2     9

With dplyr you can (cosmetics) reorder the columns with:

library(dplyr)
dcast(df, Time ~ ...) %>% select(Time, Speed, Acc, Energy)
Time Speed Acc Energy
10    10  -2     10
15     9  -1     NA
20     9   0      2
Community
  • 1
  • 1
Vincent Bonhomme
  • 7,235
  • 2
  • 27
  • 38
  • This method works great. I had trouble with it at first becasue it was using another column "Date" as column names. I used dcast(df, Time+Date ~...) and it worked. Thank you for taking the time to reply. I will use reproducible examples in the future. – Jakemcp May 27 '16 at 16:06