0

Updated Dataset:

DateTime            Object.Name    Object.Value
6/22/2017 21:11     DaHum          Normal
6/22/2017 12:59     DaHum          Alarm
6/16/2017 18:48     DaHum          Normal
6/16/2017 14:33     DaHum          Alarm
6/15/2017 18:46     DaHum          Normal
7/28/2017 8:00      ZON-1          58.56
7/28/2017 8:00      MA-H           51.66
7/28/2017 8:00      ZON-2          72.00
7/28/2017 8:00      ZON-4          70.00
7/28/2017 8:00      ZON-3          72.00
7/28/2017 7:45      PH             0.00
7/28/2017 7:45      OA             79.50
7/28/2017 7:45      SP             50.00
7/28/2017 7:45      ZON-1          32.47
7/28/2017 7:45      ZON-3          70.00
7/28/2017 7:45      CC             55.81

Hi I have dataframe in the below format:

I need to convert all the values under Object_Name to column names. The Object_Names have duplicate values, i.e the same Name is repeated with different timestamp.

The data type of Object_Value is alphanumeric, so when passed in R, it either takes as a Factor or a Character

Hence based on timestamp, I need to convert all the Object_Name row values to column name

  Date         Time    Object_Name    Object_Value
  7/28/2017    08:00    A1            58.56
  7/28/2017    08:00    A2            51.66
  .
  .
  .
  7/28/2017    08:30    A1            60.2
  7/28/2017    08:30    A2            65.2
  .
  .
  7/30/2017    08:30    B1            On
  7/30/2017    09:30    B1            Off

I need the output as below:

  Date         Time     A1        A2     B1
  7/28/2017    08:00    58.5     51.6    -
  7/28/2017    08:30    60.2     65.2    -
  7/30/2017    08:30      -        -     On
  7/30/2017    09:30      -        -     Off

Code so far:

JCI <- read.csv("JCIS2.csv",header = T, stringsAsFactors=FALSE)

JCI$Object.Value <- as.numeric(JCI$Object.Value)

library(reshape2)
JCI_Reshape <- dcast(JCI_Unique, Date...Time ~ Object.Name, value.var = "Object.Value", fun.aggregate = mean)
Anagha
  • 3,073
  • 8
  • 25
  • 43
  • 1
    Just use `dcast` from `reshape2` or `data.table` or `spread` from `tidyr` – akrun Aug 17 '17 at 06:26
  • I have tried using dcast, if I convert the object_Value to numeric type. I get NAs for non-numeric. If I use it as a character variable, I am getting all NAs. And without aggregation function(Mean) I'm getting error,. Is there any aggregation function i can use – Anagha Aug 17 '17 at 06:29

1 Answers1

0
library(tidyr) 
spread(JCI, Object_Name,Object_Value)
        Date  Time    A1    A2  B1
1: 7/28/2017 08:00 58.56 51.66  NA
2: 7/28/2017 08:30  60.2  65.2  NA
3: 7/30/2017 08:30    NA    NA  On
4: 7/30/2017 09:30    NA    NA Off

Data:

 dput(JCI)
structure(list(Date = structure(c(1L, 1L, 1L, 1L, 2L, 2L), .Label = c("7/28/2017", 
"7/30/2017"), class = "factor"), Time = structure(c(1L, 1L, 2L, 
2L, 2L, 3L), .Label = c("08:00", "08:30", "09:30"), class = "factor"), 
    Object_Name = structure(c(1L, 2L, 1L, 2L, 3L, 3L), .Label = c("A1", 
    "A2", "B1"), class = "factor"), Object_Value = structure(c(2L, 
    1L, 3L, 4L, 6L, 5L), .Label = c("51.66", "58.56", "60.2", 
    "65.2", "Off", "On"), class = "factor")), .Names = c("Date", 
"Time", "Object_Name", "Object_Value"), class = c("data.table", 
"data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer: 0x0000000007b70788>)
Prasanna Nandakumar
  • 4,295
  • 34
  • 63
  • Throwing an error: Error: Duplicate identifiers for rows (194894, 194895), (194268, 194269), (193878, 193879), (193642, 193643), (193524, 193525), (193446, 193447), (193406, 193407), (192936, 192937), (195908, 195909), (195558, 195559), (195246, 195247),..... – Anagha Aug 17 '17 at 06:40
  • Can you produce a reproducible example of your dataset as an update to ur question. – Prasanna Nandakumar Aug 17 '17 at 06:41
  • Please find the updated dataset – Anagha Aug 17 '17 at 06:57