1

I want to read a .csv file with columns for date and time into a data frame. This is my raw data:

10/05/18,0:00:03,SP-PALL-01,0.05
10/05/18,0:00:03,SP-PDALL-05,0.1
10/05/18,0:00:03,PT33-PT34,0.21249676
10/05/18,0:00:03,PT32-PT36,0.42838383
10/05/18,0:00:03,SP-PDAH-03,0.6
10/05/18,0:00:03,PT32-PT34,0.60584164
10/05/18,0:00:03,SP-PDAH-04,0.7
10/05/18,0:00:03,PT-31,2.4700246
10/05/18,0:00:03,PT32-PT31,12.394566
10/05/18,0:00:03,PT-32,14.782079

I used a variation of this answer:

setClass('tagDate')
setClass('tagTime')
setAs("character","tagDate", function(from) as.Date(from, format="%d/%m/%y"))
setAs("character","tagTime", function(from) as.POSIXct(from, format="%H:%M:%S"))
df <- read.csv('allTags-cleanup-copy.csv',header = FALSE, colClasses=c('tagDate','tagTime','character','real'))
names(df) <- c('tagDate', 'tagTime', 'tagName', 'tagValue')
head(df)

This is my result:

     tagDate             tagTime     tagName  tagValue
1 2018-05-10 2018-05-10 00:00:03  SP-PALL-01 0.0500000
2 2018-05-10 2018-05-10 00:00:03 SP-PDALL-05 0.1000000
3 2018-05-10 2018-05-10 00:00:03   PT33-PT34 0.2124968
4 2018-05-10 2018-05-10 00:00:03   PT32-PT36 0.4283838
5 2018-05-10 2018-05-10 00:00:03  SP-PDAH-03 0.6000000
6 2018-05-10 2018-05-10 00:00:03   PT32-PT34 0.6058416

For the time column, I tried variations of separator and format strings, but no luck.

Can you please help me parse this data set successfully?

cogitoergosum
  • 2,309
  • 4
  • 38
  • 62
  • 1
    I'm slightly confused... in your data, for example `10/05/18,0:00:03,SP-PALL-01,0.05`, doesn't `10/05/18` correspond to `tagDate`, and it has `/` as a separator at the moment? Can you show us an end result that you desire? – Kim May 10 '18 at 04:58
  • Apologies! Date is formatted correctly; time is not. I edited the question. – cogitoergosum May 10 '18 at 05:00
  • @cogitoergosum Whats your expectation for `Time` column? it seems to be working as well. It adding current date with `Time`. – MKR May 10 '18 at 05:05
  • I want the parsing to result _only_ in `HH:MM:SS` format. – cogitoergosum May 10 '18 at 05:06
  • @cogitoergosum You time data is already in `HH:MM:SS` format. Why can't you just keep it as `character`? – MKR May 10 '18 at 05:16
  • @MKR I am going to do some time calculations later. – cogitoergosum May 10 '18 at 05:17
  • @cogitoergosum Okay. Perhaps using `hms` package to keep your `Time` as `difftime` can be another option to consider. Have a look at answer added by my. – MKR May 10 '18 at 05:29

2 Answers2

1

The lubridate library relieves you of any need to do setClass in advance. This is a shortcut: first read the df with colClasses as all characters. Then

library(tidyverse) ## For %>% pipes
library(magrittr)  ## For %<>% pipes (convenience)
library(lubridate) ## For parsing dates

df %<>%
  dplyr::mutate(
    tagDate = as.Date(parse_date_time(tagDate, 'mdy')),
    tagTime = format(parse_date_time(tagTime, 'HMS'),
                     format = '%H:%M:%S')
  )

Basically, you just change the format of the display for tagTime.

Kim
  • 4,080
  • 2
  • 30
  • 51
  • Just curious, can `setClass` be at all used for this requirement? I can work with this snippet of your's. – cogitoergosum May 10 '18 at 05:11
  • 1
    Hmm, I haven't tried putting your data into an actual csv, but maybe try `setAs("character","tagTime", function(from) format(as.POSIXct(from, format="%H:%M:%S"), format = '%H:%M:%S'))` and see if it works? – Kim May 10 '18 at 05:13
1

Probably using hms package is an option to convert time column from character and store as difftime.

library(hms)

setClass('tagDate')
setClass('tagTime')
setAs("character","tagDate", function(from) as.Date(from, format="%d/%m/%y"))
setAs("character","tagTime", function(from) as.hms(from)) #Changing to difftime
df <- read.csv(textConnection(text),header = FALSE,
                  colClasses=c('tagDate','tagTime','character','real'))
names(df) <- c('tagDate', 'tagTime', 'tagName', 'tagValue')
head(df)

#       tagDate  tagTime     tagName   tagValue
# 1  2018-05-10 00:00:03  SP-PALL-01  0.0500000
# 2  2018-05-10 00:00:03 SP-PDALL-05  0.1000000
# 3  2018-05-10 00:00:03   PT33-PT34  0.2124968
# 4  2018-05-10 00:00:03   PT32-PT36  0.4283838
# 5  2018-05-10 00:00:03  SP-PDAH-03  0.6000000
# 6  2018-05-10 00:00:03   PT32-PT34  0.6058416
# 7  2018-05-10 00:00:03  SP-PDAH-04  0.7000000
# 8  2018-05-10 00:00:03       PT-31  2.4700246
# 9  2018-05-10 00:00:03   PT32-PT31 12.3945660
# 10 2018-05-10 00:00:03       PT-32 14.7820790

Data:

text <- "10/05/18,0:00:03,SP-PALL-01,0.05
10/05/18,0:00:03,SP-PDALL-05,0.1
10/05/18,0:00:03,PT33-PT34,0.21249676
10/05/18,0:00:03,PT32-PT36,0.42838383
10/05/18,0:00:03,SP-PDAH-03,0.6
10/05/18,0:00:03,PT32-PT34,0.60584164
10/05/18,0:00:03,SP-PDAH-04,0.7
10/05/18,0:00:03,PT-31,2.4700246
10/05/18,0:00:03,PT32-PT31,12.394566
10/05/18,0:00:03,PT-32,14.782079"
MKR
  • 19,739
  • 4
  • 23
  • 33