0

I want to read a CSV file separated by ";" which contains four columns, such as:

16/12/2006;17:24:00;0;1
16/12/2006;17:25:00;2;3
16/12/2006;17:26:00;4;5

but I want a dataframe with 3 columns rather than 4 (that is, merge the date and hour of the two first columns into a single one).

So far, I have come up with this portion of code inspired by Specify custom Date format for colClasses argument in read.table/read.csv to read the data. Then, I'd merge the two columns somehow.

setClass("myDate")
setAs("character","myDate", function(from) as.Date(from, format="%d/%m/%Y") )
setClass("myTime")
setAs("character","myTime", function(from) as.Date(from, format="%H:%M:%S") )

data <- read.table(file = "file.csv", header = FALSE, sep = ";", colClasses =  c("myDate", "myTime", "numeric", "numeric"))

However, the resulting data frame does have a column V2 in which the hour is not properly read.

          V1         V2 V3 V4
1 2006-12-16 2016-03-04  0  1
2 2006-12-16 2016-03-04  2  3
3 2006-12-16 2016-03-04  4  5

Is the myTime class badly defined? If so, how should I change it?

Community
  • 1
  • 1
Harald
  • 3,110
  • 1
  • 24
  • 35
  • 1
    Did you try as.POSIXlt instead of as.Date? Because you are reading times not dates here.... Plus, IMO I'd read it in raw and then paste them together and then run POSIX... like here: http://stackoverflow.com/questions/35624659/how-to-find-the-difference-of-time-time-taken-to-process-a-file-in-r/35625097#35625097 – Buggy Mar 04 '16 at 10:53
  • Thank you for sharing the link, it was very helpful! – Harald Mar 04 '16 at 11:08

1 Answers1

3

Is there a particular reason why you want to do this during the import, and not after? It seems much easier to import the 4 columns, merge the date and time together using paste, and then use the lubridate package and its dmy_hms function to convert to proper date-time:

require(lubridate)
data <- read.table(file = "file.csv", header = FALSE, sep = ";")
data$date_time <- paste(data$V1, data$V2)
data$date_time <- dmy_hms(data$date_time)
data[1:2] <- list(NULL)
radiumhead
  • 502
  • 2
  • 9