0

I have a csv file with the content like below:

VA1,VA2,2014-05-24,,2014-05-22 15:50:16
VA2,VA1,2014-05-24,2014-05-26,2014-05-22 15:50:16

How can I read it? Standard read.csv can not recognize dates YYYY-MM-DD. I've tried to use read.zoo, but I am not sure how to:

  1. indicate that two different formats of date & time are used: YYYY-MM-DD and YYYY-MM-DD HH:MM:SS;
  2. indicate that empty values are possible.

Here is what I've tried:

library(zoo)
colClasses <- c("factor", "factor", "Date", "Date", "Date")
fmt <- "%Y-%m-%d"
z <- read.zoo("file.csv", header = FALSE, sep = ",", quote = "", format = fmt, tz = "", colClasses = colClasses)
LA_
  • 19,823
  • 58
  • 172
  • 308
  • Possible dupe of [this](http://stackoverflow.com/questions/13022299/specify-date-format-for-colclasses-argument-in-read-table-read-csv) or [this](http://stackoverflow.com/questions/18390674/automatically-detect-date-columns-when-reading-a-file-into-a-data-frame). However, perhaps `fread` + `fasttime` is faster than a `read.table/zoo` hack. – Henrik Mar 27 '15 at 20:46

2 Answers2

1

You can't use read.zoo to read that sort of data. Its meant for time series. Try the following. No packages needed. The code below has been written to be self contained but the text = Lines part could be replaced with the filename, e.g. read.table("myfile.dat", ...whatever...):

Lines <- "VA1,VA2,2014-05-24,,2014-05-22 15:50:16
VA2,VA1,2014-05-24,2014-05-26,2014-05-22 15:50:16"

DF <- read.table(text = Lines, sep = ",", as.is = TRUE, na.strings = "")
transform(DF, V3 = as.Date(V3), V4 = as.Date(V4), V5 = as.POSIXct(V5))

giving:

   V1  V2         V3         V4                  V5
1 VA1 VA2 2014-05-24       <NA> 2014-05-22 15:50:16
2 VA2 VA1 2014-05-24 2014-05-26 2014-05-22 15:50:16
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
0

You should be able to read the table with read.csv and then set column 3 and 4 to date and datetime objects with strptime:

For column 3:

data[,3] = as.POSIXct(as.character(data[,3]))

For column 4:

data[,4] = as.POSIXct(as.character(data[,4]))
xraynaud
  • 2,028
  • 19
  • 29
  • Thanks. But it returns the following warning - `Warning In `[<-.data.frame`(`*tmp*`, , "mycolumnname", value = list(sec = c(NA_real_, : suggested 11 variables to replace 1 variable` (not sure about exact En translation since my R returns errors in local lang). In result values are converted to `numeric`. But the same code with `as.Date` works well and returns `Date`. – LA_ Mar 27 '15 at 20:54
  • OK, my bad, doesn't work like the with strptime as the object it creates is a list. I'm updating the answer to something that works – xraynaud Mar 27 '15 at 21:31