Error in read.zoo index has bad entry at data row 2

Question

When I use the following read.zoo it goes great until I added the last line: (my source is CSV, but here it is in a format for reproducing):

library(zoo)
 Lines <- "fdatetime,Consumption
    1,27/03/2015 01:00,0.04
    2,27/03/2015 02:00,0.04"


> z <- read.zoo(text = Lines, tz = "", format = "%d/%m/%Y %H:%M", sep = ",")
Error in read.zoo(text = Lines, tz = "", format = "%d/%m/%Y %H:%M", sep = ",") : 
  index has bad entry at data row 51

What's wrong with the last line? If you delete the last line it will work!

> data.table::fread(file.choose(), verbose = TRUE)
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.000001 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... ','
Detected 3 columns. Longest stretch was from line 2 to line 30
Starting data input on line 2 (either column names or first row of data). First 10 characters: 1,25/03/20
Some fields on line 2 are not type character (or are empty). Treating as a data row and using default column names.
Count of eol: 51 (including 0 at the end)
Count of sep: 102
nrow = MIN( nsep [102] / ncol [3] -1, neol [51] - nblank [0] ) = 51
Type codes (   first 5 rows): 143
Type codes (+ middle 5 rows): 143
Type codes (+   last 5 rows): 143
Type codes: 143 (after applying colClasses and integer64)
Type codes: 143 (after applying drop or select (if supplied)
Allocating 3 column slots (3 - 0 dropped)
Read 51 rows. Exactly what was estimated and allocated up front
   0.000s (  0%) Memory map (rerun may be quicker)
   0.000s (  0%) sep and header detection
   0.000s (  0%) Count rows (wc -l)
   0.001s (100%) Column type detection (first, middle and last 5 rows)
   0.000s (  0%) Allocation of 51x3 result (xMB) in RAM
   0.000s (  0%) Reading data
   0.000s (  0%) Allocation for type bumps (if any), including gc time if triggered
   0.000s (  0%) Coercing data already read in type bumps (if any)
   0.000s (  0%) Changing na.strings to NA
   0.001s        Total

Have you checked if your last line has an \r\n (Carriage Return) at the end? maybe there are one or two empty lines at the end of the file. — BerndGit, Feb 21 '16 at 14:45
The above does not produce an error for me (on Linux). Most likely the issue has to do with carriage return in the original file as pointed out by @BerndGit. — nrussell, Feb 21 '16 at 14:48
As can be seen there is no carriage return. I tried it when using even 100 and more lines and the problem is in this specific line. Other files with same format works great. — Avi, Feb 21 '16 at 14:55
@G. Grothendieck, In this case there is no need for header=TRUE — Avi, Feb 21 '16 at 14:56
Are you able to read the file in with other functions? For example, does `data.table::fread("/path/to/actual/file", verbose = TRUE)` mention anything unusual? — nrussell, Feb 21 '16 at 14:59
I use ts1<-read.csv (file.choose()) and when I see the content in R and in Notepad++ it looks good with no added spaces or characters and same as other files that were loaded OK. This files works great till I added this specific line. — Avi, Feb 21 '16 at 15:01
Try using `data.table::fread` with `verbose = TRUE` specifically. Most likely you aren't going to actually *see* characters like `\r\n` by inspecting the file manually. — nrussell, Feb 21 '16 at 15:05
Please find at the end of question body results for data.table::fread("/path/to/actual/file", verbose = TRUE) — Avi, Feb 21 '16 at 15:10
@Avi, Good point about header=TRUE not being needed. I just tried it and in fact your `read.zoo` code worked for me so I can't reproduce the problem. — G. Grothendieck, Feb 21 '16 at 15:24
If you copy the data and code from this question and paste it into your R session does it work in that case? — G. Grothendieck, Feb 21 '16 at 15:29
No it doesn't work neither by using it as is nor by using it from CSV file. Any suggestion? — Avi, Feb 21 '16 at 16:26
When I delete line 51 - like a magic, everything is OK. When line 51 is even a larger file - error, amazing!!!!! — Avi, Feb 21 '16 at 17:05
Could it be related to daylight saving? I see that at least [some countries switched from Standard time to DST `27/03/2015 02:00`](http://www.timeanddate.com/time/dst/2015a.html). DST may cause some surprises (see [some related Q&A](http://stackoverflow.com/search?tab=votes&q=%5br%5d%20daylight%20saving)). — Henrik, Feb 21 '16 at 18:32
See [this Q&A](http://stackoverflow.com/questions/27361500/trouble-finding-non-unique-index-entries-in-zooreg-time-series). Same error as you: "_index has bad entries at data rows_". From the answer: "The referenced rows have timestamps that coincide with the changeover from Standard to DST, so these times do not exist in the US Eastern timezone" — Henrik, Feb 21 '16 at 18:42

Avi · Accepted Answer · 2016-02-21T19:20:11.703

2

Thanks to @Henrik, The solution for it is to specify tz, as followed:

z<-read.zoo(ts1, tz = "UTC", format = "%d/%m/%Y %H:%M", sep = ",")

edited Feb 21 '16 at 19:20

answered Feb 21 '16 at 19:10

Avi

2,247
4
30
52

4

Since the problem is the switch to daylight savings time you could alternately use chron instead of POSIXct as it has no time zones or daylight savings time: `library(chron); z <- read.zoo(text = Lines, FUN = as.chron, format = "%d/%m/%Y %H:%M", sep = ",")` – G. Grothendieck Feb 21 '16 at 23:06

Error in read.zoo index has bad entry at data row 2

1 Answers1

Linked