11

I have difficulty switching between data frames and zoo objects, particularly keeping meaningful column names, and inconsistencies between univariate and multivariate cases:

library(zoo)

#sample data, two species counts over time
t = as.Date(c("2012-01-01", "2012-01-02", "2012-01-03", "2012-01-04"))
n1 = c(4, 5, 9, 7)  #counts of Lepisma saccharina
n2 = c(2, 6, 0, 11) #counts of Thermobia domestica
df = data.frame(t, n1, n2)
colnames(df) <- c("Date", "Lepisma saccharina", "Thermobia domestica")

#converting to zoo loses column names in univariate case...
> z1 <- read.zoo(df[,1:2]) #time series for L. saccharina
> colnames(z1)
NULL
> colnames(z1) <- c("Lepisma saccharina") #can't even set column name manually
Error in `colnames<-`(`*tmp*`, value = "Lepisma saccharina") : 
  attempt to set colnames on object with less than two dimensions
#... but not in multivariate case
> z2 <- read.zoo(df) #time series for both species
> colnames(z2)
[1] "Lepisma saccharina"  "Thermobia domestica"

To go back from a zoo object to a data frame in the original format, it's not enough to use as.data.frame since it won't include a Date column (the dates end up in the rownames): more work is needed.

zooToDf <- function(z) {
    df <- as.data.frame(z) 
    df$Date <- time(z) #create a Date column
    rownames(df) <- NULL #so row names not filled with dates
    df <- df[,c(ncol(df), 1:(ncol(df)-1))] #reorder columns so Date first
    return(df)
}

This works great on the multivariate case, but clearly can't recover a meaningful column name in the univariate case:

> df2b <- zooToDf(z2)
> df2b
        Date Lepisma saccharina Thermobia domestica
1 2012-01-01                  4                   2
2 2012-01-02                  5                   6
3 2012-01-03                  9                   0
4 2012-01-04                  7                  11

> df1b <- zooToDf(z1)
> df1b
        Date z
1 2012-01-01 4
2 2012-01-02 5
3 2012-01-03 9
4 2012-01-04 7

Is there a simple way to handle both univariate and multivariate cases? It seems z1 needs to remember the column name somehow.

Silverfish
  • 1,812
  • 1
  • 22
  • 30
  • Note to self: the basic problem of "drop" sometimes causing inconsistency between one and multi variable cases arises often with data frames, not just on conversion to zoo objects. See the FAQs for `data.table` http://datatable.r-forge.r-project.org/datatable-faq.pdf where the developers note "In `[.data.frame` we very often set `drop=FALSE`. When we forget, bugs can arise in edge cases where single columns are selected and all of a sudden a vector is returned rather than a single column data.frame. In `[.data.table` we took the opportunity to make it consistent and drop drop." – Silverfish Oct 31 '13 at 21:36
  • 2
    What is referred to as inconsistency in the question is how R works even without zoo. In fact, zoo is consistent with how R works. If it did not work that way it would be inconsistent. – G. Grothendieck Jun 12 '19 at 13:53

5 Answers5

16

If you don't want to drop dimensions, use drop=FALSE:

R> (z1 <- read.zoo(df[,1:2], drop=FALSE))
           Lepisma saccharina
2012-01-01                  4
2012-01-02                  5
2012-01-03                  9
2012-01-04                  7

You can do something like write.zoo if you want to include the zoo index as a column in your data.frame:

zoo.to.data.frame <- function(x, index.name="Date") {
  stopifnot(is.zoo(x))
  xn <- if(is.null(dim(x))) deparse(substitute(x)) else colnames(x)
  setNames(data.frame(index(x), x, row.names=NULL), c(index.name,xn))
}

UPDATE:

After trying to edit your question for brevity, I thought of an easy way to create df2b to your specifications (this will also work for z1 if you don't drop dimensions):

R> (df2b <- data.frame(Date=time(z2), z2, check.names=FALSE, row.names=NULL))
        Date Lepisma saccharina Thermobia domestica
1 2012-01-01                  4                   2
2 2012-01-02                  5                   6
3 2012-01-03                  9                   0
4 2012-01-04                  7                  11
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
5

To convert from data frame to zoo use read.zoo:

library(zoo)
z <- read.zoo(df)

Also note the availability of the drop and other arguments in ?read.zoo .

and to convert from zoo to data frame, including the index, use fortify.zoo:

fortify.zoo(z, name = "Date")

(If ggplot2 is loaded then you can just use fortify.)

As mentioned in the comments below the question, the question as well as some of the other answers are either outdated or have some significant misunderstandings. Suggest you review https://cran.r-project.org/web/packages/zoo/vignettes/zoo-design.pdf which discusses the design philosophy of zoo which includes consistency with R itself. Certainly zoo would be a lot harder to use if you had to remember one set of defaults for R and another for zoo.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
2

There's a newer simple solution to this using the timetk package. It will convert several time series formats, including xts and zoo, to tibbles. Simply wrap in as.data.frame to get a data frame.

timetk::tk_tbl(zoo::read.zoo(df))
# A tibble: 4 x 3
  index      `Lepisma saccharina` `Thermobia domestica`
  <date>                    <dbl>                 <dbl>
1 2012-01-01                    4                     2
2 2012-01-02                    5                     6
3 2012-01-03                    9                     0
4 2012-01-04                    7                    11
hmhensen
  • 2,974
  • 3
  • 22
  • 43
0

I would to go around a little bit. First, write the zoo to a csv file, then read it again to a data.frame. The index column will be named "Index" by default but you can change it with a parameter.

library(zoo)
date <-
  seq.Date(
    from = as.Date("2017-01-01"),
    to = as.Date("2017-01-10"),
    by = "days"
  )
value <- seq.int(from = 100, to = length(date))
vzoo <- zoo(value, date)
write.zoo(
  vzoo,
  index.name = "Date",
  file = "tmp.txt",
  sep = ",",
  col.names = TRUE
)
vzoo.df <- read.csv("tmp.txt", sep = ',')
0

You can simply create a new data set and add the as.data.frame to wrap the fortify.zoo. This should help. z2=as.data.frame(fortify.zoo(z, name = "Date"))

Ami
  • 197
  • 1
  • 12