1

I must be missing something trivial but I can't see why the below cast converts my dataframe to a list. I would like the output to be a dataframe if possible.

It starts as

 str(d)
     'data.frame':   12 obs. of  4 variables:
     $ credit_id: num  12 12 12 12 18 ...
     $ Date     : Date, format: "2003-06-30" "2003-09-30" "2003-12-31" ...
     $ value    : num  840 854 847 834 4831 ...
     $ varb     : chr  "sales" "sales" "sales" "sales" ...

then I try to cast it

d<-cast(d,Date+credit_id~varb)

and I get

str(d)

    List of 4
     $ Date     : Date[1:9], format: "2003-06-30" "2003-06-30" "2003-09-30" ...
     $ credit_id: num [1:9] 12 18 12 18 12 ...
     $ ebitda   : num [1:9] NA NA NA NA NA ...
     $ sales    : num [1:9] 840 4831 854 4670 847 ...
     - attr(*, "row.names")= int [1:9] 1 2 3 4 5 6 7 8 9
     - attr(*, "idvars")= chr [1:2] "Date" "credit_id"
     - attr(*, "rdimnames")=List of 2
      ..$ :'data.frame':    9 obs. of  2 variables:
      .. ..$ Date     : Date[1:9], format: "2003-06-30" "2003-06-30" "2003-09-30" ...
      .. ..$ credit_id: num [1:9] 12 18 12 18 12 ...
      ..$ :'data.frame':    2 obs. of  1 variable:
      .. ..$ varb: chr [1:2] "ebitda" "sales"

Full code below. Thanks in advance.

    d<-structure(list(credit_id = c(12, 12, 12, 12, 18, 18, 2073, 2073,
    2103, 2103, 1776, 1776), Date = structure(c(12233, 12325, 12417,
    12508, 12233, 12325, 15552, 15552, 15552, 15552, 15552, 15552
    ), class = "Date"), value = c(839.8, 853.9, 846.9, 833.7, 4831.2,
    4670, 54.1, 995, 90.944, 1092.8, 81.2, 1348.2), varb = c("sales",
    "sales", "sales", "sales", "sales", "sales", "ebitda", "sales",
    "ebitda", "sales", "ebitda", "sales")), .Names = c("credit_id",
    "Date", "value", "varb"), row.names = c(606799L, 606800L, 606801L,
    606802L, 606805L, 606806L, 1131814L, 1131822L, 1131950L, 1131958L,
    1132034L, 1132042L), class = "data.frame")
    head(d)
    str(d)
    d<-cast(d,Date+credit_id~varb)
    head(d)
    str(d)
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
Aidan
  • 109
  • 1
  • 7
  • `class(d)` reveals that it is still a `data.frame`, which is a special form of a `list`. However, it is first of class `cast_df`, which has its own `str` method so that is why it looks different. – James Sep 27 '12 at 09:20
  • thanks @James. Maybe I should not be so scared of lists then :) – Aidan Sep 27 '12 at 09:45

1 Answers1

1

Use dcast from reshape2 package. From ?dcast you can realize that dcast gives you a dataframe as output.

set.seed(001) # Generating some data.
credit_id <- sample(c(12,14,13,11), 10, TRUE)
Date  <- seq(Sys.Date(), length.out=10, by="1 day")
value <- rnorm(10,1000,50)
varb <- sample(c("ebitda", "sales"), 10, TRUE)

d <- data.frame(credit_id, Date, value, varb) # this is something like your df
str(d)

'data.frame':   10 obs. of  4 variables:
 $ credit_id: num  14 14 13 11 12 11 11 13 13 12
 $ Date     : Date, format: "2012-09-27" "2012-09-28" ...
 $ value    : num  959 1024 1037 1029 985 ...
 $ varb     : Factor w/ 2 levels "ebitda","sales": 1 2 1 1 2 2 2 1 2 1

d2 <- dcast(d,Date+credit_id~varb)
str(d2)

'data.frame':   10 obs. of  4 variables:
 $ Date     : Date, format: "2012-09-27" "2012-09-28" ...
 $ credit_id: num  14 14 13 11 12 11 11 13 13 12
 $ ebitda   : num  959 NA 1037 1029 NA ...
 $ sales    : num  NA 1024 NA NA 985 ...

d and d2 look like:

d
   credit_id       Date     value   varb
1         14 2012-09-27  958.9766 ebitda
2         14 2012-09-28 1024.3715  sales
3         13 2012-09-29 1036.9162 ebitda
4         11 2012-09-30 1028.7891 ebitda
5         12 2012-10-01  984.7306  sales
6         11 2012-10-02 1075.5891  sales
7         11 2012-10-03 1019.4922  sales
8         13 2012-10-04  968.9380 ebitda
9         13 2012-10-05  889.2650  sales
10        12 2012-10-06 1056.2465 ebitda

d2
         Date credit_id    ebitda     sales
1  2012-09-27        14  958.9766        NA
2  2012-09-28        14        NA 1024.3715
3  2012-09-29        13 1036.9162        NA
4  2012-09-30        11 1028.7891        NA
5  2012-10-01        12        NA  984.7306
6  2012-10-02        11        NA 1075.5891
7  2012-10-03        11        NA 1019.4922
8  2012-10-04        13  968.9380        NA
9  2012-10-05        13        NA  889.2650
10 2012-10-06        12 1056.2465        NA

Next time make sure to provide some data by using dput(dataframe) to make your question reproducible.

Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
  • Thanks @Jilber, it's good to know about dcast. Sorry for not making things reproducible. I used dput and copied the result into the d<-structure(list(credit_id = c(12, 12, 12, ...... line from my question. I thought that was the way to use dput. I'll read over the posting guide again. – Aidan Sep 27 '12 at 09:47
  • 1
    Yes, that's the way to use `dput`. I almost forget to say that reshape is the old version of reshape2, see [this](http://stackoverflow.com/questions/12377334/reshape-vs-reshape2-in-r) question to get some explanations. Glad to be useful. – Jilber Urbina Sep 27 '12 at 09:55