0

I am trying to convert a table to a data frame.

Example:

tbl <- structure(c(1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L), .Dim = c(4L, 2L), .Dimnames = structure(list(
c("1", "2", "3", "4"), colNames = c("2013 3", "2014 12")), .Names = c("", "colNames")), class = "table")

colNames
     2013 3 2014 12
1      1       1
2      0       0
3      0       0
4      0       0

Conversion to data frame leads to a completely different data structure. Why ?

as.data.frame(tbl)

Var1 colNames Freq
1    1   2013 3    1
2    2   2013 3    0
3    3   2013 3    0
4    4   2013 3    0
5    1  2014 12    1
6    2  2014 12    0
7    3  2014 12    0
8    4  2014 12    0
scs
  • 567
  • 6
  • 22
  • I'm not convinced this is a dup @Jilber Urbina since the OP wants to know _why_ – hrbrmstr Oct 10 '18 at 21:14
  • 1
    From `?as.data.frame.table` : "The as.data.frame method for objects inheriting from class "table" can be used to convert the array-based representation of a contingency table to a data frame containing the classifying factors and the corresponding entries (the latter as component named by responseName). " which I think just means it is a design choice. (and that said, it doesn't really answer the "why") – Andrew Lavers Oct 10 '18 at 21:21
  • 2
    Not clear what "why" means here. What alternative is there that would behave consistently for tables with more or fewer than two dimensions, like `table(mtcars[9:11])`? – Frank Oct 10 '18 at 21:46

1 Answers1

6

Well, the precise reason as to "why" is that this is the source code for as.data.frame.table (just enter that name in an R console with no other punctuation to see this in the console):

function(x, row.names = NULL, ..., responseName = "Freq", 
         stringsAsFactors = TRUE, sep = "", base = list(LETTERS))  {

  ex <- quote(
    data.frame(
      do.call(
        "expand.grid", 
        c(
          dimnames(provideDimnames(x, sep = sep, base = base)), 
          KEEP.OUT.ATTRS = FALSE, 
          stringsAsFactors = stringsAsFactors)
      ), 
      Freq = c(x), row.names = row.names)
  )
  names(ex)[3L] <- responseName
  eval(ex)

}

Ultimately, what you have with:

tbl <- structure(
  c(1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L), 
  .Dim = c(4L, 2L), 
  .Dimnames = structure(
    list(
      c("1", "2", "3", "4"), 
      colNames = c("2013 3", "2014 12")
    ), 
    .Names = c("", "colNames")
  ), 
  class = "table"
)

is an integer vector with some attributes. When you type tbl and hit <ENTER> in an R console it's calling print.table() (enter print.table with no other punctuation in an R console to see its source) it goes through some hoops to print what you see as a "rectangular" data structure.

To get your desired result, just do what the print function ultimately does (in a not-as-straightforward-way):

as.data.frame.matrix(tbl)

or using tidyverse idioms:

as.data.frame(tbl) %>% 
  tidyr::spread(colNames, Freq)
##   Var1 2013 3 2014 12
## 1    1      1       1
## 2    2      0       0
## 3    3      0       0
## 4    4      0       0
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205