2

I have the following data frame:

dat <- data.frame(toys = c("yoyo", "doll", "duckie", "tractor", "airplaine", "ball", "racecar", "dog", "jumprope", "car", "elephant", "bear", "xylophone", "tank", "checkers", "boat", "train", "jacks", "truck", "whistle", "pinwheel"),
                  price = c(1.22, 2.75, 1.85, 5.97, 6.47, 2.16, 7.13, 4.57, 1.46, 5.18, 3.16, 4.89, 7.11, 6.45, 4.77, 8.04, 6.71, 2.31, 6.21, 0.98, 0.87))

I now want to get all combination of toys for 7 to 14 selected toys. Following this thread (Unordered combinations in R), I'm using the combinations function in the arrangements package:

library(arrangements)
combs <- lapply(7:14, combinations, x = dat$toys)

Looking at the results with str(combs) it gives a list of length 8, where each list element is a two-dimensional factor, e.g.

test <- combs[[1]]
dim(test)

However, if I want to convert the list elements to a data frame now it just gives me a data frame with one column, whereas I would expect 7 columns for as.data.frame(combs[[1]]).

If I use an integer or character vector in the combinations function above, all works as expected, e.g. with:

combs2 <- lapply(7:14, combinations, x = as.character(dat$toys)) # or
combs3 <- lapply(7:14, combinations, x = 1:21)

test2 <- as.data.frame(combs2[[1]])
test3 <- as.data.frame(combs3[[1]])

I get a proper data frame with several columns.

Why is my code working with integers and characters, but not with factors?

deschen
  • 10,012
  • 3
  • 27
  • 50
  • How do you convert list elements to data frame? How does it work as expected with character or integer vector ? – Ronak Shah Nov 28 '19 at 15:04
  • I adjusted my post above. Hope it makes it more clear now. – deschen Nov 28 '19 at 15:48
  • I tried the above code but I wasn't able to replicate the result. `as.data.frame(combs[[1]])` returned a seven-column data frame. – SDS0 Nov 28 '19 at 16:42
  • To clarify, this was for the factor case, not the string/int case. – SDS0 Nov 28 '19 at 16:43
  • Are you sure? I just retried and again get a one-column data frame for the factor case with 813960 rows. (I just edited my post to make the examples less ambigious, i.e. I changed the object names to combs/combs2/combs3. – deschen Nov 28 '19 at 17:01

1 Answers1

1

When you call combinations, the underlying c code set the dim attributes on the output. When it is a character, numeric or integer, it is converted into a matrix and then you can get a data.frame from it:

We can try this in R for characters and integers (like you have shown):

x = 1:4
attr(x,"dim") <- c(2,2)
class(x)
[1] "matrix"
dim(data.frame(x))
1] 2 2

x = as.character(1:4)
attr(x,"dim") <- c(2,2)
class(x)
[1] "matrix"
dim(data.frame(x))
[1] 2 2

Note for the above, you get the correct dimensions and class (matrix) back. For factors, it doesn't complain, you get a two dimensional factor:

x = factor(1:4)
attr(x,"dim") <- c(2,2)
class(x)
[1] "factor"
str(x)
Factor[1:2, 1:2] w/ 4 levels "1","2","3","4": 1 2 3 4

However, it is not a matrix although it looks like one:

x
     [,1] [,2]
[1,] 1    3   
[2,] 2    4   
Levels: 1 2 3 4

However, converting it into a data.frame fails:

as.data.frame(x)
   x.1  x.2
1    1    3
2    2    4
3 <NA> <NA>
4 <NA> <NA>
Warning message:
In format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x,  :
  corrupt data frame: columns will be truncated or padded with NAs

My guess is that you were very lucky with the combinations of 7 to 14. If you try lower numbers, it fails:

data.frame(combinations(dat$toys,5))
Error in `[.default`(xj, i, , drop = FALSE) : subscript out of bounds
data.frame(combinations(dat$toys,2))
#throws same erros as above
StupidWolf
  • 45,075
  • 17
  • 40
  • 72
  • Thanks gor your explanation. From your point of view, is that behaviour intended or rather a bug? And would you have any idea how to get a data frame result with factors as well? – deschen Nov 29 '19 at 16:52
  • 1
    @deschen, I think it's not meant for factors. It doesn't make sense right, to have a matrix of factors. Because for all row-wise and column-wise functions you need to inherit a full level of the matrix. How does that help you? If you really want a data.frame of factors, you need to get the character first, and go through every column, making it a factor with the original levesl – StupidWolf Nov 29 '19 at 21:05