1

Hi and apologies for the title, I do not know how to explain my issue. I have a table like this:

  color.1 color.2 color.3 color.4
1     red    blue     red     red
2    blue                    blue
3   green            blue   green
4    blue                    blue

I'm interested in knowing the number of times each tuple is repeated .For example in this case it would be:

red blue red red = 1
blue         blue = 2
green    blue green = 1

I tried using the expand function and summary and it didn't worked.

EDIT: I just discovered the function table does something similar to what I want but not in the table format that I want...is it even possible to do it with a built-in function? Or any package?

user3276768
  • 1,416
  • 3
  • 18
  • 28
  • It might help future searchers if you change the word "tuple" to the word "row," or at least get the word "row" in there I can imagine people searching for something like "count duplicate rows" and not finding this question – shadowtalker Mar 12 '15 at 12:47
  • And on that point, there are a few other, possibly more elegant (depending on your use case and your coding style), solutions in the answers [here](http://stackoverflow.com/q/18201074) and [here](http://stackoverflow.com/a/16905545) – shadowtalker Mar 12 '15 at 12:51

2 Answers2

1

It's not quite the exact format you are asking for but simply running table(df) should give you the data you want. Here's my dummy example:

>xx=data.frame(A=c("a",NA,'a','c'),B=c('b','d','a',NA))

> table(xx) B A a b d a 1 1 0 c 0 0 0

Danny
  • 3,077
  • 2
  • 23
  • 26
1

You can collapse rows first and then apply table:

> table(apply(d,1,paste,collapse=' '))

     blue - - blue green - blue green   red blue red red 
                 2                  1                  1 

where d is your sample data set,

d <- structure(list(color.1 = structure(c(3L, 1L, 2L, 1L), .Label = c("blue", 
"green", "red"), class = "factor"), color.2 = structure(c(2L, 
1L, 1L, 1L), .Label = c("-", "blue"), class = "factor"), color.3 = structure(c(3L, 
1L, 2L, 1L), .Label = c("-", "blue", "red"), class = "factor"), 
    color.4 = structure(c(3L, 1L, 2L, 1L), .Label = c("blue", 
    "green", "red"), class = "factor")), .Names = c("color.1", 
"color.2", "color.3", "color.4"), class = "data.frame", row.names = c(NA, 
-4L))

PS: Here is a much more elegant and effective realization of the same idea, suggested by David Arenburg in the comment below:

table(do.call(paste, d))
Marat Talipov
  • 13,064
  • 5
  • 34
  • 53