0

let's say I have a table of the dimensions N X M, I wan to find a systematic way to rank the columns in terms of how do they increase the numbers of unique rows, preferably in R.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Mouad_Seridi
  • 2,666
  • 15
  • 27
  • 1
    Please provide a small example data and expected output based on that. – akrun Sep 16 '15 at 16:08
  • 1
    It's good practise to provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) when you ask a question – Jaap Sep 16 '15 at 16:08

2 Answers2

1

Try this example:

#dummy data
df <- data.frame(a = c(1, 1, 1, 1),
                 b = c(1, 2, 3, 4),
                 c = c(1, 2, 2, 4))
#   a b c
# 1 1 1 1
# 2 1 2 2
# 3 1 3 2
# 4 1 4 4

#re order data.frame
df[,order(sapply(colnames(df), function(i) length(unique(df[,i]))),decreasing = TRUE)]
#   b c a
# 1 1 1 1
# 2 2 2 1
# 3 3 2 1
# 4 4 4 1
zx8754
  • 52,746
  • 12
  • 114
  • 209
0
library(dplyr)

test = data_frame(a = c(1, 1, 1),
                  b = c(1, 2, 2),
                  c = c(1, 2, 3))

base = test %>% distinct

nrow(base) - 
  names(base) %>% sapply(function(name)
    base %>%
      select_("-" %>% paste(name)) %>%
      distinct %>%
      nrow)
bramtayl
  • 4,004
  • 2
  • 11
  • 18