2

Suppose we have generated a matrix A where each column contains one of the combinations of n elements in groups of k. So, its dimensions will be k,choose(n,k). Such a matrix is produced giving the command combn(n,k). What I would like to get is another matrix B with dimensions (n-k),choose(n,k), where each column B[,j] will contain the excluded n-k elements of A[,j].

Here is an example of the way I use tho get table B. Do you think it is a safe method to use? Is there another way?

n <- 5 ; k <- 3
(A <- combn(n,k))
(B <- combn(n,n-k)[,choose(n,k):1])

Another example

x<-c(0,1,0,2,0,1) ; k<- 4
(A <- combn(x,k))
(B <- combn(x,length(x)-k)[,choose(length(x),k):1])

That previous question of mine is part of this problem.
Thank you.

Community
  • 1
  • 1
gd047
  • 29,749
  • 18
  • 107
  • 146

3 Answers3

4

using Musa's idea

B <- apply(A,2,function(z) x[is.na(pmatch(x,z))])

as regards the first example:

B <- apply(A,2,function(z) (1:n)[is.na(pmatch((1:n),z))])
Community
  • 1
  • 1
gd047
  • 29,749
  • 18
  • 107
  • 146
2

Use the setdiff function:

N <- 5
m <- 2    
A <- combn(N,m)
B <- apply(A,2,function(S) setdiff(1:N,S))

MODIFIED: The above works only when the vectors have unique values. For the second example, we write a replacement for setdiff that can handle duplicate values. We use rle to count the number of occurence of each element in the two sets, subtract the counts, then invert the RLE:

diffdup <- function(x,y){
  rx <- do.call(data.frame,rle(sort(x)))
  ry <- do.call(data.frame,rle(sort(y)))
  m <- merge(rx,ry,by='values',all.x=TRUE)
  m$lengths.y[is.na(m$lengths.y)] <- 0
  rz <- list(values=m$values,lengths=m$lengths.x-m$lengths.y)
  inverse.rle(rz)
}

x<-c(0,1,0,2,0,1) ; k<- 4
A <- combn(x,k)
B <- apply(A,2,function(z) diffdup(x,z))
Jyotirmoy Bhattacharya
  • 9,317
  • 3
  • 29
  • 38
  • Thanks. How must be modified in order to work for the 2nd example too? – gd047 Mar 24 '10 at 15:18
  • Modified to add a solution for the second problem too. – Jyotirmoy Bhattacharya Mar 25 '10 at 01:28
  • Instead of this combination you could just reverse gd047 solution: `apply(A,2,function(S) x[setdiff(1:N,S)])` where `N<-length(x)`. – Marek Mar 25 '10 at 08:14
  • @marek. Tried it on the original post's second example but it doesn't work (assuming that i got the question right). The elements of S here are the values chosen while 1:N are potential indices. Would it make sense to take their set difference? – Jyotirmoy Bhattacharya Mar 25 '10 at 10:08
  • I was thinking about `N<-length(x); m<-k; (A<-combn(N,m)); apply(A,2,function(S) x[setdiff(1:N,S)])`, but disadvantage of this that we don't get `A` with elements of `x`. – Marek Mar 25 '10 at 11:10
1

Here a more general solution (you can replace X by any vector containing unique entries):

X<-1:n
B<-apply(A,2,function(x,ref) ref[!ref%in%x],ref=X)
B<-do.call(cbind,B)

Whereas in your previous question x and y were not sets, provided that the columns of A are proper sets, the above code should work.

teucer
  • 6,060
  • 2
  • 26
  • 36
  • Thank you but in most cases there will be duplicates as was the case in the referenced question. – gd047 Mar 22 '10 at 09:14