R programming. Dataframe and subsets

Question

Select only unique sets from a dataframe here one set = one row of data frame. syntax in r? I want set concepts see this example

o/p:

1 1 2

1 2 3

Here row1 and row2 form the sets ={1,2}, so I need only one copy of such rows.

This is my code for apriori algorithm. The function trim(data,r) is what i'hv been trying as a solution,but isn't working out.

uniqueItemSets<-function(data){

    #unique items in basket
    items <- c()

        for(j in c(1:ncol(data))){
            items <- c(items,unique(data[,j]))      
        }

        items <- unique(items)
    #return(as.list(items))
    return(items)
}
F_itemset<-function(data,candidate,sup){

    count <- rep(0,nrow(candidate))

    for(i in c(1:nrow(data))){          #every transaction
        for(j in c(1:nrow(candidate))){ #every dataset
            x <- candidate[j,]
            #x <- uniqueItemSets(x)

            y <- data[i,]
            #y <- uniqueItemSets(y)     
            if(all(x %in% y)){
                count[j] <- count[j] + 1
            }
        }       
    }
    #pruning
    pp<-cbind(candidate,count)
    pp<-as.data.frame(pp)
      pp<-subset(pp,pp$count>=sup)

    return(pp)
}
#k-itemset :k-value
makeItemSet<- function(candidate,k){

    l<-combn(candidate,k,simplify=FALSE)
    return(l)
} 

aprio<-function(data,sup,conf,kmax){

    C <- uniqueItemSets(data)
    C <- as.data.frame(C)
    for(k in c(2:kmax))
    {
        F <- F_itemset(data,C,sup)
        F$count <- NULL
        if(nrow(F)<k){
            break;
        }
        F<-t(F)
        C <- combn(F,k,simplify=FALSE)
        C <- as.data.frame(C)
        C <- t(C)   #transpose
        C<-unique(C)
        trim(C,1)
    }
    return(F)
}

**

new <- data.frame()
trim<-function(data,r)
{
    x<-as.data.frame(data[r,])
    c<-c()
    for(j in c(1:ncol(x))){
        c<-c(c,x[,j])
    }
    c<-unique(c)
    if(r+1<=nrow(data)){
    for(i in c((r+1):nrow(data))){
        t<-c()
        for(j in c(1:ncol(data))){
            t<-c(t,data[i,j])
        }
        t<-unique(t)
        if(all(t %in% c) && all(c %in% t))
        {
            data[-i,]
        }
    }
    new <- as.data.frame(data)
    if(r+1 < nrow(data)){
        trim(data[r+1:nrow(data),],r+1)
    }
    }

}

**

If you are wondering why all the down votes: 1. SO is not here for doing your homework. 2. You show no code effort. Reading [this post](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) will help for future questions — phiver, Mar 25 '18 at 07:48

score 0 · Answer 1 · answered Mar 25 '18 at 12:49

0

You can use apply with margin = 1 to execute row wise functions. The only thing to be aware of is that you need to transpose the outcome to get the order you need

d <- data.frame(number1 = c(1,1,1),
                number2 = c(1,2,2),
                number3 = c(2,1,3))

# next two statements can be run in one line of code if you want
d_sort <- t(apply(d, 1, sort))

# get rid of duplicate rows
unique(d_sort)

     [,1] [,2] [,3]
[1,]    1    1    2
[2,]    1    2    3

answered Mar 25 '18 at 12:49

phiver

23,048
14
44
56

It helps improve reducing my data frame.But, consider this, row1 : 1 1 2 row2 : 1 2 2 After applying your code, these two rows still remains. The redundancy is still there. I need something like if all(x %in% y) && (y%in%x) then take only one row among all such rows. You can see this approach in trim function of my code. @phiver – Nikhil Saini Mar 25 '18 at 16:23

R programming. Dataframe and subsets

**

1 Answers1