0

Suppose I have the following data frame data-

V1   V2
A    3
A    2  
A    1
B    2
B    3
C    4
C    3
C    1
C    2

Now I want to extract information of each level, i.e. (A,B,C,D & E) of V1. As an example, if I choose to see the sum of different levels in V2 for each level of V1, what should be the code? The output I want is-

      V1    V2
      A     6
      B     5
      C     10

I tried lapply and sapply but they are not giving the information I want. Of course I tried sapply(data,unique) which made no sense.

Also, in advance (may be a bit trickier), if I want to see the values in V2 which are unique in all the levels of V1,how to do it? Thanks !!

madmathguy
  • 23
  • 1
  • 7
  • 1
    Can you show the expected output as it is not clear. Do you need `library(data.table);setDT(data)[, if(uniqueN(V1)>1) .SD ,.(V2)]` – akrun Jul 08 '16 at 04:27
  • Do you just want `unique(data)` ? Or maybe this is helpful - http://stackoverflow.com/questions/18201074/find-how-many-times-duplicated-rows-repeat-in-r-data-frame/18201245 ? – thelatemail Jul 08 '16 at 04:52
  • @thelatemail, actually the link you gave is not exactly what I want. I want how many values each of A,B & C has and what values are common in them. – madmathguy Jul 08 '16 at 05:14

3 Answers3

3

I think this is what you want, in that it will find unique values which are common across different groups:

Common V2 values in each level of V1

Reduce(intersect, split(dat$V2, dat$V1))
#[1] 3 2

Common V1 values in each level of V2

Reduce(intersect, split(dat$V1, dat$V2))
#[1] "C"
thelatemail
  • 91,185
  • 12
  • 128
  • 188
1

Using data.table, we can find the unique values in 'V2' that are common across 'V1'.

library(data.table)
setDT(data)[,uniqueN(V1)==uniqueN(data$V1) , by = V2][(V1)]$V2
#[1] 3 2

and the common 'V1' in each unique element of 'V2'

setDT(data)[, if(uniqueN(V1)==1) .SD , by = V2]$V1
#[1] "C"
akrun
  • 874,273
  • 37
  • 540
  • 662
0

Maybe this is helpful

output <- aggregate(data=df,V2~.,FUN=paste)

For extraction of common values in V2 presented all the levels of V1 use this

Reduce(intersect,output$V2)
user2100721
  • 3,557
  • 2
  • 20
  • 29
  • Thanks @user2100721 , it is almost what I want. The only thing is it is giving output in list when I am trying to extract the `V2` values and using the `unlist` function is producing one single vector. – madmathguy Jul 08 '16 at 05:05
  • Yes, that is perfectly fine. For the 2nd part, I want to extract the unique values which are present in all levels. In this case A:3,2,1,B:2,3,C:4,3,1,2. So the unique values will be 2 & 3. How to get them? – madmathguy Jul 08 '16 at 05:25
  • Okay, let me illustrate with the example I provided. Form the first part of my query, I want to show A takes values 3,2,1. B takes values 2,3 and C takes values 4,3,1,2. For the 2nd part, we can see that the values 2 & 3 are unique in A,B & C. I want the output as 2 & 3. – madmathguy Jul 08 '16 at 05:44