multiple columns comparison

Question

I'm in a situation like this: I have a data.frame that looks like the following:

 Col1   Col2  
  a     3.4   
  a     3.4      
  d     3.2   
  c     3.2

I would like the following output:

 Col1  Col2  
  a    3.4      
  d    3.2   
  c    3.2

in other words the value "a" in "Col1" will be considered once since it is replicated exactly, otherwise even if the value of "d" and "c" is the same as reported in "Col2" it will be considered twice because they are different entities ("d" is different from "c")

Can anyone help me please?

score 5 · Answer 1 · answered Jan 14 '13 at 14:08

5

Try this:

DF <- read.table(text=" Col1   Col2  
  a     3.4   
  a     3.4      
  d     3.2   
  c     3.2 ", header=T)
aggregate(Col2~Col1, unique, data=DF)
  Col1 Col2
1    a  3.4
2    c  3.2
3    d  3.2

answered Jan 14 '13 at 14:08

Jilber Urbina

58,147
10
114
138

1

It is (indeed) just an alternative, I agree with you `duplicated` is the more straightforward and the correct one. There are no special reasons to use it. – Jilber Urbina Jan 14 '13 at 17:01

score 4 · Accepted Answer · answered Jan 14 '13 at 14:07

> df <- read.table(header=T, text='
+  label value
+      A     4
+      B     3
+      C     6
+      B     3
+      B     1
+      A     2
+      A     4
+      A     4
+ ')
> unique(df[duplicated(df),]) # Finds unique duplicated
  label value
4     B     3
7     A     4
> df[duplicated(df),] # Finds Duplicated
  label value
4     B     3
7     A     4
8     A     4
 > df[!duplicated(df),] # Finds rows which are not duplicated
  label value
1     A     4
2     B     3
3     C     6
5     B     1
6     A     2
>

multiple columns comparison

2 Answers2

Linked