2

I'm in a situation like this: I have a data.frame that looks like the following:

 Col1   Col2  
  a     3.4   
  a     3.4      
  d     3.2   
  c     3.2 

I would like the following output:

 Col1  Col2  
  a    3.4      
  d    3.2   
  c    3.2 

in other words the value "a" in "Col1" will be considered once since it is replicated exactly, otherwise even if the value of "d" and "c" is the same as reported in "Col2" it will be considered twice because they are different entities ("d" is different from "c")

Can anyone help me please?

Bfu38
  • 1,081
  • 1
  • 8
  • 17

2 Answers2

5

Try this:

DF <- read.table(text=" Col1   Col2  
  a     3.4   
  a     3.4      
  d     3.2   
  c     3.2 ", header=T)
aggregate(Col2~Col1, unique, data=DF)
  Col1 Col2
1    a  3.4
2    c  3.2
3    d  3.2
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
  • 1
    It is (indeed) just an alternative, I agree with you `duplicated` is the more straightforward and the correct one. There are no special reasons to use it. – Jilber Urbina Jan 14 '13 at 17:01
4
> df <- read.table(header=T, text='
+  label value
+      A     4
+      B     3
+      C     6
+      B     3
+      B     1
+      A     2
+      A     4
+      A     4
+ ')
> unique(df[duplicated(df),]) # Finds unique duplicated
  label value
4     B     3
7     A     4
> df[duplicated(df),] # Finds Duplicated
  label value
4     B     3
7     A     4
8     A     4
 > df[!duplicated(df),] # Finds rows which are not duplicated
  label value
1     A     4
2     B     3
3     C     6
5     B     1
6     A     2
> 
Harpal
  • 12,057
  • 18
  • 61
  • 74