0

Here's a test dataframe:

x = c("a", "b")
y = c(1,1,2,3,4,4,5,6)
z = c(20,30,20,40,10,30,20,40)
Data = data.frame(x,y,z)
  x y  z
1 a 1 20
2 b 1 30
3 a 2 20
4 b 3 40
5 a 4 10
6 b 4 30
7 a 5 20
8 b 6 40

So there are two samples (a and b), and elements in y that could be unique for a or b or shared between them. I want to take out the rows of data that contain only the unique (not shared) y elements.

unique(Data$y) only gives me the list of all of the y values, with the duplicates removed. Instead, I want the full rows of only the y values that are not repeated within the dataframe. How do I do this?

EDIT: Expected output would be a dataframe containing only those rows which contain y values that are not repeated in the original dataframe (not shared between a and b)

  x y  z
1 a 2 20
2 b 3 40
3 a 5 20
4 b 6 40
Jaap
  • 81,064
  • 34
  • 182
  • 193

1 Answers1

0

You can use duplicated to index the dataframe.

Data[!duplicated(Data$y),]

gives

  x y  z
1 a 1 20
3 a 2 20
4 b 3 40
5 a 4 10
7 a 5 20
8 b 6 40
Remko Duursma
  • 2,741
  • 17
  • 24
  • 1
    That basically does the same thing as unique though, and takes out the duplicate so that you're left with one of each of all of the y values. Instead, I want JUST those that are NOT repeated. So I want something to return just the rows with y values 2,3,5,6. (1 and 4 are repeated). – Tori Sindorf Oct 27 '15 at 23:44