Delete rows with duplicated record in two different columns

Question

I would like to delete rows which contain the same string in collumn C1 and C3:

My df input:

C1       C2      C3
14-130n  NE03   14-130n
23-401n  NE05   21-130n
43-123n  NE04   43-121n

My finalexpected output:

C1       C2      C3
23-401n  NE05   21-130n
43-123n  NE04   43-121n

I had tried final <- df[!(df[,1] = df[,3]),], but do not works. Some ideas? Cheers!

`=` wont work.. You need `==` – vrajs5 Jun 17 '14 at 08:57 — vrajs5, Jun 17 '14 at 08:57

score 4 · Accepted Answer · answered Jun 17 '14 at 08:55

For example:

df[!df$C1==df$C3,]

Where df:

df <- read.table(text='C1       C2      C3
14-130n  NE03   14-130n
23-401n  NE05   21-130n
43-123n  NE04   43-121n',header=TRUE,stringsAsFactors=FALSE)

In case you have factors you should coerce to character before:

 df[as.character(df$C1)!=as.character(df$C3),]

score 2 · Answer 2 · answered Jun 17 '14 at 08:58

2

Would final <- subset(df, C1!=C3) serve the purpose?

answered Jun 17 '14 at 08:58

Ricky

4,616
6
42
72

1

a `for` loop also can "serve the purpose". The question is not only "what" but also "how". See [here](http://stackoverflow.com/questions/9860090/in-r-why-is-better-than-subset) for referance – David Arenburg Jun 17 '14 at 09:10

Delete rows with duplicated record in two different columns

2 Answers2