Delete rows with identical variables in R

Question

I'm currently trying to subset data to a smaller size and I'm having a problem with the coding part, as I'm a complete newbie in coding.

I'm trying to get rid of all rows with identical entries here. So the code should eliminate all rows with identical variables in column 3 "var 2" for example. The duplicate function would just get rid of the second entry with "0", but I'd like to get rid of both entries with "0".

Appreciate your help! https://i.stack.imgur.com/esfSB.jpg

Do not post your data as an image, please learn how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) — Jaap, Jul 20 '16 at 11:28

score 1 · Answer 1 · answered Jul 20 '16 at 11:26

You could use the dplyr library to perform data manipulations. Its a neat library and very helpful. I came up with the following code to solve your problem. Assuming that the data frame is stored in a variable called data_frame, the solution is as follows

data_frame <- tbl_df(data_frame) %>%
              group_by(var2) %>%
              filter(n()==1)

I am storing the result in the same variable. You could use another variable name to keep the original data frame intact

score 0 · Answer 2 · answered Jul 20 '16 at 11:34

0

Here we use table to see which values are duplicated then search among all values for those that are not duplicated.

df = table(data$Var2)
data[!data$Var2 %in% as.numeric(names(df[df > 1])), ]

answered Jul 20 '16 at 11:34

catastrophic-failure

3,759
1
24
43

score 0 · Answer 3 · answered Jul 20 '16 at 11:43

0

We can also include duplicated with fromLast=TRUE to remove all those duplicate rows.

df1[with(df1, !(duplicated(var2)|duplicated(var2, fromLast=TRUE)),]

answered Jul 20 '16 at 11:43

akrun

874,273
37
540
662

Delete rows with identical variables in R

3 Answers3

Linked