0

I have a data frame with one column like this:

col1
line1
line1
line2

I try to remove duplicate using this:

df2 <- df[!duplicated(df), ]

but it produces a large factor instead of removing the duplicate. The result of structure something like is this:

str(df2)
 Factor w/ 7472 levels
Teres
  • 73
  • 1
  • 8
  • Welcome to Stack Overflow! We ask that for questions involving troubleshooting code that you provide a reproducible example. You can use `dput()` to share the data. – Hack-R Oct 15 '16 at 13:20

2 Answers2

2

When you have just one column, you need to use drop = FALSE to get a dataframe back:

df2 <- df[!duplicated(df), , drop = FALSE]

another option is using the unique function:

df2 <- unique(df)

the result of both approaches is the same:

> df2
   col1
1 line1
3 line2
h3rm4n
  • 4,126
  • 15
  • 21
  • You don't need drop except when there's only 1 column (and if there's only 1 column why would you want a data.frame?). – Hack-R Oct 15 '16 at 13:25
  • `drop = FALSE` is indeed only needed when you have one column in your dataframe (which is the case as OP described) – h3rm4n Oct 15 '16 at 13:28
0
col1 <- c("line1",
          "line1",
          "line2")

df <- data.frame(col1=col1, x=c(1,2,3))

df1 <- df[!duplicated(df$col1),]
df1
   col1 x
1 line1 1
3 line2 3
class(df1)

[1] "data.frame"

Hack-R
  • 22,422
  • 14
  • 75
  • 131