Selecting only unique values from a comma separated string

Question

I have a data frame that looks like this:

A B C
1 M 1,2
2 M 1,5,5,5
3 M 4,5,7,7

I want to take column C and select only the unique value from it to achieve this:

A B C
1 M 1,2
2 M 1,5
3 M 4,5,7

score 6 · Accepted Answer · edited Jan 09 '18 at 13:53

6

Assuming your C column is a character vector as in this sample

dd <- read.table(text="A B C
1 M 1,2
2 M 1,5,5,5
3 M 4,5,7,7", header = TRUE, stringsAsFactors = FALSE)

You can use strsplit to split the column values on a comma, and then reassemble the unique values with this code:

dd$D <- sapply(strsplit(dd$C, ",", fixed = TRUE), function(x) 
    paste(unique(x), collapse = ","))
dd
#   A B       C     D
# 1 1 M     1,2   1,2
# 2 2 M 1,5,5,5   1,5
# 3 3 M 4,5,7,7 4,5,7

edited Jan 09 '18 at 13:53

zx8754

52,746
12
114
209

answered May 16 '16 at 18:11

MrFlick

195,160
17
277
295

1

In case C is a factor, use `as.character(dd$C)` – Eric Lecoutre May 16 '16 at 18:20

Selecting only unique values from a comma separated string

1 Answers1

Linked

Related