-1
data<-data.frame(x=c("a,b","c","a,b","d,e,f,g"))
        x
1     a,b
2       c
3     a,b
4 d,e,f,g

I would like to extract info from column x and write every unique info into column y, what should I do? Thank you! Col y is expected like:

  y
1 a
2 b
3 c
4 d
5 e
6 f
7 g
agstudy
  • 119,832
  • 17
  • 199
  • 261
YTW
  • 27
  • 1
  • 1
    Regex is not really needed here. Something like `unique(scan(text=as.character(data$x), sep=",", what=""))` would probably do it. `strsplit()` would be another option. – Rich Scriven Jul 11 '16 at 20:48
  • 1
    Or using `strsplit`. For example : `unlist(strsplit(as.character(data$x),","))` – agstudy Jul 11 '16 at 20:49
  • If the data is just comma separated, there is no need for a regex, really. Otherwise, it could look like `y <- unique(unlist(str_extract_all(data$x, "[^,]+")))` or something more specific. – Wiktor Stribiżew Jul 11 '16 at 20:51

2 Answers2

1
d<-data.frame(x=c("a,b","c","a,b","d,e,f,g"))

> levels(d$x)
[1] "a,b"     "c"       "d,e,f,g"

> e <- as.character(levels(d$x))
> e
[1] "a,b"     "c"       "d,e,f,g"
> 

> f <- strsplit(e,",")
> f
[[1]]
[1] "a" "b"

[[2]]
[1] "c"

[[3]]
[1] "d" "e" "f" "g"

unlist(f)
[1] "a" "b" "c" "d" "e" "f" "g"
Ram K
  • 1,746
  • 2
  • 14
  • 23
1

A tidyr solution:

library(tidyr)
data %>% unnest(x=strsplit(as.character(x),",")) %>% unique()

or (thanks to @alistaire)

data %>% separate_rows(x) %>% unique()
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453