0

I want to separate_rowsof a character column into as many rows as there are characters:

df
# A tibble: 1 x 2
  speaker A_aoi
  <chr>   <chr>
1 ID01.B  B*B*B

I know the tidyversefunction separate_rowscan be used for this purpose:

library(dplyr)
library(tidyr)
df %>% 
   separate_rows(A_aoi, sep = "")

Surprisingly (to me), however, the result includes a row - the first row - which it should not include:

# A tibble: 6 x 2
  speaker A_aoi
  <chr>   <chr>
1 ID01.B  ""       # <--- should not be included
2 ID01.B  "B"  
3 ID01.B  "*"  
4 ID01.B  "B"  
5 ID01.B  "*"  
6 ID01.B  "B"

How can the sep pattern be re-formulated? I've tried to use sep = "\\*|[A-Z]"to no avail.

Reproducible data:

structure(list(speaker = "ID01.B", A_aoi = "B*B*B"), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"))
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34

1 Answers1

1

I think this will work:

df %>% 
  tidyr::separate_rows(A_aoi, sep = "(?!^)")

Answer found here: Split string into array of character strings

Will Oldham
  • 704
  • 3
  • 13