3

I have a column with special character ";" that I want to separate into new rows without loosing information of other columns.

let's say, this is my df:

col1  year
A;B    2010
A      2010
B      2011
B;C    2012

the desired result:

col1  year
A      2010
B      2010
A      2010
B      2011
B      2012
C      2012

I know how to seperate col1 using strsplit :

unlist(strsplit(df[,1], ";")

but this does not keep the information of other columns.

Cina
  • 9,759
  • 4
  • 20
  • 36

3 Answers3

3

We can use separate_rows

library(tidyr)
separate_rows(df, col1, sep = ";")

If we want to use strsplit, then have to replicate the rows based on the length of list elements (lengths)

lst1 <- strsplit(df$col1, ";")
df1 <- df[rep(seq_len(nrow(df)), lengths(lst1)),]
df1$col1 <- unlist(lst1)
akrun
  • 874,273
  • 37
  • 540
  • 662
3

Another option is to use unnest:

library(dplyr)
library(tidyr)

df %>% 
  mutate(col1=strsplit(col1, ";")) %>% 
  unnest(col1)
  col1   year
  <chr> <int>
1 A      2010
2 B      2010
3 A      2010
4 B      2011
5 B      2012
6 C      2012
TarJae
  • 72,363
  • 6
  • 19
  • 66
2

A base R way.

do.call(rbind.data.frame, Map(cbind, strsplit(dat$col1, ";"), dat$year)) |>
  setNames(names(dat))
#   col1 year
# 1    A 2010
# 2    B 2010
# 3    A 2010
# 4    B 2011
# 5    B 2012
# 6    C 2012

Data:

dat <- structure(list(col1 = c("A;B", "A", "B", "B;C"), year = c(2010L, 
2010L, 2011L, 2012L)), class = "data.frame", row.names = c(NA, 
-4L))
jay.sf
  • 60,139
  • 8
  • 53
  • 110