1

I have a data structure like:

X1     X2          X3
AA     x1;y1       100
BB                 200
CC     x3;y3;z3    300
DD     x4;y4       400

Reproducible data

df <- data.frame(
  X1 = c("AA", "BB", "CC", "DD"), 
  X2 = c("x1;y1", "", "x3,y3,z3", "x4;y4"),
  X3 = c("100", "200", "300", "400")
)

How can i distribute second column to first column as distribute. output:

Xf     X3
AA     100
x1     100
y1     100
BB     200
CC     300
x3     300
y3     300
z3     300
DD     400
x4     400
y4     400

Thanks.

ersan
  • 393
  • 1
  • 9
  • 2
    Paste 2 columns with sep=";", then look into this post: https://stackoverflow.com/questions/15347282/split-delimited-strings-in-a-column-and-insert-as-new-rows – zx8754 Sep 24 '20 at 11:51

1 Answers1

3

Paste first 2 columns, then separate into rows:

library(tidyr)
library(dplyr)

df %>% 
  transmute(new = paste(X1, X2, sep = ";"), X3) %>% 
  separate_rows(new, sep = "[;,]") 


# # A tibble: 12 x 2
#    new   X3   
#    <chr> <fct>
#  1 AA    100  
#  2 x1    100  
#  3 y1    100  
#  4 BB    200  
#  5 x2    200  
#  6 CC    300  
#  7 x3    300  
#  8 y3    300  
#  9 z3    300  
# 10 DD    400  
# 11 x4    400  
# 12 y4    400 

Edit: filter out the blank rows - ""

df %>% 
  transmute(new = paste(X1, X2, sep = ";"), X3) %>% 
  separate_rows(new, sep = "[;,]") %>% 
  filter(new != "")

# # A tibble: 11 x 2
#    new   X3   
#    <chr> <fct>
#  1 AA    100  
#  2 x1    100  
#  3 y1    100  
#  4 BB    200  
#  5 CC    300  
#  6 x3    300  
#  7 y3    300  
#  8 z3    300  
#  9 DD    400  
# 10 x4    400  
# 11 y4    400  
zx8754
  • 52,746
  • 12
  • 114
  • 209