How to expand a dataframe base on values in a column

Question

I have multiple values in certain rows within a column in a dataframe. I would like to have a dataframe with a new row for each row that contains multiple values for a single column. I have the gotten the values separated by am now certain how to go forward. Any thoughts?

Here is an example:

## input
tibble(
code = c(
85310,
47730,                       
61900,                           
93110,                          
"56210,\r\n70229",           
"93110,\r\n93130,\r\n93290"),
vary2 = LETTERS[1:6])

## desired output
tibble(
code = c(85310, 47730, 61900, 93110, 56210, 70229, 
         93110, 93130, 93290),
vary2 = c('A', 'B', 'C', 'D', 'E', 'E', 'F', 'F', 'F')
)

## one unsuccesful approach

tibble(
  code = c(
    85310,
    47730,                       
    61900,                           
    93110,                          
    "56210,\r\n70229",           
    "93110,\r\n93130,\r\n93290"),
  vary2 = LETTERS[1:6]) %>% 
  separate(col = 'code', into = LETTERS[1:3], sep = ',\\r\\n')

akrun · Answer 1 · 2018-10-04T20:58:29.520

3

We can use separate_rows

library(tidyverse)
df1 %>% 
  separate_rows(code, sep="[,\r\n]+") 
# A tibble: 9 x 2
#  code  vary2
#  <chr> <chr>
#1 85310 A    
#2 47730 B    
#3 61900 C    
#4 93110 D    
#5 56210 E    
#6 70229 E    
#7 93110 F    
#8 93130 F    
#9 93290 F

As @KerryJackson mentioned in the comments, if we don't specify the sep, the algo will automatically pick up all the delimiters (in case we want to limit this to a particular delimiter- better to use sep)

df1 %>% 
  separate_rows(code)

edited Oct 04 '18 at 20:58

answered Oct 04 '18 at 18:59

akrun

874,273
37
540
662

1

It's . . . perfect! – elliot Oct 04 '18 at 19:02
1

If you don't specify the separator, you don't need to filter, so you can have an even shorter code `separate_rows(df, code)` – Kerry Jackson Oct 04 '18 at 19:52

How to expand a dataframe base on values in a column

1 Answers1