How to create new rows in R data frame based on values in cells?

Question

I have the following data.frame:

id name shoes
1  Pete "nike; adidas; puma"
2  Anna nike

base on this data frame I would like to create the followin data.frame

id name shoes
1 Pete  nike
1 Pete  adidas
1 Pete  puma
2 Ana   nike

Any help would be appreciated.

Please share your initial data.frame by using `dput()`. – Martin Gal Jan 15 '23 at 15:12 — Martin Gal, Jan 15 '23 at 15:12

jdobres · Accepted Answer · 2023-01-15T16:20:18.883

5

The tidyverse's separate_rows function is perfect for this. By default, it separates a string whenever it encounters any non-alphanumeric characters, which includes ; in your example.

library(tidyverse)

df <- data.frame(name = c('Pete', 'Anna'), shoes = c('nike; adidas; puma', 'nike'))

df_long <- df %>% 
  separate_rows(shoes)

  name  shoes 
  <chr> <chr> 
1 Pete  nike  
2 Pete  adidas
3 Pete  puma  
4 Anna  nike

edited Jan 15 '23 at 16:20

answered Jan 15 '23 at 15:19

jdobres

11,339
1
17
37

1

It should be **tidyr**'s `separate_rows()`. Saying `tidyverse` is not precise. – Darren Tsai Jan 15 '23 at 16:06
Fair point. Notably, the tidyverse as a whole imports magrittr's `%>%` operator (through dplyr, I believe). While I know these are separate packages with different functions, I tend to just import the entire tidyverse suite, since most common workflows will use multiple components from it anyway. – jdobres Jan 15 '23 at 16:17
Nice discussion. `tidyr` also imports magrittr's `%>%` operator, so even if you `library(tidyr)` solely, your code can still work. In my experience on SO, I will say that I'm using a `tidyverse` solution when I need to library more than one packages that are included in `tidyverse` to finish a job. – Darren Tsai Jan 16 '23 at 08:33

score 0 · Answer 2 · answered Jan 15 '23 at 15:58

An approach using unnest after strsplit

library(dplyr)
library(tidyr)

df %>% 
  mutate(shoes = strsplit(shoes, "; ")) %>% 
  unnest(shoes)
# A tibble: 4 × 3
     id name  shoes
  <int> <chr> <chr>
1     1 Pete  nike
2     1 Pete  adidas
3     1 Pete  puma
4     2 Anna  nike

Data

df <- structure(list(id = 1:2, name = c("Pete", "Anna"), shoes = c("nike; adidas; puma",
"nike")), class = "data.frame", row.names = c(NA, -2L))

How to create new rows in R data frame based on values in cells?

2 Answers2

Data