0

I want to repeat every row that has "emphysema/chronic bronchitis" in column 1. And for each such repetition, I want to have "Emphysema" in column 2 for one of the 2 rows, and "Chronic Bronchitis" in column 2 for the other one. I'm not sure if I'm explaining it well though - sorry for the confusion.

This is how it currently is: Column 1 Examples: skin cancer rectal cancer emphysema/chronic bronchitis

Column 2 Examples: Skin Neoplasms Rectal Neoplasms NA

This is how I want it to be: Column 1 Examples: skin cancer rectal cancer emphysema chronic bronchitis

Column 2 Examples: Skin Neoplasms Rectal Neoplasms Emphysema Chronic Bronchitis

Nivi
  • 3
  • 2
  • 3
    `rbind(df,df[23,])` – user2974951 Oct 27 '21 at 06:19
  • 3
    Is there a general rule for what you want to repeat? Do you always just want to repeat row 23? Does it matter where the newly created row winds up? It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Oct 27 '21 at 06:24
  • df[ c(seq_int(nrow(df)) , 23) , ]. Just a set of integer row indices with an extra 23. – IRTFM Oct 27 '21 at 06:25
  • Based on the comment by @user2974951 a general function to do this is: `rep_rows <- function(x, rows) {rbind(x, x[rows, ])}`. You use it like this: `rep_rows(df, 23)` or you can have several rows repeating: `rep_rows(df, c(23, 24, 30:34))` – benimwolfspelz Oct 27 '21 at 09:15
  • Thank you for all your suggestions, they've been really helpful. I just realised I do not just have to repeat a particular row (e.g row 23). I'm not sure if what I want to do is possible but here it is: So I want to repeat every row that has "emphysema/chronic bronchitis" in column 1. And for each such repetition, I want to have "Emphysema" in column 2 for one of the 2 rows, and "Chronic Bronchitis" in column 2 for the other one. I'm not sure if I'm explaining it well though - sorry for the confusion – Nivi Oct 27 '21 at 10:51
  • `tidyr::separate_rows()` may be the solution for you. See this question for an example https://stackoverflow.com/questions/61036933/separate-rows-on-string-preserving-original – pyg Oct 27 '21 at 11:19
  • 1
    @Nivi Based on the comments, it might be worthwhile for you to edit your question and clarify the context. For example, do you have a column 1 where a row value might have multiple diagnoses (e.g., emphysema and chronic bronchitis) and you want those included in different rows in column 2? A more detailed description of what you currently have in terms of data, and what you want to have in the end (this is the "reproducible example" mentioned above) will really help out a lot here. It doesn't need to be complicated - even an example few rows of example made up data to start with. – Ben Oct 27 '21 at 12:46
  • Yep I've edited the question now! The separate_rows() might help, but I want to split up rows with a specific value in a column. For instance, I do not want to split up all the rows in column 1. I only want to split up columns that have "emphysema/ chronic bronchitis" for instance – Nivi Oct 27 '21 at 15:01

1 Answers1

0

It's not entirely clear from the description, but this might help you.

Say, you have a data.frame that looks like this:

                          col1             col2
1                  skin cancer   Skin Neoplasms
2                rectal cancer Rectal Neoplasms
3 emphysema/chronic bronchitis             <NA>

And you want to split entries that have a slash (/), such as between emphysema and chronic bronchitis.

You can use separate_rows from tidyr, and use '/' as a separator. You can also include coalesce to update column 2 with the entries (replace the missing data), like column 1. As far as I can tell, this matches your expected output.

library(tidyverse)

df %>%
  separate_rows(col1, sep = "/") %>%
  mutate(col2 = coalesce(col1))

Output

  col1               col2              
  <chr>              <chr>             
1 skin cancer        skin cancer       
2 rectal cancer      rectal cancer     
3 emphysema          emphysema         
4 chronic bronchitis chronic bronchitis
Ben
  • 28,684
  • 5
  • 23
  • 45