1

I have a df, such that a row looks like:

name datesemployed        university   
Kate Oct 2015 – Jan 2016  Princeton

What I want to do is repeat the entire row for each year in the range of variable datesemployed.

In this case, there would be two rows --- one for 2015, and one for 2016.

I've attempted to clean the variable first, but even having a tough time on how to do that:

df3<-str_split_fixed(df$datesemployed, "–", 2)
df<-cbind(df3, df)
Yu Na
  • 112
  • 1
  • 18

1 Answers1

1

We can use separate_rows from tidyr while specifying the sep as zero or more spaces followed by - and then any spaces

library(dplyr)
library(tidyr)
df %>%
     separate_rows(datesemployed,  sep="\\s*–\\s*")
#    name datesemployed university
#1 Kate      Oct 2015  Princeton
#2 Kate      Jan 2016  Princeton

data

df <- structure(list(name = "Kate", datesemployed = "Oct 2015 – Jan 2016", 
    university = "Princeton"), class = "data.frame", row.names = c(NA, 
-1L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • is there anyway this code could be adapted if the range of time was not one year? ie, if the range of years was something like 2015-2017? Tried playing around with it but haven't found an elegant adaption – Yu Na May 09 '20 at 20:46
  • @YuNa Can you please post as a new question so that it becomes easier to understand – akrun May 09 '20 at 20:47
  • 1
    posted! thanks for your help: https://stackoverflow.com/questions/61703818/repeat-row-based-on-time-range-variable-r – Yu Na May 09 '20 at 21:14