1

I am struggling with a very simple thing. I have a data frame in which the column Comments contains information exclusive to single field observations (time in this case). For example:

data <- data.frame("Species" = c("TURPHI", "EMBHER", "ANTTRI"), 
                       "Date" = c("2020/06/03", "2020/06/03", "2020/06/03"), 
                       "Comments" = c("21:00;23:00;23:45", "22:01", "21:51"), 
                  stringsAsFactors = FALSE)
> data
  Species       Date          Comments
1  TURPHI 2020/06/03 21:00;23:00;23:45
2  EMBHER 2020/06/03             22:01
3  ANTTRI 2020/06/03             21:51

I have to split each row if it has more than 1 element in Comments column. Elements are delimited by ;. In the previous case, row 1 has to be splited into 3 rows, each with its time, such as:

> data
  Species       Date  Time
1  TURPHI 2020/06/03 21:00
2  TURPHI 2020/06/03 23:00
3  TURPHI 2020/06/03 23:45
4  EMBHER 2020/06/03 22:01
5  ANTTRI 2020/06/03 21:51

Thanks a lot!!

2 Answers2

2

You can try this (only specify the number of column for splitting and you can also save in a new dataframe):

library(tidyverse)

df1 <- separate_rows(data,3,sep = ';')

Output:

# A tibble: 5 x 3
  Species Date       Comments
  <chr>   <chr>      <chr>   
1 TURPHI  2020/06/03 21:00   
2 TURPHI  2020/06/03 23:00   
3 TURPHI  2020/06/03 23:45   
4 EMBHER  2020/06/03 22:01   
5 ANTTRI  2020/06/03 21:51 
Duck
  • 39,058
  • 13
  • 42
  • 84
2

In base you can use strsplit to split data$Comments by ; and the combine the result with cbind and repeat the rows using rep.

x <- strsplit(data$Comments, ";", fixed = TRUE)
cbind(data[rep(seq_len(nrow(data)), lengths(x)),-3], time=unlist(x))
#    Species       Date  time
#1    TURPHI 2020/06/03 21:00
#1.1  TURPHI 2020/06/03 23:00
#1.2  TURPHI 2020/06/03 23:45
#2    EMBHER 2020/06/03 22:01
#3    ANTTRI 2020/06/03 21:51
GKi
  • 37,245
  • 2
  • 26
  • 48