-1

I have two data frames; one is a list of individual ID codes for my animals with an associated emergence date (31 rows), and the other is a list of weight data dates for these individuals (8474 rows). I would like to work out the time difference for each weight data row from the emergence date, to investigate weight gain since emergence for each individual animal.

I would like to add a new column to the weights dataframe that has the emergence date associated with the correct individual, so that each row has the individual ID, the emergence date for the individual, and the weights date. I can then calculate the difference in days.

I have looked at different methods for joining dataframes but can't find any for this particular function to associate new data

emergence date data

weights date data

I would like a new column in the weights data that contains the emergence date for each individual

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
  • 2
    Welcome to StackOveflow! You're most likely to get the help you need if you post a sample of your data (try `dput`) and some code that illustrates what you've tried so far and why it didn't work. See [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for guidance on how to create a minimal reproducible example in R. – A. S. K. Jul 31 '19 at 16:45

1 Answers1

0

I think you're looking for merge. I've made sample dataframes similar to yours to aid the exercise.

EmDates <- as.POSIXct(c("06/28/2019", "06/29/2019", "06/30/2019"), origin = "EST", format = "%m/%d/%Y")
Ind_ID <- c(1,2,3)
Emerg <- as.data.frame(EmDates)
Emerg <- cbind(Emerg, Ind_ID)

Ind_ID <- NULL

WeighDates <- as.POSIXct(c("07/28/2019", "07/29/2019", "07/30/2019"), tz = "EST", format = "%m/%d/%Y")
Ind_ID <- c(3,2,1)
Weigh <- as.data.frame(WeighDates)
Weigh <- cbind(Weigh, Ind_ID)                   

Target_DF <- merge(Emerg, Weigh, by = "Ind_ID")

Alternatively the dplyr join family can help here, with a few more options. Left join will keep all the data in the left (first) dataframe, and all the matching data in the right (second) data frame. Other join types in the link above will select data differently.

library(dplyr)
EmDates <- as.POSIXct(c("06/28/2019", "06/29/2019", "06/30/2019"), origin = "EST", format = "%m/%d/%Y")
Ind_ID_Emerg <- c(1,2,3)
Emerg <- as.data.frame(EmDates)
Emerg <- cbind(Emerg, Ind_ID_Emerg)


WeighDates <- as.POSIXct(c("07/28/2019", "07/29/2019", "07/30/2019"), tz = "EST", format = "%m/%d/%Y")
Ind_ID_Weigh <- c(3,2,1)
Weigh <- as.data.frame(WeighDates)
Weigh <- cbind(Weigh, Ind_ID_Weigh)                   

Target_DF <- left_join(Emerg, Weigh, by = c("Ind_ID_Emerg" = "Ind_ID_Weigh"))
Greg
  • 3,570
  • 5
  • 18
  • 31