In my data frame there are rows with same ID but different values for test year and age. I'd like to collapse the duplicate rows and create new columns for the different values.
I'm new in R and have been struggling with it for a while.
This is the data frame:
>df id project testyr1 testyr2 age1 age2 1 16S AS 2008 NA 29 NA 2 32S AS 2004 NA 30 NA 3 37S AS NA 2011 NA 36 4 50S AS 2004 NA 23 NA 5 50S AS 1998 NA 16 NA 6 55S AS 2007 NA 28 NA
testyr1
should have the earliest year and testyr2
the latest year. age1
should be the younger age and age2
the older age.
The output should be:
id project testyr1 testyr2 age1 age2 1 16S AS 2008 NA 29 NA 2 32S AS 2004 NA 30 NA 3 37S AS NA 2011 NA 36 4 50S AS 1998 2004 16 23 6 55S AS 2007 NA 28 NA
I tried to write a loop but don't know how to end it:
df.undup <- c()
df.undup <- c()
for (i in 1:nrow(df)){
if i == i+1
df$testyr1 != NA {
testyr2 = max(testyr1)
testyr1 = min(testyr1)
nage2 = max(nage1)
nage1 = min(nage1)
}
else{
testyr2 = max(testyr2)
testyr1 = min(testyr2)
nage2 = max(nage2)
nage1 = min(nage2)
}
}
Any help would be greatly appreciated.