I like to merge two datasets (Generaldata, monthlyresults) but at the same time there schould be another condition fulfilled (Startdate=colnames_of_monthlyresults).
Example (please see codes at the end of the question):
Generaldata columns: Code; Startdate (e.g.201601)
Code: 1,2,3
Startdate: 201601, 201511, 201512
Monthlyresults columns: Code; Result201201 (e.g 5%) ; Result201202; ... ; Result2011604
Code: 1,2,3
Result201511: 0, 20 , 0
Result201512: 6, 30, 5
Result201601: 8, 40, 10
Desired result: Code; Result_at_startdate
Code: 1,2,3
Result_startdate: 8, 20, 5
I tried to do this in two steps: First I merged both dataset with the following Code:
Testresults<- merge(x=Generaldata, y=Monthlyresults, by.y = "Code", by.x= "Code",
all.x = TRUE, incomparables = NA)
As a result I received a dataframe with too much columns (results for every month over 4 years), but I am only interested in the value with the result of every Person at the startdate (and enddate later). How could I achieve this?
I tried it to include both conditions in the by= element of the merge function, but the problem is that I need to refer only to the columnames and not the rowentries in the result dataset. I also tried it with a ifelse function, but then I have the problem that the format is not identical. The entry of startdate column ("201601") is not equal to the colname ("R201601", because a columnname cannot start with a number). In Excel I would probably combine the if and vlookup function to achieve it but my dataset is to large to work with Excel.
I'm an R beginner and would be very happy if someone can help me.
Example R Codes:
Generaldata<- data.frame("Code"=c(1,2,3), "Startdate"= c(201511, 201512, 201601))
Monthlyresults<- data.frame ("Code"=c(1,2,3), "R201511"=c(0,20,0), "R201512"=c(6,30,5), "R201601"=c(8,40,10) )
Testresult <-merge(x=Generaldata, y=Monthlyresults, by.y = "Code", by.x= "Code", all.x = TRUE, incomparables = NA)
#Testresult have all columns of Monthlyresults, but I like to get only the result of every person at the startdate
Desired_result<- data.frame ("Code"=c(1,2,3), "Result_startdate"= c(8,20,5))
show(Desired_result)