I'm learning R, any help will be welcomed.
I need to clean my data, removing the duplicated combinations of project-raingauge. My data has a larger number of variables but I present a simplified version:
ID_project <- c(1,1,1,1,2,2,2,2)
ID_raingauge <- c("A","B","B","B","A","A","B","C")
COMB_check <- c("|","|","ok","ok","|","ok","|","|")
score<- c(0.7,0.5,1.2,0.3,0.4,0.1,0.6,1.4)
mydata <- data.frame(ID_project,ID_raingauge,COMB_check,score)
ID_project ID_rainguage COMB_check score
======
1 A | 0.7
1 B | 0.5
1 B ok 1.2
1 B ok 0.3
2 A | 0.4
2 A ok 0.1
2 B | 0.6
2 C | 1.4
For each combination of ID_project and ID_raingauge in some cases we have more than one score. You can notice the repeated combination in COMB_check. The first time one combination appears COMB_check= "|", while next rows with the same combination COMB_check="ok".
I want to obtain the same data but just one example for each combination (ID_project-ID_raingauge). I need to keep the one with the highest score. The example result would be:
ID_project ID_rainguage COMB_check score
======
1 A | 0.7
1 B ok 1.2
2 A | 0.4
2 B | 0.6
2 C | 1.4
Thank you in advance