I am new to the group and R. I posted a question earlier, and am very grateful for the help provided, but realised that I was not very clear. So here is hopefully a better attempt to clarify my problem. Thanks in advance to anyone that can assist.
I have a dataset which contains racehorse names along with each racehorses performance rating, 'DaH', (an integer) and the type of race the horses ran in 'Code',(character).
The dataset has 7 horses with performance ratings (DaH) for the last 3 races, DaH1 to DaH3, where DaH1 is the performance rating for the last/most recent race and DaH3 was the horses performance rating from 3 races ago. Likewise the Codes, Code1 to Code3, provide the Code that the horse ran in.
So looking at the dataset below you can see that horse2 ran in a code C race in its last race and achieved a performance rating DaH1 of 124.
I import the horserace data from a provider of horse racing data.
> racehorse_data
Code1 DaH1 Code2 DaH2 Code3 DaH3
1 C 0 C 124 C 127
2 C 124 C 117 C 0
3 C 121 C 125 C 0
4 C 123 C 120 C 124
5 C 0 C 125 H 122
6 H 122 C 0 C 137
7 C 110 H 115 C 127
structure(list(Code1 = c("C", "C", "C", "C", "C", "H", "C"),
DaH1 = c(0L, 124L, 121L, 123L, 0L, 122L, 110L), Code2 = c("C",
"C", "C", "C", "C", "C", "H"), DaH2 = c(124L, 117L, 125L,
120L, 125L, 0L, 115L), Code3 = c("C", "C", "C", "C", "H",
"C", "C"), DaH6 = c(127L, 0L, 0L, 124L, 122L, 137L, 127L)), class = "data.frame", row.names = c(NA,
-7L))
I need to calculate the mean DaH rating for each horse once I have stripped out any performance rating that was achieved in a code that was not a 'C' code. To clarify: The problem i'm having is that I need to remove any performance ratings, DaH, where the Code of the race was not a 'C' code race.
So for example you can see that horse 6 ran in a 'H' code race in its last race. This needs to be removed from the dataset so that just the two races where horse 6 achieved DaH2=0 and DaH3=137 remain to give a mean DaH rating of 68.5 for its two code C races.
I have been playing with the MELT command trying to achieve a reconfigured dataframe that will enable me to filter out the DaH ratings where the code of the race was not equal to 'C'. But i'm not managed to achieve this.
It occured to me that I could isolate each horses DaH and Code as a seperate dataframe and work through each horse seperately but this seems more complicated than i'd like.
My ideal output would be a dataframe such as:
> racehorse_data
DaH1 DaH2 DaH3 Mean
1 0 124 127 83.7
2 124 117 0 80.3
3 121 125 0 82
4 123 120 124 122.3
5 0 125 na 62.5
6 na 0 137 68.5
7 110 na 127 118.5
Or perhaps a seperate vector for each horse
horse 1 0 124 127 83.7
horse 2 124 117 0 80.3
etc
Many Thanks Graham