I have a dataframe as follows:
Symptom number
Abdominal pain\n Swallowing probs\n Back issues\n 22
Abdominal pain\n 12
Back issues \n Vomiting \n 14
Back issues\n 5
There is always a \n
at the end of each symptom phrase. The symptom phrase itself can literally be anything so I don't want to search for these terms specifically, but rather any term before (or between) \n
I would like to average the number
for each symptom so that I end up with:
Symptom Avg
Abdominal pain 17
Swallowing probs 22
Back issues 20.5
Vomiting 14
I don't know how to group by the individual terms with dplyr. I have tried
SypmAvg<- df %>% group_by(grepl("(?\\n.*\\n)|($.*?\\n)",df$Symptom)%>% summarise(mean=mean(number)
but it just crashes my computer so I don't even get to see the error. Can anyone help? Is it just a regex issue or is there a better way to do this?