I would like to use stringr
and rebus
to remove parts of strings in a dataframe. Specifically, I would like to remove the part where it starts with a space and a number till the end.
The following is my dataframe:
df<-data.frame(ID = 1:8, Medication = c("FOLIC ACID 5MG TABLET", "RIBAVIRIN 200MG TAB", "ACARBOSE 50MG TABLET",
"AmLODIPine 5MG TABLET", "MAGNESIUM TRISILICATE MIXTURE 200ML",
"RESONIUM 15G/60ML SUSPENSION", "CALCIUM & VIT D TABLET", NA))
My desired dataframe is:
df_new<-data.frame(ID = 1:8, Medication = c("FOLIC ACID", "RIBAVIRIN", "ACARBOSE",
"AmLODIPine", "MAGNESIUM TRISILICATE MIXTURE",
"RESONIUM", "CALCIUM & VIT D TABLET", NA))
I tried the following code but it only helps to remove the drug strength (e.g. 5MG) not the unit of measurement (e.g. TABLET):
df %>% mutate(Medication = str_replace(Medication, pattern = SPC %R%
one_or_more(DGT) %R%
one_or_more(WRD) %R%
or(one_or_more(SPC), one_or_more(WRD)),
replace = ""))
How can I work on this?