I have copied some data describing cholera cases in regions of Yemen from an online database into a text file. The names of each region are given in both English and Arabic in a single string. I would like to remove the Arabic in R, and be left with just the English names.
This is what the English/Arabic string looks like when read into R:
regions <- c("Al Hudaydah Ø§Ù„ØØ¯ÙŠØ¯Ø©", "Hajjah ØØ¬Ø©")
I would like to be left with just the English
"Al Hudaydah" "Hajjah"
I have tried using
str_replace_all(regions, "[^[:alnum:]]", "")
and replace_non_ascii(regions)
but it doesn't give me what I'm looking for.
Any ideas?
Thanks!