0

I have the following data:

authors <- c("Fernando Carré", "Adrüne Coça", "Pìso Därço")

And I want to extract non-english characters and convert them into ASCII, but without the spaces. This is what I have tried:

gsub("[^[:alnum:]]","",authors)

But it returns:

[1] "FernandoCarré" "AdrüneCoça"    "PìsoDärço" 

It should return:

"Fernando Carre" "Adrune Coca", "Piso Darco"

Any help will be greatly appreciated.

Manu
  • 1,070
  • 10
  • 27

1 Answers1

0

Thanks for Onyambu correction, the following statement is not correct

The expression [[:alnum:]] is made for the package stringr only. It cannot be used in other packages. Hence we can use

But here is what I got from the console.

> authors <- c("Fernando Carré", "Adrüne Coça", "Pìso Därço")
> iconv(authors ,to="ASCII//TRANSLIT")
[1] "Fernando Carre" "Adrune Coca"    "Piso Darco"    
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS
s20012303
  • 81
  • 2
  • The expression `[[:alnum:]]` is a POSIX regex and can be used in R without any package. So your firat statement is incorrect. Although it can not be used here since this problem relates to encoding and not translation/substitution – Onyambu Apr 30 '21 at 01:12
  • I'm afraid that the answer is not correct, the result was: `[1] "Fernando Carr'e" "Adr\"une Coca" "P`iso D\"arco"` – Manu Apr 30 '21 at 01:17