I scrape some information from website like this:
library(rvest)
x<-"http://www.transfermarkt.com/wasserman-media-group/beraterfirmenuebersicht/berater?sort=gesamtwert.desc"
read_html(x) -> web
web %>%
html_nodes('.spielprofil_tooltip') %>%
html_text() -> footballers
footballers
The output looks like this:
[1] "Cristiano Ronaldo" "James RodrĂguez" "Ăngel Di MarĂa" "Diego Costa" "William Carvalho" "Eliaquim Mangala" "Ezequiel Garay" "Falcao"
[9] "Thiago Silva" "André Gomes" "João Moutinho" "Carlos Vela" "Bernardo Silva" "Fábio Coentrão" "Pepe" "Ivan Cavaleiro"
[17] "Giovani dos Santos" "Miguel Veloso" "Danilo" "Pizzi" "Anwar El Ghazi" "Rúben Neves" "Adrián López" "Ahmed Hassan"
[25] "Danny" "Nélson Oliveira" "Ricardo Quaresma" "Gonçalo Guedes" "Nélson Semedo" "Wallace" "Anderson" "Bruno Gama"
[33] "Sidnei" "Hugo Viana" "Hélder Costa" "Tiago" "Bruno Alves" "Bebé" "José Sá" "Hélder Postiga"
[41] "Simão" "José Bosingwa" "Ederson" "Duda" "André Geraldes" "Pelé" "Filipe Oliveira" "Diogo Jota"
[49] "Burgui" "Edinho" "Alberto RodrĂguez" "Moreno" "Ricardo Carvalho" "Tiago Sá" "VĂtor Gomes" "Mário SĂ©rgio"
[57] "Rafael Márquez" "Júlio Alves" "Marcão" "Cândido Costa" "Diego Oliveira" "Rafa" "Valdir" "César Peixoto"
[65] "Ricardo Carvalho" "Jorge Ribeiro" "Lucas Ferrugem" "Nunes" "Pedrinha" "Dong-Hyun Kim" "Wênio" "Henrique Hilário"
[73] "Jorge Andrade" "Derlei" "Abel" "Petit" "Costinha" "Nuno EspĂrito Santo" "Paulo Ferreira" "Fábio Faria"
[81] "Deco" "Jorge LuĂs" "JoĂŁo Alves" "Fabiano Rossato" "Mantorras" "Bruno" "Bruno Tiago" "LuĂs Loureiro"
[89] "Xadas" "VitĂł"
As you might see there is some problem with encoding, therefore I use following statement:
repair_encoding(footballers)
Best guess: UTF-8 (100% confident)
[1] "Cristiano Ronaldo" "James Rodríguez" "Ángel Di María" "Diego Costa" "William Carvalho" "Eliaquim Mangala" "Ezequiel Garay" "Falcao"
[9] "Thiago Silva" "André Gomes" "Jo\032o Moutinho" "Carlos Vela" "Bernardo Silva" "Fábio Coentr\032o" "Pepe" "Ivan Cavaleiro"
[17] "Giovani dos Santos" "Miguel Veloso" "Danilo" "Pizzi" "Anwar El Ghazi" "Rúben Neves" "Adrián López" "Ahmed Hassan"
[25] "Danny" "Nélson Oliveira" "Ricardo Quaresma" "Gonçalo Guedes" "Nélson Semedo" "Wallace" "Anderson" "Bruno Gama"
[33] "Sidnei" "Hugo Viana" "Hélder Costa" "Tiago" "Bruno Alves" "Bebé" "José Sá" "Hélder Postiga"
[41] "Sim\032o" "José Bosingwa" "Ederson" "Duda" "André Geraldes" "Pelé" "Filipe Oliveira" "Diogo Jota"
[49] "Burgui" "Edinho" "Alberto Rodríguez" "Moreno" "Ricardo Carvalho" "Tiago Sá" "Vítor Gomes" "Mário Sérgio"
[57] "Rafael Márquez" "Júlio Alves" "Marc\032o" "Cândido Costa" "Diego Oliveira" "Rafa" "Valdir" "César Peixoto"
[65] "Ricardo Carvalho" "Jorge Ribeiro" "Lucas Ferrugem" "Nunes" "Pedrinha" "Dong-Hyun Kim" "W\032nio" "Henrique Hilário"
[73] "Jorge Andrade" "Derlei" "Abel" "Petit" "Costinha" "Nuno Espírito Santo" "Paulo Ferreira" "Fábio Faria"
[81] "Deco" "Jorge Luís" "Jo\032o Alves" "Fabiano Rossato" "Mantorras" "Bruno" "Bruno Tiago" "Luís Loureiro"
[89] "Xadas" "Vitó"
Some names were correctly repaired but some spanish signs were not. Does anybody know how to handle the encoding properly in R? I got a similar issue when I deal with polish signs.
Any help would be appreciated!