I have a CSV with Czech characters in it that looks like this:
id,address,city
660999,Vršovická 10,Praha
676838,Valentova 50,Praha 4
676858,Husova 6740,Pardubice
677971,Lipová 10,Třebíč
678304,Jana Ziky 10/1955,Ostrava
...
When I import into RStudio everything looks fine if I view it using the View() function.
But in the terminal when I view the values everything looks crazy.
xl = read.csv("some_csv.csv")
head(xl)
id address city
1 660999 Vršovická 10 Praha
2 676838 Valentova 50 Praha 4
3 676858 Husova 6740 Pardubice
4 677971 Lipová 10 TÅ™ebÃÄ
5 678304 Jana Ziky 10/1955 Ostrava
When I check the encoding with Encoding(xl[1,2])
for example it says "unknown"
.
I also have Russian data with the same exact problem.
I've tried switching to Sys.setlocale("LC_CTYPE", "czech")
and Sys.setlocale("LC_CTYPE", "russian")
and importing under those settings and they behave the same.
I'm using Rstudio Version 0.98.501 with R version 3.0.2 on Windows 7. A colleague on a separate computer is having the same problem.
Anything I can do make these characters work correctly the terminal?