2

I have a CSV with Czech characters in it that looks like this:

id,address,city
660999,Vršovická 10,Praha
676838,Valentova 50,Praha 4
676858,Husova 6740,Pardubice
677971,Lipová 10,Třebíč
678304,Jana Ziky 10/1955,Ostrava
...

When I import into RStudio everything looks fine if I view it using the View() function.

View() in Rstudio

But in the terminal when I view the values everything looks crazy.

xl = read.csv("some_csv.csv")
head(xl)

      id              address      city 
1 660999       Vršovická 10     Praha     
2 676838         Valentova 50     Praha 4     
3 676858          Husova 6740   Pardubice     
4 677971           Lipová 10     TÅ™ebÃ­Ä     
5 678304    Jana Ziky 10/1955   Ostrava     

When I check the encoding with Encoding(xl[1,2]) for example it says "unknown".

I also have Russian data with the same exact problem.

I've tried switching to Sys.setlocale("LC_CTYPE", "czech") and Sys.setlocale("LC_CTYPE", "russian") and importing under those settings and they behave the same.

I'm using Rstudio Version 0.98.501 with R version 3.0.2 on Windows 7. A colleague on a separate computer is having the same problem.

Anything I can do make these characters work correctly the terminal?

Dirk Calloway
  • 2,569
  • 4
  • 23
  • 34

0 Answers0