0

I installed Microsoft-R-Open 3.4.0 on a Red Hat Linux Enterprise 7.3 machine following the instructions https://mran.microsoft.com/documents/rro/installation/ . R starts up and seems to be doing fine at first sight. However, when I try to list the files in a directory with the command

files <- list.files(path="/home/username/directory_name/", pattern="*.Rda",, full.names=T, recursive=FALSE)

I get the error

translateCharUTF8' must be called on a CHARSXP
Execution halted

On my local windows machine the command works fine. Googling this nothing really comes up except that the installation might be broken.

The strange thing is, that if I copy and paste the command into R and execute it, it does not work, but if I copy and paste it into R and change that command in a way that should not change its result but only add spaces or something like that it might run. E.g. changing it to

files <- list.files(path = "/home/username/directory_name/", pattern = "*.Rda",, full.names = T, recursive = FALSE)

might work, might not work and return the same error, or might execute but when I then type "files" that might return

[1]Error: 'getCharCE' must be called on a CHARSXP

When using R from the R-foundation (https://www.r-project.org/, installed via EPEL), I get the same error and behaviour.

The command sessionInfo() returns the following:

sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.3 (Maipo)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

uname -mrs gives me:

Linux 3.10.0-514.el7.x86_64 x86_64

Any help would be greatly appreciated, best regards

Stefan

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
stefan.braun
  • 3
  • 1
  • 4
  • Do you have files with names containing accented characters? Also, note that `pattern` is supposed to be a regex, not a glob. – Hong Ooi Aug 23 '17 at 10:49
  • Does it work if you're using `pattern = glob2rx("*.Rda")`? – hannes101 Aug 23 '17 at 11:15
  • I mirror what Hong and hannes101 says above. Couple of more things; If you were to copy over all files etc., from your Linux box to the Windows box and ran the same command - what would the outcome be? Also, it seems that this error can happen if there is a memory corruption (in addition to "weird" characters), but as you see the error happening with both MRO as well as CRAN, I wonder if that's it. Finally: What happens if you were to reboot the Linux box, and then try again? – Niels Berglund Aug 23 '17 at 11:25
  • Additionally, I think there's one comma too many in `files <- list.files(path="/home/username/directory_name/", pattern="*.Rda",, full.names=T, recursive=FALSE` so this should be `files <- list.files(path="/home/username/directory_name/", pattern=glob2rx("*.Rda"), full.names=TRUE, recursive=FALSE` – hannes101 Aug 23 '17 at 12:11
  • Thank you very much! I indeed had files with accented characters. After renaming them such that they do not contain accented characters anymore, it works. Using pattern = glob2rx("*.Rda") still results in the error for the files with accented characters. The second comma in the command was a copy and paste error from me, in the code I tried to execute there was no second comma. – stefan.braun Aug 23 '17 at 12:28
  • Does simply renaming my files suffice or should I be worried that there is still something wrong? – stefan.braun Aug 23 '17 at 12:34
  • Perhaps you could also play around with the locale settings, since I suspect it concerns German umlaute, it might be helpful to change the locale settings of the R session. https://stackoverflow.com/questions/24245094/import-csv-files-containing-german-umlauts-into-r#24279538 – hannes101 Aug 23 '17 at 13:07
  • Sorry for taking some time to get back to it. The answer from @HongOoi solved my problem. If you could post that as an answer I will mark it as solving my problem. Thanks! – stefan.braun Sep 25 '17 at 13:39

1 Answers1

2

Your files have names containing accented characters. Changing them to pure-ASCII names should fix the problem.

For Example

unicode_and_raw_filename = paste0("/tmp/\u1234", as.raw("A"))
training_rows <<- read.csv(unicode_and_raw_filename, header = FALSE)

produces:

'translateCharUTF8' must be called on a CHARSXP, but got 'raw'

Looks like R internals has some spaghetti code in charset conversion for Latin2, ISO_8859-2, UTF-8, and CP1252: https://stat.ethz.ch/R-manual/R-devel/library/base/html/iconv.html

Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187