1

Background

I have a function that takes as input a Tibble (or a data.frame, don't matter) and produces a custom Markdown table in a text file. The data comes from a RESTful API (AirTable, if you must know); is UTF-8 encoded; already contains unicode characters (such as ¥ and €); and is processed into a Tibble via functions in the httr, jsonlite and tibble packages. I have confirmed via the base Encoding function that the data in the Tibble columns are UTF-8.

Edit: I am running R 3.5.1 on Windows 10.

Problem

When I use cat to print data in the Tibble out to a file, it works as expected. The currency symbols and any other crazy thing in the text are printed just fine. (Although, curiously, the resulting file encoding seems to be ANSI.)

However... when creating the Markdown table, I am attempting to translate a logical column as a blank string when FALSE and as a ☑ character when TRUE. This symbol is not in the data, so I need to write it in there with the function. However, it always literally prints out to the file as the string <U+2611>.

The really curious thing is, if I tell cat to print to the console instead of the file, changing nothing else... it works. I'm bewildered.

What I Have Tried

First, I tried using the intToUtf8 function, passing in the decimal representation of the symbol (9745). I tried using this directly in the cat statement, I also tried first saving the result to a variable first and then passing that into the cat statement.

Then, I tried just directly copy-pasting the character into a string in the R file. As above, I tried passing it directly and indirectly via a variable.

Lastly, I read this: Print unicode character string in R and used an escaped unicode sequence to insert the character. Again, I tried two ways - directly in the cat statement and indirectly as a variable - but the result is the same.

(I have not tried the stringi package as suggested in the above answer, but I'm not having quite the same issue that individual was having so I'm not sure that'd fare any better.)

Zelbinian
  • 3,221
  • 5
  • 20
  • 23
  • I have chased down some answers re: how to get that cat function to do the proper thing with the unicode character I'm defining in the R file. (Basically, my Windows locale is getting in the way.) I had originally titled my question to be about WHY cat behaves differently with these sources, rather than focusing on the solution of encoding that one char, but was persuaded otherwise by SO's question suggestions :p But I think that's the interesting thing to solve here. – Zelbinian Nov 29 '18 at 22:01
  • You could try `Encoding(foo) <- "unknown"` where `foo` is the input to `cat()` when writing to a file. That should work if R options are set so that `getOption("encoding")` returns `"native.enc"`. – mvkorpel Dec 31 '18 at 11:02

0 Answers0