I want to generate UTF-8 HTML output from a data frame using kable. I know that there are many similar questions on stackoverflow, but I still can't find a solution to this problem.
kable("ب",format="html")
generates:
<table>
<thead>
<tr>
<th style="text-align:left;"> x </th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;"> <U+0628> </td>
</tr>
</tbody>
</table>
R is running on Windows with the following session info:
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252
[3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.38
loaded via a namespace (and not attached):
[1] compiler_4.0.3 tools_4.0.3 highr_0.8 xfun_0.30
and syslocale:
Sys.getlocale()
[1] "LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252;LC_MONETARY=English_Canada.1252;LC_NUMERIC=C;LC_TIME=English_Canada.1252"
I've tried setting my locale to "en_US.UTF-8" but it seems this isn't supported on Windows. I also tried Sys.setlocale("LC_CTYPE", "arabic")
but it didn't help.
I know how to convert the text in the table to html utf-8 escape codes (like &#xxxx;) but this makes for an awkward html file.
Is there a good solution for this? Or is it better to use a non-windows system for working with UTF-8?