I'm trying to understand the vagaries of printing unicode in R in a windows 7 environment. I'm trying to include a unicode character in a dataframe so that I can get the ≥ symbol in plots and in csv exports.
Using RStudio, I can do
require(dplyr)
require(ggplot2)
test <- data.frame(x = c(1,2,3), y = c(4,5,6), stringsAsFactors = FALSE)
test
test <- test %>% mutate(x2 = ifelse(x == 1, "1 day",
ifelse(x == 2, "2 days",
ifelse(x >= 3, "\u2265 3 days", NA))))
test
x y x2
1 1 4 1 day
2 2 5 2 days
3 3 6 = 3 days
However, if I do
table(test$x2)
I get
≥ 3 days 1 day 2 days
1 1 1
When I try to export the dataframe to csv, the ≥ symbol is rendered as = when viewing with Excel. I also want to plot my data in ggplot, with the ≥ symbol as a label. Unfortunately, when I do
ggplot(test, aes(x=x2, y=y)) + geom_point()
The symbol prints as '= 3 days' rather than '≥ 3 days'. I understand that I can successfully specify
+ scale_x_discrete(labels=c("\U2265 3 days", "1 day", "2 days"))
And the symbol will print successfully, but I'd much rather draw the label from the dataframe itself.
I have 3 questions
- How can I get the ≥ symbol to print in ggplot using entries from a dataframe?
- How can I export a dataframe to csv and retain the desired ≥ symbol, rather than it degrading to '='?
- What have I failed to understand about printing unicode in windows? (optional, potentially a lot.)
I understand that these problems don't arise in mac or linux, but unfortunately at work these are not options.