I'm producing a set of graphs in two languages with ggplot2
(Hadley Wikham). I could produce them in two separate workflows by renaming variables inside the original dataset. Instead, I wish to modify a ggplot
object: I wish to first produce the graphs in English and then translate the labels into French. How could/should I change the legend keys inside the ggplot object
? And how can I then sort the legend keys?
The reason I am exploring this approach is that I would like my plot colours and symbols to be the same in English and French, while at the same time having the legend keys ordered alphabetically. The problem is that French and English legend keys do not have the same alphabetical order (Spain versus Espagne). Compare the legend keys obtained from the MWE: the legend keys are ordered alphabetically in the English legend, but incorrectly in the French legend.
Replacing the xlab
, ylab
, ggtitle
, and modifying the styles of the axes labels (e.g. number formatting) is rather straightforward, so my focus really is on the legend keys and their order of listing inside the legend.
A MWE with lots of names to illustrate the tediousness of having to copy names several times in the approach below (once to group
, another time for colour
, and again for shape
, etc.):
df <- structure(list(year = c("2006", "2007", "2006", "2007", "2006",
"2007", "2006", "2007", "2006", "2007", "2006", "2007", "2006",
"2007", "2006", "2007", "2006", "2007", "2006", "2007", "2006",
"2007", "2006", "2007", "2006", "2007", "2006", "2007", "2006",
"2007", "2006", "2007", "2006", "2007", "2006", "2007", "2006",
"2007", "2006", "2007"), country = c("Australia", "Australia",
"Austria", "Austria", "Belgium", "Belgium", "Canada", "Canada",
"Denmark", "Denmark", "Finland", "Finland", "France", "France",
"Germany", "Germany", "Greece", "Greece", "Italy", "Italy", "Japan",
"Japan", "Netherlands", "Netherlands", "New Zealand", "New Zealand",
"Norway", "Norway", "Portugal", "Portugal", "Spain", "Spain",
"Sweden", "Sweden", "Switzerland", "Switzerland", "United Kingdom",
"United Kingdom", "United States", "United States"), value = c(33,
33, 33, 33, 30, 30, 34, 34, 30, 30, 33, 33, 28, 29, 27, 27, 40,
39, 35, 35, 35, 35, 27, 27, 33, 33, 27, 27, 37, 37, 32, 32, 31,
31, 32, 31, 32, 32, 33, 33)), .Names = c("year", "country", "value"
), row.names = c(NA, -40L), class = "data.frame")
library("ggplot2")
ggplot(data = df, aes(x = year, y = value, group = country, colour = country)) +
geom_line(size = 0.5) + geom_point(size = 1)
ggsave(last_plot(), file = "stackoverflow-1.png")
ggplot(data = df, aes(x = year, y = value, group = factor(country, labels = c("Australie", "Autriche", "Belgique", "Canada", "Danemark", "Finlande", "France", "Allemagne", "Grèce", "Italie", "Japon", "Pays-Bas", "Nouvelle-Zélande", "Norvège", "Portugal", "Espagne", "Suède", "Suisse", "Royaume-Uni", "États-Unis")), colour = factor(country, labels = c("Australie", "Autriche", "Belgique", "Canada", "Danemark", "Finlande", "France", "Allemagne", "Grèce", "Italie", "Japon", "Pays-Bas", "Nouvelle-Zélande", "Norvège", "Portugal", "Espagne", "Suède", "Suisse", "Royaume-Uni", "États-Unis")))) + geom_line(size = 0.5) + geom_point(size = 1) + theme(legend.title = element_blank())
ggsave(last_plot(), file = "stackoverflow-2.png")
I would like to have a method that would not break if I use only a subset of the variables (countries in the example). The most convenient and less error-prone would be to define a mapping like this:
list("A Cuckoo Land" = "Un Pays Idyllique", # This mapping is not used
"Australia" = "Australie",
"Austria" = "Autriche",
"Belgium" = "Belgique",
"Canada" = "Canada",
"Denmark" = "Danemark",
"Finland" = "Finlande",
"France" = "France",
"Germany" = "Allemagne",
"Greece" = "Grèce",
"Italy" = "Italie",
"Japan" = "Japon",
"Netherlands" = "Pays-Bas",
"New Zealand" = "Nouvelle-Zélande",
"Norway" = "Norvège",
"Portugal" = "Portugal",
"Spain" = "Espagne",
"Sweden" = "Suède",
"Switzerland" = "Suisse",
"United Kingdom" = "Royaume-Uni",
"United States" = "États-Unis")
and substitute, within the legend keys, every occurrence of the left-hand side by the right-hand side. (even better if the method can handle a trilingual approach, e.g. a mapping like "Belgium" = c("Belgique", "Bélgica")
.