The format=
string is correct, but I think maydin's link to a question whose answers include Sys.setlocale(.)
is likely going to be the best approach. It might help to reproduce your working environment if you post what Sys.getlocale("LC_TIME")
returns.
Here's some sample data, and my guess at the locale:
d <- structure(list(rok = c("2017", "2018", "2018", "2019"), mies = c("sty", "lut", "wrz", "gru")), class = "data.frame", row.names = c(NA, -4L))
d
# rok mies
# 1 2017 sty
# 2 2018 lut
# 3 2018 wrz
# 4 2019 gru
Sys.setlocale("LC_TIME", "Polish")
# [1] "Polish_Poland.1250"
as.Date(paste(d$rok, d$mies, "01", sep = "-"), format = "%Y-%b-%d")
# [1] "2017-01-01" "2018-02-01" "2018-09-01" "2019-12-01"
As a workaround, or if this is incomplete and does not work for some of them,you can make a vector of all of your abbreviated months and then match
your $mies
against it. I'm going to guess Polish (informed via https://web.library.yale.edu/cataloging/months), which leads me to:
month.abb.polish <- c("sty", "lut", "mar", "kwi", "maj", "cze", "lip", "sie", "wrz", "paź", "lis", "gru") # note 1
### this portion is just to prove that it works even if locale is wrong
Sys.setlocale("LC_TIME", "English_United States.1252")
# [1] "English_United States.1252"
as.Date(paste(d$rok, d$mies, "01", sep = "-"), format = "%Y-%b-%d")
# [1] NA NA NA NA
### this is the workaround
paste(d$rok, match(d$mies, month.abb.polish), "01", sep = "-")
# [1] "2017-1-01" "2018-2-01" "2018-9-01" "2019-12-01"
as.Date(paste(d$rok, match(d$mies, month.abb.polish), "01", sep = "-"), format = "%Y-%m-%d")
# [1] "2017-01-01" "2018-02-01" "2018-09-01" "2019-12-01"
Note:
I do not speak Polish, so my guess at "paź"
(and any of the other months) may not be quite right; and in fact, I may have inferred incorrectly and this is a different language altogether ... my apologies. The point of the "workaround" part of the answer, though, is that it doesn't matter what locale is set, nor if the abbreviations are actually technically correct: all it requires is that you know what abbreviations in the dataset correspond with which month-numbers. This solution is a one-for-one, so if the data is inconsistent and uses different abbreviations for the same month, then a different approach would be necessary.
Multiple candidates could be handled as a lookup table. This version of month.abb.polish
has more than one candidate for each month-number (I arbitrarily added "jan"
and "feb"
as two english variants, to fill it out, plus provided two abbreviations for "październik".
month.abb.polish <- c("sty"=1, "jan"=1, "lut"=2, "feb"=2, "mar"=3, "kwi"=4, "maj"=5, "cze"=6, "lip"=7, "sie"=8, "wrz"=9, "paz"=10, "pa?"=10, "lis"=11, "gru"=12)
month.abb.polish[d$mies]
# sty lut wrz gru
# 1 2 9 12
paste(d$rok, month.abb.polish[d$mies], "01", sep = "-")
# [1] "2017-1-01" "2018-2-01" "2018-9-01" "2019-12-01"
as.Date(paste(d$rok, month.abb.polish[d$mies], "01", sep = "-"), format = "%Y-%m-%d")
# [1] "2017-01-01" "2018-02-01" "2018-09-01" "2019-12-01"