0

Given a sample csv file (to download from here), I read it with code below:

df <- read.csv(file = './report1.csv', header = TRUE)
head(df, 5)

Out:

enter image description here

As you can see, date columns is in format of %d %B %Y (based on the reference table from this link) but with data type of factor.

So I try to use as.Date(date, format='%d %B %Y') to convert it to date format.

library(lubridate)

df %>%
  mutate(date=as.Date(date, format = '%d %B %Y'))
  # mutate(date=dmy(date))

But as you can see, all date values become NAs for that column. Meanwhile, mutate(date=dmy(date)) works well.

enter image description here

Does someone could explain what's error in as.Date(...) code? Thanks.

Data:

dput(head(df, 5))

Out:

structure(list(date = structure(c(31L, 43L, 55L, 67L, 79L), .Label = c("1 April 2020", 
"1 August 2020", "1 December 2020", "1 February 2021", "1 January 2021", 
"1 July 2020", "1 June 2020", "1 March 2021", "1 May 2020", "1 November 2020", 
"1 October 2020", "1 September 2020", "10 April 2020", "10 August 2020", 
"10 December 2020", "10 February 2021", "10 January 2021", "10 July 2020", 
"10 June 2020", "10 May 2020", "10 November 2020", "10 October 2020", 
"10 September 2020", "11 April 2020", "11 August 2020", "11 December 2020", 
"11 February 2021", "11 January 2021", "11 July 2020", "11 June 2020", 
"11 March 2020", "11 May 2020", "11 November 2020", "11 October 2020", 
"11 September 2020", "12 April 2020", "12 August 2020", "12 December 2020", 
"12 February 2021", "12 January 2021", "12 July 2020", "12 June 2020", 
"12 March 2020", "12 May 2020", "12 November 2020", "12 October 2020", 
"12 September 2020", "13 April 2020", "13 August 2020", "13 December 2020", 
"13 February 2021", "13 January 2021", "13 July 2020", "13 June 2020", 
"13 March 2020", "13 May 2020", "13 November 2020", "13 October 2020", 
"13 September 2020", "14 April 2020", "14 August 2020", "14 December 2020", 
"14 February 2021", "14 January 2021", "14 July 2020", "14 June 2020", 
"14 March 2020", "14 May 2020", "14 November 2020", "14 October 2020", 
"14 September 2020", "15 April 2020", "15 August 2020", "15 December 2020", 
"15 February 2021", "15 January 2021", "15 July 2020", "15 June 2020", 
"15 March 2020", "15 May 2020", "15 November 2020", "15 October 2020", 
"15 September 2020", "16 April 2020", "16 August 2020", "16 December 2020", 
"16 February 2021", "16 January 2021", "16 July 2020", "16 June 2020", 
"16 March 2020", "16 May 2020", "16 November 2020", "16 October 2020", 
"16 September 2020", "17 April 2020", "17 August 2020", "17 December 2020", 
"17 February 2021", "17 January 2021", "17 July 2020", "17 June 2020", 
"17 March 2020", "17 May 2020", "17 November 2020", "17 October 2020", 
"17 September 2020", "18 April 2020", "18 August 2020", "18 December 2020", 
"18 February 2021", "18 January 2021", "18 July 2020", "18 June 2020", 
"18 March 2020", "18 May 2020", "18 November 2020", "18 October 2020", 
"18 September 2020", "19 April 2020", "19 August 2020", "19 December 2020", 
"19 February 2021", "19 January 2021", "19 July 2020", "19 June 2020", 
"19 March 2020", "19 May 2020", "19 November 2020", "19 October 2020", 
"19 September 2020", "2 April 2020", "2 August 2020", "2 December 2020", 
"2 February 2021", "2 January 2021", "2 July 2020", "2 June 2020", 
"2 March 2021", "2 May 2020", "2 November 2020", "2 October 2020", 
"2 September 2020", "20 April 2020", "20 August 2020", "20 December 2020", 
"20 February 2021", "20 January 2021", "20 July 2020", "20 June 2020", 
"20 March 2020", "20 May 2020", "20 November 2020", "20 October 2020", 
"20 September 2020", "21 April 2020", "21 August 2020", "21 December 2020", 
"21 February 2021", "21 January 2021", "21 July 2020", "21 June 2020", 
"21 March 2020", "21 May 2020", "21 November 2020", "21 October 2020", 
"21 September 2020", "22 April 2020", "22 August 2020", "22 December 2020", 
"22 February 2021", "22 January 2021", "22 July 2020", "22 June 2020", 
"22 March 2020", "22 May 2020", "22 November 2020", "22 October 2020", 
"22 September 2020", "23 April 2020", "23 August 2020", "23 December 2020", 
"23 February 2021", "23 January 2021", "23 July 2020", "23 June 2020", 
"23 March 2020", "23 May 2020", "23 November 2020", "23 October 2020", 
"23 September 2020", "24 April 2020", "24 August 2020", "24 December 2020", 
"24 February 2021", "24 January 2021", "24 July 2020", "24 June 2020", 
"24 March 2020", "24 May 2020", "24 November 2020", "24 October 2020", 
"24 September 2020", "25 April 2020", "25 August 2020", "25 December 2020", 
"25 February 2021", "25 January 2021", "25 July 2020", "25 June 2020", 
"25 March 2020", "25 May 2020", "25 November 2020", "25 October 2020", 
"25 September 2020", "26 April 2020", "26 August 2020", "26 December 2020", 
"26 February 2021", "26 January 2021", "26 July 2020", "26 June 2020", 
"26 March 2020", "26 May 2020", "26 November 2020", "26 October 2020", 
"26 September 2020", "27 April 2020", "27 August 2020", "27 December 2020", 
"27 February 2021", "27 January 2021", "27 July 2020", "27 June 2020", 
"27 March 2020", "27 May 2020", "27 November 2020", "27 October 2020", 
"27 September 2020", "28 April 2020", "28 August 2020", "28 December 2020", 
"28 February 2021", "28 January 2021", "28 July 2020", "28 June 2020", 
"28 March 2020", "28 May 2020", "28 November 2020", "28 October 2020", 
"28 September 2020", "29 April 2020", "29 August 2020", "29 December 2020", 
"29 January 2021", "29 July 2020", "29 June 2020", "29 March 2020", 
"29 May 2020", "29 November 2020", "29 October 2020", "29 September 2020", 
"3 April 2020", "3 August 2020", "3 December 2020", "3 February 2021", 
"3 January 2021", "3 July 2020", "3 June 2020", "3 March 2021", 
"3 May 2020", "3 November 2020", "3 October 2020", "3 September 2020", 
"30 April 2020", "30 August 2020", "30 December 2020", "30 January 2021", 
"30 July 2020", "30 June 2020", "30 March 2020", "30 May 2020", 
"30 November 2020", "30 October 2020", "30 September 2020", "31 August 2020", 
"31 December 2020", "31 January 2021", "31 July 2020", "31 March 2020", 
"31 May 2020", "31 October 2020", "4 April 2020", "4 August 2020", 
"4 December 2020", "4 February 2021", "4 January 2021", "4 July 2020", 
"4 June 2020", "4 March 2021", "4 May 2020", "4 November 2020", 
"4 October 2020", "4 September 2020", "5 April 2020", "5 August 2020", 
"5 December 2020", "5 February 2021", "5 January 2021", "5 July 2020", 
"5 June 2020", "5 March 2021", "5 May 2020", "5 November 2020", 
"5 October 2020", "5 September 2020", "6 April 2020", "6 August 2020", 
"6 December 2020", "6 February 2021", "6 January 2021", "6 July 2020", 
"6 June 2020", "6 March 2021", "6 May 2020", "6 November 2020", 
"6 October 2020", "6 September 2020", "7 April 2020", "7 August 2020", 
"7 December 2020", "7 February 2021", "7 January 2021", "7 July 2020", 
"7 June 2020", "7 March 2021", "7 May 2020", "7 November 2020", 
"7 October 2020", "7 September 2020", "8 April 2020", "8 August 2020", 
"8 December 2020", "8 February 2021", "8 January 2021", "8 July 2020", 
"8 June 2020", "8 May 2020", "8 November 2020", "8 October 2020", 
"8 September 2020", "9 April 2020", "9 August 2020", "9 December 2020", 
"9 February 2021", "9 January 2021", "9 July 2020", "9 June 2020", 
"9 May 2020", "9 November 2020", "9 October 2020", "9 September 2020"
), class = "factor"), immobility = c(0.407428566, 1.553824969, 
3.033190503, 4.950250769, 7.40514975), cases = c(4.753590191, 
5.241747015, 5.690359454, 5.924255797, 6.150602768), icu = c(4.407936176, 
4.543724109, 4.72143235, 5.005429015, 5.280731862), deaths = c(0.922324625, 
1.315709119, 1.69429665, 2.057554016, 2.403912035)), row.names = c(NA, 
5L), class = "data.frame")
ah bon
  • 9,293
  • 12
  • 65
  • 148
  • Can you provide a `dput(head(df,20))` of your data ? – Benson_YoureFired Apr 29 '22 at 14:57
  • 2
    I cant seem to recreate this problem when I download the data from your provided link. Date imports as a character and correctly reformats to a date (even when I force it to a factor, then run the code). – jpsmith Apr 29 '22 at 14:58
  • 4
    I guess this is a `locale` issue that the month names in your local `locale` are not January, February, March, etc. – Darren Tsai Apr 29 '22 at 15:09
  • 2
    You have to keep in mind that converting month names depends on the locale, e.g. as I'm running on a German locale `as.Date("13 March 2020", format = '%d %B %Y')` returns NA, while using the German `as.Date("13 März 2020", format = '%d %B %Y')` gives the correct "2020-03-13". To check your locale you could do `Sys.getlocale("LC_TIME")`. And a possible fix would be to switch to an English locale for the converting the dates. – stefan Apr 29 '22 at 15:12
  • `Sys.getlocale("LC_TIME")` returns `"zh_CN.UTF-8"` for my laptop. – ah bon Apr 29 '22 at 15:43
  • I just added `dput(head(df, 5))`, @Benson_YoureFired – ah bon Apr 29 '22 at 15:46
  • 1
    [strptime, as.POSIXct and as.Date return unexpected NA](https://stackoverflow.com/questions/13726894/strptime-as-posixct-and-as-date-return-unexpected-na) – Henrik Apr 29 '22 at 15:55
  • `df %>% mutate(date=strptime(date, '%d %B %Y'))`? It returns ``s for the whole column – ah bon Apr 29 '22 at 16:04
  • 1
    ahbon, what happens when you follow the code in Henrik's linked answer? Something like `loc <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C") ; head(as.Date(df$date, format="%d %B %Y")); Sys.setlocale("LC_TIME", loc)` – r2evans Apr 29 '22 at 19:56
  • It works, but is it possible to set it in pipe? @r2evans – ah bon Apr 30 '22 at 00:09
  • 1
    try `... %>% withr::with_locale(new=c(LC_TIME="C"), as.Date(as.Date(date, format = '%d %B %Y')))` – r2evans Apr 30 '22 at 00:19
  • I get an error with your code. It seems we should use: `withr::with_locale(new=c(LC_TIME="C"), df %>% mutate(date=as.Date(date, format="%d %B %Y")))` – ah bon Apr 30 '22 at 00:59
  • `df %>% withr::with_locale(new=c(LC_TIME="C"), as.Date(as.Date(date, format = '%d %B %Y')))` returns`Error in withr::with_locale(., new = c(LC_TIME = "C"), as.Date(as.Date(date, : The parameter is not used (as.Date(as.Date(date, format = "%d %B %Y")))`. – ah bon Apr 30 '22 at 01:00

0 Answers0