2

The dataset I have contains the following list of dates that are currently being recognised as factors:

Interview_Date = c("Monday 23rd May 2005", "Tuesday 24th May 2005", 
                   "Wednesday 25th May 2005", "Thursday 26th May 2005",
                   "Friday 27th May 2005", "Saturday 28th May 2005",
                   "Sunday 29th May 2005", "Monday 30th May 2005",
                   "Tuesday 31st May 2005", "Wednesday 1st June 2005",
                   "Thursday 2nd June 2005", "Friday 3rd June 2005",
                   "Saturday 4th June 2005", "Sunday 5th June 2005")

I'm having trouble converting them into dates. When I tried

as.Date(dataframe$Interview_Date, format = "%A%d%B%Y")

The resulted ended with "NA". I need it to be recognised as a date so that I can create a boxplot showing:

boxplot(EU_Opinion ~ Interview_Date,
    data = dataframe,
    xlab = "Date",
    ylab = "EU Opinion") 

But it currently doesn't work because it is a factor variable. What should I do? Or is there another way around creating the boxplot?

Paul H
  • 65,268
  • 20
  • 159
  • 136

3 Answers3

1

You can remove the ordinal numeral part (i.e. st, nd, rd, th) and then convert to the Date object.

as.Date(sub("(?<=\\d)\\D+?\\b", "", x, perl = TRUE), "%A %d %B %Y")

# [1] "2005-05-23" "2005-05-24" "2005-05-25" "2005-05-26" "2005-05-27" "2005-05-28" "2005-05-29"
# [8] "2005-05-30" "2005-05-31" "2005-06-01" "2005-06-02" "2005-06-03" "2005-06-04" "2005-06-05"
  • %A : Full weekday name in the current locale. (Also matches abbreviated name on input.)
  • %d : Day of the month as decimal number (01–31).
  • %B : Full month name in the current locale. (Also matches abbreviated name on input.)
  • %Y : Year with century.

Data

x <- c("Monday 23rd May 2005", "Tuesday 24th May 2005", "Wednesday 25th May 2005", "Thursday 26th May 2005", "Friday 27th May 2005", "Saturday 28th May 2005","Sunday 29th May 2005", "Monday 30th May 2005", "Tuesday 31st May 2005", "Wednesday 1st June 2005", "Thursday 2nd June 2005", "Friday 3rd June 2005", "Saturday 4th June 2005", "Sunday 5th June 2005")
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
1

Using lubridate:

library(tidyverse)
library(lubridate)
df <- data.frame(Interview_Date = c("Monday 23rd May 2005", "Tuesday 24th May 2005", "Wednesday 25th May 2005", "Thursday 26th May 2005", "Friday 27th May 2005", "Saturday 28th May 2005","Sunday 29th May 2005", "Monday 30th May 2005", "Tuesday 31st May 2005", "Wednesday 1st June 2005", "Thursday 2nd June 2005", "Friday 3rd June 2005", "Saturday 4th June 2005", "Sunday 5th June 2005"))
df <- df %>% 
  mutate(new_interview_Date = dmy(Interview_Date))
glimpse(df)
# Rows: 14
# Columns: 2
# $ Interview_Date     <fct> Monday 23rd May 2005, Tuesday 24th May 2005, Wednesday 25th ...
# $ new_interview_Date <date> 2005-05-23, 2005-05-24, 2005-05-25, 2005-05-26, 2005-05-27,...
user63230
  • 4,095
  • 21
  • 43
1

You could use gsub and regular expressions.

as.Date(gsub("(.*\\d)\\D{1,2}(.*)", "\\1\\2", x), format="%A %e %B %Y")
# [1] "2005-05-23" "2005-05-24" "2005-05-25" "2005-05-26" "2005-05-27" "2005-05-28"
# [7] "2005-05-29" "2005-05-30" "2005-05-31" "2005-06-01" "2005-06-02" "2005-06-03"
# [13] "2005-06-04" "2005-06-05"

Data:

x <- c("Monday 23rd May 2005", "Tuesday 24th May 2005", "Wednesday 25th May 2005", "Thursday 26th May 2005", "Friday 27th May 2005", "Saturday 28th May 2005","Sunday 29th May 2005", "Monday 30th May 2005", "Tuesday 31st May 2005", "Wednesday 1st June 2005", "Thursday 2nd June 2005", "Friday 3rd June 2005", "Saturday 4th June 2005", "Sunday 5th June 2005")
jay.sf
  • 60,139
  • 8
  • 53
  • 110