0

I have a column that has data as such

df <- data.frame(request_time = c("2020-12-31 13:05:00", NULL, "2020-11-14 02:04:01")

I want to split the request_time column to extract just the the date. Hoping to have a new column called request_date.

I was trying to do the following:

df$request_date <- as.Date(df$request_time) 

But this returns an error saying the "character string is not in a standard unambioguous format" Im assuming dates due to the NULLS present. How can I get past this?

Sanam
  • 93
  • 7
  • 3
    Are you sure your values are NULL? Because you can't have a NULL value in your vector. If you run your sample code (after adding a missing closing parenthesis), you see it only has two rows. Did you mean NA values rather than NULL? It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input that can be used to test and verify possible solutions. – MrFlick Oct 12 '21 at 19:33

3 Answers3

1

We could use as_date function together with ymd from lubridate package:

library(dplyr)
library(lubridate)
df %>% 
  mutate(request_time = ymd(as_date(request_time)))

output:

  request_time
1   2020-12-31
2   2020-11-14

OR

library(tidyr)
df %>% 
  separate(request_time, c("date", "time"), sep=" ", remove = FALSE)
        request_time       date     time
1 2020-12-31 13:05:00 2020-12-31 13:05:00
2 2020-11-14 02:04:01 2020-11-14 02:04:01
TarJae
  • 72,363
  • 6
  • 19
  • 66
0

Just extract the date using str_extract:

library(stringr)
library(dplyr)
f %>%
  mutate(request_time = str_extract(request_time, "[0-9-]+"))
  request_time
1   2020-12-31
2   2020-11-14

in base R:

f$request_time <- str_extract(f$request_time, "[0-9-]+")
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34
0

The NULL part is not clear. If it is a string "NULL", the as.Date should return NA. Otherwise, NULL cannot exist as such. It may be a list column (which is not clear)

df$request_time <- as.Date(df$request_time)

-output

df$request_time
[1] "2020-12-31" NA           "2020-11-14"

data

df <- data.frame(request_time = c("2020-12-31 13:05:00", "NULL", "2020-11-14 02:04:01"))
akrun
  • 874,273
  • 37
  • 540
  • 662