2

I have multiple character columns (around 20) that I would like to change all to date formats and drop the time using r. I've tried loops, mutate and apply.

Here is some sample data using just two columns

col1 = c("2017-04-01 23:00:00", "2017-03-03 00:00:01", "2017-04-02 
00:00:01")
col2 = c("2017-04-10 08:41:49", "2017-04-10 08:39:48", "2017-04-10 
08:41:51")
df <- cbind(col1, col2)

I've tried:

df <- df %>% mutate(df, funs(ymd))

and

df <- df %>% mutate(df, funs(mdy))

Both gave me an error. I've also tried putting all column names in a list and do a

for(i in namedlist) {
as_date(df[i])
glimpse(df)
}

That didn't work either.

I've tried to use the answer from Convert multiple columns to dates with lubridate and dplyr and that did not work either. That posts wanted certain variables to be converted. I want all of my variables to be converted so the var command doesn't apply.

Any suggestions to do this efficiently? Thank you.

Jordan
  • 1,415
  • 3
  • 18
  • 44
  • 1
    Check `mutate_all` or `mutate_at`. – AntoniosK Jan 29 '18 at 12:25
  • So from duped link, change the `mutate_at` to `mutate_all` – Sotos Jan 29 '18 at 12:59
  • @Sotos. It doesn't drop the time like my question specified. – Jordan Jan 29 '18 at 13:03
  • That has nothing to do with your question! Try `data.frame(df) %>% mutate_all(funs(as.Date))` – Sotos Jan 29 '18 at 13:27
  • @Sotos. I'm not trying to make you emotional here. I thought I was clear. Convert strings to dates and drop the time. Says it in the first two lines of the original question. Thanks for the code. R states `Evaluation error: character string is not in a standard unambiguous format.` – Jordan Jan 29 '18 at 13:32
  • Emotional? Nahh... I even helped you with the code. I don't know where you got that impression ( I m assuming the exclamation mark on my previous comment, but that was just to emphasize the different problem you are facing). As for your new problem, It does not happen with the data you shared. Update your question with the new problem (and data that gives that problem) and I will reopen – Sotos Jan 29 '18 at 13:42
  • @Sotos. It was the exclamation. My apologies to you for assuming. I will update with new data. – Jordan Jan 29 '18 at 13:49
  • I reopened the question. I ll give it a shot as soon as you update – Sotos Jan 29 '18 at 14:14
  • @Sotos. I really appreciate you helping an r rookie. I can't seem to produce reproducible code that matches the error I have. I do think I have found a different way. I think I will delete this question to keep people from wasting there time on it. That ok with you? – Jordan Jan 29 '18 at 17:35

2 Answers2

3

If you're applying over all columns, you can do a very short call with lapply. I'll pass it here using data.table:

library( data.table )
setDT( df )

df <- df[ , lapply( .SD, as.Date ) ]

On your test data, this gives:

> df
         col1       col2
1: 2017-04-01 2017-04-10
2: 2017-03-03 2017-04-10
3: 2017-04-02 2017-04-10

NOTE: your test data actually results in a matrix, so you need to convert it to a data.frame first (or directly to a data.table).

You can do the same thing with just base R, but I personally like the above solution better:

df <- as.data.frame( lapply( df, as.Date ) )

> df
        col1       col2
1 2017-04-01 2017-04-10
2 2017-03-03 2017-04-10
3 2017-04-02 2017-04-10
rosscova
  • 5,430
  • 1
  • 22
  • 35
  • Thanks @rosscova. When I use that on the actual data which is a `data frame`, I get an error that states `Error: unsupported index type: list` – Jordan Jan 29 '18 at 12:49
  • Do you have a list column in your data frame? Check the output of `sapply(df,class)`, which should tell you the class of all columns. – rosscova Jan 29 '18 at 20:18
  • Thanks @rosscova. They are all characters. I found a way using Alteryx to reclass all of the columns into date formats though. Thank you for responding. – Jordan Jan 30 '18 at 12:55
  • You're welcome @Jordan. If the solution to your problem isn't shown here, you should post it as an answer (ie: answer your own question) so that others can find it too. – rosscova Jan 30 '18 at 20:40
  • Understand. But I used a workflow within the Alteryx software suite. I don't think I can attach the workflow and it if I could, the data would be too big to attach with it. Do you suggest I answer the question and describe the workflow? – Jordan Feb 01 '18 at 15:54
  • I see, you switched workflows entirely, that makes sense. You're probably right that putting an Alteryx workflow as an answer to an R question wouldn't be appropriate. Glad you got it working! – rosscova Feb 01 '18 at 23:26
2

EDIT: This time with the right wildcards for the as.Date function. I also added a reproducible example:

library(dplyr)

df <- data.frame(date_1 = c("2019-01-01", "2019-01-02", "2019-01-03"),
                 date_2 = c("2019-01-04", "2019-01-05", "2019-01-06"),
                 value = c(1,2,3),
                 stringsAsFactors = F)

str(df)

date_cols <- c("date_1", "date_2")

df_2 <- df %>%
   mutate_at(vars(date_cols), funs(as.Date(., "%Y-%m-%d")))

str(df_2)
AT90
  • 51
  • 7
  • Hey~are you sure your code works in your computer? I got lots of NA with your code. – Darren Tsai Mar 21 '19 at 13:33
  • My bad @DarrenTsai, I did not properly specify the wildcards for the `as.Date` function ("%Y-%m-%d"). The current answer should work :) – AT90 Mar 24 '19 at 10:42
  • seems `funs()` is soft deprecated, but you can use `mutate_at(vars(date_cols), list(~ as.Date(., "%Y-%m-%d")))` instead – Brian D Jan 08 '20 at 16:35