9

My dataframe

a1 <- c("a","a","b","b","c","d","e","e")
b2 <- c("01.01.2015", "02.02.2015", "14.02.2012", "16.08.2008", "17.06.2003", "31.01.2015", "07.01.2022", "09.05.2001")
c3 <- c("1a", "2b", "3c", "4d", "5e", "6f", "7g", "8h")
d3 <- c(1:8)

df2 <- data.frame(a1,b2,c3,d3, stringsAsFactors = F)

My code.

library(dplyr)
library(magrittr)

test <- df2 %>%
    group_by(a1) %>% 
    as.Date(b2, format = "%d.%m.%Y")

Error in as.Date.default(., b2, format = "%d.%m.%Y") : do not know how to convert '.' to class “Date”

Well, I tried without the pipe:

df$b2 <- as.Date(df$b2, format = "%d.%m.%Y")

Error in df$b2 : object of type 'closure' is not subsettable

First: Why do I get two different error messages since I am (for my understanding) am doing the same?

Second, why cant I convert my column to date?!

I might should add that I am aware of using mutate to alter the column as date format. But I wonder why my approach is not working.

four-eyes
  • 10,740
  • 29
  • 111
  • 220

1 Answers1

20

Do the transformations within mutate

df2 %>%
   group_by(a1) %>%
   mutate(b2=as.Date(b2, format = "%d.%m.%Y"))
#    a1         b2    c3    d3
#  (chr)     (date) (chr) (int)
#1     a 2015-01-01    1a     1
#2     a 2015-02-02    2b     2
#3     b 2012-02-14    3c     3
#4     b 2008-08-16    4d     4
#5     c 2003-06-17    5e     5
#6     d 2015-01-31    6f     6
#7     e 2022-01-07    7g     7
#8     e 2001-05-09    8h     8

If we need to do only the transformation, we don't need to group by 'a1'.

mutate(df2, b2= as.Date(b2, format= "%d.%m.%Y"))

By using %<>% operator from magrittr, we can transform in place.

df2 %<>%
  mutate(b2= as.Date(b2, format= "%d.%m.%Y"))

  
ah bon
  • 9,293
  • 12
  • 65
  • 148
akrun
  • 874,273
  • 37
  • 540
  • 662
  • That works, yes. However, I do not understand why my approach is not working?! – four-eyes Oct 29 '15 at 07:56
  • @Chrissl In the `dplyr` framework, we use either `mutate/transmute` to change/create a new column or `summarise` to get a summary output per group. – akrun Oct 29 '15 at 07:58
  • ok. What is the particular advantage in using the dplyr package here. It is more typing, and does the same?! – four-eyes Oct 29 '15 at 08:01
  • @Chrissl For this case, you don't need the `group_by` step, I am just trying to use your code to show where you went wrong. – akrun Oct 29 '15 at 08:02
  • @Chrissl Regarding the advantage part, some people say that it is easier to read when using `%>%`. There might be some truth behind that because when I showed the dplyr code and some other code that gives the same result to a python guy (with no experience with R), he could understand it better with dplyr. For me, it is subjective. – akrun Oct 29 '15 at 08:06
  • Ah. When I wnat to change the format of a column (i.e. `chr` to date) I have to use a combination of `%>%` and `%<>%` when keep on piping stuff. Like `test <- df2 %>% group_by(a1) %<%>% mutate(b2 == as.Date(b2, format = "%d.%m.%Y") %>% do more %>% do more %>% do more...` – four-eyes Oct 29 '15 at 08:12
  • @Chrissl You don't need to use `%<>%`. You can assign it to the same object or a new object i.e. `test <- df2 %>% group_by(a1) %>% mutate(b2= as.Date(b2, format='%d.%m.%Y'))` – akrun Oct 29 '15 at 08:14
  • akrun, thanks for sticking with me. Then I still do not understand why `test <- df2 %>% group_by(a1) %>% as.Date(b2, format = "%d.%m.%Y")` is not working.... It should be the same as `df2 %>% group_by(a1) %>% mutate(b2=as.Date(b2, format = "%d.%m.%Y"))` – four-eyes Oct 29 '15 at 08:16
  • @Chrissl What I meant is that without using `test <- ` or `df2 <-`, you can update the `df2` by using the `%<>%` operator. – akrun Oct 29 '15 at 08:24