4

I would like to get a column that has the earliest date in each row from multiple date columns.

My dataset is like this.

df = data.frame( x_date = as.Date( c("2016-1-3", "2016-3-5", "2016-5-5")) , y_date = as.Date( c("2016-2-2", "2016-3-1", "2016-4-4")), z_date = as.Date(c("2016-3-2", "2016-1-1", "2016-7-1")) )

+---+-----------+------------+-----------+ | | x_date | y_date | z_date | +---+-----------+------------+-----------+ |1 | 2016-01-03 | 2016-02-02 |2016-03-02 | |2 | 2016-03-05 | 2016-03-01 |2016-01-01 | |3 | 2016-05-05 | 2016-04-04 |2016-07-01 | +---+-----------+------------+-----------+

I would like to get something like the following column.

+---+---------------+ | | earliest_date | +---+---------------+ |1 | 2016-01-03 | |2 | 2016-01-01 | |3 | 2016-04-04 | +---+---------------+

This is my code, but it outputs the earliest date from the overall columns and rows....

library(dplyr) df %>% dplyr::mutate(earliest_date = min(x_date, y_date, z_date))

Toshihiro
  • 61
  • 1
  • 1
  • 7

3 Answers3

10

One option is pmin

df %>% 
   mutate(earliest_date = pmin(x_date, y_date, z_date))
#    x_date     y_date     z_date   earliest_date
#1 2016-01-03 2016-02-02 2016-03-02    2016-01-03
#2 2016-03-05 2016-03-01 2016-01-01    2016-01-01
#3 2016-05-05 2016-04-04 2016-07-01    2016-04-04

If we need only the single column, then transmute is the option

df %>%
    transmute(earliest_date = pmin(x_date, y_date,z_date))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    This is what I wanted to do! [pmin()](http://stackoverflow.com/questions/28070878/r-use-min-within-dplyrmutate) the function I need to use. Thank you very much. – Toshihiro Aug 23 '16 at 23:55
  • Additionally, when I tried to use pmin() for rows with missing values, NAs, I needed to use ifelse() to deal with NAs. However, at that time, Date class was automatically converted to double type (precisely speaking, Date class information was removed). To keep class information of Date, I tried safe.ifelse() proposed [here](http://stackoverflow.com/questions/6668963/how-to-prevent-ifelse-from-turning-date-objects-into-numeric-objects) , and it's working fine. – Toshihiro Aug 24 '16 at 01:34
  • @Toshihiro There is `na.rm` argument in `pmin` . By default, it is `FALSE` i.e. `pmin(x_date, y_date, z_date, na.rm = TRUE)` – akrun Aug 24 '16 at 03:33
2

You can apply rowwise to get minimum of the date (as the dates are already of class Date)

apply(df, 1, min)

#[1] "2016-01-03" "2016-01-01" "2016-04-04"

Or you can also use pmin with do.call

do.call(pmin, df)

#[1] "2016-01-03" "2016-01-01" "2016-04-04"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
2

You need to transform your data set first if you want the output to be a data frame with columns in rows.

library(reshape2)
melt(df) %>% group_by(variable) %>% summarize(earliest_date = min(value))
Robin Gertenbach
  • 10,316
  • 3
  • 25
  • 37