0

I would like to create a new column called maxdate and mindate from a list of date columns assuming 4 date columns and has missing values.

The solution below only gives the max/min of the rows of the columns. I am interested in finding max/min date across the columns.

df$maxdate <- apply (df[1:4], 1, max, na.rm = TRUE)
df <- data.frame(
  col1 = c("11/09/1999", "11/09/1999", "11/09/1999", "11/09/1999", "11/09/1999"),
  col2 = c("01/01/2000", "01/01/2000", "01/01/2000", "01/01/2000", "01/01/2000"),
  col3 = c("12/09/1961", "10/03/1995", "30/03/1992", "25/05/1992", "25/05/1992"),
  col4 = c("01/01/1930", "01/01/1939", "01/01/1942", "01/01/1936", "01/01/1937")
)

sample data

col1          col2        col3      col4

11/09/1999  01/01/2000  12/09/1961  01/01/1930
11/09/1999  01/01/2000  10/03/1995  01/01/1939
11/09/1999  01/01/2000  30/03/1992  01/01/1942
11/09/1999  01/01/2000  25/05/1992  01/01/1936
11/09/1999  01/01/2000  25/05/1992  01/01/1937
megmac
  • 547
  • 2
  • 11
kaffy
  • 15
  • 5
  • Could you please share some reproducible data using `dput`? – Quinten May 17 '23 at 09:50
  • Use `sapply`, or `apply(..., 2, ...)`. In `dplyr`, use `across`. In `collapse`, `fmin` and `fmax` are vectorized over columns. – Maël May 17 '23 at 09:50
  • Hey @kaffy, could you edit your question to include some sample data? You dont have to use your real data, there are plenty of ways to provide sample data which != real data - see here for some tips: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – jpsmith May 17 '23 at 17:48
  • hello @jpsmith thanks so much for the tip, please see the sample data in the question – kaffy May 17 '23 at 18:12
  • @Ben thanks much! Yes a new column called maxdate with a maximum date from across the date columns. That is one max date from all the 4 columns in a column maxdate – kaffy May 18 '23 at 01:59

2 Answers2

0
library(dplyr)

df = data.frame(date1 = c("2023-05-11", "2023-04-12","2023-07-13","2023-01-14","2023-05-15"),
                date2 = c("2023-04-11", "2023-07-12","2023-09-13","2023-05-14","2023-12-15"),
                date3 = c("2023-08-11", "2023-06-12","2023-08-13","2023-08-14","2023-05-15"),
                date4 = c("2023-01-11", "2023-05-12","2023-05-13","2023-12-14","2023-05-15"))

df <- df  %>% mutate_all(as.Date)

# edit: removed rowwise and added na.rm=TRUE, as you seem to want the max from all rows, disregarding NAs?
df <- df %>%  mutate(max_date = max(date1, date2,date3,date4, na.rm=TRUE))

df
  • thanks, your solution result returns a max for each rows of the column. I only want one max date across the four dates columns in a new columns – kaffy May 17 '23 at 12:03
  • Okay, do you mean the max from all four columns and all rows? I am editing my answer to do that, then. – Simon Krenn May 19 '23 at 03:01
0

Like this?

df <- data.frame(
  col1 = c("11/09/1999", "11/09/1999", "11/09/1999", "11/09/1999", "11/09/1999"),
  col2 = c("01/01/2000", "01/01/2000", "01/01/2000", "01/01/2000", "01/01/2000"),
  col3 = c("12/09/1961", "10/03/1995", "30/03/1992", "25/05/1992", "25/05/1992"),
  col4 = c("01/01/1930", "01/01/1939", "01/01/1942", "01/01/1936", "01/01/1937")
)


maxdate <- lapply(df, function (x) max(x, na.rm = TRUE))




> maxdate
$col1
[1] "11/09/1999"

$col2
[1] "01/01/2000"

$col3
[1] "30/03/1992"

$col4
[1] "01/01/1942"

Note that I did not assign it to a column as the number of rows != the number of columns so you cannot add it to your dataframe.

megmac
  • 547
  • 2
  • 11