Finding the Date at which maximum occured

Question

In the below dataframe, is it possible to find the Date at which maximum occured by groups

df
Date        Var    Value
27/9/2019    A       56
28/9/2019    A       50
1/10/2019    B       90
2/10/2019    B       100

df1      Max         Date          Mean
A        56        27/9/2019        53
B        100       2/10/2019        95

I tried with ````df1 <- df %>% group_by(Var) %>% summarise(Date = df$Date[which.max(df$Value)])```` But could not get — Dev P, Oct 13 '19 at 03:17

score 2 · Answer 1 · answered Oct 13 '19 at 05:25

2

We can group_by Var, calculate the mean of Value and select the row with maximum value.

library(dplyr)

df %>%
  group_by(Var) %>%
  mutate(Mean = mean(Value)) %>%
  slice(which.max(Value))

#  Date      Var   Value  Mean
#  <fct>     <fct> <int> <dbl>
#1 27/9/2019 A        56    53
#2 2/10/2019 B       100    95

answered Oct 13 '19 at 05:25

Ronak Shah

377,200
20
156
213

This is efficient and perhaps qualifying the `mutate` as `dplyr::mutate(Mean = mean(Value))` might make it explicit. +1. – deepseefan Oct 13 '19 at 05:31

score 0 · Answer 2 · edited Jun 20 '20 at 09:12

There might be a better way to do this. Since you want a summary which reduce multiple values down to a single value. You could *_join the output of the filter with the summary table as follows:

library(dplyr)
df1 <- df %>%
  group_by(Var) %>%
  filter(Value == max(Value)) %>%
  select(df1=Var, Max=Value, Date)

df2 <-df %>%
  group_by(Var) %>%
  summarise_at(.vars = vars(Value),
               .funs = c(mean="mean", sd="sd")) 
df2 %>%
  left_join(df1, by = "Var") %>%
  select(Var, Value, Date, mean, sd)

# -------------------------------------------------------------------------

# # A tibble: 2 x 5
#   Var   Value Date       mean    sd
#   <chr> <dbl> <chr>     <dbl> <dbl>
# 1 A        56 27/9/2019    53  4.24
# 2 B       100 2/10/2019    95  7.07

Data

df <- data.frame(
  Date = c("27/9/2019", "28/9/2019", "1/10/2019", "2/10/2019"), 
  Var = c("A", "A", "B", "B"), 
  Value = c(56, 50, 90, 100), stringsAsFactors = F
)

Hope that is what you want.

thanks. Is it not possible by summarise. Because I need to find mean, sd and for the groups as well. I was able to do that. So basically my code goes like this ````df1 <- df %>% group_by(Var) %>% summarise(Std = round(sd(Value),2),Mean = round(mean(Value),2),Max = round(max(Value),2),filter(Value == max(Value)) %>% arrange(Value,Var, Date))```` But not working — Dev P, Oct 13 '19 at 03:43
@DevP, I edited my answer in a way I taught you might use `summarise`. — deepseefan, Oct 13 '19 at 05:37

hello_friend · Accepted Answer · 2019-10-13T11:23:22.447

Base R, split-apply-combine (Edited):

# Create df, ensure date vec has appropriate type: 

df <- data.frame(

  Date = as.Date(c("27/9/2019", "28/9/2019", "1/10/2019", "2/10/2019"), "%d/%m/%y"),

  Var = c("A", "A", "B", "B"), 

  Value = c(56, 50, 90, 100),

  stringsAsFactors = F
)

# Split df by "Var" values: 

split_applied_combined <- lapply(split(df, df$Var), function(x){

# Calculate the max date: 

    max_date <- x$Date[which(x$Value == max(x$Value))]

    # Calculate the mean: 

    mean_val <- mean(x$Value)

    # Calculate the std_dev: 

    sd_val <- sd(x$Value)

   # Combine vects into df: 

    summarised_df <- data.frame(max_date, mean_val, sd_val)

    }
  )

# Combine list back into dataframe:

split_applied_combined <- do.call(rbind, 

                          # Store df name as vect:

                                  mapply(cbind,

                                         "Var" = names(split_applied_combined),

                                         split_applied_combined,

                                         SIMPLIFY = FALSE))

Dplyr alternative:

require("dplyr")

# Group by var, summarise data, store return object as a dataframe: 

summarised_df <- 

  df %>% 

  group_by(Var) %>% 

  summarise(max_date_per_group = max(Date), mean_val_per_group = mean(Value), sd_per_group = sd(Value)) %>% 

  ungroup()

Finding the Date at which maximum occured

3 Answers3

Data