2

I have a data frame df1. I would like to find the minimum turning point at each column, where the value before and after the minimum point is larger than it. For example in x=c(2,5,3,6,1,1,1), I would like to determine that the minimum turning point is at 3, but with the min function, I am only able to find the minimum point which is 1. If there is no minimum point, I would like to get NA. Thanks.

> df
structure(list(x = c(2, 5, 3, 6, 1, 1, 1), y = c(6, 9, 3, 6, 
3, 1, 1), z = c(9, 3, 5, 1, 4, 6, 2)), row.names = c(NA, -7L), class = c("tbl_df", 
"tbl", "data.frame"))


df1>
  x     y     z
  2     6     9
  5     9     3
  3     3     5
  6     6     1
  1     3     4
  1     1     6
  1     1     2

Desired result as shown below.

df2>
 x      y      z
 3      3      1
kaix
  • 305
  • 3
  • 10
  • How do you calculate min turning point? How do you get 3 for `x=c(2,5,3,6,1,1,1)` ? – Ronak Shah Dec 17 '20 at 09:27
  • For x, the turning point is at 3 as the value before and after 3 is larger than it. Thanks. – kaix Dec 17 '20 at 09:30
  • `sapply(df, function(x) x[which.max(diff(x) < 0) + 1])` could work. Though `z` is not clear- it also should also be equal to `3`, no? As 9 > 3. Also, are you looking for the value in the at turning point or it's location? – David Arenburg Dec 17 '20 at 09:33
  • @DavidArenburg, I would like to get the value and also the location of the turning point. For z, 3 is also a turning point but I would like to get the point at 1 as 1 is smaller than 3. Thanks. – kaix Dec 17 '20 at 09:38
  • So why for `x` you also don't want 1 as it is smaller than 3? Not sure I follow your logic – David Arenburg Dec 17 '20 at 09:39
  • 1
    For x, the 1 is not the turning point as there is no value larger than 1 at the 6th row. While in z, the value before and after 1 is larger than it (5,1,4). – kaix Dec 17 '20 at 09:42
  • I think the criteria can be simplified to `min values except those in first & last rows`! Am I right? – AnilGoyal Dec 17 '20 at 09:47

3 Answers3

4

You can use lead and lag to compare current value with previous and next value.

library(dplyr)

df %>% summarise(across(.fns = ~min(.x[which(lag(.x) > .x & lead(.x) > .x)])))

#     x     y     z
#  <dbl> <dbl> <dbl>
#1     3     3     1
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Ronak, but the user wants the results `3, 3, 1` instead of `3, 3, 3`? – AnilGoyal Dec 17 '20 at 09:40
  • Thanks. Is it possible to find the location (which row) the minimum turning point is at? – kaix Dec 17 '20 at 09:50
  • What if there are two (of course same) turning points? Which position to return in that case? – AnilGoyal Dec 17 '20 at 09:53
  • I f there are repeated, I would like to get the row that comes first. Thanks. – kaix Dec 17 '20 at 09:58
  • 2
    To get the position you could use : `df %>% summarise(across(.fns = ~which(.x == min(.x[which(lag(.x) > .x & lead(.x) > .x)]))[1]))` @kaix – Ronak Shah Dec 17 '20 at 10:14
  • If I have x=c(8,9,3,3,7) , how do I get the position of turning point at 4th row. By using 'df %>% summarise(across(.fns = ~which(.x == min(.x[which(lag(.x) >= .x & lead(.x) > .x)]))[1]))' ,I was able to get the position at row 3. – kaix Dec 18 '20 at 04:34
  • I think `df %>% summarise(across(.fns = ~tail(which(.x == min(.x[which(lag(.x) > .x & lead(.x) > .x)])), 1)))` should work. – Ronak Shah Dec 18 '20 at 05:09
  • Thanks, this work for x, but when I did it with y=c(6,9,3,6,3,1,1) , it returns the row at 5 because its the tail for minimum value 3. – kaix Dec 18 '20 at 08:56
3

You can use diff, get the sign than diff again to get the valleys. Use min to get the lowest valey.

#Value
sapply(df, function(x) min(x[1+which(diff(sign(diff(x))) == 2)]))
#x y z 
#3 3 1

#Position
sapply(df, function(x) {
    tt <- 1+which(diff(sign(diff(x))) == 2)
    tt[which.min(x[tt])] })
#x y z 
#3 3 4 

But this will work only in case the valley is one position wide.

Am more robust solution will be using the function from Finding local maxima and minima:

peakPosition <- function(x, inclBorders=TRUE) {
  if(inclBorders) {y <- c(min(x), x, min(x))
  } else {y <- c(x[1], x)}
  y <- data.frame(x=sign(diff(y)), i=1:(length(y)-1))
  y <- y[y$x!=0,]
  idx <- diff(y$x)<0
  (y$i[c(idx,F)] + y$i[c(F,idx)] - 1)/2
}

#Value
sapply(df, function(x) min(x[ceiling(peakPosition(-x, FALSE))]))
#x y z 
#3 3 1 

#Position
sapply(df, function(x) {
    tt <- peakPosition(-x, FALSE)
    tt[which.min(x[floor(tt)])] })
#x y z 
#3 3 4 

An alternative would be to use rle:

x <- c(8,9,3,3,8,1,1)
y <- rle(x)
i <- 1 + which(diff(sign(diff(y$values))) == 2)
min(y$values[i]) #Value
#[1] 3
j <- which.min(y$values[i])
1+sum(y$lengths[seq(i[j])-1]) #First Position
#[1] 3
sum(y$lengths[seq(i[j])]) #Last Position
#[1] 4
GKi
  • 37,245
  • 2
  • 26
  • 48
  • 1
    Very nice+1, I was thinking exactly about that! There's an old [related question](https://stackoverflow.com/questions/6836409/finding-local-maxima-and-minima) around. – jay.sf Dec 17 '20 at 10:06
  • @GKi, if I have a column x with x=c(8,9,3,3,8,1,1), is it possible to get the minimum value 3 at position 4. As after the 3 in row 4, the next value 8 is greater than it. – kaix Dec 18 '20 at 13:53
  • @kaix I have added a variant using `rle` which will give 3 for the value and 3 for the first and 4 for the last position. – GKi Dec 21 '20 at 07:34
1

Alternate approach

df %>% summarise_all(~ifelse(min(.)==last(.) | min(.) == first(.), min(.[. != last(.) & . != first(.)]), min(.)))

  x y z
1 3 3 1

For returning the row_nums

df %>% mutate_all(~ifelse(min(.)==last(.) | min(.) == first(.), min(.[. != last(.) & . != first(.)]), min(.))) %>%
  mutate(id = row_number()) %>% left_join(df %>% mutate(id = row_number()), by = "id") %>%
  mutate(x_r = ifelse(x.x == x.y, row_number(), 0),
         y_r = ifelse(y.x == y.y, row_number(), 0),
         z_r = ifelse(z.x == z.y, row_number(), 0)) %>%
  select(ends_with("r")) %>% summarise_all(~min(.[. != 0]))

  x_r y_r z_r
1   3   3   4
```
AnilGoyal
  • 25,297
  • 4
  • 27
  • 45