-3

I have a data frame which looks like this:

data frame picture

highopen_percent_change <- c(0.285, -2.156, 0.030, -0.184, -0.005, 0,092, -0.092, 0, -0.0563, -0.020, -0.174, -0.492, 0.201, -0.005)
IDcombinatie_nr <- c(47, 41, 42, 45, 41, 42, 48, 32, 44, 48, 32, 48, 32, 48, 28, 12, 32)
df <- data.frame(highopen_percent_change, IDcombinatie_nr)

In this data frame I want to know the smallest positive and negative value of highopen_percent_change per IDcombinatie_nr. So per IDcombinatie_nr I want to know what the smallest positive and negative highopen_percent_change values are. This value should not be zero, since some values of highopen_percent_change are zero in my data frame.

So, this is what I want:

For 47 the lowest positive value is 0.05 and the lowest negative value is -0.1.

For 41 the lowest positive value is 0.02 and the lowest negative value is -0.4.

For 42 the lowest positive value is 0.3 and the lowest negative value is -0.2.

I hope you can help me.

Luc
  • 7
  • 1
  • 5
  • 1
    Please provide [a minimal, reproducible example](https://stackoverflow.com/q/5963269/10068985) using `dput()`. – Darren Tsai Apr 15 '20 at 10:53
  • 2
    the number of the data points in the two columns doesn't add up: `Error in data.frame(highopen_percent_change, IDcombinatie_nr) : arguments imply differing number of rows: 17, 19` – Chris Ruehlemann Apr 15 '20 at 11:16

3 Answers3

0

Without providing data, try:

require(dplyr)

minpositive = function(x) min(x[x > 0])
maxnegative = function(x) max(x[x < 0])

df %>% 
   group_by(IDcombinatie_nr) %>% 
   summarise(Value = c(minpositive(highopen_percent_change),
                       maxnegative(highopen_percent_change))
Tur
  • 604
  • 4
  • 9
  • For other ways to solve this, check https://stackoverflow.com/questions/25314336/extract-the-maximum-value-within-each-group-in-a-dataframe – Tur Apr 15 '20 at 11:02
  • Thank you for the response, what does the x mean in your answer? – Luc Apr 15 '20 at 11:07
  • It defines the object you use in the function, as you can see it is used highopen_percent_change df column as x in minpositive/maxnegative in the sumarise – Tur Apr 15 '20 at 11:09
0

In R base, you can use tapply plus subsetting:

min.pos <- tapply(df$highopen_percent_change[df$highopen_percent_change > 0],
                 df$IDcombinatie_nr[df$highopen_percent_change > 0], min)

min.neg <- tapply(df$highopen_percent_change[df$highopen_percent_change < 0],
                 df$IDcombinatie_nr[df$highopen_percent_change < 0], min)
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34
0

Using data.table, remove rows with 0 values, group by ID and sign (>0), get minimum, lastly reshape from long-to-wide using dcast:

library(data.table)

# updated example data:
dt <- data.table(
  highopen_percent_change = c(0.285, -2.156, 0.030, -0.184, -0.005, 0.092, -0.092, 0, -0.0563, -0.020, -0.174, -0.492, 0.201, -0.005, 0.530, -0.445),
  IDcombinatie_nr = c(47, 41, 42, 45, 41, 42, 48, 32, 44, 48, 32, 48, 32, 48, 28, 12))

# result
dcast(dt[highopen_percent_change != 0, .(myMin = min(highopen_percent_change)), 
         by = .(IDcombinatie_nr, 
                grp = ifelse(highopen_percent_change > 0, "minPositive", "minNegative"))], 
      IDcombinatie_nr ~ grp, value.var = "myMin")
#    IDcombinatie_nr minNegative minPositive
# 1:              12     -0.4450          NA
# 2:              28          NA       0.530
# 3:              32     -0.1740       0.201
# 4:              41     -2.1560          NA
# 5:              42          NA       0.030
# 6:              44     -0.0563          NA
# 7:              45     -0.1840          NA
# 8:              47          NA       0.285
# 9:              48     -0.4920          NA
zx8754
  • 52,746
  • 12
  • 114
  • 209