3

I need a quick hint how to calculate the sum of all columns (here named A, B, C) which are greater or equal to some threshold (defined in column key).

df <- data.frame(
  key = c(0.5, 0.8, 0.2),
  A = c(0.7, 0.6, NA),
  B = c(0.7, 0.8, 0.9),
  C = c(0.1, NA, NA)
)

The solution can be achieved with if statement, but I am looking for some more efficient way.

df$solution <- NA
for (i in 1:nrow(df)){
  threshold <- df[i, "key"]
  values <- df[i, c(2:ncol(df))]
  a <- sum(values[values >= threshold], na.rm = TRUE)
  df[i, "solution"] <- a
}

> df
  key   A   B   C solution
1 0.5 0.7 0.7 0.1      1.4
2 0.8 0.6 0.8  NA      0.8
3 0.2  NA 0.9  NA      0.9

I found some examples here, here and here where threshold is predefined value, but can’t make it work for my case.

JerryTheForester
  • 456
  • 1
  • 9
  • 26

1 Answers1

3
df$solution <- rowSums(df[-1] * (df[,-1]>=df[,1]), na.rm = TRUE)
df
  key   A   B   C solution
1 0.5 0.7 0.7 0.1      1.4
2 0.8 0.6 0.8  NA      0.8
3 0.2  NA 0.9  NA      0.9
Onyambu
  • 67,392
  • 3
  • 24
  • 53