2

I had a problem when summing columns based on condition. Here, summing columns means d23+d24+d25+d26+..+d31. Below is part of my dataframe.

      d23  d24  d25   d26    d27    d28    d29    d30    d31
854 -0.60 4.11 8.52  0.90  -7.99 -10.27  -8.32  -6.79 -11.71
855 -1.14 2.66 8.14  0.11  -8.96 -11.25  -9.17  -7.84 -12.53
856 -1.16 0.71 5.45 -1.65 -10.72 -11.18 -11.58 -10.44 -14.29
857  0.08 5.36 9.59 -0.22  -9.79  -9.47  -9.44  -7.67 -10.57
858 -0.95 4.86 8.18 -4.03 -12.15 -11.19 -11.37  -9.47 -13.90
859 -0.70 3.72 8.60  1.87  -6.99  -9.77  -7.84  -6.20 -11.31

As you can see, there are positive and negative values. I want to sum across columns in such a way, if the value is positive, the set it as zero; if the value is negative, then take the absolute value of this value. Finally, sum across columns and create a new column.

Any idea how can I realize that?

Yabin Da
  • 553
  • 5
  • 11

1 Answers1

6

One possibility could be:

colSums(abs(df) * (df < 0))

  d23   d24   d25   d26   d27   d28   d29   d30   d31 
 4.55  0.00  0.00  5.90 56.60 63.13 57.72 48.41 74.31 
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
  • This looks like a row sum instead of a column sum. I edited my question to make it clearer. – Yabin Da Jun 28 '19 at 21:53
  • 2
    You can replace `culSums()` with `rowSums()`: `rowSums(abs(df) * (df < 0))`. – tmfmnk Jun 28 '19 at 21:55
  • Can I use this for other conditions? Say, if the value is greater than 5, then set the value as the original value minus 5. Otherwise, set this value to be zero. Then sum across columns. – Yabin Da Jun 28 '19 at 22:04
  • 2
    Sure, for that you can use something like `rowSums(((df > 5) * df - 5) * (df > 5))`. – tmfmnk Jun 28 '19 at 22:17