5

I've started to code a image processing program from different image processing algorithms, mostly from René Schulte work, and when benchmarking, I noticed that from all the effects I could find from different sources, a code for applying a 'Softlight' effect was the slowest. I'm not good in optimizing equation, but I feel that the filter is based on a formula that is maybe repeating variables with no reason.

Could this be summarized in something shorter or faster?

// Basically, b is from Image A, and t from Image B
int csoftLight(float b, float t)
      {
          b /= 255;
          t /= 255;

          return (int)((t < 0.5) ? 255 * ((1 - 2 * t) * b * b + 2 * t * b) : 255 * ((1 - (2 * t - 1)) * b + (2 * t - 1) * (Math.Pow(b, 0.5))));
       }

[Edit - Results using the equation Mohammed Hossain found about softlight in PS]

// Input: 137 and 113

// Byte version:
int other = ((byte)((B < 128) ? (2 * ((A >> 1) + 64)) * ((float)B / 255) : (255 - (2 * (255 - ((A >> 1) + 64)) * (float)(255 - B) / 255))));
// Returns 116    


// float version:
int res = (int)((t < 0.5) ? 255 * ((1 - 2 * t) * b * b + 2 * t * b) : 255 * ((1 - (2 * t - 1)) * b + (2 * t - 1) * (Math.Pow(b, 0.5))));
// Returns 129

[Edit]

Here's the quickest algorithm based on Mohammed Hossain answer:

int csoftLight(byte A, byte B)
{
    return (int)((A < 128) ? (2 * ((B >> 1) + 64)) * ((float)A / 255) : (255 - (2 * (255 - ((B >> 1) + 64)) * (float)(255 - A) / 255)));          
}
Léon Pelletier
  • 2,701
  • 2
  • 40
  • 67
  • I'd hope that compilers would do common subexpression elimination, but when floating-point gets involved... – nneonneo Mar 21 '13 at 18:42
  • And googling the name of the guy that added this code to the original source returns nothing. I'll end up calling the coder! – Léon Pelletier Mar 21 '13 at 18:49
  • Arithmetic operations on float are more expensive then on integers. If it's possible to change parameter types to int, then I can provide solution. – Vano Maisuradze Mar 21 '13 at 19:09
  • @Vano Maisuradze I wonder if a version that would quickly round to 1/1000 by using int as input and multiply it by 1000, then divide by 1000 at the end would be faster. Maybe? I just don't know at which extension a stored float is bigger than a stored int in memory. Though I could google it. – Léon Pelletier Mar 21 '13 at 19:26

1 Answers1

4

This answer should help you and clarify some things up a bit: How does photoshop blend two images together?

One of the equations is the soft light algorithm.

#define ChannelBlend_SoftLight(A,B) ((uint8)((B < 128)?(2*((A>>1)+64))*((float)B/255):(255-(2*(255-((A>>1)+64))*(float)(255-B)/255)))) //not very accurate

It importantly avoids the costly square root operation and less importantly uses bit-shift operators in place of division by 2 (which should be optimized away by smart compilers, anyways). It also uses more integer operations than floating operations, which is more faster.

Here is another formula (courtesy to the owners of this which switches the variable's operations, and it seemingly works...

#define ChannelBlend_SoftLight(A,B) (uint8)(((A < 128) ? (2 * ((B >> 1) + 64)) * ((float) A / 255) : (255 - (2 * (255 - ((B >> 1) + 64)) * (float) (255 - A) / 255))));
Community
  • 1
  • 1
Mohammed Hossain
  • 1,319
  • 10
  • 19
  • Wow, you got exactly the context of the equation. Megaupvote! – Léon Pelletier Mar 21 '13 at 19:23
  • Don't know what I'm doing wrong, but it doesn't return the correct image: `int csoftLight(byte A, byte B) { return ((byte)((B < 128) ? (2 * ((A >> 1) + 64)) * ((float)B / 255) : (255 - (2 * (255 - ((A >> 1) + 64)) * (float)(255 - B) / 255)))); }` – Léon Pelletier Mar 21 '13 at 20:04
  • 1
    You must call that for each channel of the pixel; if you are doing this in RGB, then you must call it three times (R,G,B). The end product is three values (R,G,B) that are the three channels of the output pixel. Are you currently doing this? – Mohammed Hossain Mar 21 '13 at 20:37
  • Yes, that's what I'm doing, channel by channel. I've added an edit where I'm showing my input and output using each method. – Léon Pelletier Mar 21 '13 at 21:11
  • 1
    I have found a better one, please see my updated answer and see if it you find it better. – Mohammed Hossain Mar 21 '13 at 21:34
  • Thanks. However, isn't that identical to yours? – Léon Pelletier Mar 21 '13 at 21:46
  • 1
    Notice that the variable operations are switched (where A is used in the first equation, B is used in the second equation). Someone should probably update the first guy's answer that I linked...I'll update the answer with a 'conversion' from src/dest to A and B. – Mohammed Hossain Mar 21 '13 at 21:51
  • Ok! Working. Tested 5-6 times, and giving 282ms, the old algorithm giving 824ms. Great improvement! – Léon Pelletier Mar 21 '13 at 22:06