5

I am profiling my simple 2D XNA game. I found that 4% of entire running time is taken by simple operarion of adding together two Colors , one of them multiplied first by float.

I need to call this method rogulthy 2000 times per frame (for each tile on map), which gave me 120000 times per second for XNA's 60 fps. Even minimal boosting of single call whould give huge speed impact. Yet I simple do not know how can I make this more effective

    private void DoColorCalcs(float factor, Color color)
    {
        int mul = (int)Math.Max(Math.Min(factor * 255.0, 255.0), 0.0);
        tile.Color = new Color(
            (byte)Math.Min(tile.Color.R + (color.R * mul / 255), 255),
            (byte)Math.Min(tile.Color.G + (color.G * mul / 255), 255),
            (byte)Math.Min(tile.Color.B + (color.B * mul / 255), 255));

    }

EDIT: As suggested by Michael Stum:

    private void DoColorCalcs(float factor, Color color)
    {
        factor= (float)Math.Max(factor, 0.0);
        tile.Color = new Color(
            (byte)Math.Min(tile.Color.R + (color.R * factor), 255),
            (byte)Math.Min(tile.Color.G + (color.G * factor), 255),
            (byte)Math.Min(tile.Color.B + (color.B * factor), 255));
    }

This lowered time usage from 4% to 2.5%

PiotrK
  • 4,210
  • 6
  • 45
  • 65
  • 2
    The word you're looking for is optimize, not optimalize. :) – jalf Dec 18 '09 at 17:59
  • It sounds like a lot of calls for a simple 2D game. Can't you change the overall algorithm instead? By e.g. caching the results or avoiding doing everything every frame? –  Dec 18 '09 at 18:02
  • I need to recalculate dynamic lights for tiles (my lighting model includes light flickering, which change light intensity at rate Sin(time)*0.1 for given tile in every frame) – PiotrK Dec 18 '09 at 18:05
  • 1
    Do you *need* to make it more efficient? Is the current performance actually a problem? You just said that it only takes 4% of the total execution time. That means that your "huge" speed impact is *at most* going to be 4%. It means that your app might run at 62.4 FPS instead of 60. Is that really worth the effort? – jalf Dec 18 '09 at 18:07
  • 3
    you can try to divide your tiles in 2 (4, 8) groups and create several threads - each thread computes the colors of one tile group. So you can use all the cores your processor is offering. – Simon Ottenhaus Dec 18 '09 at 18:12
  • currently I have 56 fps on target machine and I want to get full 60 fps. So, yes - boosting 2 fps is worth the effort – PiotrK Dec 18 '09 at 18:18
  • is factor and color the same for all tiles? – Simon Ottenhaus Dec 18 '09 at 18:22
  • @Simon Ottenhaus: Unfortunally not. It's unique per tile – PiotrK Dec 18 '09 at 18:24
  • The solution is simple then: get a faster processor ;) – Simon Ottenhaus Dec 18 '09 at 18:26
  • You could try calculating tile colour on the GPU of course... – Martin Dec 19 '09 at 15:39

2 Answers2

1

The obvious improvement would be to include the division operation (/ 255) in the calculation of mul, to reduce the divisions from 3 to a single division:

private void DoColorCalcs(float factor, Color color)
{
    float mul = Math.Max(Math.Min(factor * 255.0f, 255.0f), 0.0f) / 255f; 
    tile.Color = new Color(
        (byte)Math.Min(tile.Color.R + (color.R * mul), 255),
        (byte)Math.Min(tile.Color.G + (color.G * mul), 255),
        (byte)Math.Min(tile.Color.B + (color.B * mul), 255));
}

That being said, since you're replacing tile.Color, it may actually be faster to replace it in place instead of overwriting it (though I'd profile this to see if it helps):

private void DoColorCalcs(float factor, Color color)
{
    float mul = Math.Max(Math.Min(factor * 255.0f, 255.0f), 0.0f) / 255f;
    tile.Color.R = (byte)Math.Min(tile.Color.R + (color.R * mul), 255);
    tile.Color.G = (byte)Math.Min(tile.Color.G + (color.G * mul), 255);
    tile.Color.B = (byte)Math.Min(tile.Color.B + (color.B * mul), 255);
}

This prevents the recalculation of the alpha channel, and may reduce the amount of instructions a bit.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • (int)Math.Max(Math.Min(factor * 255.0, 255.0), 0.0) / 255 will result in 0 in most cases. So this doesn't help. – Simon Ottenhaus Dec 18 '09 at 18:06
  • Agree with Simon Ottenhaus - you are dividing integer 0..255 by 255 which gave you 254 zeros and single value "one" – PiotrK Dec 18 '09 at 18:08
  • Ah, I forgot - I tried to replace creating new Color object by setting it's variables. It required a bit of "bad approach" as I had to change tile.Color from Accessor type to Variable. But still it had no speed impact (testing and rounded result to 0,1 fps) – PiotrK Dec 18 '09 at 18:11
  • can't you change mul to a float then? Or would that be too inaccurate? – Michael Stum Dec 18 '09 at 18:12
  • You can do this, but you have to make mul a float. Since you're doing float-based math in every other computation, there should actually be a slight speed boost due to this. – Reed Copsey Dec 18 '09 at 18:13
  • I'd be shocked if the compiler didn't optimize away the division already. – jalf Dec 18 '09 at 18:15
  • Keep in mind that float operation take more time than integer calculations. If you use floats you should replace the division by a multiplication with 1/255 = 0,0039215686f. Most processors can do one single precision multiplication per clock cycle, but take much longer for a division. – Simon Ottenhaus Dec 18 '09 at 18:15
  • @PiotrK: Yeah, it's difficult to do this with structs as properties, but it can make a slight difference in some cases. It may not be enough here to really impact it, though, since it's just saving a single assignment. – Reed Copsey Dec 18 '09 at 18:15
  • @jalf: It won't, in his case, because of the casting. The compiler will optimize away certain things, but when you have mixed type math operations, it doesn't typically get optimized. – Reed Copsey Dec 18 '09 at 18:16
  • @Simon: The JIT usually makes that optimization for you ;) Constant division and multiplication with floats tends to get optimized out by the compiler and/or JIT (depending on the type) – Reed Copsey Dec 18 '09 at 18:16
1

My first question is, why floating point? If you're just scaling colors to fade them, you don't need a lot of precision. Example:

int factorTimes256;
tile.Color.R = Math.Max(255, tile.Color.R + (color.R * factorTimes256) / 256);
// same for G and B

In other words, represent your factor as an integer from 0 to 256, and do all the calculations on integers. You don't need more precision than 8 bits because the result is only 8 bits.

My second question is, did you say you went from 4% to 2.5% in this code? That's tiny. People who use profilers that only do instrumentation or sample the program counter are often satisfied with such small improvements. I bet you have other things going on that take a lot more time, that you could attack. Here's an example of what I mean.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135