1

I'm trying to edit pixels of a Bitmap (960x510), and I'm using the fastest method I could find. But it's still painfully slow (7 FPS).

unsafe
{
    BitmapData bitmapData = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), ImageLockMode.ReadWrite, bmp.PixelFormat);
    int bytesPerPixel = Image.GetPixelFormatSize(bmp.PixelFormat) / 8;
    int heightInPixels = bitmapData.Height;
    int widthInBytes = bitmapData.Width * bytesPerPixel;
    byte* PtrFirstPixel = (byte*)bitmapData.Scan0;
    for(int y = 0; y < heightInPixels; y++)
    {
        byte* currentLine = PtrFirstPixel + (y * bitmapData.Stride);
        for (int x = 0; x < widthInBytes; x = x + bytesPerPixel)
        {
            for (int i = 0; i < Shaders.Count; i++)
            {
                rgba color = new rgba(currentLine[x], currentLine[x + 1], currentLine[x + 2], -1);
                if (bytesPerPixel == 4) color.a = currentLine[x + 3];
                Shaders[i].pixel(new xy(x * bytesPerPixel % bmp.Width, x * bytesPerPixel / bmp.Height), color);
                currentLine[x] = (byte)color.r;
                currentLine[x + 1] = (byte)color.g;
                currentLine[x + 2] = (byte)color.b;
                if (bytesPerPixel == 4) currentLine[x + 3] = (byte)color.a;
            }
        }
    }
    bmp.UnlockBits(bitmapData);
}

The method inside Shaders[i].pixel changes the color.r, color.g, color.b and color.a values. How can I increase performance?

Flexan
  • 33
  • 5
  • have you checked this https://stackoverflow.com/questions/24701703/c-sharp-faster-alternatives-to-setpixel-and-getpixel-for-bitmaps-for-windows-f ? – Sven Bardos Sep 24 '22 at 22:21
  • @SvenBardos I've used it in the past. The problem with that solution is that while it does give a performance boost (but still only 10-13 FPS), I'm unable to modify the alpha layer correctly (same with bytes, which cuts off half the image). By the way, the Bitmap size is only 960x510. – Flexan Sep 24 '22 at 22:52

1 Answers1

1

I do not code in C# but you are doing too much operations per color channel and pixel not sure what you re trying to do but most likely you could optimize it a lot for example:

  1. why 3 loops?

    this stuff is usually done in 2 loops and the last loop unrolled or done at once. Also I assume Shaders.Count is the same as color channel count then why are you setting all the channels in each iteration of i ?

  2. why new without delete in innermost loop?

    this is heap trashing I would move the new before loops and delete after it. or get rid of it completely. This might boost speed a lot.

  3. why computing (x * bytesPerPixel % bmp.Width, x * bytesPerPixel / bmp.Height) ?

    You are computing it heightInPixels*Shaders.Count times instead of once. This can be precomputed once into a LUT before for loops instead of computing it again and again. This is major speed boost.

    btw if LUT is out of question due to low memory you could at least move the computation before i loop. Also if you change the bmp x resolution to power of 2 to get rid of all the / and % operations

  4. why if (bytesPerPixel == 4) currentLine[x + 3] = (byte)color.a;

    You can have 2 copies of code one for bytesPerPixel == 4 and one for without it so you move the if before loops. Any if statement inside heavy duty loop is a major performance hit.

  5. Use multi threading to boost speed.

    so simply obtain your process affinity mask get the number n of CPU/cores divide the image to n regions and do each with its own thread.

Spektre
  • 49,595
  • 11
  • 110
  • 380
  • 1. The Shaders count is not the same as any loop. Its just a list containing interfaces and it could be 1 or 1000. 2. I have created 1 instance of both xy and rgba and set the values in the loop, +1 fps. 3. No idea what LUT means, but that is a good point. 4. Will implement, didn't think if statements had a big cost (I've added alot of them and it didn't really affect) but every bit counts. 5. Have tried, the problem with that is that multiple threads are trying to call IShader.pixel on the same object at the same time, and locking doesn't increase performance. – Flexan Sep 25 '22 at 11:22
  • @Flexan LUT means look up table ... – Spektre Sep 25 '22 at 12:21
  • Implemented 2, 3 and 4 and getting average of 28 FPS. I can acquire 33 FPS average without the method call, and 40 FPS average without the innermost for loop. The problem is that I need to execute the pixel method on all members of the Shaders collection, so I'm not sure if I can make it better. – Flexan Sep 25 '22 at 13:41
  • @Flexan 28 fps from 7 is nice ... without knowing what exactly you doing and experience with your programming environment I can not help any further ... sometimes changing pixel format of the resulting bitmap helps (so less conversions or less memory operations are needed). There is still room for minor improvements you have `x+=bytesPerPixel` and `x+0 x+1 x+2` which can be converted to few `x++` operations which is one less addition, but do not expect too much boost from it – Spektre Sep 25 '22 at 14:39