0

I have a problem with reading 8bit grayscale bmp. I am able to get info from header and to read the palette, but I can't refer pixel values to the palette entries. Here I have found how to read the pixel data, but not actually how to use it in case of bmp with a palette. I am a beginner. My goal is to read only one row of pixels at a time.

Code:

#include <iostream>
#include <fstream>
using namespace std;

int main(int arc, char** argv)
{   const char* filename="Row_tst.bmp";
    remove("test.txt");
    ofstream out("test.txt",ios_base::app);//file for monitoring the results

    FILE* f = fopen(filename, "rb");
    unsigned char info[54];
    fread(info, sizeof(unsigned char), 54, f); // read the header

    int width = *(int*)&info[18];
    int height = *(int*)&info[22];

    unsigned char palette[1024]; //read the palette
    fread(palette, sizeof(unsigned char), 1024, f);
    for(int i=0;i<1024;i++)
    {   out<<"\n";
        out<<(int)palette[i];
    }

    int paletteSmall[256]; //1024-byte palette won't be needed in the future
    for(int i=0;i<256;i++)
    {   paletteSmall[i]=(int)palette[4*i];
        out<<paletteSmall[i]<<"\n";
    }

    int size = width;

    //for(int j=0;j<height;j++)
    {   unsigned char* data = new unsigned char[size];
        fread(data, sizeof(unsigned char), size, f);
        for(int i=0;i<width;i++) 
        {   cout<<"\n"<<i<<"\t"<<paletteSmall[*(int*)&data[i]];
        }
        delete [] data;
     }

    fclose(f);

    return 0;
}

What I get in the test.txt seems fine - first values from 0 0 0 0 to 255 255 255 0 (palette), next values from 0 do 255 (paletteSmall).

The problem is that I can't refer pixel values to the color table entries. My application callapses, with symptoms indicating, probably, that it tried to use some unexisting element of a table. If I understand properly, a pixel from a bmp with a color table should contain a number of a color table element, so I have no idea why it doesn't work. I ask for your help.

Community
  • 1
  • 1
beginner
  • 35
  • 6
  • A footnote on your terminology: "8bit grayscale bmp" – there is no such thing. As you found out, it is just a regular palettized image (where the palette happens to contain just gray values). Paranoia is Good! Do not assume the palette contains 256 entries (the value is in the header); create your gray value from the 3 RGB values, not assuming that they are all the same. Your code is not complete so I cannot check, but be aware that individual scan lines are padded up to DWORD lengths, which is vital if your image width is not a multiple of 4. – Jongware May 29 '15 at 09:43
  • 1
    _"8bit grayscale bmp"_ thing - right. Also, thank you for the other hints. However, I believe they do not apply to my case. I am writing an application to process bmps generated as output by some other software. The bmps are always the same - always multiple of 4 (needed for FFT algorithm) and always the same "grey" palette. Thanks once again! – beginner May 29 '15 at 10:06

1 Answers1

1

You are forcing your 8-bit values to be read as int:

cout<<"\n"<<i<<"\t"<<paletteSmall[*(int*)&data[i]];

The amount of casting indicates you were having problems here and probably resolved to adding one cast after another until "it compiled". As it turns out, compiling without errors is not the same as working without errors.

What happens here is that you force the data pointer to read 4 bytes (or as much as your local int size is, anyway) and so the value will almost always exceed the size of paletteSmall. (In addition, the last couple of values will be invalid under all circumstances, because you read bytes from beyond the valid range of data.)

Because the image data itself is 8-bit, all you need here is

cout<<"\n"<<i<<"\t"<<paletteSmall[data[i]];

No casts necessary; data is an unsigned char * so its values are limited from 0 to 255, and paletteSmall is exactly the correct size.


On Casting

The issue with casting is that your compiler will complain if you tell it flat out to treat a certain type of value as if it is another type altogether. By using a cast, you are telling it "Trust me. I know what I am doing."

This can lead to several problems if you actually do not know :)

For example: a line such as your own

int width = *(int*)&info[18];

appears to work because it returns the proper information, but that is in fact a happy accident.

The array info contains several disconnected unsigned char values, and you tell your compiler that there is an int stored starting at position #18 – it trusts you and reads an integer. It assumes that (1) the number of bytes that you want to combine into an integer is in fact the number of bytes that itself uses for an int (sizeof(int)), and (2) the individual bytes are in the same order as it uses internally (Endianness).

If either of these assumptions is false, you can get surprising results; and almost certainly not what you wanted.

The proper procedure is to scan the BMP file format for how the value for width is stored, and then using that information to get the data you want. In this case, width is "stored in little-endian format" and at offset 18 as 4 bytes. With that, you can use this instead:

int width = info[18]+(info[19]<<8)+(info[20]<<16)+(info[21]<<24);

No assumptions on how large an int is (except that it needs to be at least 4 bytes), no assumption on the order (shifting values 'internally' do not depend on endianness).

So why did it work anyway (at least, on your computer)? The most common size for an int in this decade is 4 bytes. The most popular CPU type happens to store multi-byte values in the same order as they are stored inside a BMP. Add that together, and your code works, on most computers, in this decade. A happy accident.

The above may not be true if you want to compile your code on another type of computer (such as an embedded ARM system that uses another endianness), or when the used compiler has a smaller (.. which by now would be a very old compiler) or a larger size for int (just wait another 10 years or so), or if you want to adjust your code to read other types of files (which will have parameters of their own, and the endianness used is one of them).

Jongware
  • 22,200
  • 8
  • 54
  • 100
  • 1
    Thank you for your help. It works now. _The amount of casting indicates you were having problems here and probably resolved to adding one cast after another until "it compiled"._ Exactly. I need to read about casting, apparently, because I don't really understand your entire answer. – beginner May 29 '15 at 09:44
  • A comment on **On Casting** edit: Thank you once again. And once again, I need to learn a lot. I don't understand why `int width = info[18]+(info[19]<<8)+(info[20]<<16)+(info[21]<<24);`. Why not`int width = info[18]+info[19]+info[20]+info[21];`? I suppose this is a stupid question, but I really don't get it. The only thing I see is that each one of the consecutive _unsigned chars_ is somehow enriched with an additional byte, compared to the previous one. – beginner May 30 '15 at 20:33
  • @beginner: That's basically exactly the same as `148` - now that ain't `1+4+8` either. The difference with computers is that the base is `256`, not `10`. Being able to count in other bases is, well, not exactly *required* - but it helps if you know the basics. – Jongware May 30 '15 at 20:41