3

As far as I know, read() and write() are there so we can read and write bytes directly from or to a file, and I was taught that the equivalent of a byte in c++ is unsigned char, so why do they take char pointers as parameters?

Also, do take a look at this function from a "bmp file image reader" library I found:

bool BMPImage::readInfo()
{
    //...

    //read bmp and dib headers
    unsigned char header[28] = {0};
    _ifs->read((char*)header, 28);
    _width    = *(int*)&header[18]; //width is located in [18] and is 4 bytes size
    _height   = *(int*)&header[22]; //height is located in [22] and is 4 bytes size
    _bpp      = (unsigned char) *(short*)&header[28]; //bpp is located in [28] and is 2 bytes size
    _channels = _bpp / 8; //set num channels manually

    //...

Why does the _ifs->read() line work anyway? The cast from unsigned char to char forces loss of data, no?

ichramm
  • 6,437
  • 19
  • 30
McLovin
  • 3,295
  • 7
  • 32
  • 67
  • 3
    "The cast from unsigned char to char forces loss of data, no?" - no. –  Jul 19 '17 at 15:01
  • `char` and `unsigned char` have the same size, which is 1 byte. The only difference is signedness: `unsigned char` is always > 0; the normal `char` can also be signed (but doesn't have to be). – Ben Steffan Jul 19 '17 at 15:02
  • There is no practical difference between `char` and `unsigned char`. But since most people use `char` in their program, it is only natural that API accepts mostly used type. – SergeyA Jul 19 '17 at 15:04
  • 1
    [Can I turn unsigned char into char and vice versa?](https://stackoverflow.com/q/15078638/1460794) – wally Jul 19 '17 at 15:06
  • Even better question: why isn't it a void*? That way I can implicitly read data into any pointer without having to do reinterpret_cast. – Calmarius Jul 27 '19 at 18:38

3 Answers3

1

In C and C++, the standards do not specify whether char is signed or unsigned, and implementations are free to implement it as either. There are separate types signed char (guaranteed to hold at least the range [-127,127]) and unsigned char (guaranteed to hold at least the range [0,255]), and char will be equivalent to one of them, but it is implementation defined as to which it is.

Given that the ASCII character set only contains values 0 to 127, it makes sense that, historically, a single signed byte would have been seen as adequate for holding a single character, while still using the same convention as larger types, where integral types are signed by default unless explicitly declared as unsigned.

David Scarlett
  • 3,171
  • 2
  • 12
  • 28
0

Given that char and unsigned char have the same size, there should be no data loss when converting between them.

Said that, have in mind that fstreamm is just an specialization of std::basic_fstream for chars:

// from <fstream>
typedef basic_fstream<char>         fstream;

You can create your own type for unsigned char, like this:

typedef basic_fstream<unsigned char> ufstream; 
ichramm
  • 6,437
  • 19
  • 30
  • Are c-style casts fine when doing these file io operations? Also, why do these functions still use char and not unsigned char? or even void* as the c equivalents? – McLovin Jul 19 '17 at 15:17
0

was taught that the equivalent of a byte in c++ is unsigned char

I don't know what byte is, but you can use char to represent a byte just fine.

so why do [fstream.read and fstream.write] take char pointers as parameters?

fstream is an alias of std::basic_fstream<char>. std::basic_fstream is a template whose all operations deal with its specified char_type. Since that char_type is char, all operations deal with char, not unsigned char.

You could use basic_fstream<unsigned char> as Juan suggested, but it's more involved than that. You will need to specialize char_traits<unsigned char> which is the second (defaulted) template argument of basic_fstream<unsigned char>.

The cast from unsigned char to char forces loss of data, no?

No. Accessing unsigned char through a char* loses no data. In fact, accessing any type through char* will not lose data.


This on the other hand:

*(int*)&header[18]

has undefined behaviour, unless the buffer was properly aligned such that header[18] happens to be located at the boundary required by int. I see no such guarantees in the definition of the array. Some architectures do not support unaligned memory access at all.

eerorika
  • 232,697
  • 12
  • 197
  • 326