11

I'm writing some code that uses the fstream read() function and this function expects a char* as the buffer. Later on, I want to work with the bytes in this buffer as unsigned chars, so I'm either going to have to: 1. declare the buffer as a char* and then do static_casts for each element later, 2. declare the buffer as unsigned char* and then do a reinterpret_cast when I pass it to the read function, or 3. declare the buffer as a char* and also create a casted pointer that I use for accessing the buffer as an unsigned char.

Here's a snippet:

char* buf = new char[512];
unsigned char* ubuf = reinterpret_cast<unsigned char*>(buf);

    fstream myfile;

    myfile.open("foo.img");

    myfile.seekg(446);
    myfile.read(buf, 16);
    //myfile.read(reinterpret_cast<char*>(buf), 16);

int bytes_per_sector = ubuf[1] << 8 | ubuf[0];
...

I like this way because I only have to cast once and I can just access the buffer as either type without doing a cast every time. But, Is this a good practice? Is there anything that can go wrong here? Using reinterpret_cast makes me a little nervous because I don't normally use it and I've been told to be careful with it so many times.

shwoseph
  • 249
  • 3
  • 12
  • 1
    This is one of the very few cases where `reinterpret_cast` is actually safe and makes sense. – Konrad Rudolph Feb 01 '15 at 00:38
  • Just a small side-issue: In your code, it seems there's no justification for allocating the buffer dynamically instead of automatically. And as you do a `reinterpret_cast` because none of the others would work, why not do a c-style cast instead? It's equally safe (or unsafe), and might be sufficiently less bulky that one pointer (`unsigned char*` or `char*`, depending on how much you need either) does not result in too much (if any) additional code. – Deduplicator Feb 01 '15 at 00:42
  • 5
    @Deduplicator Ugh. Please don’t recommend using C-style casts. Do consider those deprecated in C++. It’s safe *in this situation*, but it’s a much simpler rule to just ban them outright, and to avoid the potential of confusion. And the `reinterpret_cast`, being more explicit, makes the code more readable as well, since it clearly tells the reader which cast is being performed. – Konrad Rudolph Feb 01 '15 at 00:55
  • @KonradRudolph: More to read does not mean more readability. While in most cases C-style-casts in C++ should be eschewed, the rationale (which is quite sound in most cases) *really* breaks down here badly. Also, always remember that function-style casts are just C-style casts writ different. – Deduplicator Feb 01 '15 at 00:58
  • 4
    @Deduplicator The C++ casts *replace* C cast. Use of C cast is neither helpful nor justified since there is always a more specific C++ cast. There is no reason to ever use a C cast. Your "might be sufficiently less bulky" doesn't make sense since 1. a C cast would just do what the appropriate C++ cast would and 2. in this case that's nothing. – philipxy Feb 01 '15 at 01:06
  • @philipxy: If you reason from the generated machine-code being the same that I'm out-to-lunch about new-style-casts being bulky, then why aren't you programming in assember, where you can much easier avoid that kind of bulk than in a HLL? Let's go back to the C++ source-code level, and all new-style-casts are much bulkier than the old-style, whether re-branded as function-style or not. – Deduplicator Feb 01 '15 at 01:14
  • 1
    @Deduplicator The point both philipxy and I were making is that `reinterpret_cast` is explicit and therefore increases readability even if not type safety. A `reinterpret_cast` has a well-defined meaning. A C-style cast, by contrast, doesn’t. It can mean any of a number of things, so using it in code obscures the actual semantics from the reader. That’s generally acknowledged to be an incredibly bad idea. – Konrad Rudolph Feb 01 '15 at 01:27
  • @Deduplicator I suggest that learn about C++ casts, the as-if rule and the relevant user-visible time and space guarantees by the language definition for casting. – philipxy Feb 01 '15 at 01:29
  • @KonradRudolph: If you argue that always when there's a more verbose option, a concise one "obscures the semantics", I wonder how you get anything done. Engineering is about trade-offs, and whether you use a C-Style-cast, a function-style-cast or a new-style-cast to cast a character-pointer of one flavor to another, there's no added clarity in being thoroughly verbose. Yes, I acknowledged that in most situations using a new-style-cast adds clarity and safety, but you really should respect the limits of applicability for any general advice. – Deduplicator Feb 01 '15 at 01:42
  • @Deduplicator No, I am not arguing that. On the contrary, you will find that I *consistently* argue for the most concise option possible ([here’s a salient example](http://stackoverflow.com/a/263247/1968)). But that’s not the case here: *in this case*, conciseness damages readability. You say that I shouldn’t over-generalise advice and I’d generally agree but as I’ve said in my initial comment, here’s a great situation where a simple, absolute rule (don’t use any C-style casts, full stop) beats a more nuanced, complex rule because it’s easy to enforce and has no real drawbacks. – Konrad Rudolph Feb 01 '15 at 02:23
  • @KonradRudolph: I don't completely agree, though only for the case where it's abundantly clear what type the source is, and thus what the cast does. (Which does not mean I don't agree that hard-and-fast rules are better, especially as enforcement can be automated, and that this rule doesn't have remarkably few downsides.) Also, as *function-style-casts have identical semantics*, they must be seen as under the same indictment, and that's not quite always *practical*. – Deduplicator Feb 01 '15 at 02:45
  • 1
    ... Now, if I had to re-design C++, I would love to downgrade c-style-cast to implicit, function-style to implicit+any ctor+conversion-operator (it's a shame `implicit_cast` isn't in C++14), and make the other 4 kinds (static, const, dynamic, reinterpret) short and concise variants. – Deduplicator Feb 01 '15 at 02:46

1 Answers1

10

In this case a reinterpret_cast is fine, for 2 reasons:

  1. The (signed) char and unsigned char types are required to have the same "representation and alignment". This means that there is not going to be a difference in the data (it will be bit-per bit exact), or how long the buffer is interpreted to be.

  2. File reading functions typically use char* just as a generic data-access type. They can't use void* because the type void has an specifically undefined length and representation. char, however, does. So they can use it to read/write a series of bytes.

Actually, file-functions are oftentimes intended to have data be reinterpreted as/from something else. It allows you to have a struct like

typedef struct structSomeStruct
{
    char name[8]; // a basic, fixed length, string
    unsigned long i; // a 32 bit value
    float x, y, z, w;
} SomeStruct;

Or

class SomeStruct
{
    public:
        char name[8];
        unsigned long i;
        float x, y, z, w;
        
        SomeStruct()
        {
            // ...
        }
};

And store it to a file using something like:

SomeStruct st;

// populate the st data structure

// file.write(char* start_of_data, size_t number_of_bytes);
file.write(reinterpret_cast<char*>(&st), sizeof(SomeStruct));

And read it similarly.

user202729
  • 3,358
  • 3
  • 25
  • 36
Wolfgang Skyler
  • 1,338
  • 1
  • 13
  • 26