34

I am asking this question for two different languages: C and C++.

What is best practice when calling functions that have an opposite integer sign expectation to what we require in our code?

For example:

uint32       _depth;                        // uint32 = DWORD
int          depth;

_BitScanForward(&_depth, (uint32)input);    // DWORD, DWORD
depth = (int)_depth;

_BitScanForward is expecting DWORD (uint32) parameters. The variable input is of int16 type and I need to process the result _depth as an int32 in my code.

  1. Do I need to care about casting input as shown? I know the complier will probably do it for me, but what is best practice?
  2. Is it acceptable to declare _depth as int32 and therefore avoid having to cast it afterwards as shown?

NOTE:

My comment about the complier is based on experience. I wrote code that compiled with no warnings in VS but crashed on execution. Turned out I was calling a function with an incorect width int. So I don't leave this topic up to the compiler any more.

EDIT:

The answers are helpful, thanks. Let me refine my question. If there are no width issues, i.e. the function is not expecting a narrower int than what is being passed in (obvioulsy will fail), then is it okay to rely on the compiler to handle sign and width differences?

IamIC
  • 17,747
  • 20
  • 91
  • 154
  • 2
    "Can I declare _depth as int32" Well, does anything stop you from doing that? – Daniel Daranas Sep 26 '16 at 11:40
  • 2
    "Probably" do it for you? – Lightness Races in Orbit Sep 26 '16 at 11:43
  • 14
    Which language are you using? C or C++? These are two, distinct languages. Pick one! – Lightness Races in Orbit Sep 26 '16 at 11:44
  • @DanielDaranas yeah, of course I can. I'm asking *should* I. – IamIC Sep 26 '16 at 11:44
  • @LightnessRacesinOrbit I haven't tested them all. I know in Dephi (not C, but a compiler), it warns if types aren't cast in code. – IamIC Sep 26 '16 at 11:45
  • @IamIC you actually asked if you _could_. – Daniel Daranas Sep 26 '16 at 11:46
  • 2
    @DanielDaranas Definition of "could" = "Used to indicate ability or permission". Permission as in acceptablness. But I changed my question just for you ;) – IamIC Sep 26 '16 at 11:48
  • 1
    This question might be helpful to you: [Should this compile? Overload resolution and implicit conversions](http://stackoverflow.com/q/8914986/96780) – Daniel Daranas Sep 26 '16 at 11:49
  • My personal guideline is to [Make all type conversions explicit](http://thedeepbluecpp.blogspot.com.es/2014/01/rule-2-make-all-type-conversions.html), and minimize them, of course. The best type conversion is no type conversion. I resist to put that as an answer, though, because it borders the "primarily opinion-based" territory. – Daniel Daranas Sep 26 '16 at 11:51
  • @DanielDaranas I'm currently going with explicit and readable. But it seems tedious. – IamIC Sep 26 '16 at 11:53
  • 2
    Whenever you find explicit casting tedious, ask yourself if you could avoid it by choosing appropriate types in the first place, i.e., prefer option 2. As I said, the type conversion that will surely not "fail" is no type conversion at all. – Daniel Daranas Sep 26 '16 at 11:55
  • @DanielDaranas totally agreed. I always try to use types that functions expect. However, as you know, libraries are inconsistent. – IamIC Sep 26 '16 at 11:58
  • 6
    Please ask for _one_ language at a time. If the answer is different for each of the languages you name, that makes a mess of things! Surely you are in actual fact using one or the other, so just name that one. – Lightness Races in Orbit Sep 26 '16 at 13:20
  • @LightnessRacesinOrbit Fair enough. – IamIC Sep 26 '16 at 13:22
  • In C, the first `DWORD` in the comment `_BitScanForward(&_depth, (uint32)input); // DWORD, DWORD` mis-leads. `_BitScanForward()` takes two arguments: `unsigned long *`, `unsigned long`. The first is an _address_, not a `DWORD`. "_BitScanForward is expecting DWORD (uint32) parameters" is inconsistent with MS documentation. – chux - Reinstate Monica Sep 26 '16 at 16:42
  • 1
    @Danh: If the implementation documents the behavior of an intrinsic with that name, code should behave according to such documentation. The Standard imposes no requirements on what meaning if any, an implementation may attach to that name, but behavior should only be considered "undefined" on implementations which don't define it. – supercat Sep 26 '16 at 16:53
  • @supercat I forget that function is a compiler instrinsic – Danh Sep 27 '16 at 00:46

4 Answers4

38

I would strongly recommend to hide that function into a custom wrapper function which agrees with your preferred API (and within this function do proper explicit casting). In the case of using compiler-specific functions this has the additional advantage that it will be much easier to port it to different compilers (should you ever want to do that), by just re-implementing that wrapper function.

chtz
  • 17,329
  • 4
  • 26
  • 56
12

It is very important to write an explicit cast when going from any integer type that is narrower than int to any integer type that is the same width or wider than int. If you don't do this, the compiler will first convert the value to int, because of the "integer promotion" rules, and then to the destination type. This is almost always wrong, and we wouldn't design the language this way if we were starting from scratch today, but we're stuck with it for compatibility's sake.

System-provided typedefs like uint16_t, uint32_t, WORD, and DWORD might be narrower, wider, or the same size as int; in C++ you can use templates to figure it out, but in C you can't. Therefore, you may want to write explicit casts for any conversion involving these.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • 1
    When deciding that short unsigned values should promote to signed, the authors of the C Standard noted that the majority of then-current compilers would reliably behave in arithmetically-correct fashion for constructs like `unsigned mul(unsigned short x, unsigned short y) { return x*y; }` even for results in the range INT_MAX+1 to UINT_MAX. For some reason, however, the maintainers of gcc have decided it's more useful to have the "optimizer" generate code that's only reliable for values up to INT_MAX. – supercat Sep 26 '16 at 16:48
  • @zwol so in the example I give, where the ints are of the same width but only one is signed, is it still best practice to write the code as I have, or is this unnecessary? – IamIC Sep 26 '16 at 17:26
  • @supercat so would `malloc(sizeof(MyType) * i)` where i is an int and sizeof returns an unsigned long long... would this require casting of the i? – IamIC Sep 26 '16 at 17:34
  • 1
    @IamIC It is still best practice in your case because `uint32` and `int` might not always be the same size; for instance if you ever port the code to another environment. Re `malloc`, unfortunately the only reliable way to do that operation is rather more complicated than fits in a single expression, see http://lteo.net/blog/2014/10/28/reallocarray-in-openbsd-integer-overflow-detection-for-free/ (read the *entire thing*). – zwol Sep 26 '16 at 17:37
  • 1
    @IamIC: The only way that expression would cause trouble would be if the product exceeded the maximum value for size_t. If "int" were 16 bits but "size_t" were 8 bits, and "sizeof(MyType)" were 200, and "i" were 240, then the multiplication would yield Undefined Behavior rather than computing 0xC8u * 0xF0u (i.e. 0xBB80) and passing "malloc" a value of 0x80 (128). On the other hand, adding casts so code would pass malloc() a value of 128 rather than invoking UB in the multiplication probably wouldn't be very useful. – supercat Sep 26 '16 at 17:54
  • @supercat that is what i suspected. I would like to think the compiler will sensibly perform casting as required, but your first comment about the gcc compiler casts doubts on this idea. – IamIC Sep 26 '16 at 18:11
  • 1
    @IamIC In my opinion, the problem supercat cites is not nearly as important as the problem of getting values sign-extended where you expected them to be zero-extended or vice versa. – zwol Sep 26 '16 at 18:19
  • @zwol I agree. This certainly is an educational discussion for me. – IamIC Sep 26 '16 at 18:34
  • 1
    @zwol: I'd say the fact that code may sometimes misbehave bizarrely in rare but unpredictable circumstances is a bigger problem than code which behaves in a fashion which is deterministic and logical but doesn't fit the application's needs. The interesting "optimizations" gcc does on the above function don't merely produce unexpected values--if e.g. gcc recognizes that one argument will always be 65535, it may conclude that the other argument can't exceed 32768 and omit other code which would only be relevant if such a condition arose. – supercat Sep 26 '16 at 18:53
  • Could you, please, add examples of issues you could encounter by not explicitly casting from `< int` to `>= int`? Shouldn't the `< int` to `int` and `int` to `>= int` both be losless conversions? Then why would such a two-step process produce a different result? – relatively_random Oct 03 '16 at 11:57
4

Well It kind of depends on your usage etc:

If I can use the type which is needed I just use the type.

If not: Your compiler should warn you in the cases where you implicitly convert datatypes which may result in over/underflows. So I have those warnings on usually and change the implicit conversion to explicit ones.

There I have 2 different approaches:

If I am like 100% sure that I never over/underflow the boundaries between signed/unsigned int I use static_cast. (usually for conversion of different APIs. Like size() returning int vs size_t).

When I am not sure or it may be possible I am beyond the boundaries I use boost::numeric_cast. This throws an exception when you cast beyond boundaries and thus shows when this happens.

The approach with the exceptions adheres to the practice to fail hard/crash/terminate if something goes wrong instead of continuing with corrupt data and then crash somewhere else or do other things with undefined data.

Hayt
  • 5,210
  • 30
  • 37
1

First your compiler will make the casts implicit and will give you a warning on any meaningful warning level.

Both casts you perform are casts where the compiler (or your coworkers) cannot easily decide if they are correct, therefor an explicit casting or explicit conversion with a boundary test is best practice. Which you choose depends on your knowledge of the data. The safest way is to check boundary conditions. The cheapest way is to simply cast (in C++ please use static_cast not C-style casts).

user6556709
  • 1,272
  • 8
  • 13
  • I'm unclear why my casts would be unclear to the compiler / a human. Please explain. – IamIC Sep 26 '16 at 12:14
  • 2
    Your first cast from signed to unsigned, this is always unclear for the compiler. Since it doesn’t know if your signed variable is negative or positive. The second cast is unsigned to signed. Here the compiler cannot determine, if the value of your unsigned int is greater than the maximal signed value. Therefor it has to warn you. Your coworkers have the same problem. By only looking at a small part of code, they cannot decide, if you are doing the right thing. – user6556709 Sep 26 '16 at 12:22
  • The only alternative would be to cast to a wider int. Are you suggestion that I don't cast? I don't think the compiler will widen the input because they function won't accept it. – IamIC Sep 26 '16 at 12:27
  • 1
    @IamIC: by using C++ `reinterpret_cast` you will tell on source level, that you want to use the same bit pattern as [u]int, wanting to go for example from `int32_t -1` to `uint32_t 0xFFFFFFFF` in terms of value. Reinterpret_cast will not compile to any instruction, it's just source-level change of type of value. This may help, if you have some badly designed API, which is for example returning `int size()`, but it was never meant to return negative number, etc... Overall the best practice is of course to avoid any conversions and use the proper type all the way trough, in other case cast+test. – Ped7g Sep 26 '16 at 12:39
  • @Ped7g that really helps! I guess in C the answer would be to cast as I've shown, rather than leaving it as implicit. – IamIC Sep 26 '16 at 12:43
  • 1
    @IamIC the C is not as strongly typed as C++, still **explicit** (was implicit :/ :D )C-like cast is better than compiler warning. It shows the programmer paid attention and verified the cast does produce expected result in all cases. But overall I don't see any reason why to use C today, when C++ is available. – Ped7g Sep 26 '16 at 13:03
  • @Ped7g I'm writing functions for PostgreSQL. In accepts only C, not C++. Could you copy your comments into an answer, please? Ps. I meant implicit ;) – IamIC Sep 26 '16 at 13:12
  • 1
    @lamlC I would never suggest to remove the casts. Casts to different signedness should always be written explicit. Only to make sure you (some weeks later), your compiler and your coworker understand, you really want to do this conversion and you know this is free of side effects. – user6556709 Sep 26 '16 at 13:12