Casting numeric data types in C

Question

I saw this question on SO about casting, and the answer specifically mentioned numeric types. I thought I understood this till now, when I have to do it.

I have two int's I want to add, divide by two and save to another int. So I create a temporary variable called 'bucket' that's one datatype bigger, add the two ints and shift right by one.

I've come up with several ways to do this, all seem to work, but several are, I think, unnecessary. Please look this over and let me know where I misunderstood K&R.

Case1:

bucket      = ( (long) ADCBUF0 + (long) ADCBUF7) >> 1;
rem_volt       = bucket;

Case 2:

bucket      = ( (long) ADCBUF1 + (long) ADCBUF8) >> 1;
loc_volt       = (unsigned int) bucket;

Case 3:

bucket      = (long) ADCBUF2 + (long) ADCBUF9;
current     = (unsigned int) (bucket >> 1);

Case 4:

bucket      = ADCBUF3 + ADCBUFA;
dac_A         = (unsigned int) (bucket >> 1);

Case 5:

bucket      = (long) (ADCBUF4 + ADCBUFB) >> 1;
dac_B       = (unsigned int) bucket;

I think Case 4 or Case 5 is the correct way to do this, since Case 1 implicitly casts a long to an int, Case 2 I think is right, but more typing, Case 3 I think uneccessarily casts the ints to longs when only their sum needs to be cast.

Thanks for your thoughts.

EDIT: Thanks for the comments so far! To clarify, I am being sloppy about the unsigned vs signed, they should all be unsigned. I am trying to avoid an overflow on the addition. My long is 2 bytes bigger than my int. (So it's really much bigger than needed, I could just check the overflow bit.) I'm working on an embedded platform, I will never port this (I swear!) but I am trying to consider whomever comes after me and tries to puzzle out what I'm doing, so I'll use the divide by 2 rather than the bitshift.

You might want to try each of these with INT_MAX for the two inputs to see what happens. It won't necessarily tell you what the standard says, but it'll at least give you an idea of what your compiler thinks. — Laurence Gonsalves, Jan 13 '10 at 18:48
It would help if you declared the types of your variables in these examples. — Paul Stephenson, Jan 13 '10 at 19:05
you should also keep in mind that `int` and `long` may be of same size! — Christoph, Jan 13 '10 at 19:42
In general, I try to avoid mixing bitwise operators (like shift right) with arithmetic operators. If you want to divide by two, then write `/ 2`. Besides being clearer, bitwise operations on signed types often get you into trouble. Most compilers will optimize the divide to a shift if it really is faster for your platform. — Adrian McCarthy, Jan 14 '10 at 00:28

score 4 · Answer 1 · 2010-01-14T00:49:50.180

(You said that your variables are ints, a and b is what I've used below, and I'm using this assumption: int a, b;.)

The problem with taking the average of two numbers by adding and then dividing is you might overflow (as Laurence Gonsalves alluded to in his comment). If you know that the sum won't exceed INT_MAX, then using int is fine, and nothing more needs to be done:

int avg = (a + b) / 2;

If the sum might overflow, then moving up to a type which won't overflow, such as long, is fine; however, remember that long and int might be the same size (INT_MAX might equal LONG_MAX), in which case this doesn't help, except that INT_MAX would be much larger—at least 2,147,483,674.

int avg = ((long)a + b) / 2;

Note that when b is added to a, b is automatically converted to a long. You might need to cast the final result to int to avoid warnings (including from a lint program):

int avg = (int)(((long)a + b) / 2);

However, remember that integer limits are defined by the implementation; the real concern is whether the addition could ever be more than INT_MAX and casts are just a way to explicitly specify type and thus avoid that unintentional overflow.

When you want to divide by two, use / 2, don't try to get clever with bitshifts. Compilers have been smart enough to optimize this kind of thing for years, especially when you use constants, and code clarity is more important.

If ints, a+b will never overflow over INT_MAX as int is a signed type. At most the sign will change. Overflowing INT_MIN is the worry. If values are never negative(eg. ret<0 for error) you'd only need to cast to unsigned so that >>1 and /2 don't keep the sign. — , Jan 14 '10 at 00:14
jbcreix: I think you're confused. If *a* and *b* are INT_MAX, then `a+b` overflows, and overflow on signed integer types is undefined behavior. — , Jan 14 '10 at 00:43
It is *unsigned* integer types for which overflow isn't a problem, as they are required to behave wrt mod 2**n. — , Jan 14 '10 at 00:45

score 3 · Answer 2 · answered Jan 14 '10 at 00:00

3

This isn't answering the actual question you asked, but it might be useful, and I like bit-twiddling hacks...

It's possible to calculate the average (rounded down) of two unsigned integers, without overflowing or using an intermediate wider type, using some trickery:

avg = (a & b) + ((a ^ b) >> 1)

Why it works: for each column of the binary addition, a & b gives a "1" bit if the bits in that column sum to 2, and (a ^ b) >> 1 gives a "1" bit for the column to the right (which you can think of as 1/2 for the column you're adding) where the bits in that column sum to 1.

answered Jan 14 '10 at 00:00

Matthew Slattery

45,290
8
103
119

Too clever! Sounds like a great trick for the obfuscated C contest. – Adrian McCarthy Jan 14 '10 at 00:32
Well, you can implement a non-overflowing average in an arguably more readable way. For example, `a / 2 + b / 2 + (a % 2 & b % 2)` will work. If you prefer it in bit terms, it would turn into `(a >> 1) + (b >> 1) + (a & b & 1)` – AnT stands with Russia Jan 14 '10 at 04:22

AnT stands with Russia · Accepted Answer · 2010-01-14T07:40:53.357

You don't provide some important details, but under some reasonable assumptions it is 4 and 5 that are broken, while 1-3 are fairly OK.

I assume that you are insisting on the use of bigger intermediate integral type because you want to avoid overflow on addition. It is also not clear whether you are working with signed on unsigned types and why your are mixing them like that.

Anyway, the minimal proper way to perform the addition and shift would be

bucket = ((long) ADCBUF0 + ADCBUF7) >> 1;

Switching to bigger type in just one operand is sufficient. Of course, you can keep both casts if you like the more "symmetrical" look better.

Or you can do it without any casts at all (assuming that bucket is declared with "bigger" type)

bucket = ADCBUF0;
bucket = (bucket + ADCBUF7) >> 1;

As for the final assignment, if the recipient is an unsigned int by itself (this is completely unclear from your question), there's no real need for any cast, unless you are trying to suppress the compiler warnings. I assume that it is an unsigned int. In that case just

rem_volt = bucket;

will work. If it is not an unsigned int, then the explicit cast might make a difference, but there no way to say, until you provide this important detail.

It doesn't matter where you do the shift in your case (in the first or in the second statement), again, if bucket has "bigger" type. If bucket is declared with the same type as the ADC... variables, then the shift must be done early (in the first expression).

As for your variants: 1 is OK, but the second cast is technically redundant (matter of style). 2 is OK, with two excessive casts. 3 is also OK, with an excessive cast. 4 is broken, since it doesn't protect from overflow (if that was your intent). 5 is broken in the same way.

In other words, out of all your variants number 1 looks the best (again, assuming that the cast to unsigned int is redundant).

P.S. If you want to divide something by 2, the most reasonable way to do it is to use the division operator / and the constant 2 as a divisor. There's no meaningful reason to bring any shifts into the picture.

Also Matthew Slattery in his answer provided a non-overflowing way to calculate the average using bit-twiddling operations. It can also be done in another, (arguably) more readable way through

average = ADCBUF0 / 2 + ADCBUF7 / 2 + (ADCBUF0 % 2 & ADCBUF7 % 2);

or, if you prefer the bit-twiddling approach

average = (ADCBUF0 >> 1) + (ADCBUF7 >> 1) + (ADCBUF0 & ADCBUF7 & 1);

I don't think `ADCBUF0` is intended to be a hex constant because `U` isn't a hex digit. — Paul Stephenson, Jan 13 '10 at 19:07
You might want to mention that the "without any casts" way assumes that `bucket` is of type `long`. — caf, Jan 14 '10 at 03:53

score 1 · Answer 4 · answered Jan 13 '10 at 19:00

I would use 2 or 3. On the first line, you need to make sure the addition is done as a long to prevent overflow. On the 2nd line, specifying that you are doing a downcast prevents compiler warnings.

4 and 5 are actually broken on a platform where long is bigger than int.

score 0 · Answer 5 · answered Jan 13 '10 at 23:56

This has been stated in other anwers, but maybe not so explicitly. To answer your question: "where I misunderstood K&R":

an arithmetic operation between 2 different types is performed by 'promoting' the lower to the higher, and the result is carried at that level - so multiple casts between just 2 types are redundant
an assignment of a higher to a lower type is done by truncation - the cast doesn't change the data (it's only for compiler warnings)

My question to OP is how you thought "all seemed to work"? I could not find a way to make that happen (with meaningful data), even with some interpolation of your assumptions and examples. Did you test with meaningful data? From the names of your variables, it looks like you just took the sample code out of an application. That is not a good place to test something like the question at hand. It's easy to make a mistake when you have 10+ variables. Test different options in a dedicated test program with as little variation as possible.

In addition other comments, I'll mention that you can use

printf("Sizes: int=%d long=%d\n", sizeof(int), sizeof(long)); // note parens!

to check sizes without having to know about a limits.h or other include file to define the platform. Including this result in your original post would have cleared a question in some minds because on many current desktop platforms, an 'int' is the same size as a 'long' (using 'long long' for greater word size)

Casting numeric data types in C

5 Answers5