Extract upper and lower word of an unsigned 32-bit integer

Question

To extract the upper and lower word of an unsigned 32-bit integer and store each one in separate uint16_t-variables I do as follows (nbr is an unsigned 32-bit integer):

uint16_t lower_word = (uint16_t) nbr & 0x0000FFFF;  // mask desired word
uint16_t upper_word = (uint16_t) ((nbr & 0xFFFF0000) >> 16); // right-shift after masking

Is the explicit conversion to an uint16_t unnecessary? What other more effective ways, if there are any, do you recommend to obtain the desired results instead of this approach?

score 8 · Answer 1 · answered Dec 21 '18 at 10:33

The C type system is both subtle and dangerous. The explicit conversion may or may not be necessary. In case of (uint16_t) nbr & 0x0000FFFF specifically, the cast is not correct, assuming 32 bit CPU.

You cast before the operation takes place. Meaning that the operand nbr will get explicitly converted by the cast, and then immediately implicitly converted up to int by implicit integer promotion. The result will be of type int, which is signed. Harmless in this case but can cause trouble in other cases. By using the incorrect cast you made a signed int out of a uint32_t which was not the intention.

Overall you need to be aware of Implicit type promotion rules.

Although, there is an implicit lvalue conversion upon assignment back to uint16_t, which most of the time saves the day.

Also note that 0x0000FFFF is dangerous style. Hex literals are of the type where the value will fit, regardless of how many zeroes you put before the value. In this case, it is int which is signed. On a 16 bit system, 0x0000FFFF would give int but 0x00008000 would give unsigned int. (Check this weird bug for example: Why is 0 < -0x80000000?)

The best practice, rugged, portable, MISRA-C compliant code, is code which does not contain any implicit conversions at all:

uint32_t nbr = ...;
uint16_t lower_word = (uint16_t) (nbr & 0xFFFFUL);
uint16_t upper_word = (uint16_t) ((nbr >> 16) & 0xFFFFUL);

This assuming nbr is known to be uint32_t, otherwise it is best practice to cast that operand to uint32_t before casts.

The masks are not really necessary in this specific case, but in the general case, for example when masking out 4 bytes from a uint32_t.

@EricPostpischil As stated initially: "assuming 32 bit CPU". Obviously if `int` or `long` is 64 bit on some garage compiler, swap `UL` for `U`. Seriously, have you nothing better to do than stalking every post I make and come with language-lawyer remarks that are far beyond what any newbie will need to know? This answer is already far too detailed to be pedagogical, without covering portability to obscure garage compilers. — Lundin, Dec 21 '18 at 15:17

bruno · Accepted Answer · 2018-12-21T13:06:15.000

6

uint16_t lower_word = (uint16_t) nbr;
uint16_t upper_word = (uint16_t) (nbr  >> 16);

masks are useless

cast are necessary else the compiler could produce warning

{edit to take into account the remark of Lundin / Eric Postpischil}

for instance gcc -Wconversion produces a warning without the cast

edited Dec 21 '18 at 13:06

answered Dec 21 '18 at 10:30

bruno

32,421
7
25
37

I see your logic. The masking is seemingly unnecessary. – Abdel Aleem Dec 21 '18 at 10:37
The assignments do not violate any constraint in the C standard, so a compiler is not required to produce a warning, and most do not by default, even when “all” normal warnings are requested. GCC, for example, warns only when “-Wconversion” is explicitly requested; this warning is not in its set of “-Wall” warnings. – Eric Postpischil Dec 21 '18 at 12:44
@EricPostpischil yes, Lundin warned ( :) ) me about that in a comment on the next answer, but thank you – bruno Dec 21 '18 at 12:58

score 2 · Answer 3 · answered Dec 21 '18 at 10:30

2

No, you don't need the typecast and i would recommend not to use one. This is because it has a higher precedence than the & operator, so nbr is first converted to uint16_t and then masked. This is also why the second line would not work without the additional parentheses.

Apart from that the code is just fine, and there is no real reason to use a different approach. You could also do the shift first and then mask the value, but the resulting assembler code should be exactly the same.

answered Dec 21 '18 at 10:30

Felix G

674
1
7
17

1

@AbdelAleem Please don't "fix" the question once there are answers posted. Instead fix the actual code :) I will rollback your changes as it makes posted answers irrelevant. – Lundin Dec 21 '18 at 10:35
I disagree, typecast are needed because he wants the result in 16 bits and of course the code must compile without warnings – bruno Dec 21 '18 at 10:37
The code does compile without warnings, at least on gcc 8.2, even if you use -Wall -Wpedantic -Wextra. That said, the casts don't hurt either, at least if operator precedence is respected (so the additional parentheses should always be used) – Felix G Dec 21 '18 at 10:46
@FelixG -Wall at least is mandatory to compile for me, a compiler does not produce warning/error because it is not gentle but because that helps, and the programmer have to take them into account :) – bruno Dec 21 '18 at 10:51
1

@bruno i completely agree with you on that, and i actually use -Wall -Wextra on all my code. That's why i did a quick test, and found that gcc doesn't complain if the casts are omitted, even if i add -Wpedantic – Felix G Dec 21 '18 at 10:56
@FelixG oh yes, so strange ! – bruno Dec 21 '18 at 11:02
1

gcc has a separate warning for potentially dangerous type conversions called `-Wconversion`. It is not included in `-Wall` and `-Wextra`, likely because it gives a high amount of false positives together with spotting real bugs (like a static analyser). `-pedantic` only affects standard compliance and `uint16_t u16 = u32 >> 16;` is fine as far as the standard is concerned. – Lundin Dec 21 '18 at 11:08

score -1 · Answer 4 · answered Dec 21 '18 at 10:39

If you need to repeat this operation multiple times in your code, there is another way of doing so using unions :

typedef union _uplow                                                                                     
{                                                                                                        
  struct _reg {                                                                                          
    uint32_t low : 16;                                                                                   
    uint32_t up  : 16;                                                                                   
  } reg;                                                                                                 
  uint32_t word;                                                                                         
} uplow;

Declare your variable as follows :

uplow my_var;
my_var.word = nbr;

Use it like that :

printf ("Word : 0x%x\n Low : 0x%x\n Up  : 0x%x\n", my_var.word, my_var.reg.low, my_var.reg.up);

Output :

Word : 0xaaaabbbb
 Low : 0xbbbb
 Up  : 0xaaaa

This is horribly non-portable, both compiler-specific and CPU-specific. While the OP's code is 100% portable and much better than this. You could use a union between `uint16_t[2]` and `uint32_t` to make it less bad, but you would still have endianess dependence needlessly. — Lundin, Dec 21 '18 at 10:41
@Lundin yes, and so complicated at the same time. I down voted this answer, like I did with the now removed answer using the pointer, these two down votes are the only ones I did on S.O. but difficult to not do :( — bruno, Dec 21 '18 at 11:05

Extract upper and lower word of an unsigned 32-bit integer

4 Answers4