0

I'm trying to debug an existing code trying to format a small integer into an hexadecimal 4-char C-string. But the behaviour is apparently inconsistent between positive and negative integers.

Here is the code:

char mystring[5];
mystring[4] = 0;

sprintf (mystring, "%04X", (char)(61))
// ---> mystring is "FF3D" [OK]
// ---> return value is 4 (chars written) [OK]
 
sprintf (mystring, "%04X", (char)(-61))
// ---> mystring is "FFFFFFC3" [NOT OK]
// ---> return value is 8 (chars written) [NOT OK]

In the second case, I have 8 characters written, despite the %04X format. What is going on? How can I limit to only 4 chars the result?

Silverspur
  • 891
  • 1
  • 12
  • 33
  • 1
    char is promoted to integer type. Signed types are filling extra space with oldest bit, so you got `FF` in front. Note that format string `%x` represent `int` type. Try `%04hhX`. https://godbolt.org/z/jf4ao5 – Marek R Mar 04 '21 at 15:56
  • 1
    @MarekR `%x` is for `unsigned int`, not `int`. The `char` argument is promoted to `int` (or `unsigned int` on some exotic systems). – eerorika Mar 04 '21 at 16:06

3 Answers3

2

The "%04" tells sprintf only the minimum number of digits to use.

If the number needs more, it will get more so the output is not truncated.

U. W.
  • 414
  • 2
  • 10
2

That happens because of integral promotion rules. In function calls, char is promoted to an int. int is usually represented as 32 bit two's complement, so a negative value like -61 becomes FFFFFFC3.

Then, the width field like in %04 specifies the minimum width. When a value exceeds that, it is printed as-is.

As a workaround, you can use the hh length field, which specifies that the original value was a char and should be treated as such.

sprintf (mystring, "%04hhX", -61);

- should output 00C3.

If i use sprintf (mystring, "%04hhX", (char)(-61)); as you suggest, I get 00C3 instead of FFC3. What is going on?

A char is in practice 1 byte (8 bits). So -61 is C3. The 00 prefix comes from the padding requirement of 04. To get FFC3, use a 16-bit data type (e.g. short) for example "%04hX":

sprintf (mystring, "%04hX", -61);

- should output FFC3.

Alternatively you can trim unnecessary bits before formatting, and treat the value as unsigned int

sprintf (mystring, "%04X", (-61 & 0xFFFF));

The bitwise-and operation (&) is useful for setting unnecessary bits to 0.


Note that I'm mixing signed and unsigned int in this post. That is intentional and is OK to do. The behavior is implementation-defined, but always works because all modern computers are based on two's complement integer representation. For example, the last example can be "improved" by using an unsigned value: (-61 & 0xFFFFu), but will have absolutely no effect on the end result.

rustyx
  • 80,671
  • 25
  • 200
  • 267
1
"%04X", (char)(61)

You have used the wrong format specifier. As a result, the behaviour of the program is undefined. On exotic systems, the behaviour may be inadvertently well defined, but probably not what you intended.

%X is for unsigned int. The char argument promotes (on most systems) to int for which the format specifier is not allowed. Regardless, format specifiers for int and unsigned int will treat the input as a multi-byte value. It just so happens that a 4 byte int represents the value -61 as FF'FF'FF'C3.

To ignore the high bytes of the promoted argument, you must use the length specifier in the format. hh is for signed char and unsigned char. Note that there is no numeric format specifier for char. Furthermore, there is no hex format for signed numbers. So, you should be using unsigned char. Here is a correct example:

unsigned char c = -61;
std::sprintf (mystring, "%04hhX", c);

And another, using signed decimal:

signed char c = -61;
std::sprintf (mystring, "%04hhd", c);

I have 8 characters written, despite the %04X format.

The width does not limit the number of characters. It is minimum width to which the output is padded.

How can I limit to only 4 chars the result?

Use std::snprintf instead:

int count = std::snprintf(nullptr,
  sizeof mystring,"%04hhX", c);
assert(count < sizeof mystring);
std::snprintf(mystring,
  sizeof mystring,"%04hhX", c);

when I use your first suggestion with an unsigned char, I get 00C3 instead of FFC3. What is going on?

When -63 is converted to unsigned char, the resulting value is 195. 195 is C3 in hexadecimal.


P.S. Consider using std::format if possible.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • Same comment as for @rustyx answer: when I use your first suggestion with an `unsigned char`, I get `00C3` instead of `FFC3`. What is going on? – Silverspur Mar 05 '21 at 09:47