1

When using the %a specifier with the printf family of functions, what is the longest possible string that results for a float and for a double?

Assume that float and double are IEEE-754 32-bit and 64-bit floating points.

To be more specific, how large does buf need to be for the following functions so that buf does not get overrun (note that sprintf will also write a null-terminator):

#include <stdio.h>

void StringFromFloat(char *buf, float f) {
    sprintf(buf, "%a", f);
}

void StringFromDouble(char *buf, double d) {
    sprintf(buf, "%a", d);
}

I am assuming that %A has no difference in max length.

Costava
  • 175
  • 9
  • 1
    **Never** use `sprintf()` if you don't know the max length. You know the size of the buffer, so use `snprintf()`. Or use `snprintf()` to get the length and `malloc()` an appropriate-length buffer. (On Linux use `asprintf()`...) – Andrew Henle Apr 23 '23 at 22:31
  • 1
    @AndrewHenle Using `snprintf()` instead of `sprintf()` is reasonable, yet not enough. Using `snprintf()` without checking its return value pushes the problem down the road. One idea would use `volatile int len = snprintf(...,n,...); assert(len >= 0 && (unsigned) len < n);` or perhaps something less severe. IAC, the allocation approach is certainly reasonable it one is not concerned about dynamic allocation overhead. Perhaps maybe a VLA? For me, using `char buf[sizeof(double)*CHAR_BIT + 1];` would certainly handle pedantic cases. Best answer depends on the larger unposted context. – chux - Reinstate Monica Apr 24 '23 at 14:12
  • @chux-ReinstateMonica All good points. I didn't put that much effort into my comment, though. One thing I do like to do with `snprintf()` - use a moderately large buffer with a fixed size on the "get the size of the necessary buffer" first call to `snprintf()`. If it does work and that buffer is large enough, then there's no need to run `snprintf()` twice, which might be an expensive operation. – Andrew Henle Apr 24 '23 at 15:11
  • @AndrewHenle I used to try to find a minimal maximum size (as OP apparently wants). Now I use `char buf[sizeof floating_point_object * CHAR_BIT + 1];`, call it good and move on. – chux - Reinstate Monica Apr 24 '23 at 15:29

3 Answers3

3
int main() {
    int c;
    printf("%a%n\n", -M_PI, &c);
    printf("%d\n", c+1);
    printf("%a%n\n", -DBL_MAX, &c);
    printf("%d\n", c+1);
    printf("%a%n\n", -DBL_MIN, &c);
    printf("%d\n", c+1);
}

emits

-0x1.921fb54442d18p+1
22
-0x1.fffffffffffffp+1023
25
-0x1p-1022
11

https://coliru.stacked-crooked.com/a/9c8e5f7a059b66a5

If we assume the format is standard, and that dobules are IEEE-754 64-bit, then the answer appears to be a length of 25 (including the trailing null)

The IEEE-754 64-bit type uses a sign bit (1 character), a 52-bit mantissa (13 hex characters), and a 11-bit exponent (5 hex characters). That matches our results above (-, fffffffffffff, and +1023), which implies that the other 6 bytes are ~standard formatting (0x1., p, and '\0')

Mooing Duck
  • 64,318
  • 19
  • 100
  • 158
  • 1
    Then for the 32-bit type using sign bit (1 character), a 23-bit mantissa (6 hex characters), and 8-bit exponent (4 char, via a 3-digit decimal number plus a sign character), the max length is 1 + 6 + 4 + 6 = 17? – Costava Apr 23 '23 at 18:14
  • 1
    Adding up your numbers, the max length for 64-bit type is 1 + 13 + 5 + 6 = 25 (which includes null terminator), not 26? – Costava Apr 23 '23 at 18:29
  • 2
    Note that exponent is written in decimal format, not hexa. 11 bit is still 5 characters (4 decimals + 1 for the exponent sign). – aka.nice Apr 23 '23 at 20:16
  • @Costava: Curious. You're right that in the output I only see 24 characters (and don't see the null). I'm not sure why `%n` reported 25. – Mooing Duck Apr 23 '23 at 22:16
  • 1
    " I'm not sure why %n reported 25." --> `"%n"` writes 24 in `c` - as expected. Code prints `c + 1`. – chux - Reinstate Monica Apr 23 '23 at 22:40
2

what is the longest possible string
how large does buf need to be

Usual a string is assessed in length and the buffer size needed is 1 more than the length. The maximum length is discussed below. Add one for the size.


"%a" has various implementation defined details. Because of that, code should be very careful about assuming the maximum length based on some calculation as below. Prudent code would handle a few extra characters.

Reasonable estimate: the sum of:

  • longest significand length, perhaps of the form -1.xxx...xxx where x is a hexadecimal digit. Assuming FLT_RADIX==2: 1 /* sign */ + 2 /* 0x */ + 1 /* lead digit */ + 1 /* . */ + roundup((xxx_MANT_DIG-1)/4).

  • longest exponent which is a decimal power-of-2 for the value xxx_TRUE_MIN or 1 /* p */ + 1 /* sign */ + roundup(log2(-xxx_MIN_EXP + xxx_MANT_DIG)).


For common double:

significand length: `1 + 2 + 1 + 1 + ru((53-1)/4)` --> 18

exponent length: `1 + 1 + ru(log10(- -1021 + 53))` --> 6

sum: 24

For common float:

significand length: `1 + 2 + 1 + 1 + ru((24-1)/4)` --> 11

exponent length: `1 + 1 + ru(log10(- -125 + 24))` --> 5

sum: 16

Consider NANs may have payload formatted in some interesting fashion that exceeds the above sums.

Example: double NAN w/payload: "-NAN4503599627370495". C specifies the payload and characters:[0-9A-Za-z_], but is silent on its length. One implementation used the payload as a decimal value.

1 /* sign */ + 3 /* NAN */ + 16 /* decimal digits = 20

A 52-bit payload reasonable is expected to have no more than 52 payload characters. I have never seen one longer than the above 20.

1 /* sign */ + 3 /* NAN */ + 52 /* binary digits = 56

C spec about "%a", "%A":

A double argument representing a floating-point number is converted in the style [-]0xh.hhhhp±d, where there is one hexadecimal digit (which is nonzero if the argument is a normalized floating-point number and is otherwise unspecified) before the decimal-point character) and the number of hexadecimal digits after it is equal to the precision; if the precision is missing and FLT_RADIX is a power of 2, then the precision is sufficient for an exact representation of the value; if the precision is missing and FLT_RADIX is not a power of 2, then the precision is sufficient to distinguish values of type double, except that trailing zeros may be omitted; if the precision is zero and the # flag is not specified, no decimal-point character appears. The letters abcdef are used for a conversion and the letters ABCDEF for A conversion. The A conversion specifier produces a number with X and P instead of x and p. The exponent always contains at least one digit, and only as many more digits as necessary to represent the decimal exponent of 2. If the value is zero, the exponent is zero.
A double argument representing an infinity or NaN is converted in the style of an f or F conversion specifier.
C17dr § 7.21.6.1 8

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • not sure if you need `MANT_DIG` on the exponent calculation. denormal numbers get rendered with a leading zero for me which seems appropriate given that's how they're represented. e.g. `pow(2, -1074)` gets rendered as `0x0.0000000000001p-1022`. – Sam Mason Apr 24 '23 at 10:50
  • @SamMason That is an implementation detail. The output could have been `"0x1.0000000000000p-1074"` as well as what you saw. C's `fprintf()` spec is loose here about if the first digit is zero or not and if trailing zeros are removed. Both variations are allowed. Since we are looking for a worst case, I am emphasizing the worst case various reasonable implementations might do. – chux - Reinstate Monica Apr 24 '23 at 13:57
  • AFAICT the whole question is an implementation detail! https://en.cppreference.com/w/c/io/fprintf does mention a leading zero digit for denormal numbers, but not sure where to look to see where this comes from – Sam Mason Apr 24 '23 at 14:17
  • 1
    @SamMason " where this comes from" --> Spec posted. – chux - Reinstate Monica Apr 24 '23 at 14:26
0

How large does buf need to be for the following functions so that buf does not get overrun?

Consider instead of trying to find the smallest maximum buffer size, pass the size into the function and let snprintf() prevent any buffer overrun. @Andrew Henle

void StringFromFloat(size_t sz, char *buf, float f) {
  volatile int len = snprintf(buf, sz, "%a", f);
  assert(len >= 0 && (unsigned) len < sz);
}

If the calling code wants to pass in what is thought to be buffer size that always works, compute the expected max size with various macros.
For LOG10_PR(), see this for more ideas.

#include <float.h>
#define LOG10_PR(x) (1 + (x)>9 + (x)>99 + (x)>999 + (x)>9999 + (x)>99999) 
#define PRINTF_A_SIZE_FLT ( \
    5 /* -0x1. */ + (FLT_MANT_DIG-1+3)/4 + \
    2 /* p- */ + LOG10_PR(FLT_MANT_DIG - FLT_MIN_EXP) + \
    1 /* \0 */)

char buf[PRINTF_A_SIZE_FLT];
StringFromFloat(sizeof buf, buf, some_float);

Or instead, be generous and use the bit width of the object to scale the buffer size if a few extra bytes is not a concern.

char buf[sizeof some_float * CHAR_BIT + 1];
StringFromFloat(sizeof buf, buf, some_float);
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256