2

I am trying to write a C program that reads a binary file and converts it to a data type. I am generating a binary file with a head command head -c 40000 /dev/urandom > data40.bin. The program works for data types int and char but fails for double. Here is the code for the program.

void double_funct(int readFrom, int writeTo){
    double buffer[150];
    int a = read(readFrom,buffer,sizeof(double));
    while(a!=0){
        int size = 1;
        int c=0;

         for(c=0;c<size;c++){
            char temp[100];
            int x = snprintf(temp,100,"%f ", buffer[c]);
            write(writeTo, temp, x);
        }
        a = read(readFrom,buffer,sizeof(double));
    }
}

and this is the char function that works

void char_funct(int readFrom, int writeTo){
    char buffer[150];
    int a = read(readFrom,buffer,sizeof(char));
    while(a!=0){
        int size = 1;
        int c=0;

        for(c=0;c<size;c++){
            char temp[100]=" ";
            snprintf(temp,100,"%d ", buffer[c]);
            write(writeTo, temp, strlen(temp));
        }
        a = read(readFrom,buffer,sizeof(char));
    }
}

The problem is that with char I need to get 40000 words with wc -w file and I get them. Now with double I get random amount of words but theoretically I should get 5000 from 40000 bytes of data but I get a random amount between 4000 and 15000 and for char I get 40000 like it should 1 byte for one character.

I don't know what is wrong the same code works for int where I get 10000 words from 40000 bytes of data.

Jongware
  • 22,200
  • 8
  • 54
  • 100
simon44556
  • 23
  • 4
  • It is not safe to assume that `read()` reads the full number of bytes requested. It is essential to use the return value to determine how many bytes actually were read. – John Bollinger Feb 03 '17 at 18:32
  • Furthermore, `read()` will return `-1` if an error occurs. You do not account for that possibility, and if an error does occur you process who-knows-what data. – John Bollinger Feb 03 '17 at 18:34
  • 1
    This line: `int x = snprintf(temp,100,"%f ", buffer[c]);` takes a single character from the buffer and tries to convert it to a float. You might want to check the return value `x` that tells you how much was written into the buffer. But, this is certainly not what you want to be doing. – bruceg Feb 03 '17 at 18:35
  • Additionally, `write()` has similar characteristics to `read()` with respect to the number of bytes transferred, though in practice that's rarely an issue for local files. – John Bollinger Feb 03 '17 at 18:35
  • 1
    @bruceg in that version the buffer is a `double` array not a `char` array as in the other version. – Weather Vane Feb 03 '17 at 18:37
  • @JohnBollinger So it should be like this: http://pastebin.com/uWkFv1fv – simon44556 Feb 03 '17 at 18:44
  • @simon44556, it *could* be like that, but I wouldn't recommend it. If one wants to `read()` a specific number of bytes, one generally must perform the `read()` in a loop, so as to accumulate the desired number of bytes over multiple calls if necessary. It's usually counterproductive to bail just because you don't get the number you asked for in one call. – John Bollinger Feb 03 '17 at 18:53
  • This is only the double function the other one works it generates normal data – simon44556 Feb 03 '17 at 18:55
  • @JohnBollinger I sould do it like this then while(read(readFrom,buffer,sizeof(char)) == sizeof(double) ) – simon44556 Feb 03 '17 at 18:56
  • @JohnBollinger I tried now with casting double to int and it does it correctly but I need it to output double http://prntscr.com/e44fhb – simon44556 Feb 03 '17 at 19:10
  • @simon44556, as for the reads and writes, no, what you're proposing still doesn't make sense. In the event that there is a partial read or write, you can't just discard or ignore the bytes that were successfully transferred. To do it correctly, you need to try to read or write *the rest* of the intended number of bytes, into / from the tail of the buffer. But I don't think partial transfers are your actual problem. – John Bollinger Feb 03 '17 at 19:25
  • @JohnBollinger Ok but it does something because when I cast it to int it does this kind of output and it writes the correct number of ints it only fails at double so the error must be in converting or in writing. Here is the output: http://prntscr.com/e44pl4 – simon44556 Feb 03 '17 at 19:31
  • @JohnBollinger Ok I fixed it like this http://prnt.sc/e44xoh – simon44556 Feb 03 '17 at 19:48
  • [Printf width specifier to maintain precision of floating-point value](http://stackoverflow.com/q/16839658/2410359) likely useful here. – chux - Reinstate Monica Feb 03 '17 at 20:27

1 Answers1

5

The main problem seems to be that your temp array is not large enough for your printf format and data. IEEE-754 doubles have a decimal exponent range from from -308 to +308. You're printing your doubles with format "%f", which produces a plain decimal representation. Since no precision is specified, the default precision of 6 applies. This may require as many as 1 (sign) + 309 (digits) + 1 (decimal point) + 6 (trailing decimal places) + 1 (terminator) chars (a total of 318), but you only have space for 100.

You print to your buffer using snprintf(), and therefore do not overrun the array bounds there, but snprintf() returns the number of bytes that would have been required, less the one required for the terminator. That's the number of bytes you write(), and in many cases that does overrun your buffer. You see the result in your output.

Secondarily, you'll also see a large number of 0.00000 in your output, arising from rounding small numbers to 6-decimal-digit precision.

You would probably have better success if you change the format with which you're printing the numbers. For example, "%.16e " will give you output in exponential format with a total of 17 significant digits (one preceding the decimal point). That will not require excessive space in memory or on disk, and it will accurately convey all numbers, regardless of scale, supposing again that your doubles are represented per IEEE 754. If you wish, you can furthermore eliminate the (pretty safe) assumption of IEEE 754 format by employing the variation suggested by @chux in comments. That would be the safest approach.

One more thing: IEEE floating point supports infinities and multiple not-a-number values. These are very few in number relative to ordinary FP numbers, but it is still possible that you'll occasionally hit on one of these. They'll probably be converted to output just fine, but you may want to consider whether you need to deal specially with them.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • Note: Off by 1. Need `buffer[318]`. --> "309 (digits)" for `1e308`. – chux - Reinstate Monica Feb 03 '17 at 20:23
  • 3
    `"%.15e"` is good, yet is insufficient to print some differing `double` as different text. Suggest `printf("%.*e\n", DBL_DECIMAL_DIG - 1, buffer[c]);` to print all different `double` differently. Or `printf("%a\n", buffer[c]);` – chux - Reinstate Monica Feb 03 '17 at 20:30
  • 1
    Thanks @chux. I've fixed my math and tweaked my suggested format in light of your comments. You're also quite right that it is possible to generalize so as to eliminate assumptions about floating-point representation, and I'm content for my answer to refer to your comment for details. – John Bollinger Feb 03 '17 at 23:11