1

I want to read the first 4096 Bytes as hex of a image file (dd from a luks encrypted device). So i tested at first with an unsigned long which has a value range from 0 to 18446744073709551615. But with this code

int main() {
    unsigned long c;

    FILE *fp = fopen("C:\\image.dd", "r");
    if (fp == NULL) {
        fprintf(stderr, "Can't read file");
        return 0;
    }

     while (!feof(fp)){                         // while not end of file
           c=fgetc(fp);                         // get a character/byte from the file
           printf("%02x ",c);                   // and show it in hex format
    }
    fclose(fp);

    return 0;
} 

i'll get this output:

4c 55 4b 53 ba be 00 01 61 65 73 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 78 74 73 2d 70 6c 61 69 6e 36 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 73 68 61 32 35 36 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 40 31 ea 2e 93 28 55 cd 52 b6 c4 51 1e 0f b1 25 0e 2d 65 72 85 f2 41 97 b3 9b 76 ae 07 e5 53 ac 02 21 b4 ffffffff

The first 512 bytes of my image.dd looks like this:

4C 55 4B 53 BA BE 00 01 61 65 73 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 78 74 73 2D 70 6C 61 69 
6E 36 34 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 73 68 61 32 35 36 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 40 
31 EA 2E 93 28 55 CD 52 B6 C4 51 1E 0F B1 25 0E 
2D 65 72 85 F2 41 97 B3 9B 76 AE 07 E5 53 AC 02 
21 B4 1A 6F 0C 8D E2 08 62 91 4D 22 3D CA A2 51 
19 0A 74 29 00 01 06 4B 32 38 38 33 34 64 34 66 
2D 36 62 32 64 2D 34 37 33 62 2D 62 34 63 65 2D 
33 31 38 32 36 65 64 61 65 39 63 39 00 00 00 00 
00 AC 71 F3 00 10 64 B8 37 E9 07 F3 84 51 CF 51 
23 E8 F2 8E 31 57 FE 2C DE D5 70 76 F2 1B B0 F8 
95 33 A6 BB E4 4F 91 A8 00 00 00 08 00 00 0F A0 
00 00 DE AD 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 02 00 00 00 0F A0 
00 00 DE AD 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 03 F8 00 00 0F A0 
00 00 DE AD 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 05 F0 00 00 0F A0 
00 00 DE AD 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 07 E8 00 00 0F A0 
00 00 DE AD 00 00 00 00 00 00 00 00 00 00 00 00

So this is weird. Why does it and at this point with unsigned long and also with the values ffffffff?

Edit: i just imagine, thats no matter if i use unsigned long or something like unsigned int. But problem is the same.

johndoe
  • 13
  • 3
  • Why tag this as c++? – 2785528 Jul 22 '20 at 12:26
  • 1. `while (!feof(fp))` is [wrong](https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong). 2. You invoked *undefined behavior* by type mismatch in `printf()`. ("%x" requests `unsigned int` and `%lx` should be used for `unsigned long`) – MikeCAT Jul 22 '20 at 12:29

3 Answers3

3

You have four errors.

Firstly, you are printing data without checking if reading is successful. Checking should be between c=fgetc(fp); and printf("%02x ",c);

Secondly, you are passing wrong type to printf() and invoking undefined behavior. The format %02X requires unsigned int. (int in proper range is also OK, according to this answer)

Thirdly, you are opening binary file with text mode. For this reason, the reading is stopped at 0x1A (EOF) byte. To open files in binary mode via fopen(), b should be added to the mode string.

Fourthly, you are not limiting your reading to the first 4096 bytes.

Note that fgetc() returns int, so using unsigned long is overkill.

Fixed code:

#include <stdio.h>

int main() {
    int i; // read size counter
    int c; // use proper type

    FILE *fp = fopen("C:\\image.dd", "rb");     // open file in binary mode
    if (fp == NULL) {
        fprintf(stderr, "Can't read file");
        return 0;
    }

    for (i = 0; i < 4096; i++) {                // while not end of size to read
           c=fgetc(fp);                         // get a character/byte from the file
           if (c == EOF) break;                 // stop at end of file
           printf("%02x ",c);                   // and show it in hex format
    }
    fclose(fp);

    return 0;
}
MikeCAT
  • 73,922
  • 11
  • 45
  • 70
1

There are several things going on.

Why did the read stop halfway?

There are two mode to fopen a file in: text mode ("rt") and binary mode ("rb"). By default ("r" alone), files will open in text mode.

Your program opens the image file in text mode ("r").

In this mode, 0x1A (Ctrl+Z) is treated as EOF regardless of the actual file size. That caused fgetc to return -1 and feof to return 1.

Note that these weirdness only occurs on Windows.

Why did that ffffffff have appeared in output?

fgetc returns an int (which is a signed value). It returns -1 when the file pointer is at EOF.

-1 converted to unsigned long is 0xFFFF_FFFF_FFFF_FFFFul (on 64-bit platform).

Then why did only 8 fs have appeared in output instead of 16?

printf determines how to print by the format specifier (%02x in this case), not by actual value to print. "%x" will print an unsigned int (which is 32-bit type on your machine), so it showed only half of unsigned long.

How to fix these?

  • Open file in binary mode ("rb") to read the binary file.
  • Use a signed type (int or long) to receive fgetc result.
  • Also, instead of feof, you should check if c is EOF (-1). feof is not really reliable (for example, the file might get deleted after feof and before fgetc).
snipsnipsnip
  • 2,268
  • 2
  • 33
  • 34
0

The ffffffff you are seeing is an end-of-file (EOF) indicator. In your while-loop, you first check if the end of file is reached. After the last byte of the file is successfully read in, the EOF indicator is not set yet (it will only be set after an unsuccessful read). Then you try to read yet another byte, it fails, returning the EOF, and you print out this EOF indicator. Then, on the next iteration, you check and discover that EOF is set, and exit the loop.

By the way, on your machine, unsigned long is probably a 32-bit type (not 64-bit as you claim). It is common for C (and C++) compilers on Windows to have both int and long as 32-bit types.

heap underrun
  • 1,846
  • 1
  • 18
  • 22