1

I am writing a hex dump program in C. I know there are tons of hex dump programs out there, but I wanted to write one for the experience. I have written the program in CodeBlocks, on Windows, but I can't seem to get it to work.

I am reading in a test program which is roughly 137,000 bytes, but the program stops at 417 bytes. Now, when I compile the code on Linux (as it's only a console application and is using standard C libraries), it works perfectly, and gives back the correct amount of bytes in the file. Does anyone have any idea why read() would not work on Windows, but works fine in Linux?

Below is an example of how I am reading in the file.

int main(int argc, char **argv)
{
    if (argc != 2) { return 1; }

    int fd = open(argv[1], O_RDONLY);
    if (fd == -1) { return 1; }

    unsigned char buffer[8];
    unsigned int bytes = 0;
    unsigned int total_bytes = 0;

    while ((bytes = read(fd, buffer, sizeof(unsigned char) * 8)) > 0) {
        ...
        total_bytes += bytes;
    }

    printf("Total Bytes: %d\n", total_bytes);

    return 0;
}
jww
  • 97,681
  • 90
  • 411
  • 885
jsmith
  • 151
  • 1
  • 8
  • Also, I will mention that I have tried multiple files, and I seem to have issues with all of them when I am on Windows. Still no problems on Linux, however. Thanks! – jsmith May 15 '19 at 03:46
  • 1
    According to the [`read (3)` man page](https://linux.die.net/man/3/read), `read` returns `-1` on error, not something based on not-`> 0`. You should check for a `-1` return value, and then inspect `errno`. – jww May 15 '19 at 04:01
  • 3
    Regarding the `> 0` condition, this is *very* problematic since `bytes` an *unsigned* integer. Consider the unsigned equivalent of `-1` in a [two's complement](https://en.wikipedia.org/wiki/Two's_complement) system. And if you see e.g. [this `read` reference](http://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html) you will see that it returns a value of type `ssize_t`, which is signed. – Some programmer dude May 15 '19 at 04:06
  • @jww Other than that, `> 0` is an okay condition. The loop will end if there's an error or end-of-file. – Some programmer dude May 15 '19 at 04:08
  • Good catch on that unsigned int. However, after testing, read() is definitely not returning -1. It is returning 0. Doing a little more digging, it seems that read is stopping on the _substitute character_ which hex value in the ascii table is 1A. – jsmith May 15 '19 at 04:16
  • OT: regarding: `if (argc != 2) { return 1; }` should tell the user what they should have input (via a USAGE statement) before calling the `return` – user3629249 May 15 '19 at 05:02
  • regarding: *Still no problems on Linux,* no problem because linux uses `ctrl-d` for EOF rather than `ctrl-z` – user3629249 May 15 '19 at 05:09
  • OT: regarding: `if (fd == -1) { return 1; }` 1) please follow the axiom: *only one statement per line and (at most) one variable declaration per statement.* 2) when an error indication is returned from a C library function, use `perror( "your error message")`; to output to `stderr` both your error message and the text reason the system thinks the error occurred – user3629249 May 15 '19 at 05:11
  • OT: when calling `read()` the code should be checking for all 3 return conditions: 1) <0 means an error occurred 2) ==0 means EOF or the other end of the connection closed the connection 3) >0 means that many characters were read. – user3629249 May 15 '19 at 05:15
  • OT: regarding: `while ((bytes = read(fd, buffer, sizeof(unsigned char) * 8)) > 0) {` This would be much more flexibility written as: `while ((bytes = read(fd, buffer, sizeof( buffer )) > 0) {` – user3629249 May 15 '19 at 05:17

1 Answers1

1

I have found the answer in this post after all. They were having the issue with stdin, though. Apparently the substitute character (1A) is the same as CTRL+Z in Windows, and so it was forcibly closing my program when reading that character.

C reading (from stdin) stops at 0x1a character

jsmith
  • 151
  • 1
  • 8