0

I want to make a very simple console application that replaces the Unix command wc -c. For this, I need simply to count the bytes into the file and output them in the console.

If my application is not provided with arguments, it reads them from the standard input (this part I've already made working). However, if I decide to supply more arguments (valid and existing file names), I should count each one of them their bytes.

So far so good. Here is the code

const int __OPEN_ERROR = 48;

int main(int argc, char* argv[]) {

    if(argc == 1) {
        char buf[3];
        int count = 0, lines = 0;
        while(read(0, buf, 1) != 0) {
            count++;
            if(buf[0] == '\n')
                lines++;
        }

        printf("Count:%d\n", count);
        printf("Lines:%d\n", lines);
    }
    else {
        int fd, bytes = 0;
        char buf[3];

        for (int arg = 1; arg <= argc; arg++) {
            if ((fd = open(argv[arg], O_RDONLY) < 0)) {
                exit(__OPEN_ERROR);
            }
            while (read(fd, buf, 1) != 0) { // <-- this here leads to infinite loop.
                printf("%s", buf);
                bytes++;
            }

            printf("%d %s", bytes, argv[arg]);
        }
    }

    exit(0);
}

Please don't mind how badly this code is written, this is my first day exploring C, for now everything I want is working solution.

Notice how both reads are same but latter one falls into infinite loop, for a reason I cannot understand. And yes, opening the file succeeds.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
kuskmen
  • 3,648
  • 4
  • 27
  • 54
  • 2
    `read` and `open` can return -1 (on failure). You need to handle that case. Read *carefully* [open(2)](http://man7.org/linux/man-pages/man2/open.2.html), [read(2)](http://man7.org/linux/man-pages/man2/read.2.html) and [errno(3)](http://man7.org/linux/man-pages/man3/errno.3.html). – Basile Starynkevitch Feb 10 '18 at 21:41
  • I know, however this doesn't solve the problem, I am trying it first to work, then polish (error handling, removing duplicates and etc.) – kuskmen Feb 10 '18 at 21:43
  • 2
    `argv[argc]` will *always* be a null pointer. Yet you attempt to use it to open a file. – Some programmer dude Feb 10 '18 at 21:43
  • [Use the `gdb` debugger](https://sourceware.org/gdb/current/onlinedocs/gdb/) and also [strace(1)](http://man7.org/linux/man-pages/man1/strace.1.html) – Basile Starynkevitch Feb 10 '18 at 21:44
  • @Someprogrammerdude, what do you mean, when I evaluate it, it is correctly file's name? Am I wrong? – kuskmen Feb 10 '18 at 21:44
  • 1
    `arg <= argc` should be `arg < argc` – Barmar Feb 10 '18 at 21:44
  • 4
    `printf("%s", buf)` causes undefined behavior, because there's no null terminator in `buf`. – Barmar Feb 10 '18 at 21:45
  • 2
    Tangential: Note that you shouldn't create function or variable names that start with an underscore, in general. [C11 §7.1.3 Reserved identifiers](https://port70.net/~nsz/c/c11/n1570.html#7.1.3) says: — _All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use._ — _All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces._ See also [What does double underscore (`__const`) mean in C?](https://stackoverflow.com/a/1449301/15168) – Jonathan Leffler Feb 10 '18 at 21:45
  • 1
    You need to set `bytes` back to `0` when you open each file. You also need `close(fd)` after the loop. – Barmar Feb 10 '18 at 21:47
  • Okay, thank you guys, will check the suggestions and promise not to break the standard of C once I get familiar with, @Barmar thanks for the catch. – kuskmen Feb 10 '18 at 21:48
  • 1
    If you're reading a single character (byte), consider using stdio, `fopen`, and `c = fgetc(fd)` rather than `open`, `read(fd, buf, 1)`. That also avoids thinking that `buf` is a null-terminated string. – Schwern Feb 10 '18 at 21:48
  • Both the C and the the POSIX specifications explicitly says that `argv[argc]` is a null pointer. If it's not then your system is badly broken. – Some programmer dude Feb 10 '18 at 21:48
  • 1
    Don't forget to `close(fd)` what you `open()` successfully. – Jonathan Leffler Feb 10 '18 at 21:48
  • 1
    You guys are amazing – kuskmen Feb 10 '18 at 21:48
  • @Schwern `fgetc()` is for `stdio` streams, not Unix FD. – Barmar Feb 10 '18 at 21:49
  • @Someprogrammerdude, I still don't know what you mean, how am I supposed to get each argument then :/ – kuskmen Feb 10 '18 at 21:50
  • 1
    To get each argument (other than the program name), use: `for (int arg = 1; arg < argc; arg++)` as the loop, using `<` rather than `<=`. In general, using `<=` for an upper bound of a loop in C is indicative of a problem, or sometimes indicative that an array is defined `SomeType array[LIMIT+1];` and then you can use `<= LIMIT` in the loop. – Jonathan Leffler Feb 10 '18 at 21:53
  • 1
    @kuskmen Change the loop condition as Barmar told you. The `argc` variable is the number of *elements* in the `argv` array, it's not the max valid index into the array. – Some programmer dude Feb 10 '18 at 21:53
  • 1
    'Tis oddly asymmetric that you count lines for standard input and not for files explicitly named on the command line. Note too that if you have 3 files on the command line (and the code is working), then you will report the cumulative bytes read at the end of each file — because you don't set `bytes = 0;` inside the loop. Either don't print the intermediate values of `bytes` (so print the total bytes after the loop) or reset the value to zero before each file is processed (or do both — have a per-file count and a total count and print appropriately). – Jonathan Leffler Feb 10 '18 at 21:57

1 Answers1

1

Your loops are wrong, should be:

char buf;
...
while (read(fd, buf, 1) == 1) {
    printf("%c", buf);
    bytes++;
}

read lets you read bytes from file descriptor. The given number say n is a request, this means that read reads at most n bytes, and the returned value is exactly the number of bytes effectively read. As you want to read 1 byte, then test if it read 1 byte. Second, as you read bytes (or char) one by one, then the result is one char and should be print as. %c tells printf to print the value as a char (%s is to print C-strings).

The other error is the control of arg loop, should be:

for (int arg = 1; arg < argc; arg++) // strict test...arg from 0 to argc-1 (included) or argc (excluded)

You must also reset the byte count to 0 on each arg looping, and close each unused file, thus:

for (int arg = 1; arg < argc; arg++) {
    char buf;
    int bytes = 0;
    int fd;
    if ((fd = open(argv[arg], O_RDONLY) < 0)) {
        exit(__OPEN_ERROR);
    }
    while (read(fd, buf, 1) == 1) {
        printf("%c", buf); // optionnal, counting may not print...
        bytes++;
    }
    close(fd);
    printf("%d %s", bytes, argv[arg]);
}
Jean-Baptiste Yunès
  • 34,548
  • 4
  • 48
  • 69