As pointed out by @kaylum in the comment, the immediate problem that breaks the defined behavior of your code is your use of state
before it has been assigned a value in:
if (state == in && ...
state
is a variable declared with automatic storage duration. Until the variable state
is explicitly assigned a value, its value is indeterminate. Using state
while its value is indeterminate results in Undefined Behavior. See: C11 Standard - 6.7.9 Initialization(p10) and J.2 Undefined Behavior
Once you invoke Undefined Behavior in your code, the defined execution is over and your program can do anything between appearing to run correctly or SegFault. See: Undefined, unspecified and implementation-defined behavior
The simple fix is to initialize int state = out;
to begin with. (you will start in the out
state in order to ignore leading whitespace before the first word)
You have similar problems with your variable X
which is not initialized and is used when its value is indeterminate in x_count[i] = X;
Moreover, it is unclear what you intend to do with int X
to begin with. It is clear from your desired output:
(words number)
1 XXX
2 XXXXX
3 XX
4
5 X
12345 (charcacters number)
That you want to output one 'X'
per-character (to indicate the word length for your histogram), but there is no need to store anything in a variable X
to do that, you simply need to output one character 'X'
for each character in the word. Additionally your output of 4
does not make much sense being empty as your state-variable state
should prevent counting empty words. You would never have been in
an empty word.
Compounding the confusion is your check for a backspace '\b'
character when you check EOF
and other whitespace characters for end of word. It looks more likely that you intended a '\n'
but though an off-by-one-key typo you have '\b'
instead of '\n'
. That is conjecture that you will have to add details to clarify...
A Word-Length Histogram
K&R provides very good exercises and the use of a state-loop is a very good place to start. Rather than multiple-included loops to inch-worm over each word and skip over potentially multiple-included whitespace, you simply keep a state-variable state
in your case to track whether you are in a word reading characters, or before the first word, between words or after the last word reading whitespace. While you can simply the check for whitespace by including ctype.h
and using the isspace()
macro, a manual check of multiple whitespace characters is fine.
While defining in
and out
macros of 1/0
is fine, simply using a variable and assigning 0
for out or non-zero for in works as well. Since you are keeping a character-count to output a length number of 'X'
characters, you can just use your character count variable as your state-variable. It will be zero until you read the first character in a word, and then you would reset it to zero after outputting your length number of 'X'
s to prepare for the next word.
Initializing all variables, and reading either from the filename given as the first argument to the program, or from stdin
by default if no argument is given, you can do something similar to:
#include <stdio.h>
int main (int argc, char **argv) {
int cc = 0, /* character count (length) */
wc = 0; /* word count */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
for (;;) { /* loop continually */
int c = fgetc(fp); /* read character from input stream */
if (c == EOF || c == ' ' || c == '\t' || c == '\n') { /* check EOF or ws */
if (cc) { /* if characters counted */
printf ("%3d : ", wc++); /* output word count */
while (cc--) /* loop char count times */
putchar ('X'); /* output X */
putchar ('\n'); /* output newline */
cc = 0; /* reset char count */
}
if (c == EOF) /* if EOF -- bail */
break;
}
else /* otherwise, normal character */
cc++; /* add to character count */
}
if (fp != stdin) /* close file if not stdin */
fclose (fp);
}
(note: the character-count cc
variable is used as the state-variable above. You can use an additional variable like state
if that is more clear to you, but think through way using cc
above accomplishes the same thing. Also note the change and use of '\n'
instead of '\b'
as the literal backspace character is rarely encountered in normal input, though it can be generated -- while a '\n'
is encountered every time the Enter key is pressed. If you actually want to check for teh backspace character, you can add it to the conditional)
Example Input File
$ cat dat/histfile.txt
my dog has fleas
my alligator has none
Example Use/Output
Using a heredoc for input:
$ cat << eof | ./bin/wordlenhist
> my dog has fleas
> my alligator has none
> eof
0 : XX
1 : XXX
2 : XXX
3 : XXXXX
4 : XX
5 : XXXXXXXXX
6 : XXX
7 : XXXX
Redirecting from a file for input:
$ ./bin/wordlenhist < dat/histfile.txt
0 : XX
1 : XXX
2 : XXX
3 : XXXXX
4 : XX
5 : XXXXXXXXX
6 : XXX
7 : XXXX
Or passing the filename as a argument and opening and reading from the file within your program are all options:
$ ./bin/wordlenhist dat/histfile.txt
0 : XX
1 : XXX
2 : XXX
3 : XXXXX
4 : XX
5 : XXXXXXXXX
6 : XXX
7 : XXXX
Lastly, you can input directly on stdin
and generate a manual EOF
by pressing Ctrl+d on Linux or Ctrl+z on windows. (note: you will have to press the key combination twice -- can you figure out why?) E.g.
$ ./bin/wordlenhist
my dog has fleas my alligator has none 0 : XX
1 : XXX
2 : XXX
3 : XXXXX
4 : XX
5 : XXXXXXXXX
6 : XXX
7 : XXXX
(also note where the first line of output is placed -- this will help you answer the last question)
If you would like to add a comment below and clarify your intent for int X;
and x_count[i] = X;
and the use of '\b'
I'm happy to help further. Look things over and let me know if you have any questions.