1

My code is the following:

#include <stdio.h>

#define SIZE 10000000

int main() {
    char c, word[SIZE];
    int len = 0, i, count = 0, w = 0;
    while ((c = getc(stdin)) != EOF) {
        if (c != ' ') {
            word[len] = c;
            len++;
        } else {
            for (i = 0; i < len; i++) {
                if ((word[i] >= 48 && word[i] <= 57) || (word[i] >= 65 && word[i]<= 90) || (word[i] >= 97 && word[i] <= 122))
                    count++;
                if (count == len)
                    w++;
            } 
            count = 0;
            len = 0;
        }
    }
    printf("%d", w);

    return 0;
}

it counts a number of special words in a line. A special word is a word that contains A-Z or a-z or 0-9. if a word contains even just one other character it's not a special word anymore. So my algorithm for finding this word is counting a number of A-Z, a-z and 0-9 in the word and comparing it to the length of the word, if they match then it's the special word.

My code has some problems:

1) it doesn't care about the last word in the line as there is no ' ' (space) after the last word. 2) it does strange things (outputs wrong number) when there is more lines in the input.

what I want it to do is that write a number of special words in separate lines like this:

input:

dog cat23 banana
$money dollars 352

output:

3
2

how do I do that? about the first problem I thought of writing if (c == EOF) in the loop but it doesn't work. I would just use fgets but each word may have at most 10000000 characters so if in one line several words have this much characters how can an array hold this much memory?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
OrangeBike
  • 145
  • 8
  • Most compilers won't let you run with variables (arrays) of 10 MB on the stack. The program will usually crash as it tries to allocate too much space on the stack. You normally have to make such huge variables into static or file scope variables, or use dynamic memory allocation – Jonathan Leffler May 10 '20 at 18:16
  • you do not need to read the whole word. One char is everything you need even for any size word. Your algorithm is just simply bad. – 0___________ May 10 '20 at 18:19
  • You should be using the library function `isalnum()` not magic numbers. – Weather Vane May 10 '20 at 18:21
  • 1
    You should look for blanks and newlines (and probably tabs too), and demand that the words consist only of characters other than these. Note that the `` header provides the `isalnum()` function that provides a simple test for whether a character is alphanumeric — a letter or a digit. – Jonathan Leffler May 10 '20 at 18:21
  • [`getc` returns an **`int`**](https://stackoverflow.com/questions/35356322/difference-between-int-and-char-in-getchar-fgetc-and-putchar-fputc) – Antti Haapala -- Слава Україні May 10 '20 at 18:21
  • @P__J__ how can I do that without reading the whole word? – OrangeBike May 10 '20 at 18:34
  • Wow, you've taken the old rule of *Don't Skimp on Buffer Size* to the extreme. Considering the longest word in the non-medical unabridged dictionary is 29-characters (requiring 30 to accommodate `'\0'`). For the unabridged medical dictionary, the longest word is 45-characters (requiring 46 to store as a string). You are more than covered with a 2K buffer (e.g. `2048` char) – David C. Rankin May 11 '20 at 01:39

4 Answers4

4

You have a number of problems/things you are not hanlding

  • You don't check for newlines (or any whitespace other than space), so they will be considerd part of a word (making the first or last word on a line non-special).
  • You only output at the end, not after each newline, connected to the above
  • You don't deal with multiple spaces, so two consecutive spaces will be treated as if they have a zero-length special word between them
  • You have "magic" constants to check for letter/digits -- you would be better off using <ctype.h> to get isalnum and isspace to check for things
  • You needlessly copy the word to an on-stack buffer (with no check for overflow), and then iterate over the word a second time to check for special/non-special. You'd be better off doing the check as you read the characters of the word.

Putting all that together, you want just a simple loop that runs over the input once, tracking what it is you're currently looking at (space between words, a special word, or a non-special word), and updating and ouputting the count appropriately:

#include <ctype.h>
#include <stdio.h>

int main() {
    enum { SPACE, SPECIAL, NONSPECIAL } state = SPACE;
    int special_count = 0, ch;
    do {
        ch = getchar();
        if (ch == EOF || isspace(ch)) {
            if (state == SPECIAL)
                special_count++;
            state = SPACE;
        } else if (isalnum(ch)) {
            if (state == SPACE)
                state = SPECIAL;
        } else {
            state = NONSPECIAL; }
        if (ch == EOF || ch == '\n') {
            if (special_count > 0);
                printf("%d\n", special_count);
            special_count = 0; }
    } while(ch != EOF);
}

A couple of points here:

  • we use a state variable to keep track of what we were looking at as of the last character.
  • we use a do-while loop (instead of a while) as we actually want to do something with the EOF.
  • we only print out counts for lines with at least one special word to avoid printing 0s for blank lines (as will often happen at the end, if the last line ends with a newline). If we want to print out 0 for lines with at least one non-special word, we'll need an additional flag or counter to track the presence of non-special words.
Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • state machines and long if ... else ..if ...else ladders are definitely something which is easy to read. Especially if the state machine is not as trivial as here. – 0___________ May 10 '20 at 19:25
  • True -- if the state maching is much more complex than this, you definitely want to use a DSL like [flex](https://en.wikipedia.org/wiki/Flex_(lexical_analyser_generator)) – Chris Dodd May 10 '20 at 19:29
3

You have many problems in your code, most of which have been detailed in Chris Dodd's answer. Also note that c must have type int, not char in order to reliably detect end of file.

You can use a state machine as Chris proposed or a series of loops as detailed below:

#include <ctype.h>
#include <stdio.h>

int main() {
    int count = 0;
    for (;;) {
        int c = getchar();
        if (isalnum(c)) {
            while (isalnum(c = getchar()))
                continue;
            if (c == EOF || isspace(c))
                count++;
        }
        while (c != EOF && !isspace(c))
            c = getchar();
        if (c == EOF || c == '\n') {
            printf("%d\n", count);
            count = 0;
            if (c == EOF)
                break;
        }
    }
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
0

you do not need to store any words at all. you can get the answers by keeping some states for when a line starts or when a word starts and when an out of scope character is found. you can increment the answer when you find space or newline and print answer only when you find a newline. here

#include <stdio.h>
#include<stdbool.h>

int main()
{
    char c;
    int f=0;
    int ans=0;
    while(scanf("%c",&c)!=EOF)
    {
        if(c==' ')
        {
            if(f==1)
                ans++;
            f=0;
        }
        else if(c=='\n')
        {
            if(f==1)
                ans++;
            printf("%d\n",ans);
            ans=0;
            f=0;
        }
        else
        {
            if(!((c >= 48 && c<= 57) || (c >= 65 && c<= 90) || (c >= 97 && c<= 122)))
                f=2;
            else if(f==0)
                f=1;
        }
    }
    return 0;
}
Reshad
  • 220
  • 5
  • 19
  • this code does not work. For `"dog cat23 banana\n$money dollars 352" stdin it gives 3 bu the answer is 5 – 0___________ May 10 '20 at 18:46
  • how does it not work "dog cat23 banana\n$money dollars 352" here only "banana\n$money" is not spacial right?? – Reshad May 11 '20 at 19:26
0

here is written very fast bad code (it can be written better but today my brain is still sleeping), but it shows the idea.

typedef enum 
{
    WHITESPACE,
    INSIDEWORD,
    ENDOFTHELINE,
}STATES;


int main()
{
    STATES state = WHITESPACE;
    unsigned nwords = 0;
    int wordvalid = 0;
    int ch;
    int readnext = 0;


    ch = getc(stdin);

    do
    {
        while(!readnext)
        {
            if(ch == '\n' || ch == EOF) 
            {
                state = ENDOFTHELINE;
            }
            switch(state)
            {
                case WHITESPACE:
                    if(!isspace(ch)) 
                    {
                        state = INSIDEWORD;
                        wordvalid = 1;
                    }
                    else 
                    {
                        readnext = 1;
                    }
                    break;
                case INSIDEWORD:
                    if(isspace(ch))
                    {
                        nwords += wordvalid;
                        state = WHITESPACE;
                        wordvalid = 0;
                    }
                    else
                    {
                        if(wordvalid)
                        {
                            if(!isalnum(ch))
                            {
                                wordvalid = 0;
                            }
                        }
                    }
                    readnext = 1;
                    break;
                case ENDOFTHELINE:
                    nwords += wordvalid;
                    printf("valid words: %d\n", nwords);
                    nwords = 0;
                    readnext = 1;
                    wordvalid = 0;
                    if(ch != EOF) state = WHITESPACE;
                    break;
            }
        }
        ch = getc(stdin);
        readnext = 0;
    }while(ch != EOF || state != ENDOFTHELINE);
}

You can test it here :https://godbolt.org/z/Fvv7NM

0___________
  • 60,014
  • 4
  • 34
  • 74
  • Overly complicated an confusing (no need for a double-while loop) with absolutely no explanation as to what it is doing or why. Just dropping a bunch of code might help someone finish their homework, but won't help them learn. – Chris Dodd May 10 '20 at 18:51
  • thanks, but the answer should not be the sum of valid words. for that input the output should be: 3 2 not 5 – OrangeBike May 10 '20 at 18:54
  • 1
    @OrangeBike now it counts in lines – 0___________ May 10 '20 at 19:13