0

I'm currently writing a task for uni trying read in a data from either a file or stdin, tokenising allt the words and presenting a list of occurring words and the count of each word.

Example of desired output:

Good bye occurred 5 times
This occurred 3 times
Hello occurred 1 time

Example of current output:

This
1
is
1
fun!

1
This
1
is
1
fun!

Don't mind the formatting of the output. That is an issue to be fixed later.

I have a running program that uses a linked list declared as follows:

typedef struct node
{
    char word[50];
    int count;
    struct node * next;
}words;

The linked list is initialised as follows

words * head = NULL;
head = malloc(sizeof(words));

And two pointers assigned to the list

words * current = head;
words * search = head;

What I'm struggling with is the following piece of code:

while (!feof(input_file))
{
    while(current->next != NULL)
    {
        current = current-> next;
    }

    //Test-loop for tokenisation
    while(fgets(buffer,512, input_file) != NULL)
    {
        //Declaration of char token
        char* token;

        //Declaration of flag
        int duplicate_word = 1;
        //Test for-loop
        for (token = strtok(buffer, " "); token != NULL; token = strtok(NULL, " "))
        {
            char duplication_check_token[60];
            strcpy(duplication_check_token, token);

            while(search != NULL)
            {
                char duplication_check_search[60];
                strcpy(duplication_check_search, current -> word);
                if (strcmp(duplication_check_search, duplication_check_token) == 0)
                {
                    search->count++;
                    duplicate_word = 0;
                    break;
                }
                search = search -> next;
            }

            if (duplicate_word != 0)
            {
                while(current->next != NULL)
                {
                    current = current-> next;

                }

                current = malloc(sizeof(words));
                strcpy(current -> word, token);
                current -> count = 1;
                current -> next = NULL;

                //Test print
                printf("%s\n", token);
                printf("%d\n", current -> count);
            }

        }



    }

When debugging, it never seems to check through the entire linked list in the while(search != NULL)loop.

What part of the logic am I getting wrong?

Thank you for any help!

E.Bille
  • 7
  • 3
  • 1
    See [Why is `while(!foef(..))` always wrong](https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong) – Pablo Feb 15 '18 at 01:25

2 Answers2

0

You have a break condition in your loop that is probably the reason why you dont go through all your list. So it means you have a duplicate in your list.

Add this before the loop:

int a = 0;

And then add the commented lines in the loop:

++a; // before the if.
if (strcmp(duplication_check_search, duplication_check_token) == 0)
{
    search->count++;
    duplicate_word = 0;
    printf("%d\n", a); // to check what was the index of the item causing the break
    getchar(); // pause until next keypress
    break;
}
Antonin GAVREL
  • 9,682
  • 8
  • 54
  • 81
0

For two reasons. Firstly, search is only ever initialised to the head of the list once, when it is declared, so it never sees the words that might be added to the head of the list. I think this:

        while(search != NULL)
        {
            char duplication_check_search[60];
            strcpy(duplication_check_search, current -> word);
            if (strcmp(duplication_check_search, duplication_check_token) == 0)
            {
                search->count++;
                duplicate_word = 0;
                break;
            }
            search = search -> next;
        }

should probably be this:

        for (search = head; search != NULL; search = search->next)
        {
            char duplication_check_search[60];
            strcpy(duplication_check_search, current -> word);
            if (strcmp(duplication_check_search, duplication_check_token) == 0)
            {
                search->count++;
                duplicate_word = 0;
                break;
            }
         }

And secondly, because there is only the one head element in the list! Your code allocates a word struct and assigns it to current, but current is never inserted into your list. I think this:

            while(current->next != NULL)
            {
                current = current-> next;

            }

            current = malloc(sizeof(words));
            strcpy(current -> word, token);
            current -> count = 1;
            current -> next = NULL;

was meant to be this:

            current = malloc(sizeof(words));
            strcpy(current -> word, token);
            current -> count = 1;
            current -> next = head;
            head = current;

the while (current... loop isn't necessary, it's far easier to add new elements to the head of the list.

Also, I trust that you realise that if a line in your file has a "word" more than fifty characters long, really really bad things will happen, right? You might want to use strncpy() rather than strcpy().

Whilom Chime
  • 366
  • 1
  • 9