0

My assignment is to create a function that

Prints a table indicating the number of occurrences of each different words in the text in the same order in which they appear

I'm using gets() to read in the string, a function to tokenize each word in the string and store it in a 2-dimensional array, and need help figuring out how to make one that analyzes the array for duplicates.

Here's the tokenizer function:

void tokeniz(char *array)
{
    char words[arraySize][arraySize] = { NULL };

    const char s[2] = " ";
    char *token;
    int i = 0;

    token = strtok(array, s);

    while (token != NULL)
      {
        strcpy(words[i], token);
        token = strtok(NULL, s);
        i++;
      }
    wotable(words);
}

Earlier in the program, I have a function to count the number of times each character appears in the string (pre-tokenization). Would I be able to repurpose some of that code?

    void   alpha(char *array)
{
    char character = 'a';
    int numberOf = 0, tcc = 0;

    for (character = 'a'; character <= 'z'; character++)
    {
        tcc = (tcc + numberOf);
        numberOf = 0;
        for (int i = 0; i < arraySize; i++)
            if (*(array + i) == character)
                numberOf++;
        if (numberOf != 0)
            printf("\nNumber of %c's:\t\t%d\n", character, numberOf);
    }
    printf("\nTotal character count:\t%d\n\n- - - - - - - - - - - - - -", tcc);
}
  • without showing that code it is hard to say -- but unlikely you are not – Soren May 03 '16 at 20:02
  • 2
    You should rarely, if ever, use `gets()`, read [link here](http://stackoverflow.com/questions/1694036/why-is-the-gets-function-so-dangerous-that-it-should-not-be-used) for an explanation why. Use `fgets()` instead. – Fjotten May 03 '16 at 20:09
  • Somehow I feel you have your `alpha` routine backwards. Imagine you want to write the same thing for words; would you loop over your local dictionary file and test every word in it to see if it's in your input string? – Jongware May 03 '16 at 20:12
  • 1) Using `gets()` 2) Initializing with `NULL` when `0` or `'\0'` is needed 3) `strcpy()` with no length protection are all weak practices best avoided. – chux - Reinstate Monica May 03 '16 at 21:25
  • the function `gets()` had been depreciated for some years and completely removed from the C11 standard. Your compiler should have told you about that. Strongly suggest replacing `gets()` with `fgets()` (be sure to read the man page, because the parameters are completely different. – user3629249 May 06 '16 at 06:20
  • suggest the `s` array (a very poor name) contain tab, space, period, exclamation, colon, semicolor, single quote, double quote, newline. The code will need to handle the newline when you switch to using `fgets()` (read the man page) – user3629249 May 06 '16 at 06:28
  • If you know how to use `malloc()` and `realloc()` then strongly suggest using them rather than a fixed size 2d array. suggest using `strdup()` to extract each token from the input data. Suggest `struct wordsAndCount { char * word; size_t wordCount }; for each entry in the array of words, then keeping count, etc will be relatively easy. Determining if a word is already in the array would be a simple loop using `strcmp()` to see if a new word matches a word already in the array. – user3629249 May 06 '16 at 06:35

2 Answers2

0

No, you will not be able to repurpose some of that code.

user31264
  • 6,557
  • 3
  • 26
  • 40
0

the following code:

  1. cleanly compiles
  2. needs a few lines of code added in a couple of places
  3. is a bit wasteful as each new word results in a call to realloc()
  4. properly checks for errors
  5. may want to replace the call to fgets() with a call to readline() (be sure to read/understand the man page for readline() )
  6. could separate the freeing of the memory as part of a cleanup function, so that code only needs to be written once.

and now the code

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_BUF_LEN (1024)

struct wordsAndCount
{
    char *pWord;
    size_t  count;
};


// note: following assumes words are not continued across multiple lines
int main( int argc, char *argv[] )
{
    if( 2 != argc)
    {
        fprintf( stderr, "USAGE: %s <inputFileName>\n", argv[0]);
        exit( EXIT_FAILURE );
    }

    // implied else, correct number of command line arguments

    FILE *fp = NULL;
    if( NULL != (fp = fopen( argv[1], "r") ) )
    { // fopen failed
        perror( "fopen for input file failed" );
        exit( EXIT_FAILURE );
    }

    // implied else, fopen successful

    struct wordsAndCount **table = NULL;
    size_t   countWords = 0;

    char buffer[ MAX_BUF_LEN ];
    char *token = NULL;
    char *delimeters = ",.;: '\"\n";

    while( fgets( buffer, sizeof buffer, fp ) )
    {
        token = strtok( buffer, delimeters );
        while( NULL != token )
        {
            struct wordsAndCount ** temp = realloc( table, (countWords+1)*sizeof (struct wordsAndCount *) );
            if( !temp )
            { // then realloc failed
                perror( "realloc failed" );
                fclose( fp );

                for( ; countWords; countWords-- )
                {
                    free( (*table[countWords]).pWord );
                }
                free( table );
                exit( EXIT_FAILURE );
            }

            // implied else, realloc successful

            table = temp;

            int foundIndex = 0;
            // if word already in table[] <-- need to add code for this
                (*table[foundIndex]).count++;
            //else
            {
                (*table[countWords]).pWord = strdup( token );
                if( !(*table[countWords]).pWord )
                { // then strdup failed
                    perror( "strdup failed" );
                    fclose( fp );

                    for( ; countWords; countWords-- )
                    {
                        free( (*table[countWords]).pWord );
                    }
                    free( table );
                    exit( EXIT_FAILURE );
                }

                // implied else, strdup successful

                (*table[countWords]).count = 1;
                countWords++;
            }

            token = strtok( NULL, delimeters );
        } // end while tokens
    } // end while more lines in input file

    // print words and counts <-- need to add code for this
} // end function: main
user3629249
  • 16,402
  • 1
  • 16
  • 17