Implementing Split function in C

Question

I'm trying to write a C function that is similar to split in Java. When I do what I do at main, instead of add function, it works perfectly. But I couldn't understand why it does not work with add function.

#include <stdio.h>
#include <string.h>
char *words[50] = { NULL };

void add(const char *word) {
    static int i = 0;
    words[i] = word;
    i++;
}


int main( void )
{
    char string[65];
    char *tokenPtr; 

    fgets(string, 65, stdin); 
    tokenPtr = strtok( string, " " ); 
    add(tokenPtr);

    int i = 0;
    while ( tokenPtr != NULL ) {
        add(tokenPtr);
        tokenPtr = strtok( NULL, " " ); 
    }

    int i;
    for(i = 0; words[i] != NULL; i++)
        puts(words[i]);

    return 0; 
}

This is just a little piece of my actual code, there are some reasons why I need to do it in another function.

I would remove the first call to `add`, since it will duplicate the first word. Are there other problems? If so, you should describe the problems that you're seeing. — user3386109, Mar 21 '15 at 00:24
@BinaryJudy `i` is static variable, so it is initialize once, not repeatedly. — ikh, Mar 21 '15 at 00:40
I think there's no reason for `add` function. Removing it will make program better. — ikh, Mar 21 '15 at 00:42
@ikh Oops, you're right. I looked right over "static". `i` will get incremented. — DigitalNinja, Mar 21 '15 at 00:43
1) multiple definition of int i in main. 2) the first token is being added twice due to the extraneous second line after the call to fgets. also of interest, the magic number '65' should be #define'd and the fgets second parameter should be either the #define name or sizeof string. while the current code will never overflow the words[] array, that could happen as the code changes. so the add function should be checking for that possibility — user3629249, Mar 21 '15 at 02:01

score 4 · Accepted Answer · answered Mar 21 '15 at 00:49

4

Remove the first add calling.

fgets(string, 65, stdin); 
tokenPtr = strtok( string, " " ); 
// add(tokenPtr); // remove

Since you'll add the first token in the next while loop.

Also, you should remove duplication of int i.

// int i; // <-- why is it there?
for(i = 0; words[i] != NULL; i++)
    puts(words[i]);

answered Mar 21 '15 at 00:49

ikh

10,119
1
31
70

oh, yes. I had made a few changes before I put it here. The last piece of code was in another function, so I forgot to change the name of the variable. – errorist Mar 21 '15 at 10:35

score 1 · Answer 2 · answered Mar 21 '15 at 05:02

In addition to removing the first call to add and the duplicate declaration of i, there are a couple of other considerations. While there is nothing wrong with using while to parse the tokens, with strtok in particular, a for loop nicely handles both the initial call, and each subsequent call with NULL.

Similarly, while there is nothing wrong with the static declaration for i in add, passing the index as a parameter to the function can provide access to the count in main(). This adds a definite count of the tokens as well as flexibility should you later decide to allocate/reallocate words dynamically.

While there is nothing wrong with using a for loop to iterate indefinitely while relying on a NULL test on the pointer, if you did pass the token index as a parameter to add, you could iterate over the definite range 0 < i < token index and insure you do not attempt a read beyond the end of words in the event that you have 49 tokens. (which a test for could be added)

Finally, (and this is just a nit), when declaring variables to use as positive counters, indexes, etc, consider using size_t instead of int.

There are many approaches to this problem, no one more right than the other, as long as correct. With all the considerations, a slight revision to the code might look like:

#include <stdio.h>
#include <string.h>

char *words[50] = { NULL };

void add (const char *word, size_t idx) {
    words[idx] = word;
}

int main( void )
{
    char string[65];
    char *tokenPtr; 
    size_t tokIdx = 0;      /* token index  */
    size_t i = 0;           /* use size_t if non-negative counter */

    fgets(string, 65, stdin); 

    for (tokenPtr = strtok (string, " "); tokenPtr && *tokenPtr; tokenPtr = strtok (NULL, " "))
        add (tokenPtr, tokIdx++);

    for (i = 0; i < tokIdx; i++)
        printf (" words[%zu]  %s\n", i, words[i]);

    return 0; 
}

Output

$ echo "how many tokens are in this line eight?" | ./bin/strtokadd
 words[0]  how
 words[1]  many
 words[2]  tokens
 words[3]  are
 words[4]  in
 words[5]  this
 words[6]  line
 words[7]  eight?

I couldn't understand " tokenPtr && *tokenPtr " part in for loop. tokenPtr check if it is NULL but what's *tokenPtr for? — errorist, Mar 21 '15 at 10:47
`*tokenPtr` insures that it is not pointing at the `null-terminating` character at the end if the string. (depending on what you pass a delimiters, this becomes an issue) — David C. Rankin, Mar 21 '15 at 20:11

Implementing Split function in C

2 Answers2

Linked