1

I want to split my array and sort after it using bubble sort and srttok, from 'b' to 'z' it's all right, but when it comes to 'a' it has some problem.

Here's my code:

#include <stdio.h>
#include <string.h>

int main(int argc, char const *argv[]) {
    char str[10000];
    char *token[100] = { 0 };
    const char s[5] = { ' ', ',', ';', ':', '.' };
    gets(str);
    int i = 0;
    token[i] = strtok(str, s);
    while (token[i] != NULL) {
        i++;
        token[i] = strtok(NULL, s);
    }
    for (int j = 0; j < i; j++) {
        printf("%s\n", token[j]);
    }

    printf("i = %d\n", i);
    char *temp;
    int exchanged = 1;
    for (int j = 0; exchanged && j < i - 1; j++) {
        exchanged = 0;
        for (int k = 0; k < i - 1 - j; k++) {
            if (strcmp(token[k], token[k + 1]) > 0) {
                temp = token[k];
                token[k] = token[k + 1];
                token[k + 1] = temp;
                exchanged = 1;
            }
        }
    }

    printf("\nAfter sort\n");
    for (int j = 0; j < 100; j++) {
        printf("%s\n", token[j]);
    }
    printf("\n");
}

Here's my result

enter image description here

enter image description here

It works well when I type 'b' to 'z', and wrong when I type 'a', how did that happen?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • You may want to read this: [Why not upload images of code/data/errors when asking a question?](https://meta.stackoverflow.com/q/285551/12149471) – Andreas Wenzel May 20 '22 at 07:30
  • Obligatory: [Why is the gets function so dangerous that it should not be used?](https://stackoverflow.com/q/1694036/2505965) – Oka May 20 '22 at 07:30
  • 1
    Have you tried running your code line by line in a debugger while monitoring the values of all variables, in order to determine at which point your program stops behaving as intended? If you did not try this, then you may want to read this: [What is a debugger and how can it help me diagnose problems?](https://stackoverflow.com/q/25385173/12149471) You may also want to read this: [How to debug small programs?](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/) – Andreas Wenzel May 20 '22 at 07:31
  • 3
    The second argument to `strtok` should be a null-terminated string. `s` is not a null-terminated string. See: [Undefined Behaviour](https://en.cppreference.com/w/c/language/behavior). – Oka May 20 '22 at 07:34
  • You invoke undefined behaviour. According to [strtok manpage](https://man7.org/linux/man-pages/man3/strtok_r.3.html) `delim` is supposed to be a pointer to a string. You do not provide a valid string as you do not include a terminating 0 byte. – Gerhardh May 20 '22 at 07:35
  • You should avoid declaring huge arrays on the stack, since that may lead to stack overflow. – Lundin May 20 '22 at 07:47

1 Answers1

1

The string of separators const char s[5] = { ' ', ',', ';', ':', '.' }; is not null terminated. Passing it to strtok() has undefined behavior.

If you use a brace enclosed initializer list, you must specify the null terminator either explicitly as:

const char s[] = { ' ', ',', ';', ':', '.', '\0' };

or implicitly by giving the array a length of 6 bytes:

const char s[6] = { ' ', ',', ';', ':', '.' };

But it is much simpler to define this string as:

const char s[] = " ,;:.";

Also note that gets() has been removed from the C Standard. You should not use this function. Use fgets() instead and add \n to the s string.

You should also test that i does not exceed the maximum index value in token to accessing elements beyond the array boundaries.

Here is a modified version:

#include <stdio.h>
#include <string.h>

int main(int argc, char const *argv[]) {
    char str[10000];
    char *token[100];
    char *tok;
    const char s[] = " ,;:.\n";
    if (!fgets(str, sizeof str, stdin)) {
        fprintf(stderr, "no input\n");
        return 1;
    }
    int i = 0;
    tok = strtok(str, s);
    while (tok != NULL) {
        if (i == 100) {
            fprintf(stderr, "too many tokens at %s\n", tok);
            break;
        }
        token[i++] = tok;
        tok = strtok(NULL, s);
    }
    for (int j = 0; j < i; j++) {
        printf("%s\n", token[j]);
    }

    printf("i = %d\n", i);
    int exchanged = 1;
    for (int j = 0; exchanged && j < i - 1; j++) {
        exchanged = 0;
        for (int k = 0; k < i - 1 - j; k++) {
            if (strcmp(token[k], token[k + 1]) > 0) {
                char *temp = token[k];
                token[k] = token[k + 1];
                token[k + 1] = temp;
                exchanged = 1;
            }
        }
    }

    printf("\nAfter sort\n");
    for (int j = 0; j < 100; j++) {
        printf("%s\n", token[j]);
    }
    printf("\n");
    return 0;
}

Note that all the tokens pointed to by the token string pointers are inside the str array. If you overwrite this array, for example with a second call to fgets(), the tokens will be corrupted. You can allocate separate strings to avoid this with:

    token[i++] = strdup(tok);

To avoid memory leaks, you would free the memory using for (i = 0; i < j; i++) free(token[j]); before leaving the function.

chqrlie
  • 131,814
  • 10
  • 121
  • 189