C-Lang: Segmentation Fault when working with string in for-loop

Question

Quite recently, at the university, we began to study strings in the C programming language, and as a homework, I was given the task of writing a program to remove extra words.

While writing a program, I faced an issue with iteration through a string that I could solve in a hacky way. However, I would like to deal with the problem with your help, since I cannot find the error myself.

The problem is that when I use the strlen(buffer) function as a for-loop condition, the code compiles easily and there are no errors at runtime, although when I use the __act_buffer_len variable, which is assigned a value of strlen(buffer) there will be a segmentation fault at runtime.

I tried many more ways to solve this problem, but the only one, which I already described, worked for me.

// deletes words with <= 2 letters
char* _delete_odd(const char* buffer, char delim)
{
    int __act_buffer_len = strlen(buffer);

    // for debugging purposes
    printf("__actbuff: %d\n", __act_buffer_len);
    printf("sizeof: %d\n", sizeof(buffer));
    printf("strlen: %d\n", strlen(buffer));

    char* _newbuff = malloc(__act_buffer_len + 1); // <- new buffer without words with less than 2 unique words
    char* _tempbuff; // <- used to store current word

    int beg_point = 0;
    int curr_wlen = 0;
    for (int i = 0; i < strlen(buffer); i++)       // no errors at runtime, app runs well
    // for (int i = 0; i < __act_buffer_len; i++)  // <- segmentation fault when loop is reaching a space character
    // for (int i = 0; buffer[i] != '\0'; i++)     // <- also segmentation fault at the same spot
    // for (size_t i = 0; i < strlen(buffer); i++) // <- even this gives a segmentation fault which is totally confusing for me
    {
        printf("strlen in loop %d\n", i);
        if (buffer[i] == delim)
        {
            char* __cpy;
            memcpy(__cpy, &buffer[beg_point], curr_wlen); // <- will copy a string starting from the beginning of the word til its end

            // this may be commented for testing purposes
            __uint32_t __letters = __get_letters(__cpy, curr_wlen); // <- will return number of unique letters in word
            if (__letters > 2) // <- will remove all the words with less than 2 unique letters
            {
                strcat(_newbuff, __cpy);
                strcat(_newbuff, " ");
            }

            beg_point = i + 1; // <- will point on the first letter of the word
            curr_wlen = buffer[beg_point] == ' ' ? 0 : 1; // <- if the next symbol after space is another space, than word length should be 0
        } 
        else curr_wlen++;
    }
    return _newbuff;
}

In short, the code above just finds delimiter character in string and counts the number of unique letters of the word before this delimiter.

All pointers must be initialized. You are creating pointer variables but you are not setting them to point to valid memory. You can use malloc() or just create character arrays of a certain size to hold your buffers. — siride, Feb 12 '22 at 14:07
Names starting with two underscores are reserved for the implementation, any use of them is UB. `%d` is a wrong format specifier for `sizeof(anything)` and `strlen(anything)`, UB. `char* __cpy; memcpy(__cpy, ...` is UB because `__cpy` is not initialised. — n. m. could be an AI, Feb 12 '22 at 14:11
thanks for your help, guys. Solved the problem by initializing __cpy with malloc. — KatanaMajestyt, Feb 12 '22 at 14:21

score 0 · Accepted Answer · answered Feb 12 '22 at 14:28

My fault was in not initializing a __cpy variable. Also, as @n.1.8e9-where's-my-sharem. stated, I shouldn't name vars with two underscores.

The final code:

// deletes words with <= 2 letters
char* _delete_odd(const char* buffer, char delim)
{
    size_t _act_buffer_len = strlen(buffer);
    char* _newbuff = malloc(_act_buffer_len); // <- new buffer without words with less than 2 unique words

    int beg_point = 0;
    int curr_wlen = 0;
    for (size_t i = 0; i < _act_buffer_len; i++)
    {
        if (buffer[i] == delim)
        {
            char* _cpy = malloc(curr_wlen);
            memcpy(_cpy, &buffer[beg_point], curr_wlen); // <- will copy a string starting from the beginning of the word til its end

            // this may be commented for testing purposes
            __uint32_t _letters = _get_letters(_cpy, curr_wlen); // <- will return number of unique letters in word
            if (_letters > 2) // <- will remove all the words with less than 2 unique letters
                strcat(_newbuff, _cpy);

            beg_point = i + 1; // <- will point on the first letter of the word
            curr_wlen = buffer[beg_point] == ' ' ? 0 : 1; // <- if the next symbol after space is another space, than word length should be 0

            free(_cpy);
        } 
        else curr_wlen++;
    }
    return _newbuff;
}

Thanks for helping me

Is this solved your issue, please mark your question as answered. More info here https://meta.stackexchange.com/questions/147531/how-mark-my-question-as-answered-on-stack-overflow — Erwol, Feb 12 '22 at 19:29
You may want to see [this discussion](https://stackoverflow.com/q/1449181/2472827) as to your names. You should be doing [#include ](https://pubs.opengroup.org/onlinepubs/009696899/basedefs/stdint.h.html) or `inttypes.h` to use `uint32_t`. — Neil, Feb 22 '22 at 16:16

C-Lang: Segmentation Fault when working with string in for-loop

1 Answers1