C program to remove all occurrences of a WORD in string

Question

The output of my code is incorrect. For example, if I input "joy is joyful", then want to remove the word "joy", the output should be " is joyful", but instead the output is the same as the input.

Here is the full code:

#include<stdio.h>
#include<conio.h>
#include<string.h>

void print(char string[100]);

main()
{
    char string[100], remove[100];
    int stringLen, removeLen, i, j, k, l, count, location, sLen, ij, counter = 0;
    
    printf("Enter any string: ");
    gets(string);
    
    printf("Enter word to remove: ");
    gets(remove);
    
    printf("\nString before removing '%s': ", remove);
    print(string);
    
    stringLen = strlen(string);
    sLen = stringLen;
    removeLen = strlen(remove);
    
    for(i=0; i<stringLen; i++)
    {
        count = 0;
        
        for(j=0; j<removeLen; j++)
        {
            if(string[i+j] == remove[j])
            {
                count++; 
                location = i;
                ij = i+j;
            }
        }
        
        if(count == removeLen)
        {
            if(string[ij+1] == '\0' && string[ij+1] == ' ' && string[ij+1] == '\n')
            {
                counter = count;
            }
            
            else
            {
                counter = count - 1;
            }
        }
        
        if(counter == removeLen)
        {
            for(l=0; l<count; l++)
            {
                for(k=location; k<sLen; k++)
                {
                    string[k] = string[k+1];
                }
            
                sLen--;
            }
        }
    }
    
    printf("\n\nString after removing '%s':", remove);
    print(string);
    
    getch(); 
    return 0;
}

void print(char string[100])
{
    printf("\n%s", string);
}

I tried making this part:

if(count == removeLen)
{
    if(string[ij+1] == '\0' && string[ij+1] == ' ' && string[ij+1] == '\n')
    {
        counter = count;
    }
    
    else
    {
        counter = count - 1;
    }
}

To this and it worked:

if(count == removeLen)
{
    if(string[ij+1] != '\0' && string[ij+1] != ' ' && string[ij+1] != '\n')
    {
        counter = count - 1;
    }
    
    else
    {
        counter = count;
    }
}

What seems to be the problem with the original one?

Have you stepped through the code with a debugger to see what it is doing? Also you are going out of bounds here since `i+j` will always be larger than `stringLen` in the end and that is going to be a problem. — Sami Kuhmonen, Aug 16 '20 at 05:09
See [Why gets() is so dangerous it should never be used!](https://stackoverflow.com/questions/1694036/why-is-the-gets-function-dangerous-why-should-it-not-be-used) — David C. Rankin, Aug 16 '20 at 05:22
You might be interested in a C library function called [strstr](https://linux.die.net/man/3/strstr) — selbie, Aug 16 '20 at 05:29

score 1 · Answer 1 · answered Aug 16 '20 at 09:28

For pf all, never, ever, EVER, use gets(). It is so prone to exploit by buffer overrun that is have been removed from the C-library beginning with C11. For more discussion see: Why gets() is so dangerous it should never be used!

In your word replacement, you are not worrying about removing leading or trailing whitespace before or after the word you remove and you only remove that word if it is not a substring in a larger word or a word followed by punctuation. (this is fine -- but in isolating an removing words you will generally want to takes what is left into consideration)

You can simplify what you are attempting to do and reduce the complete algorithm to a single loop over the character in the string. You simply keep three indexes (or counters if you want to think of it that way). You need a read-index, the next character to be read. You need a write-index, the next location in the string to be written. And finally you need a remove-index to the characters in the substring to be removed.

Here you simply loop over the characters in the string with your read-index. Your read and write indexes begin the same. If a letter matches the first letter in your remove substring, you increment your remove-index and loop again. If a sequence of characters match all characters in your remove substring, on the next iteration your substring index will be at its nul-terminating character.

Now you can test if the next character under the read-index in your string is a space (using the isspace() macro) or testing if you are at the end of your original string. If either case is true, you simply subtract the substring length from your write-index and continue on -- effectively removing the substring from your original string. There are no multiple-loops needed, you are essentially working through each character of the original keep track of where you are (the state) with the substring index.

A short example approaching it this way could be something like the following. The function remove_substr(), reads the characters in str and removes each isolated occurrence of substr within it updating the original str in-place:

int remove_substr (char *str, const char *substr)
{
    if (!strstr (str, substr))              /* if substr not found in str */
        return 0;                           /* return 0 - nothing replaced */
    
    size_t  sslen = strlen (substr),        /* length of substr */
            i = 0, j = 0, n = 0;            /* read, write, substr indexes */
    
    do {                                    /* loop over str (including '\0') */
        if (!substr[n]) {                   /* substr found (at substr '\0') */
            /* if at end of str or whitespace */
            if (!str[i] || isspace((unsigned char)str[i]))
                j -= sslen;                 /* subtract sslen from write index */
            n = 0;                          /* reset substr index */
        }
        str[j++] = str[i];                  /* copy from read to write index */
        if (str[i] == substr[n])            /* if char matches substr */
            n++;                            /* increment substr counter */
    } while (str[i++]);                     /* exit after '\0' processed */
    
    return 1;   /* return replacements made */
}

A simple type int was chosen for the return type to indicate 0 no removals took place, or 1 indicating that occurrences of substr were removed from str.

A short example calling the function could be:

#include <stdio.h>
#include <string.h>
#include <ctype.h>

#define MAXC 1024

/* insert function here */

int main (void) {
    
    char str[MAXC] = "",                    /* storage for string */
        substr[MAXC] = "";                  /* storage for substring */
    
    fputs ("enter string: ", stdout);       /* prompt for string */
    if (!fgets (str, sizeof str, stdin))    /* read/validate input */
        return 1;
    str[strcspn(str, "\n")] = 0;            /* overwrite '\n' with '\0' */
    
    fputs ("enter substr: ", stdout);       /* ditto for substr */
    if (!fgets (substr, sizeof substr, stdin))
        return 1;
    substr[strcspn(substr, "\n")] = 0;
    
    if (remove_substr (str, substr))        /* remove all substr in str */
        printf ("\nresult: '%s'\n", str);   /* output updated str if removals */
    else
        puts ("\nno replacements made");    /* otherwise output no replacements */
}

Simply run the program and you will be prompted to input the string and the substring to remove. Currently each of the strings used are limited to MAXC (1024 characters), adjust to meet your needs -- but don't skimp on buffer size.

Example Use/Output

$ ./bin/str_rm_substr
enter string: joy is joyful
enter substr: joy

result: ' is joyful'

A more complicated example:

$ ./bin/str_rm_substr
enter string: joy is joyful, joy is full of joy
enter substr: joy

result: ' is joyful,  is full of '

There are many ways to write a function like this. You can use combinations of strtok() to tokenize a copy of your original string checking whether each token matches your substr to remove. You can inch-worm down your string using multiple loops to scan forward to find the first letter in your substr and then loop to see if it matches. You can also use combinations of strspn() and strcspn() to do the same inch-worm technique, letting those function handle looping for you. There are probably a 1/2-dozen or so valid approaches.

Look things over and let me know if you have questions.

I remember having used *gets()* once in 1987. Please don't shoot me. — Déjà vu, Aug 16 '20 at 09:34
We don't shoot any more -- it's a hanging offense now `:)` (we tried the electric chair -- but that made such a mess out of new C programmers others complained -- pity, I didn't mind the barbecue smell, but it did remind me of the Tom Hanks movie "The Green Mile") — David C. Rankin, Aug 16 '20 at 09:37

score 0 · Answer 2 · answered Aug 16 '20 at 07:18

0

Here

if(string[ij+1] == '\0' && string[ij+1] == ' '

you test if a character is both a nul and a space.

That will never be true. In order words, the whole if-statement is useless as it always takes the false path.

answered Aug 16 '20 at 07:18

Support Ukraine

42,271
4
38
63

score 0 · Answer 3 · answered Aug 16 '20 at 07:20

The problem is in this case if(string[ij+1] == '\0' && string[ij+1] == ' ' && string[ij+1] == '\n') and the counter decreasing. So after decreasing you never will get into this code:

if(counter == removeLen)
{
    for(l=0; l<count; l++)
    {
        for(k=location; k<sLen; k++)
        {
            string[k] = string[k+1];
        }
    
        sLen--;
    }
}

So remove this code:

if(count == removeLen)
{
    if(string[ij+1] == '\0' && string[ij+1] == ' ' && string[ij+1] == '\n')
    {
        counter = count;
    }
    
    else
    {
        counter = count - 1;
    }
}

And it will be work.

C program to remove all occurrences of a WORD in string

3 Answers3