56

What is the easiest and most efficient way to remove spaces from a string in C?

Wolf
  • 9,679
  • 7
  • 62
  • 108
Tyler Treat
  • 14,640
  • 15
  • 80
  • 115

16 Answers16

110

Easiest and most efficient don't usually go together…

Here's a possible solution for in-place removal:

void remove_spaces(char* s) {
    char* d = s;
    do {
        while (*d == ' ') {
            ++d;
        }
    } while (*s++ = *d++);
}
Wolf
  • 9,679
  • 7
  • 62
  • 108
Aaron
  • 9,123
  • 5
  • 40
  • 38
  • 5
    What happens if the input source was initialized from a string literal? – Suppressingfire Nov 13 '09 at 01:11
  • 11
    @Suppressingfire: assuming you mean `RemoveSpaces("blah");`, and not `char a[] = "blah"; RemoveSpaces(a);`, then undefined behaviour. But that's not the fault of this code. It is not recommended to pass a read-only string to a function which is documented to modify the string passed to it (by, for example, removing spaces) ;-) – Steve Jessop Nov 13 '09 at 01:25
  • 4
    I think you should do *i = '\0'; in the end. – Nick Louloudakis Nov 12 '13 at 14:14
  • 8
    ***i = 0** and ***i = '\0'** is the same :) – Uxío Mar 31 '15 at 14:31
  • Could slightly simplify with `do { *i = *j; if(*i != ' ') i++; } while(*j++ != 0)`. Then no need for final `*i = 0;` – chux - Reinstate Monica May 20 '15 at 20:41
  • Could simplify further by using `source` for `j` and ditching the `j` entirely. I see no reason to preserve the inbound `source` value unless the function is to return the original base address (such as the family of `str` functions in the standard library, which itself may not be such a bad idea). – WhozCraig May 29 '15 at 06:58
  • 3
    How... How is this working? I am new to C and pointers so a walkthrough of what's happening would be much appreciated – starscream_disco_party Feb 14 '16 at 20:57
  • 2
    @starscream_disco_party take two pointers to the same address. while the pointer passed into the function (s) points to a space char, just increment s. This in effect moves past 1 or more space characters, but that doesn't touch the other pointer we created (d). then after that operation, you assign the d's value (which is still pointing just past the last non-space, or the beginning element) to the value of the s, which we caused to "skip over" all the space characters. after this assignment happens, we then increment both pointers. clever. needs double parens around the assignment though – Joe McDonagh Nov 09 '19 at 06:14
  • 2
    @JoeMcDonagh Can you tell me on what conditions the while loop is going to break? After each iteration, s and d pointers increments. When d reaches '\0' then it assigns '\0' to where the s is pointing to and after that both s and d increments by one location after that how the while terminates? – strikersps Feb 25 '20 at 16:23
  • 3
    @strikersps assignment returns the assigned value normally in languages like C, so when you hit the end of the string and assign the null character, it will be interpreted as falsy and the loop will break. I love this piece of code. It just so dense and beautiful! – Joe McDonagh Feb 25 '20 at 16:49
  • @JoeMcDonagh Yeah it's a clever piece of code. My doubt is, let's say I have a string " blah" which has a 3 leading whitespaces, and I run the above function on this string after the first iteration `d` pointer will be ahead of the `s`, now in each iteration `*s = *d` first then `s++` and `d++`, then if `*s` is not falsy then `do{}` executes again that's how the code works I suppose. So, after each assignment `*s = *d` and as `d` is ahead, and `while()` will break only when `*s` is falsy which is checked after `s++` and `d++` so don't you think `d` will go out-of-bound? – strikersps Feb 25 '20 at 17:21
  • 1
    @strikersps first off just want to point out that increment operator takes precedence over assignment. what i would suggest you do to understand this piece of code is add printf statements to each iteration of the loop to show you the values in the pointers on each iteration. – Joe McDonagh Feb 26 '20 at 21:11
  • The **breaking condition** for this do-while loop comes from *"An assignment expression has the value of the left operand after the assignment."* – Chris Tang Mar 05 '20 at 12:14
  • If you're having trouble figuring how this snippet works, add `printf("%zu %c = %zu %c \n", s, *s, d, *d) ` before the `} while` and check it out! Also, for the breaking condition, NULL gets assigned to s from d and [hence returned](https://stackoverflow.com/questions/16567622/what-is-the-result-of-an-assignment-expression-in-c?noredirect=1&lq=1). We have `while(NULL)` and it breaks. – Angaj Sharma Sep 22 '22 at 16:28
21

Here's a very compact, but entirely correct version:

do while(isspace(*s)) s++; while(*d++ = *s++);

And here, just for my amusement, are code-golfed versions that aren't entirely correct, and get commenters upset.

If you can risk some undefined behavior, and never have empty strings, you can get rid of the body:

while(*(d+=!isspace(*s++)) = *s);

Heck, if by space you mean just space character:

while(*(d+=*s++!=' ')=*s);

Don't use that in production :)

Kornel
  • 97,764
  • 37
  • 219
  • 309
  • Interesting, the first two function on my machine. But I guess all of these are undefined, since using s++ and *s in one statement results in undefined behavior? – Andomar Nov 13 '09 at 00:54
  • make sure that you aren't beyond the end of the string when dereferencing it. – Casey Nov 13 '09 at 00:59
  • 1
    @Andomar: First one is completely safe and sound. Last two are sketchy indeed (tested in GCC4.2). – Kornel Nov 13 '09 at 01:50
  • 1
    Calling it "sound" is perhaps a bit too polite. All 3 versions are completely unreadable, for no performance gained. [Apple agrees](https://blog.codecentric.de/en/2014/02/curly-braces/) that braces are unnecessary. I mean, what is many million dollars in losses and all the programmers in the world laughing at you, compared to the sheer agony involved in writing braces? – Lundin May 21 '15 at 11:35
  • Why is it necessary to risk undefined behaviour, when you could solve that risk using the comma operator and a `for` loop? – autistic May 24 '15 at 00:41
  • stumbled upon this tonight looking for the most efficient way to do it as an exercise- i love it! (the first one i mean). parens around the assignment will squash warnings. – Joe McDonagh Nov 09 '19 at 06:17
20

As we can see from the answers posted, this is surprisingly not a trivial task. When faced with a task like this, it would seem that many programmers choose to throw common sense out the window, in order to produce the most obscure snippet they possibly can come up with.

Things to consider:

  • You will want to make a copy of the string, with spaces removed. Modifying the passed string is bad practice, it may be a string literal. Also, there are sometimes benefits of treating strings as immutable objects.
  • You cannot assume that the source string is not empty. It may contain nothing but a single null termination character.
  • The destination buffer can contain any uninitialized garbage when the function is called. Checking it for null termination doesn't make any sense.
  • Source code documentation should state that the destination buffer needs to be large enough to contain the trimmed string. Easiest way to do so is to make it as large as the untrimmed string.
  • The destination buffer needs to hold a null terminated string with no spaces when the function is done.
  • Consider if you wish to remove all white space characters or just spaces ' '.
  • C programming isn't a competition over who can squeeze in as many operators on a single line as possible. It is rather the opposite, a good C program contains readable code (always the single-most important quality) without sacrificing program efficiency (somewhat important).
  • For this reason, you get no bonus points for hiding the insertion of null termination of the destination string, by letting it be part of the copying code. Instead, make the null termination insertion explicit, to show that you haven't just managed to get it right by accident.

What I would do:

void remove_spaces (char* restrict str_trimmed, const char* restrict str_untrimmed)
{
  while (*str_untrimmed != '\0')
  {
    if(!isspace(*str_untrimmed))
    {
      *str_trimmed = *str_untrimmed;
      str_trimmed++;
    }
    str_untrimmed++;
  }
  *str_trimmed = '\0';
}

In this code, the source string "str_untrimmed" is left untouched, which is guaranteed by using proper const correctness. It does not crash if the source string contains nothing but a null termination. It always null terminates the destination string.

Memory allocation is left to the caller. The algorithm should only focus on doing its intended work. It removes all white spaces.

There are no subtle tricks in the code. It does not try to squeeze in as many operators as possible on a single line. It will make a very poor candidate for the IOCCC. Yet it will yield pretty much the same machine code as the more obscure one-liner versions.

When copying something, you can however optimize a bit by declaring both pointers as restrict, which is a contract between the programmer and the compiler, where the programmer guarantees that the destination and source are not the same address. This allows more efficient optimization, since the compiler can then copy straight from source to destination without temporary memory in between.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Why use the `restrict` keyword? There is no reason you shouldn't be able to pass the same pointer as source and destination, and your code supports that. – chqrlie Oct 12 '16 at 21:20
  • @chqrlie It could be removed, certainly, at the expense of slower code in the generic use-case. I don't think I have benchmarked this code, but I suspect it shouldn't make that much of a difference. – Lundin Oct 13 '16 at 06:40
  • 1
    This is the most sensible answer I've seen. It's clear, concise and well understandable by a beginner! Thank you. – PageMaker Feb 03 '21 at 18:24
  • I'd replace `str_untrimmed` by `scattered` and `str_trimmed` by `condensed`. – Wolf Aug 12 '21 at 12:47
  • 1
    @Wolf Good for you. Now kindly stop vandalizing people's posts with minor superfluous edits or to change the coding style to your personal preference. You apparently have too high rep for getting edits reviewed, or you'd have an edit ban incoming. – Lundin Aug 12 '21 at 14:38
  • @Lundin Thanks for letting me know that this kind of edits is considered harmful. Nevertheless, `trim` is the word that indicates (in most languages/libraries) removing leading and trailing spaces from strings. – Wolf Aug 12 '21 at 15:22
9

In C, you can replace some strings in-place, for example a string returned by strdup():

char *str = strdup(" a b c ");

char *write = str, *read = str;
do {
   if (*read != ' ')
       *write++ = *read;
} while (*read++);

printf("%s\n", str);

Other strings are read-only, for example those declared in-code. You'd have to copy those to a newly allocated area of memory and fill the copy by skipping the spaces:

char *oldstr = " a b c ";

char *newstr = malloc(strlen(oldstr)+1);
char *np = newstr, *op = oldstr;
do {
   if (*op != ' ')
       *np++ = *op;
} while (*op++);

printf("%s\n", newstr);

You can see why people invented other languages ;)

Andomar
  • 232,371
  • 49
  • 380
  • 404
  • Your second example forgets to properly terminate the destination string. – caf Nov 13 '09 at 00:20
  • ..and your first example doesn't do the right thing at all (eg if the string starts off with two non-space characters). – caf Nov 13 '09 at 00:21
  • @caf: The while loop will run for the \0 terminator, because it's `while (*(op++))` and not `while (*(++op))` – Andomar Nov 13 '09 at 00:22
  • That's true, which means the it's still buggy, because it skips the first character regardless of whether it's a space or not. – caf Nov 13 '09 at 00:45
  • You can common up the loop here: `void copyExceptSpace(char*, const char*);`, `void removeSpace(char *s) { copyExceptSpace(s,s); }`, `char *dupExceptSpace(const char *s) { char *n = malloc(strlen(s)+1); if (n) copyExceptSpace(n,s); return n; }`. Or something like that. – Steve Jessop Nov 13 '09 at 01:33
  • Why mix the algorithm "remove spaces" together with memory allocation? There is no reason to do so. Avoid strdup() because it isn't standard. Never cast the result from malloc(). – Lundin May 21 '15 at 09:36
  • both these code examples leak memory. Fix by calling free() on the memory pointer returned strdup() /malloc(). That leak has been in these examples for nearly nine years. – Stephen Kellett May 29 '18 at 20:20
2

if you are still interested, this function removes spaces from the beginning of the string, and I just had it working in my code:

void removeSpaces(char *str1)  
{
    char *str2; 
    str2=str1;  
    while (*str2==' ') str2++;  
    if (str2!=str1) memmove(str1,str2,strlen(str2)+1);  
}
Alfredo
  • 29
  • 1
2
#include <ctype>

char * remove_spaces(char * source, char * target)
{
     while(*source++ && *target)
     {
        if (!isspace(*source)) 
             *target++ = *source;
     }
     return target;
}

Notes;

  • This doesn't handle Unicode.
quark
  • 15,590
  • 3
  • 42
  • 30
  • 3
    Won't this skip the first character? – Aaron Nov 13 '09 at 00:17
  • 2
    You should cast the value passed to `isspace` to `unsigned char`, since that function is defined to accept a value either in the range of `unsigned char`, or EOF. – caf Nov 13 '09 at 00:22
  • 2
    It still removes the first character, and fails if it is called with `target` contating '\0' in its first element (I don't get what is the purpose of checking its contents). Changing the `while(*source++ && *target) {...}` to `do {...} while(*source++);` seems to work fine. – mMontu May 24 '12 at 14:53
  • 1
    Did you mean `ctype.h`? – Spikatrix May 20 '15 at 14:29
  • 3
    1) Fails to remove an initial space in `source`. 2) Never appends a terminating null character to `target` if `source == ""`. 3) Depends on value in `target[0]`. – chux - Reinstate Monica May 20 '15 at 20:51
  • 4) `return target;` ?? – BLUEPIXY Mar 03 '17 at 04:15
1
#include<stdio.h>
#include<string.h>
main()
{
  int i=0,n;
  int j=0;
  char str[]="        Nar ayan singh              ";
  char *ptr,*ptr1;
  printf("sizeof str:%ld\n",strlen(str));
  while(str[i]==' ')
   {
     memcpy (str,str+1,strlen(str)+1);
   }
  printf("sizeof str:%ld\n",strlen(str));
  n=strlen(str);
  while(str[n]==' ' || str[n]=='\0')
    n--;
  str[n+1]='\0';
  printf("str:%s ",str);
  printf("sizeof str:%ld\n",strlen(str));
}
Praveen
  • 55,303
  • 33
  • 133
  • 164
1

The easiest and most efficient way to remove spaces from a string is to simply remove the spaces from the string literal. For example, use your editor to 'find and replace' "hello world" with "helloworld", and presto!

Okay, I know that's not what you meant. Not all strings come from string literals, right? Supposing this string you want spaces removed from doesn't come from a string literal, we need to consider the source and destination of your string... We need to consider your entire algorithm, what actual problem you're trying to solve, in order to suggest the simplest and most optimal methods.

Perhaps your string comes from a file (e.g. stdin) and is bound to be written to another file (e.g. stdout). If that's the case, I would question why it ever needs to become a string in the first place. Just treat it as though it's a stream of characters, discarding the spaces as you come across them...

#include <stdio.h>

int main(void) {
    for (;;) {
        int c = getchar();
        if (c == EOF) { break;    }
        if (c == ' ') { continue; }
        putchar(c);
    }
}

By eliminating the need for storage of a string, not only does the entire program become much, much shorter, but theoretically also much more efficient.

autistic
  • 1
  • 3
  • 35
  • 80
  • 2
    The question does not mention string literals at all. But you have to assume that a string literal can be passed to the function. And what if the input comes from somewhere else, for example you are writing some sort of text parser. – Lundin May 21 '15 at 11:41
  • When questioning the efficiency of a program we must consider the entire program, not just a small part of it. That is what I'm trying to get across here, and I think you missed that point, @Lundin. – autistic May 21 '15 at 23:50
1
/* Function to remove all spaces from a given string.
   https://www.geeksforgeeks.org/remove-spaces-from-a-given-string/
*/
void remove_spaces(char *str)
{
    int count = 0;
    for (int i = 0; str[i]; i++)
        if (str[i] != ' ')
            str[count++] = str[i];
    str[count] = '\0';
}
fja0568
  • 53
  • 4
0

Code taken from zString library

/* search for character 's' */
int zstring_search_chr(char *token,char s){
        if (!token || s=='\0')
        return 0;

    for (;*token; token++)
        if (*token == s)
            return 1;

    return 0;
}

char *zstring_remove_chr(char *str,const char *bad) {
    char *src = str , *dst = str;

    /* validate input */
    if (!(str && bad))
        return NULL;

    while(*src)
        if(zstring_search_chr(bad,*src))
            src++;
        else
            *dst++ = *src++;  /* assign first, then incement */

    *dst='\0';
    return str;
}

Code example

  Exmaple Usage
      char s[]="this is a trial string to test the function.";
      char *d=" .";
      printf("%s\n",zstring_remove_chr(s,d));

  Example Output
      thisisatrialstringtotestthefunction

Have a llok at the zString code, you may find it useful https://github.com/fnoyanisi/zString

fnisi
  • 1,181
  • 1
  • 14
  • 24
  • Why do you repeatedly check if the passed parameter is NULL, over and over again? The superfluous NULL check makes this the least efficient version of all posted. Why not use standard `strpbrk` instead of your home-brewed version? And where is the const correctness? – Lundin Mar 07 '16 at 12:46
  • Right, the first `if` statement could be removed and checkes could well be done within the logical test part of the `for` loop, thanks for that, I will look into that.....>> Why not use standard `strpbrk` instead of your home-brewed version? Just wrote this code (the whole zString thing) for fun, and tried not to use standard functions at all. So, no harm to say it is a _fun project_, but this of course should not stop anybody from contributing the code – fnisi Mar 08 '16 at 20:12
  • unlike what the comment says, `zstring_search_chr` does not return the *index of a chr*, its `char*` argument should be `const` qualified. The function `zstring_remove_chr` is quite inefficient. – chqrlie Oct 12 '16 at 21:28
  • @chqrlie, updated comments and the code for `zstring_remove_chr()`. I would love to see a more efficient version of `zstring_remove_chr()` or some recommendations from you. thanks – fnisi Oct 27 '16 at 21:42
  • 1
    You could post the code on http://codereview.stackexchange.com . I shall write a review if you do. There is indeed a few ideas for improvement. – chqrlie Oct 28 '16 at 12:52
0

That's the easiest I could think of (TESTED) and it works!!

char message[50];
fgets(message, 50, stdin);
for( i = 0, j = 0; i < strlen(message); i++){
        message[i-j] = message[i];
        if(message[i] == ' ')
            j++;
}
message[i] = '\0';
nAiN
  • 55
  • 2
  • 7
0

Here is the simplest thing i could think of. Note that this program uses second command line argument (argv[1]) as a line to delete whitespaces from.

#include <string.h>
#include <stdio.h>
#include <stdlib.h>

/*The function itself with debug printing to help you trace through it.*/

char* trim(const char* str)
{
    char* res = malloc(sizeof(str) + 1);
    char* copy = malloc(sizeof(str) + 1);
    copy = strncpy(copy, str, strlen(str) + 1);
    int index = 0;

    for (int i = 0; i < strlen(copy) + 1; i++) {
        if (copy[i] != ' ')
        {
            res[index] = copy[i];
            index++;
        }
        printf("End of iteration %d\n", i);
        printf("Here is the initial line: %s\n", copy);
        printf("Here is the resulting line: %s\n", res);
        printf("\n");
    }
    return res;
}

int main(int argc, char* argv[])
{
    //trim function test

    const char* line = argv[1];
    printf("Here is the line: %s\n", line);

    char* res = malloc(sizeof(line) + 1);
    res = trim(line);

    printf("\nAnd here is the formatted line: %s\n", res);

    return 0;
}
0

This is implemented in micro controller and it works, it should avoid all problems and it is not a smart way of doing it, but it will work :)

void REMOVE_SYMBOL(char* string, uint8_t symbol)
{
  uint32_t size = LENGHT(string); // simple string length function, made my own, since original does not work with string of size 1
  uint32_t i = 0;
  uint32_t k = 0;
  uint32_t loop_protection = size*size; // never goes into loop that is unbrakable
  while(i<size)
  {
    if(string[i]==symbol)
    {
      k = i;
      while(k<size)
      {
        string[k]=string[k+1];
        k++;
      }
    }
    if(string[i]!=symbol)
    {
      i++;
    }
    loop_protection--;
    if(loop_protection==0)
    {
      i = size;
      break;
    }
  }
}
  • *This is implemented in micro controller and it works*: I'm afraid not. This solution is very inefficient (quadratic time complexity) and incorrect: instead of setting the null terminator, it duplicates the last byte of the string. The `loop_protection` kludge was added to attempt to fix the infinite loop on strings containing only spaces. It does not fix the issue and may even backfire on long strings. Study much simpler solutions from the other answers. – chqrlie Aug 12 '21 at 13:04
0

While this is not as concise as the other answers, it is very straightforward to understand for someone new to C, adapted from the Calculix source code.

char* remove_spaces(char * buff, int len)
{
    int i=-1,k=0;
    while(1){
        i++;
        if((buff[i]=='\0')||(buff[i]=='\n')||(buff[i]=='\r')||(i==len)) break;
        if((buff[i]==' ')||(buff[i]=='\t')) continue;
        buff[k]=buff[i];
        k++;
    }
    buff[k]='\0';
    return buff;
}
-1

I assume the C string is in a fixed memory, so if you replace spaces you have to shift all characters.

The easiest seems to be to create new string and iterate over the original one and copy only non space characters.

stefanB
  • 77,323
  • 27
  • 116
  • 141
-1

I came across a variation to this question where you need to reduce multiply spaces into one space "represent" the spaces.

This is my solution:

char str[] = "Put Your string Here.....";

int copyFrom = 0, copyTo = 0;

printf("Start String %s\n", str);

while (str[copyTo] != 0) {
    if (str[copyFrom] == ' ') {
        str[copyTo] = str[copyFrom];
        copyFrom++;
        copyTo++;

        while ((str[copyFrom] == ' ') && (str[copyFrom] !='\0')) {
            copyFrom++;
        }
    }

    str[copyTo] = str[copyFrom];

    if (str[copyTo] != '\0') {
        copyFrom++;
        copyTo++;
    }
}

printf("Final String %s\n", str);

Hope it helps :-)

Jacob G.
  • 28,856
  • 5
  • 62
  • 116
eyaler
  • 39
  • 2