-2

I have a problem with printing a string in C (well, the string that *ptr points to).

I have the following code:

char *removeColon(char *word) {
    size_t wordLength;
    char word1[MAXLENGTH];

    wordLength = strlen(word);
    wordLength--;
    memcpy(word1, word, wordLength);
    printf("word1: %s\n", word1);
    return *word1;
}

I ran this with word = "MAIN:" (the value of word comes from strtok on a string read from a file). It works fine until the printf, where the result is:

word1: MAIN╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠

and then there is an exception and everything breaks.

Any thoughts?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Elad Edri
  • 331
  • 2
  • 9
  • 4
    null terminator, (guess) – Martin James Mar 20 '16 at 01:10
  • Can you please provide more details? I tried to add '\0' to the end of word1, but it didn't help. – Elad Edri Mar 20 '16 at 01:11
  • If it's not, I may read the question later. – Martin James Mar 20 '16 at 01:11
  • 1
    When using memcpy() you need to append a null byte '\0' to the string, before doing any output with it.*(word1+wordLength +1)=0; is one way to do it. – Arif Burhan Mar 20 '16 at 01:13
  • C does not have a string type. It is all convention of the library using a sequence of `char` (i.e. an array). `memcpy` is not called `str(ing)cpy` for a good reason. – too honest for this site Mar 20 '16 at 01:13
  • 1
    C 'strings' are null-terminated char arrays. If you don't put the null at the end, the char array stops being a C-string and becomes undefined behaviour. – Martin James Mar 20 '16 at 01:13
  • A screenful of crap followed by a segfault is not unexpected behavior. – Martin James Mar 20 '16 at 01:15
  • 1
    `return *word1;` is an error. Your compiler would tell you about this. You must fix the things the compiler tells you about before trying to run your program. – M.M Mar 20 '16 at 01:16
  • ^^ oh, and all the other bugs, like returning pointers to local vars :( – Martin James Mar 20 '16 at 01:17
  • `gcc -Wall -Wextra file.c -o file` and possibly with the `-Werror` and/or `-g` flag as well. – RastaJedi Mar 20 '16 at 01:46
  • 1
    @MartinJames: *a screenful of crap followed by a segfault* **is** a good example of undefined behavior. Btw, the OP is only attempting to return a pointer to local storage: in reality, he is returning a `char`, which is incorrect given the function return type. I am so fed up with compilers that accept such errors with or even without a mere warning message. – chqrlie Mar 20 '16 at 01:46
  • @chqrlie it also accepts it even if there is a proper prototype that doesn't have an empty parameter list or a proper? What I'm getting at is that whole "non-prototype definition" thing--is it properly prototyped so the compiler knows about the type and number of arguments? – RastaJedi Mar 20 '16 at 01:49
  • @EladEdri [This](http://stackoverflow.com/questions/35923938/having-issues-when-printing-an-array-of-chars/35924131#35924131) has a decent explanation about null-terminated strings. – RastaJedi Mar 20 '16 at 01:51
  • @RastaJedi: I am not referring to the prototype, even just the return type is wrong: with `clang -Weverything` I only get a warning: `remlast.c:15:12: warning: incompatible integer to pointer conversion returning 'char' from a function with result type 'char *'; remove * [-Wint-conversion]`. `-Werror` is definitely useful. – chqrlie Mar 20 '16 at 02:05
  • ╠ is 0xCC in codepage 437, and [MSVC fills 0xCC to uninitialized memory to help debugging](https://stackoverflow.com/q/370195/995714). That means you've accessed uninitialized memory. You can find tons of questions about ╠ and 0xCC here on SO – phuclv Aug 18 '18 at 10:55

4 Answers4

1

You must make sure (1) each string is nul-terminated, and (2) you are not attempting to modify a string-literal. You have many approaches you can take. A simple approach to remove the last character (any char) with strlen:

char *rmlast (char *s)
{
    if (!*s) return s;      /* return if empty-string */
    s[strlen (s) - 1] = 0;  /* overwrite last w/nul   */
    return s;
}

(you can also use the string.h functions strchr (searching for 0), strrchr (searching for your target char, if passed), strpbrk (searching for one of several chars), etc.. to locate the last character as well)

Or you can do the same thing with pointers:

char *rmlast (char *s)
{
    if (!*s) return s;  /* return if empty-string */
    char *p = s;

    for (; *p; p++) {}  /* advance to end of str  */
    *--p = 0;           /* overwrite last w/nul   */

    return s;
}

You can also pass the last character of interest if you want to limit removal to any specific character and make a simple comparison in the function before overwriting it with a nul-terminating character.

Look over both and let me know if you have any questions.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • Isn't 0 (an integer) different than '\0'? – RastaJedi Mar 20 '16 at 05:01
  • No, the `nul-character` `'\0'` has the ASCII value of `0` -- there is no difference. It just takes more typing to type the four characters `'\0'` than one `0` `:)` – David C. Rankin Mar 20 '16 at 05:27
  • So it's just the character representation. I wasn't sure if the integer zero was somehow represented by something other than the "null byte". Obviously '0' is totally different but that didn't even need to be stated. – RastaJedi Mar 20 '16 at 05:31
  • Yep, you got it ASCII `'\0'` is `0`, like just like ASCII `'0'` is `0x30` (decimal `48)`. Take a quick look at [**ASCIItable.com**](http://www.asciitable.com/). – David C. Rankin Mar 20 '16 at 05:44
1

Your function removeColon should either

  • operate in place and modify the string passed as an argument
  • be given a destination buffer and copy the shortened string to it or
  • allocate memory for the shortened string and return that.

You copy just the characters into the local array, not the null terminator, nor do you set one in the buffer, passing this array to printf("%s", ...) invokes undefined behavior: printf continues printing the buffer contents until it finds a '\0' byte, it even goes beyond the end of the array, invoking undefined behavior, printing garbage and eventually dies in a crash.

You cannot return a pointer to an automatic array because this array becomes unavailable as soon as the function returns. Dereferencing the pointer later will invoke undefined behavior.

Here is a function that works in place:

char *removeColon(char *word) {
    if (*word) word[strlen(word) - 1] = '\0';
    return word;
}

Here is one that copies to a destination buffer, assumed to be long enough:

char *removeColon(char *dest, const char *word) {
    size_t len = strlen(word);
    memcpy(dest, word, len - 1);
    dest[len - 1] = '\0';
    return dest;
}

Here is one that allocates memory:

char *removeColon(const char *word) {
    size_t len = strlen(word);
    char *dest = malloc(len);
    memcpy(dest, word, len - 1);
    dest[len - 1] = '\0';
    return dest;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
0
wordLength = strlen(word);

You have to include the null terminator in the length, because every string has a terminating character whose ASCII value is 0, spelled \0 in C. Also, use the str... family of functions instead of mem..., since the former is intended for null terminated strings, but the latter for arrays. In addition, you cannot return a local stack allocated array. Based on the code of the function, it sounds like you're removing the last character. If that is the case, it is better to do

void remlast(char *str)
{
    str[strlen(str) - 1] = '\0';
}

Note that this does not work on empty strings.

lost_in_the_source
  • 10,998
  • 9
  • 46
  • 75
0

You copy over wordLength bytes, but you fail to add a null terminating byte. Because word1 is uninitialized prior to this copy, the remaining bytes are undefined.

So when printf attempts to print the string, it doesn't find a null terminator and keeps reading until it finds a null byte somewhere outside the bounds of the array. This is undefined behavior.

After copying the bytes, you need to manually add the null terminator:

memcpy(word1, word, wordLength);
word1[wordLength] = '\0';

Also, you're returning a pointer to a local variable. When the function returns, that variable is out of scope, and dereferencing that pointer is also undefined behavior.

Rather than making word1 a local array, you can allocate memory dynamically for it:

char *word1 = malloc(strlen(word));

If you do this, you'll need to free this memory somewhere in the calling function. The other option is to have the caller pass in a buffer of the proper size:

void removeColon(char *word, char *word1) {
dbush
  • 205,898
  • 23
  • 218
  • 273