0

I have a C string and I'd like to shorten it, so that it gets cut from the first occurrence of '$'. Here's my code:

int char_search(char exp[], int s, char what) {
    int i, occurrence=-1;
    for (i=0; i < s && occurrence == -1; ++i)
        if (exp[i] == what)
            occurrence = i;

    return occurrence;
}

int shorten(char *exp, int maxlength, char *exp_new) {
    int l, i;
    l = char_search(exp, maxlength, '$');
    exp_new = (char *) malloc((l+1)*sizeof(char));
    exp_new[l] = '\0';
    for (i = 0; i<l; i++) 
        exp_new[i] = exp[i];

    return l;
}

The problem is it starts to overwrite the exp_new pointer address, and only copies the first char to the actual array. Also, exp_new returns NULL for some reason. (The string lengths might not be correct, but that shouldn't screw up the whole thing.)

  • OT: `shorten()` misses to test the result of `char_search()` against `-1`. – alk Nov 12 '13 at 10:16
  • I'll take care of that, that shouldn't be the cause of error either. However, thanks for the reminder! – balinthaller Nov 12 '13 at 10:18
  • 1
    [Please don't cast the return value of `malloc()` in C](http://stackoverflow.com/a/605858/28169). Also, use [`strchr()`](http://linux.die.net/man/3/strchr) when you need it. Finally, consider not scaling allocation sizes by `sizeof (char)` which is always 1 anyway. – unwind Nov 12 '13 at 10:19

4 Answers4

1

When you want to return a pointer through a function parameter you need to use a double indirection:

int shorten(char *exp, int maxlength, char **exp_new)

A better signature would be

char* shorten(char *exp, int maxlength, int *length)

where you return the result directly.

Edit: When the function name "shorten" describes what it is ment to do the obvious result should be the shortened string. The length of that string resulting from the search function is a negligible result. The function could be changed so that the length parameter could accept a NULL pointer. In that case you could omitting a length result if it's not interesting for the caller. But that depends on the environment.

harper
  • 13,345
  • 8
  • 56
  • 105
  • I'd better replace "*better*" with "*other*". As there is no obvious reason why the 2nd should be the preferable way to go. – alk Nov 12 '13 at 10:13
  • A _better_ signature would be `some_error_code shorten(const char* exp, int maxlength, char** exp_new, int* new_length);`. There are two error cases in the function that need to be checked. Generally in C program design, you want to reserve the return value for error codes. – Lundin Nov 12 '13 at 10:45
  • There are more than two error cases: negative maxlength, NULL or invalid pointer for the string that should be shortened, NULL or invalid ptr for the out paramter(s), out of memory. Interfaces are contracts and if the contracts says "the function indicates problem with a NULL return value" it is concluded. – harper Nov 12 '13 at 21:36
0

Several issues here:

  1. To return a value from a function

    • pass an address to an instance of what shall be returned and in the function dereference the address to store the data.
    • return the value from the function via return.
  2. When addressing array elements no negative values are needed and a simple int might not be wide enough, so use size_t. If as in your case -1 should also be used use ssize_t. Both such types are guaranteed by the C/POSIX Standard to be wide enough to address all of the machines memory and with this every possible array element.

  3. In C don't cast the result of malloc()/calloc()/realloc().

  4. Declare const what is constant. This helps the compiler to optimise and you to write better code.

  5. Always test functions which could fail whether they did. This applies to system calls as well as to your own functions.

    Error checking is missing for:

    • malloc()
    • char_search()

Taking into account the points above your code might be changed to look like the following:

 ssize_t char_search(const char * exp, const size_t s, const char what) 
 {
   ssize_t occurrence = -1;

   for (size_t i = 0; i < s && occurrence == -1; ++i)
   {
     if (exp[i] == what)
     {
       occurrence = i;
     }
   }

   return occurrence;
}

ssize_t shorten(const char * exp, const size_t maxlength, char ** pexp_new) 
{
  ssize_t l = char_search(exp, maxlength, '$');
  if ((NULL != pexp_new) && (-1 != l)) /* This way if passing in NULL 
                                      as 3rd argument you just get l. */
  {
    *pexp_new = malloc((l + 1) * sizeof **pexp_new);
    if (NULL = *pexp_new)
    {
      l = -1; /* Indicate failure. */
    }
    else
    {
      size_t i = 0;

      (*pexp_new)[l] = '\0';

      for (; i < l; ++i) 
      {
        (*pexp_new)[i] = exp[i];
      }
    }
  }

  return l;
}
alk
  • 69,737
  • 10
  • 105
  • 255
0

There are plenty of bugs in your code.

l = char_search(exp, maxlength, '$'); You need to check the result so that it isn't -1.

(char *) malloc((l+1)*sizeof(char)); You need to check the result of malloc. Also, never cast the result of malloc.

for (i = 0; i<l; i++) This doesn't copy the null termination, it is a bit strange that you null terminate first and copy the data afterwards. I would replace the loop with memcpy(exp_new, exp, l); and after that execute exp_new[l] = '\0';.

The true source of the problem is obscure variable names. Nobody, including yourself, can tell what l is supposed to mean. Call it str_length or something, so that the reader can easily tell that "ah yes this is the length of the string, so it doesn't include null termination". Furthermore l and 1 looks exactly the same on some code editors, making it an bad choice of variable name in general.

Also, exp should be const char* for both functions.

char *exp_new stores a pointer on the stack. That pointer only exists throughout the execution of the function. If you want to return the allocated memory to the caller, you must change the parameter to char** exp_new.

Generally however, returning pointers to malloc'd memory is somewhat bad and dangerous practice. Make sure that the memory is freed somewhere.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Referring the `0`-termination: There is this line `exp_new[l] = '\0';` in the OP's code. – alk Nov 12 '13 at 10:45
  • @alk Hmm, ah yeah so it isn't a bug, just weird order. At the point where the null termination is added, the data will be `"junkjunkjunk\0". And then the copy replaces the junk with data afterwards. You normally overwrite the junk first, then append the null. – Lundin Nov 12 '13 at 10:53
  • 1
    Edited the post so that it doesn't state that this is a bug, just a bit obscure coding. – Lundin Nov 12 '13 at 10:54
0

Also, exp_new returns NULL for some reason. (The string lengths might not be correct, but that shouldn't screw up the whole thing.)

This is a bit worrying. You dismiss the fact that exp_new "returns" NULL, when this is a major warning sign that something isn't right.

The last thing I would expect is that writing to a null pointer would produce useful results.

Are you describing the problem properly?

If you want to return the exp_new to the caller you won't do it like this. You have to declare it as char **, and assign the locally allocated string

char *local_exp_new = malloc(...) ;
*exp_new = local_exp_new ;
user1338
  • 225
  • 1
  • 6