0

I'm trying to break a string str containing the bar symbol | into an array of strings(output), with the delimiter being |, which will not be included in the array of strings. There will be 20 elements in output. This code belongs to a function that will return output pointer, which is why I need dynamic allocation.

I'm trying to do this without using the sscanf() function.

Example: if str is "abc|def||jk" then this is what output should look like at the end (less than 20 elements for demonstration purpose):

output[0]=="abc"

output[1]=="def"

output[2]==""

output[3]=="jk"

However, I always get an error exit code, something like:

Process finished with exit code -1073740940 (0xC0000374)

When debugging I found out that the first string element is parsed correctly into output, but the second element is produced correctly sometimes, and other times I ran into trouble.

Code below:

char **output = (char**) calloc(20, 20*sizeof(char));
int begin = 0;
int end = 1;
// index that counts output element
int arrayIndex = 0;
int i;
for(i = 0; i < strlen(str); i++) {
    end = i;
    bool endOfString = false;
    // there is no '|' symbol at the end of the string
    if (*(str+i) == '|' || (endOfString = (i+1)==strlen(str))) {
        end = endOfString ? (end+1):end;

        // problem here. Assembly code poped up when debugging (see image below)
        char *target = (char*) calloc(end-begin+1, sizeof(char));

        // give proper value to target
        strcpy(target, str+begin);
        *(target+end-begin) = '\0';
        *(output+arrayIndex) = target;
        // increase proper indexes
        begin = i + 1;
        arrayIndex++;

    }
}

The worst of all is that I cannot debug it because a window with assembly code pops up the instance I step over the calloc function when debugging. Assembly code window

I used gdb too, but it didn't work:

56 char target = (char) calloc(length, sizeof(char));

(gdb) n

warning: Critical error detected c0000374

Thread 1 received signal SIGTRAP, Trace/breakpoint trap.

0x00007ffded8191f3 in ?? ()

(gdb) bt

#0 0x00007ffded8191f3 in ?? ()

Backtrace stopped: previous frame identical to this frame (corrupt stack?)

(gdb) continue

Continuing.

gdb: unknown target exception 0xc0000374 at 0x7ffded819269

Community
  • 1
  • 1
Jason
  • 15
  • 1
  • 5
  • That's just the `calloc` function's assembly. You can ignore it, it's pointless. Instead, use a real command-line debugger like `gdb`. – S.S. Anne Oct 16 '19 at 00:08
  • 1
    @JL2210 Sorry I meant to say "step over" instead of "step into". I used gdb and it didn't work either. Just edited my question. – Jason Oct 16 '19 at 00:15
  • Once you get `SIGTRAP`, type in `bt` or `continue`. – S.S. Anne Oct 16 '19 at 00:16
  • @JL2210 It shows "Backtrace stopped: previous frame identical to this frame (corrupt stack?)" – Jason Oct 16 '19 at 00:17
  • Looks like you're corrupting the stack. I have to go. Good night. – S.S. Anne Oct 16 '19 at 00:18
  • This isn't a [MCVE]. We have no idea where `str` comes from, nor is it compilable on its own. You clearly have issues with `output` (which you allocate as if you mean to make a contiguous 2D array of `char`, but then use like an array of `char*`; [there's a difference](https://stackoverflow.com/q/17477234/364696).), and don't understand the behavior of `strcpy` (`strcpy(target, target);` is violates the `strcpy` API requirements, which demand no overlap in arguments), but neither of those problems should cause stack corruption; you probably have yet more problems in the code you haven't shown – ShadowRanger Oct 16 '19 at 01:07
  • 1) See the answer here for proper use of calloc for what you are trying to do: https://stackoverflow.com/questions/24345089/using-calloc-to-set-up-char-array-also-freeing-array-when-done. 2) Make sure 'str' is null-terminated. Otherwise strlen will probably not return a correct value – MikeFromCanmore Oct 16 '19 at 02:01

1 Answers1

0

(edit) indexing was OK. Couple of notes: sizeof(char) is always 1 -- just use 1. Further, there is no need to cast the return of malloc, it is unnecessary. See: Do I cast the result of malloc?. Additionally, how many times do you call strlen(str)? (hopefully optimization limits the loop condition to a single call, but you could potentially be calling strlen(str) for every iteration). Same with endOfString = (i+1)==strlen(str). Save the length of the string before entering the loop and then just use the saved value to compare against.

While you can count indexes as you have and loop character-by-character looking for delimiters, it is far more efficient to let strchr (pointer, '|') advance to the next delimiter (or return NULL indicating no more delimiters remain). Then rather than worrying about indexes, simply keep a pointer p and end-pointer ep to advance down the string, e.g.

#define NFIELDS 20
...
char **splitstr (const char *s, const char delim, size_t *n)
{
    const char  *p = s,     /* pointer for parsing */
                *ep = s;    /* end pointer for parsing */
    *n = 0;                                 /* zero string counter */
    while (*n < NFIELDS && *p) {            /* loop while less than NFIELDS */
        size_t len;
        if ((ep = strchr (p, delim)))       /* get pointer to delim */
            len = ep - p;
        else
            len = strlen (p);               /* or get length of final string */
        if (!(output[*n] = malloc (len + 1))) { /* allocated for string */
        ...
        memcpy (output[*n], p, len);        /* copy chars to output[n] */
        output[(*n)++][len] = 0;            /* nul-terminate to make string */
        if (!ep)                            /* if delim not found, last */
            break;
        p = ++ep;                           /* update p to 1-past delim */
    }
    ...

Adding appropriate error checking and returning the pointer to the allocated strings, you could do:

char **splitstr (const char *s, const char delim, size_t *n)
{
    const char  *p = s,     /* pointer for parsing */
                *ep = s;    /* end pointer for parsing */
    char **output = calloc (NFIELDS, sizeof *output);   /* pointer to output */

    if (!output) {
        perror ("calloc-output");
        return NULL;
    }

    *n = 0;                                 /* zero string counter */
    while (*n < NFIELDS && *p) {            /* loop while less than NFIELDS */
        size_t len;
        if ((ep = strchr (p, delim)))       /* get pointer to delim */
            len = ep - p;
        else
            len = strlen (p);               /* or get length of final string */
        if (!(output[*n] = malloc (len + 1))) { /* allocated for string */
            perror ("malloc-output[n]");
            while ((*n)--)          /* free prior allocations on failure */
                free (output[*n]);
            free(output);
            return NULL;
        }
        memcpy (output[*n], p, len);        /* copy chars to output[n] */
        output[(*n)++][len] = 0;            /* nul-terminate to make string */
        if (!ep)                            /* if delim not found, last */
            break;
        p = ++ep;                           /* update p to 1-past delim */
    }

    return output;      /* return pointer to allocated strings */
}

In your case a complete short example would be:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NFIELDS 20
#define MAXC  1024

char **splitstr (const char *s, const char delim, size_t *n)
{
    const char  *p = s,     /* pointer for parsing */
                *ep = s;    /* end pointer for parsing */
    char **output = calloc (NFIELDS, sizeof *output);   /* pointer to output */

    if (!output) {
        perror ("calloc-output");
        return NULL;
    }

    *n = 0;                                 /* zero string counter */
    while (*n < NFIELDS && *p) {            /* loop while less than NFIELDS */
        size_t len;
        if ((ep = strchr (p, delim)))       /* get pointer to delim */
            len = ep - p;
        else
            len = strlen (p);               /* or get length of final string */
        if (!(output[*n] = malloc (len + 1))) { /* allocated for string */
            perror ("malloc-output[n]");
            while ((*n)--)          /* free prior allocations on failure */
                free (output[*n]);
            free(output);
            return NULL;
        }
        memcpy (output[*n], p, len);        /* copy chars to output[n] */
        output[(*n)++][len] = 0;            /* nul-terminate to make string */
        if (!ep)                            /* if delim not found, last */
            break;
        p = ++ep;                           /* update p to 1-past delim */
    }

    return output;      /* return pointer to allocated strings */
}

int main (void) {

    char buf[MAXC],         /* buffer for input */
        **output = NULL;    /* pointer to split/allocated strings */
    size_t n = 0;           /* number of strings filled */

    if (!fgets (buf, MAXC, stdin)) {    /* validate input */
        fputs ("error: invalid input.\n", stderr);
        return 1;
    }
    buf[strcspn (buf, "\n")] = 0;       /* trim newline from buf */
    /* split buf into separate strings on '|' */
    if (!(output = splitstr (buf, '|', &n))) {
        fputs ("error: splitstr() failed.\n", stderr);
        return 1;
    }

    for (size_t i = 0; i < n; i++) {    /* loop outputting each & free */
        printf ("output[%2zu]: %s\n", i, output[i]);
        free (output[i]);   /* free strings */
    }
    free (output);          /* free pointers */
}

Example Use/Output

$ echo "abc|def||jk" | ./bin/splitstr
output[ 0]: abc
output[ 1]: def
output[ 2]:
output[ 3]: jk

Memory Use/Error Check

In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.

It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.

For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.

$ echo "abc|def||jk" | valgrind ./bin/splitstr
==32024== Memcheck, a memory error detector
==32024== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==32024== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==32024== Command: ./bin/splitstr
==32024==
output[ 0]: abc
output[ 1]: def
output[ 2]:
output[ 3]: jk
==32024==
==32024== HEAP SUMMARY:
==32024==     in use at exit: 0 bytes in 0 blocks
==32024==   total heap usage: 5 allocs, 5 frees, 172 bytes allocated
==32024==
==32024== All heap blocks were freed -- no leaks are possible
==32024==
==32024== For counts of detected and suppressed errors, rerun with: -v
==32024== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Always confirm that you have freed all memory you have allocated and that there are no memory errors.

Look things over and let me know if you have further questions.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85