-7

How can I free() the variable in freeme later if I don't know the length of freeme ? I have a handle but I don't know the length. I can add a counter and then just loop it at the end of the functions that calls this but there's usually a better style than the most obvious way to do it. I'd rather not have another variable in the function. I'm already sceptical about including booleans .

bool parse(bool b1, int i, int *j, int *e, char **ptr, int q, char **pString, char *str,
           char **freeme) {

    for (*e = 0; *(ptr + q + *e); (*e)++) {
        b1 = true;
        if (*(ptr + *e + q)) {
            str = concat(*pString, *(ptr + *e + q));
            *pString = concat(str, " ");
            free(str);
            freeme[*e] = *pString;
        }
        *j = *e;
    }
    return b1;
}

If I understand correctly I must free() all *pString in the freeme array when I'm done. concat looks like this and I took it from an answer here at SO. How do I concatenate two strings in C?

char *concat(char *s1, char *s2) {
    char *result = malloc(strlen(s1) + strlen(s2) + 1);//+1 for the zero-terminator
    //in real code you would check for errors in malloc here
    if (result == NULL) {
        fprintf(stderr, "malloc failed!\n");
        return (char *) '0';
    }
    strcpy(result, s1);
    strcat(result, s2);
    return result;
}
Community
  • 1
  • 1
Montao
  • 245
  • 1
  • 3
  • 12
  • By letting the OS free the memory used? – MikeCAT May 05 '16 at 05:35
  • @MikeCAT But this is my OS – Montao May 05 '16 at 05:36
  • 1
    You should provide a little bit more background then. What is your function trying to do? – Misguided May 05 '16 at 05:38
  • What does `concat` do differently than `strcat` ? Your `parse` function is very strange looking and doesn't look well formed. Perhaps just telling us what you want this function to do and what the meaning of all those parameters such as i,j, and e are. I could probably suggest a better version.... – selbie May 05 '16 at 05:40
  • @Julián It is part of a scanner/lexer/tokenizer, it's a helper function for tokenizing shell command language. I will work on the names so that the functions can stand for themselves with more meaningful variable names, but I wrote 2000 lines in just a few days that I must refactor so I let the IDE make the variable names for me and then I refactor it so right now the variable names are not that meaningful. But the function is parsing and tokenizing a text that is a shell program so that I can execute shell code with my C program. – Montao May 05 '16 at 05:41
  • @selbie I posted the source for `concat`. I'm a beginner at this detailed level of C and I'm not familiar with all the libraries. My targets are OpenBSD and Linux. – Montao May 05 '16 at 05:42
  • @kaylum It's now included: `char *result = malloc(strlen(s1) + strlen(s2) + 1);/` – Montao May 05 '16 at 05:43
  • `return (char *) '0';` is not a null pointer!! – Paul Rooney May 05 '16 at 05:44
  • While this might not be liked by others, I dare say that you probably should not use raw C language for parsers unless you have no other choice. If you used e.g. C++ you could save a lot of time searching bugs related to raw strings, memory management, ... – BitTickler May 05 '16 at 05:44
  • @BitTickler I also do this to learn C and learn to use C debuggers. I come from OOP (Java/Python) and I could do this i python but I want to learn C. I also feel that the things should be "object" - would be great to invoke methods on my objects and data. But I want to use types that everybody understands. – Montao May 05 '16 at 05:46
  • He's writing his own OS, he doesn't have a choice in using C++. @Montao, why are you checking *(ptr + *e + q) inside the loop? that's unnecesary. Aside from that, you seem to overwrite str as soon as you get it, why even declare it as a parameter?.... – Misguided May 05 '16 at 05:47
  • 1
    Are you asking how many elements need to be freed in `freeme`? Assuming `freeme` is dynamically allocated then C does not provide any automatic way for you to know how large that memory block is (ie, how many elements it has if used as an array). You need to keep track of the allocated size yourself. – kaylum May 05 '16 at 05:48
  • 1
    *coughs* He could use Rust language *coughs* C is not the only system programming language. – BitTickler May 05 '16 at 05:49
  • @BitTickler But Rust will have its own type system nobody recognizes. Everybody recognizes a `char`. – Montao May 05 '16 at 05:50

3 Answers3

2

One important aspect in learning C is to learn making good choices regarding abstractions. A lexer/parser is complicated enough by itself and should not be littered with raw low level pointer and string manipulations, which are error prone and reduce readability.

So, instead of writing your parse code, you might want to write yourself a few robust and testable abstractions (as far as abstractions go in C...) first.

Your freeme argument of your parse function, for example looks like a StringList to me. And so it would be worth it to write yourself a StringList module first, with the usual operations of a list.

Also, especially in the context of parsers, strings and other dynamically allocated objects often are kept in more than one data structure. As such, the question arises, which container "owns" the instance. I.e. who can free it? This would be another fundamental aspect which, if addressed properly could help writing the parser itself in a more concise and less error prone way.

With all that in place, your question as such is obsolete, as the StringList module would address those nitty gritty details instead of your parser code.

Last not least, you will be able to pick ready-made abstractions for such things, as they have been written and used thousands of times before. FreeBSD kernel code is pure C language and has very clear and well tested code and I am sure you will find something suitable.

Applying this advice might make your code look a bit more like this:

bool parse(InputIterator * source_iter, StringList * tokens ) 
{
     char currentToken[MAX_TOKEN_LENGTH];
     /* ... */
     for( ; !InputIterator_end(source_iter); InputIterator_advance(source_iter, 1)) {
         char current = InputIterator_current(source_iter);
         if(isws(current)) {
            StringList_append( tokens, currentToken);
            skip_whitespaces(source_iter); /* another helper function of yours*/
         }
         else {
             string_append_char( currentToken, sizeof(currentToken), current); // no need to write this logic 100 times all over your parser.
         }
     }
     return true; 
}

In my experience, this style of C programming works well, yields testable, small functions and is more readable compared to the "all in one" approach.

BitTickler
  • 10,905
  • 5
  • 32
  • 53
2

If I read your parse function correctly, it appears to be iterating over a list of strings specified by ptr. The list itself is null terminated and the starting offset in the list is at index q.

The parse function's goal appears to be to produce an output string that is the concatenation of all the input strings but with a delimiter string, str and a space inserted between each. Something like the following:

 result = ""
 for each string s in ptr:
     result += s
     result += delimiter
     result += " "
 return result;

What makes it confusing is that some of the input parameters to the parse function are actually just local variables or not used at all. This includes e',i, andb1`.

So what makes your problem hard is the memory management of concatenating each string together and trying to keep track of freeing each temporary allocation later. What you really need is a string class. But this is C, not C++. Fortunately, we can overcome this with a struct type and some worthy helper functions.

I would suggest let's start with a worthy string class optimized for concatenation. Let's define a "SimpleString" as follows:

typedef struct _SimpleString
{
    char* str;          // the actual null terminated string
    size_t length;      // length of psz, not including null char
    size_t allocated;   // amount of memory malloc'd including room for null char
} SimpleString;

Where the str of this struct is the raw string pointer.

Now let's create a simple "constructor" function to create a string:

SimpleString* create_string()
{
    SimpleString* s = (SimpleString*)malloc(sizeof(SimpleString));
    if (s == NULL)
    {
        return NULL; // out of memory
    }

    s->str = malloc(1);

    if (s->str == NULL)
    {
        free(s);
        return NULL;
    }

    s->str[0] = '\0';
    s->length = 0;
    s->allocated = 1;

    return s;
}

The above function will "malloc" an instance of SimpleString and return the pointer to it. The inner member of SimpleString, str, is initialized to an empty string itself.

Now whenever we want to concatenate into this string, we can use our helper function as follows:

int concat_string(SimpleString* s, const char* p)
{
    size_t needed = 0;
    size_t p_len = p ? strlen(p) : 0;

    if (p == NULL)
    {
        return 0;
    }

    if (p_len == 0)
    {
        // nothing to do
        return 1;
    }

    if (s->str)
    {
        needed += s->length;
    }

    needed += p_len;
    needed += 1; // for null char

    if (needed > s->allocated)
    {
        size_t newallocation = needed * 2; // allocate more than needed so that we don't have to reallocate and re-copy the buffer on each call to concat_string
        char* newstring = malloc(newallocation);
        if (newstring == NULL)
        {
            // out of memory
            return 0;
        }

        newstring[0] = '\0';

        s->allocated = newallocation;

        if (s->str && (s->length > 0))
        {
            memcpy(newstring, s->str, s->length);
        }

        free(s->str);
        s->str = newstring;
    }

    memcpy(s->str + s->length, p, p_len);
    s->str[s->length + p_len] = '\0';
    s->length += p_len;
    return 1;
}

The above function will take care of reallocating additional memory (and freeing old memory) as the string grows.

Now finally a helper function to release the string:

void release_string(SimpleString* s)
{
    if (s)
    {
        free(s->str);
        free(s);
    }
}

Now we can greatly simplify your parse function as follows:

SimpleString* parse(char **list, int q, const char *delimiter)
{
    SimpleString* result = NULL;
    char *current = list[q];
    int count = 0;

    while (current != NULL)
    {
        if (result == NULL)
        {
            result = create_string();
            if (result == NULL)
            {
                // error! out of memory
                break;
            }
        }

        concat_string(result, current);
        concat_string(result, delimiter);
        concat_string(result, " ");

        count++;
        current = list[q + count];
    }

    return result;
}

In the above function, I renamed your ptr parameter to be list and your str parameter to be delimiter. I left q alone as the initial offset into the list.

I suspect that you really wanted to only concatenate the trailing space on all but the last iteration. If that's the case, I'll leave that as an exercise for you.

And in your original implementation, you were using parameter j to indicate the count of strings used and returned as an out parameter. You can easily add that back. (e.g. *j = count;)

Sample usage:

int main()
{
    char* list[] = { "Mecury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune", NULL };

    SimpleString* result = parse(list, 0, ",");

    printf("%s\n", result->str);

    release_string(result);

    return 0;
}
selbie
  • 100,020
  • 15
  • 103
  • 173
1

You can introduce another parameter to count the elements in freeme.

bool parse(bool b1, int i, int *j, int *e, char **ptr, int q, char **pString, char *str,
           char **freeme, int *freeme_len) {

    for (*e = 0; *(ptr + q + *e); (*e)++) {
        b1 = true;
        if (*(ptr + *e + q)) {
            str = concat(*pString, *(ptr + *e + q));
            *pString = concat(str, " ");
            free(str);
            //freeme[*e] = *pString;
            freeme[(*freeme_len)++] = *pString;
        }
        *j = *e;
    }
    return b1;
}

After calling parse() you iterate all freeme elements and free them.

void main() {
  //The following line outlines that freeme is allocated somewhere and it's type is char**
  //and should not be taken literally. It's size should be calculated, according 
  //to the situation, or should be dynamically resized during operations.
  char **freeme = malloc(1000 * sizeof(char*));

  //freeme_len should be initialized to 0 before first calling to parse().
  int freeme_len = 0;

  //here follows some program code, that we don't know about, which declares and 
  //initializes all remaining variables
  parse(b1, i, &j, &e, ptr, q, &pString, str, freeme, &freeme_len);

  //After calling, free all
  for(i=0; i<freeme_len; i++) {
    free( freeme[i] );
  }

  free(freeme);
}
Anton Angelov
  • 1,223
  • 7
  • 19
  • 1
    Simple and straight! It worked with a very small change. BTW you can't change my `e`, I need that for a tokenization. So I make 2 lines from the one you change and just increment on one line and keep my old line. Now I have 0 errors in Valgrind and 0 memory leaks. – Montao May 05 '16 at 08:00
  • This is still really dangerous code - how much *capacity* is available in `freeme`? It looks like it's being used as a list with no bounds checking. (Actually, the entire interface looks wrong, but as the question has no `main()` or other context, we're left guessing...) – Toby Speight May 05 '16 at 08:53
  • The code in main() is symbolical and it's aim is to show how to use the variable `freeme_len` to unallocate the remaining strings. We don't know what capacity does `freeme` have. I wrote this malloc(1000*4) just to outline that it has to be allocated somewhere. I'll edit the code to add comments about that. I don't argue if it's right or wrong. We don't know the bigger picture. It's just simple concrete solution for concrete problem. – Anton Angelov May 05 '16 at 09:08