-9

it is realloc problem again. It seems that I could not find any similar problems in a lot of previous realloc statements. I would appreciate your interest.

I am trying to read text input of format: g:<expression>;0,1,0,1,0. I have a set of lines of this format in the source text file and a snippet of code (that follows) reading this line in a "for" loop. expression between : and ; is read into propstr 2D char array. Everything after ; determines a vector (of size DIM) of numbers separated by comma ,. This vector is read into upstr (just as a string) and then converted into array of integers upvec by function process_update_vector. At every iteration of the loop realloc is used to adjust the sizes of the mentioned arrays (propstr,upstr and upvec). Number of lines read so far in the loop is NREAC. Here is the code:

/*Before this point, current line in the source is read into `temp'*/
NREAC++;
for(i=0;i<strlen(temp);i++){
 if(temp[i]==':') colon=i;//Here we find colon
 if(temp[i]==';') semicolon=i;//...and semicolon positions
}
memset(temp1,'\0',STRLEN);
if(NREAC==1)
 ptrchar=(char **)malloc(sizeof(char *));
else
 ptrchar=realloc(propstr,NREAC*sizeof(char *));
if(ptrchar==NULL){
 fprintf(stderr,"Error: could not allocate memory for propstr\n");
 if(propstr!=NULL) free(propstr);
 return 1345;
}else{propstr=ptrchar;ptrchar=NULL;}
propstr[NREAC-1]=(char *)malloc((semicolon-colon)*sizeof(char));
if(propstr[NREAC-1]==NULL){
 fprintf(stderr,"Error: couldn't get memory for propstr[NREAC-1]\n");
 return 1344;
}
for(i=colon+1;i<semicolon;i++)/*Copy the propensity part of the line*/
 temp1[i-colon-1]=temp[i];
temp1[i-colon-1]='\n';/*Include newline symbol for correct parsing*/
strcpy(propstr[NREAC-1],temp1);
memset(temp1,'\0',STRLEN);
if(NREAC==1)
 ptrchar=(char **)malloc(sizeof(char *));
else
 ptrchar=realloc(upstr,NREAC*sizeof(char *));
if(ptrchar==NULL){
 fprintf(stderr,"Error could not allocate memory for upstr\n");
 if(upstr!=NULL) free(upstr);
 return 1343;
}else{upstr=ptrchar;ptrchar=NULL;}
upstr[NREAC-1]=(char *)malloc((strlen(temp)-semicolon-1)*sizeof(char));
if(upstr[NREAC-1]==NULL){
 fprintf(stderr,"Error: couldn't get memory for upstr[NREAC-1]\n");
 return 1342;
}
if(strlen(temp)-semicolon==2){/*No vector is specified*/
 fprintf(stderr,"Error: no update vector found:\n");
 fprintf(stderr,"`%s'",temp);
 return 1;
}
if(NREAC==1)
 ptrint=(int **)malloc(sizeof(int *));
else
 ptrint=(int **)realloc(upvec,NREAC*(sizeof(int *)));/*!!!!!!!!!!!!!!!!!!!!*/
if(ptrint==NULL){
 fprintf(stderr,"Error: could not allocate memory for upvec\n");
 if(upvec!=NULL) free(upvec);
 return 1341;
}else{upvec=ptrint;ptrint=NULL;}
upvec[NREAC-1]=(int *)malloc(DIM*sizeof(int));
if(upvec[NREAC-1]==NULL){
 fprintf(stderr,"Error: couldn't get memory for upvec[NREAC-1]\n");
 return 1340;
}
for(i=semicolon+1;i<strlen(temp)-1;i++)
 temp1[i-semicolon-1]=temp[i];
temp1[i-semicolon-1]='\n';/*Include newline for more convenient way of parsing*/
strcpy(upstr[NREAC-1],temp1);
/*Get update vector*/
upvec[NREAC-1]=process_update_vector(upstr[NREAC-1],upvec[NREAC-1]);

memset(temp1,'\0',STRLEN);
memset(temp,'\0',STRLEN);
continue;

This snippet appears in the for loop. The "invalid pointer" error appears in the place marked with /*!!!!!!!!!!!!!!!!!!!!*/.

Conditions of the error. For small enough DIM everything works fine and always worked. At some point, I had to increase DIM up to 11 and then the error occured in the middle of the parsing procedure (It is usual error, I guess, *** glibc detected *** dinamica: realloc(): invalid pointer: 0x000000000165d190 ***). The value of NREAC seems to be not affecting the realloc behavior. It is always the same place in the code where the error occurs. Do I wrongly allocate memory for int type variable, since the allocation for char type was never a problem?

The process_update_vector function:

int * process_update_vector(const char *upstr,int *upvec)
{
   int i,j,k;
   char symbuf[5];/*5 symbols, max 99999 is possible*/
   i = 0;
   j = 0;
   k = 0;
   while(upstr[i] != '\n'){
      if(upstr[i] == ','){/*',' is the delimiter*/
         symbuf[j] = '\0';
         j = 0;
         upvec[k] = atoi(symbuf);
         k++;
         i++;
         continue;
      }
      symbuf[j] = upstr[i];
      j++;
      i++;
   }
   /*For the last entry*/
   upvec[k] = atoi(symbuf);
   k++;
   return upvec;
}
Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • 13
    wall of ugly code, down vote – thecoshman Oct 24 '12 at 12:22
  • 3
    seriously, learn about coding conventions and stick to them *at least* in some cases. (Spaces are *not* poisoned!) – Niklas R Oct 24 '12 at 12:32
  • 3
    Some tips for making your code readable: 1. Use more than one space for indentation. 2. Don't put multiple statement in one line. 3. Put spaces around operators. 4. Instead of duplicating blocks of code, put it in a function. – interjay Oct 24 '12 at 12:32
  • Can *you* even read your own code?! – Kerrek SB Oct 24 '12 at 12:54
  • The first time upvec is even mentioned in the big wall of code is on the line where libc is telling you that it's wrong. How are we supposed to figure out how it's wrong when we don't know what it is and how it's been handled before? – Art Oct 24 '12 at 13:02

2 Answers2

2

Ok, I tried to look at your code. My eyes were sore but I managed to get through the code. here my changes for the first part, where you read the expression between the colon and the semicolon. I changed some types but let more or less the same error handling, even if I think it is overkill or better said, it is at the wrong place (I tend to separate the allocation/error work from the business code, makes it easier to debug).

/*Before this point, current line in the source is read into `temp'*/

char **propstr=NULL;         /* I hope this variable was initialized to NULL or else you get problems */
NREAC++;                     /* This is bad naming, all uppercase is by convention reserved for macros */
char *colon = strchr(temp, ':');     /* There a lib function to do the searching, use them */
char *semicolon = strchr(temp, ';');

if(!colon || !semicolon) {
  fprintf(stderr,"Error: syntax error\n");
  return 2112;  /* whatever */
}

ptrchar=realloc(propstr,NREAC*sizeof(char *));     /* realloc called with a NULL pointer is the same as a malloc, typecasts of mallocs/reallocs are not good. */
if(!ptrchar) {
 fprintf(stderr,"Error: could not allocate memory for propstr\n");
 free(propstr);   /* The check against NULL is also done by free, it's therefoe redundant */
 return 1345;
}
else
  propstr=ptrchar;   /* There's no point in NULLing a variable that will be overwritten anyway */

size_t lenexpr = semicolon-colon;   /* The length of the expression can be found by subtracting both pointers */
propstr[NREAC-1]=malloc(lenexpr+1);  /* +1 for the \n */
if(!propstr[NREAC-1]) {
 fprintf(stderr,"Error: couldn't get memory for propstr[NREAC-1]\n");
 return 1344;
}
memcpy(propstr[NREAC-1], colon+1, lenexpr);   /* We copy directly without a temporary that way */
propstr[NREAC-1][lenexpr] = '\n';             /* add the linefeed */
propstr[NREAC-1][lenexpr+1] = 0;              /* terminate the string */

Here I stopped, because there is a fundamental error in your second part that I do not understand. Do you want to store the vector as a string or as an integer array. If the former, then you have to allocate chars and not sizeof (int), if the latter there must be some atoi or strtol somewhere.

There are several other things that would be nice when you submit a question to SO, you should include the declarations of the variables you use, you should show the defines of the macros you use.

EDIT: second part

// Edit3 ptrchar=realloc(upstr, NREAC*sizeof(char *));

// Edit3 if(!ptrchar) {
// Edit3   fprintf(stderr,"Error could not allocate memory for upstr\n");
// Edit3   free(upstr);
// Edit3   return 1343;
// Edit3 }
// Edit3 else
// Edit3   upstr=ptrchar;

// Edit3 upstr[NREAC-1] = malloc(strlen(semicolon)+1);   /* +1 for the \n */

// Edit3 if(!upstr[NREAC-1]) {
// Edit3  fprintf(stderr,"Error: couldn't get memory for upstr[NREAC-1]\n");
// Edit3  return 1342;
// Edit3 }
if(strlen(semicolon)<2) {/*No vector is specified*/
  fprintf(stderr,"Error: no update vector found:\n'%s'", temp);
  return 1;
}
ptrint = realloc(upvec, NREAC*sizeof(int *));/*!!!!!!!!!!!!!!!!!!!!*/
if(!ptrint) {
  fprintf(stderr,"Error: could not allocate memory for upvec\n");
  free(upvec);
  return 1341;
}
else
  upvec=ptrint;

upvec[NREAC-1] = malloc(DIM*sizeof(int));
if(!upvec[NREAC-1]) {
  fprintf(stderr,"Error: couldn't get memory for upvec[NREAC-1]\n");
  return 1340;
}
// Edit3 memcpy(upstr[NREAC-1], semicolon+1, strlen(semicolon+1)+1);  /* +1 will include the \0 */
// Edit3 strcat(upstr[NREAC-1], "\n"); /*Include newline for more convenient way of parsing*/

/*Get update vector*/
// Edit3 upvec[NREAC-1] = process_update_vector(upstr[NREAC-1], upvec[NREAC-1]);
// Edit3, let's reuse our initial pointer, it's still valid.
process_update_vector(semicolon+1, upvec[NREAC-1]);
continue;

The signature of that function process_update_vector seems odd, does it realloc upvec[NREAC-1]? If not, there's no point in returning it and reassigning it. So it would be a good idea to show that function too.

Conclusion: The only errors I detected in your code are, the length of the allocation were 1 too short because of the appended \n. Other point: By replacing the if(first) malloc else realloc by my realloc you have to make sure that the pointer is NULL initially or else you have a problem.

EDIT2: Here an updated version of process_update_vector, that wasn't incorrect per se, but a was bit complicated for what it does. It also had a high buffer overflow risk, with the temporary buffer of only 5 characters!

This version doesn't need a temp buffer.

void process_update_vector(const char *upstr, int *upvec)
{
const char *p = strchr(upstr, ',');     /* There are fine library functions for string handling */
int k = 0;
  while(p) {
    upvec[k++] = atoi(upstr);
    upstr = p+1;                        /* Position upstr to after , */

    p = strchr(upstr, ',');
  }
  upvec[k++] = atoi(upstr);
  /* We don't need to return upvec, it doesn't change in the function */
}

Two comments: - There's no check of the DIM, so on broken input we can have a buffer oveflow. - There isn't any white space handling, often strings are entered with spaces after commas (it's more readable), this case wouldn't work in that case, but several while(*p==' ') p++; placed at the right places can take care of that.

EDIT3: The change in the called function also changes the caller, you don't need to copy to the upstr, so that allocation can be removed completely. I added // Edit3 comments in the 2nd listing. Unless of course if you intend to reuse the copied string elsewhere.

PS: On SO, thanking is done by upvoting the answers.

Patrick Schlüter
  • 11,394
  • 1
  • 43
  • 48
  • Sorry for NREAC to be in uppercase,there is no macro here, but it is really global thing, it is what all program is about and sorry for your eyes, that are much more important. I hope there is no fundamental error, since I want to have everything after ";" both as a string(in `upstr`) and as an integer array(in `upvec`). For those, the declarations are `char **propstr;`, `char **upstr;` and `int **upvec;`. It is quite fairly mentioned, that there must be some `atoi` in the code. It is there, where the conversion from `upstr` to `upvec` takes place, namely, in `process_update_vector` function – user1768508 Oct 25 '12 at 09:21
  • Please let me know if you need code for the function `process_update_vector`. – user1768508 Oct 25 '12 at 09:24
  • Yes, it would be helpful to have that function. At least its signature. – Patrick Schlüter Oct 25 '12 at 10:19
  • I have edited my question post by adding the `process_update_vector` function definition. Thanks for your help! – user1768508 Oct 25 '12 at 11:24
  • Yes! You are right! The allocation was 1 too short. The problematic source text is working now. THANK YOU very much! I am really ashamed to bother you guys for such a stupid error. But to be honest I have learnt a lot from you. Now I have many things to alter in the code. Richard's posts were really useful as well. Thank you once more. @richardc – user1768508 Oct 25 '12 at 11:36
  • BTW, with the modified function you don't need to append the `\n` at your input strings. – Patrick Schlüter Oct 25 '12 at 12:56
  • On SO, I am a newbie it does not allow me to vote. I need newlines for further parsing down the program. white spaces are controlled before this code in the program. The `temp` array, containing lines from the source, is already without any white space and tab. DIM is also controlled, because it is gotten from the source as well, it is the number of lines of different formatting. Thanks. Vote up. – user1768508 Oct 25 '12 at 13:48
  • It was only to show that it is possible to use algorithm that need less preconditions. Often one has to do some strange things in one (or several) parts of a program, because downstream someone chose a not so ideal algorithm. Fixing that downstream stuff can then simplify things for the callers. That is what I wanted to show. – Patrick Schlüter Oct 25 '12 at 14:42
  • Yes, you are right. I have to reconsider whole program and/or caller function. But some of your ideas and code I have already implemented. Your code is much simpler and more effective. – user1768508 Oct 25 '12 at 15:14
0

Using Microsoft Visual Studio 2005 I pasted the code into a temp file and used the reformat command to reformat it to the text below.

See this Wikipedia article on Programming style which can help you from getting flamed for stackoverflow submissions.

Also see this sample of a C coding styles document.

EDIT begin The runtime is complaining about an invalid pointer which indicates that the pointer you are passing to realloc() is not a pointer that was created with a call to malloc() or to calloc(). It knows because whenever you do a malloc() there is a memory management header that is part of the data area and the pointer you are given is a pointer to allocated memory after the header. See this question How does realloc know how much to copy?

Finally, I rewrote this using a state machine approach (see also state machine explanation) and put that source code at the bottom of the reformatted example. EDIT end

Looking at the source this appears to be part of a loop. It appears that the variable upvec may not be initialized and it is some kind of an array so is it actually malloced or not?

See section with //-------------------- indicated.

Reformatted source follows:

    /*Before this point, current line in the source is read into `temp'*/
    NREAC++;
    for(i=0;i<strlen(temp);i++){
        if(temp[i]==':') colon=i;//Here we find colon
        if(temp[i]==';') semicolon=i;//...and semicolon positions
    }
    memset(temp1,'\0',STRLEN);

    if(NREAC==1)
        ptrchar=(char **)malloc(sizeof(char *));
    else
        ptrchar=realloc(propstr,NREAC*sizeof(char *));
    if(ptrchar==NULL){
        fprintf(stderr,"Error: could not allocate memory for propstr\n");
        if(propstr!=NULL) free(propstr);
        return 1345;
    } else {
        propstr=ptrchar;
        ptrchar=NULL;
    }
    propstr[NREAC-1]=(char *)malloc((semicolon-colon)*sizeof(char));
    if(propstr[NREAC-1] == NULL){
        fprintf(stderr,"Error: couldn't get memory for propstr[NREAC-1]\n");
        return 1344;
    }
    for(i=colon+1;i<semicolon;i++)/*Copy the propensity part of the line*/
        temp1[i-colon-1]=temp[i];
    temp1[i-colon-1]='\n';/*Include newline symbol for correct parsing*/
    strcpy(propstr[NREAC-1],temp1);
    memset(temp1,'\0',STRLEN);
    if(NREAC==1)
        ptrchar=(char **)malloc(sizeof(char *));
    else
        ptrchar=realloc(upstr,NREAC*sizeof(char *));
    if(ptrchar==NULL){
        fprintf(stderr,"Error could not allocate memory for upstr\n");
        if(upstr!=NULL) free(upstr);
        return 1343;
    } else {
        upstr=ptrchar;
        ptrchar=NULL;
    }
    upstr[NREAC-1]=(char *)malloc((strlen(temp)-semicolon-1)*sizeof(char));
    if(upstr[NREAC-1]==NULL){
        fprintf(stderr,"Error: couldn't get memory for upstr[NREAC-1]\n");
        return 1342;
    }
    if(strlen(temp)-semicolon==2){/*No vector is specified*/
        fprintf(stderr,"Error: no update vector found:\n");
        fprintf(stderr,"`%s'",temp);
        return 1;
    }

  // -----------------------------------------------------
    if(NREAC==1)
        ptrint=(int **)malloc(sizeof(int *));
    else
        ptrint=(int **)realloc(upvec,NREAC*(sizeof(int *)));/*!!!!!!!!!!!!!!!!!!!!*/
    if(ptrint==NULL){
        fprintf(stderr,"Error: could not allocate memory for upvec\n");
        if(upvec!=NULL) free(upvec);
        return 1341;
    } else {
        upvec=ptrint;
        ptrint=NULL;
    }

    upvec[NREAC-1]=(int *)malloc(DIM*sizeof(int));
    if(upvec[NREAC-1]==NULL){
        fprintf(stderr,"Error: couldn't get memory for upvec[NREAC-1]\n");
        return 1340;
    }
  // ---------------

    for(i=semicolon+1;i<strlen(temp)-1;i++)
        temp1[i-semicolon-1]=temp[i];
    temp1[i-semicolon-1]='\n';/*Include newline for more convenient way of parsing*/
    strcpy(upstr[NREAC-1],temp1);
    /*Get update vector*/
    upvec[NREAC-1]=process_update_vector(upstr[NREAC-1],upvec[NREAC-1]);

    memset(temp1,'\0',STRLEN);
    memset(temp,'\0',STRLEN);
    continue;

EDIT A suggested approach for this problem using a state machine with a function that will parse each line of text.

#include <malloc.h>
#include <stdlib.h>

// pLine is a line of text containing a zero terminated string of the format of
//    g:<expression>;0,1,0,1,0
// This function will process the line and return the expression as a string
// and a list of the integers.
void processExpression (char *pLine, char *pExpression, int **pIntList)
{
    int  stateMachineIndex = 1;
    int  integerCount = 0;
    char *pLineSave = 0;
    int  iIntListIndex = 0;

    *pIntList = 0;

    while (*pLine) {
        switch (stateMachineIndex) {
            case 1:
                // initial state
                if (*pLine == ':') {
                    // colon found so now start getting the expression
                    stateMachineIndex = 2;
                }
                pLine++;
                break;
            case 2:
                if (*pLine != ';') {
                    *pExpression++ = *pLine++;
                } else if (*pLine) {
                    // if we have not reached end of string yet then go to the
                    // next state of parsing the list of integers.
                    stateMachineIndex = 3;
                    pLine++;
                    pLineSave = pLine;
                }
                break;
            case 3:
                // at this point we begin to process the list of integers.
                // however we are not sure how many there are so we will count them first
                if (*pLine == ',') {
                    integerCount++;
                }
                pLine++;
                break;
            case 4:
                // we now have a count of the integers we expect however it
                // may be off by one so we will allocate a smidge more space
                *pIntList = (int *)calloc ((integerCount + 2), sizeof(int));
                stateMachineIndex = 5;
                *pExpression = 0;         // and while we are at it lets terminate our expression string
                break;
            case 5:
                // now we get an integer value from the list of integers
                (*pIntList)[iIntListIndex++] = atoi (pLine);
                // eat up characters to the next integer in the list
                while (*pLine && *pLine != ',') pLine++;
                if (*pLine == ',') pLine++; // if we found a comma, skip it to the next field
                break;
            default:
                break;
        }
        if (*pLine == 0 && stateMachineIndex < 4) {
            // end of the string so now lets do our integer thing
            // if we are still in the first phase of processing
            if (pLineSave && *pLineSave && integerCount > 0) {
                stateMachineIndex = 4;
                pLine = pLineSave;    // restart our parser back to the integer area
            } else {
                break;
            }
        }
    }
}

// simple test harness to test the concept.
int main(int argc, char* argv[])
{
    char *pLine = "g:expression and stuff;1,2,3,4,5";
    char expressionBuffer[128];
    int  *pIntList = 0;

    processExpression (pLine, expressionBuffer, &pIntList);
    return 0;
}
Community
  • 1
  • 1
Richard Chambers
  • 16,643
  • 4
  • 81
  • 106
  • OK! Next question for OP - what happens when he/she debugs it? – Martin James Oct 24 '12 at 13:15
  • I also use auto-reformat and wrapping things within Emacs, but copy-paste here did not work properly (even that single space indents, that are not ENOUGH, I did manually). Thanks for your answer once again and thanks to those who always intend eat you all when something wrong with anything disregard whether it is IMPORTANT or not. This is my first question here and looks like the last. When NREAC=1(when the parser first meets a line of the specified format) then malloc is called. Isn't it? `ptrint=(int **)malloc(sizeof(int *));` Then this ptrint is copied to the upvec. – user1768508 Oct 24 '12 at 13:36
  • Even before I malloc-ed immediately to upvec, but changed now, since "the rules and conventions of programming would make commentators here blame that I pass realloc(...) output immediately to the array being reallocated, although I do not event care about that previous array if it is failed to be reallocated even once. – user1768508 Oct 24 '12 at 13:36
  • @user1768508, I have put a version up in my posting using a state machine. It appears to do what you are wanting to do for each line of text. You should be able to take that function and use it as a tool for doing multiple text lines by getting a text line and then feeding it into the function. You will need some additional memory allocation for each line of text to process. I may add something to it for that however you should probably use a struct for each line of text processed and the struct would contain the elements for that line of text. – Richard Chambers Oct 24 '12 at 14:51
  • Dear Richard. I really appreciate your effort. It works for me just fine, even for multiple lines of my source file. The concept itself is very interesting as well, and I have a lot to learn from your approach and implementation. Anyway, I would like to understand what happens to the `realloc` in my example. Because I have many such statements for lines of very different formats and this quite annoys me when I cannot fully understand what my code does, in every particular example the behavior is different. I hope you understand my will to get to the point. – user1768508 Oct 24 '12 at 17:04
  • @user1768508, the run time is complaining of an invalid pointer. What this means is that the pointer that you are passing to the realloc() function is not a pointer to a malloced area of memory. However it may also be that you overwrote the memory management header in some way. Here is something about memory management http://stackoverflow.com/questions/12433947/why-does-malloc-not-work-sometimes/12434865#12434865 – Richard Chambers Oct 24 '12 at 17:27
  • @user1768508, also see this stackoverflow posting. http://stackoverflow.com/questions/3476448/how-does-realloc-know-how-much-to-copy?rq=1 – Richard Chambers Oct 24 '12 at 17:30
  • Thank you very much. Very useful postings, I've learnt many things. Following your advice to rarely use malloc/realloc calls, what do you think if in my example I would do two consecutive readings of the source file: first, to scan and determine how many lines are there and other parameters, and second, to allocate memory for all arrays in a single malloc call and then read those lines? I had this idea in the very beginning of my implementation. – user1768508 Oct 25 '12 at 03:49