0

I am writing a function normalize that prepares a string for processing. This is the code:

/* The normalize procedure examines a character array of size len 
 in ONE PASS and does the following:
 1) turns all upper-case letters into lower-case ones
 2) turns any white-space character into a space character and, 
 shrinks any n>1 consecutive spaces into exactly 1 space only
 3) removes all initial and final white-space characters

 Hint: use the C library function isspace() 
 You must do the normalization IN PLACE so that when the procedure
 returns, the character array buf contains the normalized string and 
 the return value is the length of the normalized string.
  */
 int normalize(char *buf, /* The character array containing the string to be normalized*/
            int len    /* the size of the original character array */)
 {
    /* exit function and return error if buf or len are invalid values */
if (buf == NULL || len <= 0)
  return -1; 

char *str = buf; 
char prev, temp; 
len = 0; 

/* skip over white space at the beginning */
while (isspace(*buf))
  buf++; 


/* process characters and update str until end of buf */
while (*buf != '\0') {
  printf("processing %c, buf = %p, str = %p \n", *buf, buf, str); 

  /* str might point to same location as buf, so save previous value in case str ends up changing buf */
  temp = *buf; 

  /* if character is whitespace and last char wasn't, then add a space to the result string */
  if (isspace(*buf) && !isspace(prev)) {
    *str++ = ' '; 
    len++; 
  } 

  /* if character is NOT whitespace, then add its lowercase form to the result string */
  else if (!isspace(*buf)) {
    *str++ = tolower(*buf); 
    len++; 
  }

  /* update previous char and increment buf to point to next character */
  prev = temp; 
  buf++; 
}


/* if last character was a whitespace, then get rid of the trailing whitespace */ 
if (len > 0 && isspace(*(str-1))) {
  str--; 
  len--; 
}

/* append NULL character to terminate result string and return length */
*str = '\0'; 
return len;

}

However, I am getting a segmentation fault. I have narrowed down the problem to this line:

*str++ = *buf;

More specifically, if I try to deference str and assign it a new char value (eg: *str = c) the program will crash. However str was initialize at the beginning to point to buf so I have no clue why this is happening.

*EDIT: This is how I am calling the function: * char *p = "string goes here"; normalize(p, strlen(p));

Ryan
  • 647
  • 2
  • 7
  • 17

2 Answers2

1

You can't call your function with p when p was declared as char *p = "Some string";, since p is a pointer initialized to a string constant. This means you can't modify the contents of p, and attempting to do so results in undefined behavior (this is the cause for segfault). However, you can, of course, make p point to somewhere else, namely, to a writable characters sequence.

Alternatively, you could declare p to be an array of characters. You can initialize it just like you did with the pointer declaration, but array declaration makes the string writable:

char p[] = "Some string";
normalize(p, strlen(p));

Remember that arrays are not modifiable l-values, so you will not be able to assign to p, but you can change the content in p[i], which is what you want.

Apart from that, note that your code uses prev with garbage values in the first loop iteration, because you never initialize it. Because you only use prev to test if it is a space, maybe a better approach would be to have a flag prev_is_space, rather than explicitly storing the previous character. This would make it easy to start the loop, you just have to initialize prev_is_space to 0, or 1 if there are leading white spaces (this really depends on how you want your function to behave).

Filipe Gonçalves
  • 20,783
  • 6
  • 53
  • 70
0

I don't see where you initialized prev before using it in isspace(prev).

Michael
  • 5,775
  • 2
  • 34
  • 53