2

I have the following program which is a minimal example of repeatedly appending the character 'a' to a string by creating a new char pointer (newRow), allocating memory for the new pointer, setting the first character to 'a', copy the previously updated string to the position after 'a' in the new string, then reassign the original string to the new char pointer. However, there are two things not working as I expected.

Firstly, I would expect that for each iteration, the address of newRow should change, since it is newly created by malloc (and the previous has not been freed), and then once newRow is assigned to row, row's address should be that of the previous iterations newRow, but this is not the case.

Secondly, after the 18th iteration, additional characters are being appended to the end of the string, and I don't know why.

Below is source compiled on Linux with gcc -Wall -Wextra -pedantic -std=c99 -g c_programs/playground.c, and the output from running the program.

#include <stdio.h>
#include <stdlib.h>

int main()
{
  char lineBuf[] = "hello";
  char *row = (char *)malloc(strlen(lineBuf) + 1);
  strcpy(row, lineBuf);

  for (size_t i = 0; i < 40; i++)
  {
    char *newRow = (char *)malloc(strlen(row) + 1);
    printf("%ld:  &row: %p, &newRow: %p, len: %ld", i, (void *)&row, (void *)&newRow, strlen(row));
    newRow[0] = 'a';
    strcpy(newRow + 1, row);
    printf(", string is: %s\n", newRow);
    row = newRow;
  }
}
0:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 5, string is: ahello
1:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 6, string is: aahello
2:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 7, string is: aaahello
3:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 8, string is: aaaahello
4:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 9, string is: aaaaahello
5:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 10, string is: aaaaaahello
6:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 11, string is: aaaaaaahello
7:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 12, string is: aaaaaaaahello
8:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 13, string is: aaaaaaaaahello
9:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 14, string is: aaaaaaaaaahello
10:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 15, string is: aaaaaaaaaaahello
11:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 16, string is: aaaaaaaaaaaahello
12:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 17, string is: aaaaaaaaaaaaahello
13:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 18, string is: aaaaaaaaaaaaaahello
14:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 19, string is: aaaaaaaaaaaaaaahello
15:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 20, string is: aaaaaaaaaaaaaaaahello
16:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 21, string is: aaaaaaaaaaaaaaaaahello
17:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 22, string is: aaaaaaaaaaaaaaaaaahello
18:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 23, string is: aaaaaaaaaaaaaaaaaaahello
19:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 25, string is: aaaaaaaaaaaaaaaaaaaahello1
20:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 26, string is: aaaaaaaaaaaaaaaaaaaaahello1
21:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 27, string is: aaaaaaaaaaaaaaaaaaaaaahello1
22:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 28, string is: aaaaaaaaaaaaaaaaaaaaaaahello1
23:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 29, string is: aaaaaaaaaaaaaaaaaaaaaaaahello1
24:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 30, string is: aaaaaaaaaaaaaaaaaaaaaaaaahello1
25:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 31, string is: aaaaaaaaaaaaaaaaaaaaaaaaaahello1
26:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 32, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaahello1
27:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 33, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaahello1
28:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 34, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1
29:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 35, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1
30:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 36, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1
31:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 37, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1
32:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 38, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1
33:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 39, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1
34:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 41, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1A
35:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 42, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1A
36:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 43, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1A
37:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 44, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1A
38:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 45, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1A
39:  &row: 0x7fff662312f8, &newRow: 0x7fff66231300, len: 46, string is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaahello1A
Max888
  • 3,089
  • 24
  • 55

2 Answers2

3

strcpy(newRow + 1, row); fails as newRow points to insufficient memory (by 1).
This leads to undefined behavior (UB).

 // char *newRow = (char *)malloc(strlen(row) + 1);
 char *newRow = (char *)malloc(strlen(row) + 1 + 1);
 //                                          ^   ^
 //                              new character   \0
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
3

The pointer values you're printing don't change because you're printing the address of the pointer itself, not its value. Try using:

printf("%zu:  row: %p, newRow: %p, len: %zu", i, row, newRow, strlen(row));

(%zu is the correct format specifier for a size_t, which is what strlen() returns, and what the i variable is)

As far as the extra character, you're not allocating enough memory for the new string:

char *newRow = (char *)malloc(strlen(row) + 1);

should be + 2 since you need space for both one more character, and a terminating null character, which strlen() doesn't count.

A few more points:

  • you don't need to, and shouldn't, cast the return value of malloc(), and as John Bollinger points out, you should check the return value to ensure that malloc() succeeded
  • the preferred signature for main() when not using command line arguments is int main(void) not int main() (alternatives may be available but are implementation-defined)
  • #include <string.h> when using strcpy()
sj95126
  • 6,520
  • 2
  • 15
  • 34
  • One shouldn't cast the return value of `malloc`, but one *should* check whether it's a null pointer before using it. Allocation failure is not the problem here, but it pays to cultivate good programming habits by exercising them even in small, throwaway programs such as the OP's. – John Bollinger Oct 25 '21 at 15:26
  • @JohnBollinger: sure, fair point. I've revised to include that and a few more points that are good habits – sj95126 Oct 25 '21 at 15:36
  • To cast or not to cast, that is the [question](https://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc). Tis a holy war, perhaps 80% preferring _no cast_. As with such style issues, best to follow your group’s coding standard. `void` or not in a non-`main()` definition is optional. `main()` is special and _may_ need `void`. IMO, another holy war. Interesting you side with non-verbose in one and verbose in the other. – chux - Reinstate Monica Oct 25 '21 at 15:41
  • @chux-ReinstateMonica: I was speaking of the fact that the C standard says that there are only two* acceptable signatures for `main` and `int main()` isn't one of them, so personal choices of verbosity are irrelevant. (* three, if you count `envp`) – sj95126 Oct 25 '21 at 15:45
  • C spec also has "or in some other implementation-defined manner.". It does not say "only two". IAC, best hashed out is a quesiton than comments. – chux - Reinstate Monica Oct 25 '21 at 15:49
  • @chux-ReinstateMonica: OK, fair point, technically there are alternatives. I have revised the answer to clarify that. Personally I don't think it's a good idea to encourage implementation-defined behavior if it's not absolutely necessary, but you're right, it's an issue for a separate question. – sj95126 Oct 25 '21 at 15:58