-2

I'm trying to create an array of strings in C. My plan is for the program to read line from a file and them gradually build this array.

But I noticed that after I allocate memory to this array, some of its indices are 0x0, and some are trying to access memory in a weird manner. For example:

char** arr_docs = (char**)malloc(sizeof(char*));

In gdb, I'll try to see the memory addresses of many of this array's indexes:

arr_docs[0]
> 0x0
arr_docs[1]
> 0x0
arr_docs[2]
> 0x0
arr_docs[3]
> 0x1fae1 <error: Cannot access memory at address 0x1fae1>
arr_docs[4]
> 0x0

Wait, what?? Why does arr_docs[3] is trying to access that address?

I have also noticed that when I'm building the array of strings, the program correctly puts the intended string in arr_docs[0], but at some point in the loop (In the debugger, it shows that is when i == 4), arr_docs[0] get allocated again! Here's the for-loop code and the behavior arr_docs[0] shows in the debugger:

void getlinha(char* buf, FILE* arq){
    fgets(buf, 50, arq);
    int size = strlen(buf);
    // final \n replaced by \0
    buf[size-1] = '\0';
}
char* temp = (char*)malloc(sizeof(char));
for(i = 0; i < 6; i++){
        //char* temp = (char*)malloc(50);
        arr_docs[i] = (char*)malloc(sizeof(char));
        getlinha(temp, input);
        strcpy(arr_docs[i], temp);
    }

In the debugger, when i < 4:

> arr_docs[0]: 0x55555555a530 "sigaa 2"

When i == 4 (More specifically, when arr_docs[4] = (char*)malloc(sizeof(char));):

> arr_docs[0] : 0x55555555a530 "\260\245UUUU"

I'm completely lost.


Update

Following recommendations, I edited the code. Dumped dynamic memory allocation, since I know how many strings to store. Still, some problems arise. The new code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void getlinha(char* buf, FILE* arq){
    fgets(buf, 50, arq);
    int size = strlen(buf);
    // final \n replaced by \0
    buf[size-1] = '\0';
}

int main(void){
    char* arr_docs[6];
    int i;
    FILE* input = fopen("file.input", "r");
    for(i = 0; i < 6; i++){
        getlinha(arr_docs[i], input);
    }
}

In this program, fgets rises Segmentation fault error; In the debugger, arr_docs[0] is correctly assigned. But arr_docs[1] throws the error:

arr_docs[1]
> 0x5555555552bd <__libc_csu_init+77> "H\203\303\001H9\335u\352H\203\304\b[]A\\A]A^A_\303ff.\017\037\204"

fgets(arr_docs[1], 50, input)
> Program received signal SIGSEGV, Segmentation fault.
> __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:314


Update 2

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void getlinha(char* buf, FILE* arq){
    fgets(buf, 50, arq);
    int size = strlen(buf);
    // final \n replaced by \0
    buf[size-1] = '\0';
}

int main(void){
    char* arr_docs[6];
    int i;
    FILE* input = fopen("file.input", "r");
    for(i = 0; i < 6; i++){
        // assuming the upper bound size of one doc is 50 chars
        arr_docs[i] = malloc(50 * sizeof(char));
        getlinha(arr_docs[i], input);
    }
    for(i = 0; i < 6; i++){
        free(arr_docs[i]);
        arr_docs[i] = NULL;
    }
    fclose(input);
    return 0;
}
  • `malloc()` doesn't initialize the memory that it returns, so it can contain anything. – Barmar Jul 30 '22 at 17:27
  • 3
    `calloc()` gives you a zero-initialized buffer – dave Jul 30 '22 at 17:28
  • 3
    `arr_docs[i] = (char*)malloc(sizeof(char));` only allocates 1 character. You can't read a 50-character line into that buffer. – Barmar Jul 30 '22 at 17:29
  • [In C you shouldn't cast the return of `malloc` (or `calloc`)](https://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc/). – Some programmer dude Jul 30 '22 at 17:29
  • 2
    How many strings do you expect to store, having allocated `sizeof(char*)` bytes? – GSerg Jul 30 '22 at 17:29
  • 2
    `malloc(sizeof(char*))` only allocates memory for 1 pointer. – Barmar Jul 30 '22 at 17:29
  • 6
    So you're causing lots of buffer overflows in your code, resulting in undefined behavior. – Barmar Jul 30 '22 at 17:30
  • Hey guys: I'm planning on storing 6 strings in the array. I added a update in the post, where I eliminated the cast and some other recommendations were applied. Still, some errors arise. How should I proceed? –  Jul 30 '22 at 18:02
  • If you know how many strings you want to store, or even just an upper bound on the number of strings, then why are you messing with dynamic allocation? – John Bollinger Jul 30 '22 at 18:09
  • 1
    You're still blasting through buffer overflows. Ex: how big a string do you think you can store in the buffer allocated with `char *temp = malloc(sizeof(*temp));` ? Do yo know how many chars that buffer is (e.g. do you know how many chars `sizeof(*temp)` is) ? Run your code in a *debugger*. Each time it tries to puke on itself the debugger will *probably* halt and the stack trace can reveal how you got to where you are. Memory watch like valgrind, or `-g -fsanitize=address,undefined` can also help immensely. – WhozCraig Jul 30 '22 at 18:12
  • 1
    `char* arr_docs[6];` is an array of 6 pointers. Those pointers are not initialized, they point to "random" addresses. Passing any of those pointers as the first argument of `fgets` invokes *undefined behavior* – UnholySheep Jul 30 '22 at 18:51
  • I finally got it. I added a final update. It works as expected. It was my fault. I was thinking that allocating a pointer to char wouldn't need to specify the upper bound of chars that string could contain. –  Jul 30 '22 at 19:01
  • 1
    your final update is full of errors – 0___________ Jul 30 '22 at 19:29
  • @0___________ mind explaining them? I corrected some errors I noticed (like forgetting `fclose`), but I can't see the "full of errors" part. –  Jul 30 '22 at 19:38
  • 1
    for example free(input) – 0___________ Jul 30 '22 at 19:41
  • Corrected that already. Any other erros u notice, would be happy to correct them. –  Jul 30 '22 at 19:47

2 Answers2

0

I suspect that that "0x0" in the first part of your question is not an address (which should be 20 bit) but rather the value stored at that position. With malloc() you allocate memory as it is, meaning that it's not freed from the values that were stored before being assigned. If you want it to be initialized to 0 instead, use ```calloc()``.

Fulvio
  • 31
  • 6
0

There are still some problems in your latest update:

  • you do not check for fopen failure to open the file
  • you do not check for memory allocation failure
  • you do not check for fgets() failure
  • if a line read by fgets() is longer than 49 bytes including the newline, it will be spill to the next array.
  • you have undefined behavior if strlen(buf) returns 0.
  • you overwrite the last byte of buf without checking that it is a newline.

Here is a modified version without dynamic memory allocation that truncates long lines:

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// read a line of input
// return -1 and set buf to the empty string at end of file
// otherwise return the line length
// line read is truncated to fit in buf and null terminated if size>0
// truncation can be detected by testing if the return value >= buffer length.
int getlinha(char *buf, int size, FILE *fp) {
    int c, i, j;
    for (i = j = 0; (c = getchar()) != EOF && c != '\n'; i++) {
        if (i + 1 < size)
            buf[j++] = (char)c;
    }
    if (size > 0)
        buf[j] = '\0';
    if (i == 0 && c == EOF)
        return -1;
    return i;
}

int main(void) {
    char arr_docs[6][50];
    int i, n;
    FILE *input = fopen("file.input", "r");
    if (input == NULL) {
        fprintf(stderr, "cannot open %s: %s\n", "file.input", strerror(errno));
        return 1;
    }
    for (n = 0; n < 6; n++) {
        // assuming the upper bound size of one doc is 50 chars
        if (getlinha(arr_docs[n], sizeof arr_docs[n], input) < 0)
            break;
    }
    for (i = 0; i < n; i++) {
        printf("%d: %s\n", i + 1, arr_docs[i]);
    }
    fclose(input);
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Thanks for these corrections. I assumed the maximum size of the string is 50 because my teacher is using a file where the longest line has 15 chars. But in a production or serious environment, I should consider the possibility of a 50+ line. –  Jul 30 '22 at 22:22
  • @PedroVinícius: you should consider using `getline` – chqrlie Jul 31 '22 at 04:35