1

I am working on a short program that reads a .txt file. Intially, I was playing around in main function, and I had gotten to my code to work just fine. Later, I decided to abstract it to a function. Now, I cannot seem to get my code to work, and I have been hung up on this problem for quite some time.

I think my biggest issue is that I don't really understand what is going on at a memory/hardware level. I understand that a pointer simply holds a memory address, and a pointer to a pointer simply holds a memory address to an another memory address, a short breadcrumb trail to what we really want.

Yet, now that I am introducing malloc() to expand the amount of memory allocated, I seem to lose sight of whats going on. In fact, I am not really sure how to think of memory at all anymore.

So, a char takes up a single byte, correct? If I understand correctly, then by a char* takes up a single byte of memory? If we were to have a:

char* str = "hello"

Would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?

And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.

int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));

Is this syntactically correct so far? Now, if you would judge my interpretation. We are telling the compiler that we need "size" number of contiguous memory reserved for chars. If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?

Now, if we could go one step further.

int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);

This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:

int file_read(char* filename, int size, char** buffer) {
    // Set up FILE stream 

    // Allocate memory to buffer
    buffer = malloc(size * sizeof(char));

    // Add characters to buffer
    int i = 0;
    char c;
    while((c=fgetc(file))!=EOF){
         *(buffer + i) = (char)c;
         i++;
    }

Adding the characters to the buffer and allocating the memory is what is I cannot seem to wrap my head around.

If **buffer is pointing to *str which is equal to null, then how do I allocate memory to *str and add characters to it?

I understand that this is lengthy, but I appreciate the time you all are taking to read this! Let me know if I can clarify anything.

EDIT:

Whoa, my code is working now, thanks so much!

Although, I don't know why this works:

*((*buffer) + i) = (char)c;
karafar
  • 496
  • 4
  • 14
  • `buffer = malloc(size * sizeof(char));` --> `*buffer = malloc(size * sizeof(char));` – BLUEPIXY Sep 21 '17 at 21:05
  • `str = (char*)(size * sizeof(char);` should be `str = (char*)malloc(size * sizeof(char);` – odin Sep 21 '17 at 21:06
  • 1
    `char c; while((c=fgetc(file))!=EOF){ *(buffer + i) = (char)c;i++; }` --> `int c; while((c=fgetc(file))!=EOF && i < size-1){ (*buffer)[i++] = c;} (*buffer)[i] = 0;` – BLUEPIXY Sep 21 '17 at 21:07
  • So, does **buffer point to some character and *buffer points to *str? – karafar Sep 21 '17 at 21:07
  • `*buffer` can be regarded as an alias for `str`. – BLUEPIXY Sep 21 '17 at 21:11
  • "default a char* takes up a single byte of memory". Only on very unusual implementations. – EOF Sep 21 '17 at 21:16
  • 1
    That's a novel, not a question. Learn [ask] and provide a [mcve]. – too honest for this site Sep 21 '17 at 21:23
  • @EOF Yeah, I know there are some variations depending on the system, but for the questions sake I went with the value that was commonly referenced. – karafar Sep 21 '17 at 21:23
  • 1
    @FaridKaradsheh As BLUEPIXY pointed out the variable c should be declared as having type int instead of char because in general char can behave as unsigned character. In this case the condition (c=fgetc(file))!=EOF will be always true.:) – Vlad from Moscow Sep 21 '17 at 21:26
  • @Olaf I know it was long, but I felt like I needed to explain my logic all the way through, so that I knew the predicate was true as well as the conclusion. Thanks for the links, though. I will read them before I post again! – karafar Sep 21 '17 at 21:27
  • @FaridKaradsheh Who claimed that a `char *` taking up a single byte was the most common value? – EOF Sep 21 '17 at 21:27
  • @EOF Ooops, I misread that. I thought you had said char not char*. I meant to word that as a question, so "would I char* take up a single byte of memory." I'll go ahead and fix that. – karafar Sep 21 '17 at 21:42

2 Answers2

3

So, a char takes up a single byte, correct?

Yes.

If I understand correctly, by default a char* takes up a single byte of memory.

Your wording is somewhat ambiguous. A char takes up a single byte of memory. A char * can point to one char, i.e. one byte of memory, or a char array, i.e. multiple bytes of memory.

The pointer itself takes up more than a single byte. The exact value is implementation-defined, usually 4 bytes (32bit) or 8 bytes (64bit). You can check the exact value with printf( "%zd\n", sizeof char * ).

If we were to have a char* str = "hello", would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?

Yes.

And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.

int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));

Is this syntactically correct so far?

Do not cast the result of malloc. And sizeof char is by definition always 1.

If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?

Yes. Well, almost. str* makes no sense, and it's 10 chars, not 10 memory addresses. But str would point to the first of the 10 chars, yes.

Now, if we could go one step further.

int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);

This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:

int file_read(char* filename, int size, char** buffer) {
    // Set up FILE stream 

    // Allocate memory to buffer
    buffer = malloc(size * sizeof(char));

No. You would write *buffer = malloc( size );. The idea is that the memory you are allocating inside the function can be addressed by the caller of the function. So the pointer provided by the caller -- str, which is NULL at the point of the call -- needs to be changed. That is why the caller passes the address of str, so you can write the pointer returned by malloc() to that address. After your function returns, the caller's str will no longer be NULL, but contain the address returned by malloc().

buffer is the address of str, passed to the function by value. Allocating to buffer would only change that (local) pointer value.

Allocating to *buffer, on the other hand, is the same as allocating to str. The caller will "see" the change to str after your file_read() returns.


Although, I don't know why this works: *((*buffer) + i) = (char)c;

  • buffer is the address of str.
  • *buffer is, basically, the same as str -- a pointer to char (array).
  • (*buffer) + i) is pointer arithmetic -- the pointer *buffer plus i means a pointer to the ith element of the array.
  • *((*buffer) + i) is dereferencing that pointer to the ith element -- a single char.
  • to which you are then assigning (char)c.

A simpler expression doing the same thing would be:

(*buffer)[i] = (char)c;
DevSolar
  • 67,862
  • 21
  • 134
  • 209
  • I realize now that I forgot the malloc(). I didn't copy my source exactly. Would it be the first of 10 or 11 memory addresses (+1 for null character)? Thanks, the final bit really clarifies it for me! Also, since char is equal to 1, then we don't need to use sizeof(), right? Which would then make the code more efficent. – karafar Sep 21 '17 at 21:20
  • 1
    Minor: "safe to assume that it takes up 6 bytes " -- more likely 6 for the literal and 4 or 8 for the pointer. – chux - Reinstate Monica Sep 21 '17 at 21:25
  • @chux Is it 4 or 8 due to padding the location with zeros? – karafar Sep 21 '17 at 21:29
  • @FaridKaradsheh: I am not quite sure what chux is about, either. The pointer itself takes up some space -- usually 4 (32bit) or 8 (64bit) bytes these days. The string literal it points to takes up 6 bytes -- 5 for the characters and one for the terminating null byte. But you were referring to what the pointer *points* to, not the space taken up by the pointer itself. – DevSolar Sep 21 '17 at 21:32
  • @DevSolar Yes that`s it. – chux - Reinstate Monica Sep 21 '17 at 21:32
  • @DevSolar How do you figure that? It's effectively impossible to figure out the size of the memory a pointer points to, so saying that "a char* takes up a single byte of memory" can only really mean what it obviously means: `sizeof(char*) == 1`, which is not common. – EOF Sep 21 '17 at 21:48
  • @EOF: It is `sizeof char == 1`, not `sizeof char* == 1`. – DevSolar Sep 21 '17 at 21:53
  • @DevSolar That's a more than questionable interpretation of "a char* takes up a single byte of memory". I'm not sure any sane person would be willing to follow it. – EOF Sep 21 '17 at 21:54
  • I think the confusion lies in the syntax. Some of you think I was referring to the pointer itself, others thought I was referring to what the pointer points to. If was referring to how much space a pointer took. So, if we had char* x, how large is the block of memory at &x? Apologies for the confusing question. – karafar Sep 21 '17 at 21:56
  • 1
    @FaridKaradsheh Depends on the implementation. – EOF Sep 21 '17 at 21:57
  • @FaridKaradsheh: You can check this with `printf( "%zd\n", sizeof char* );` -- as mentioned, that is implementation-defined, and usually 4 or 8 (32/64 bit). And indeed I was thinking you were referring to the character it points to, because a one-byte pointer would be pretty useless. (256 bytes of addressable memory?) – DevSolar Sep 22 '17 at 05:28
2

with char **buffer, buffer stands for the pointer to the pointer to the char, *buffer accesses the pointer to a char, and **buffer accesses the char value itself.

To pass back a pointer to a new array of chars, write *buffer = malloc(size).

To write values into the char array, write *((*buffer) + i) = c, or (probably simpler) (*buffer)[i] = c

See the following snippet demonstrating what's going on:

void generate0to9(char** buffer) {

    *buffer = malloc(11);  // *buffer dereferences the pointer to the pointer buffer one time, i.e. it writes a (new) pointer value into the address passed in by `buffer`
    for (int i=0;i<=9;i++) {
        //*((*buffer)+i) = '0' + i;
        (*buffer)[i] = '0' + i;
    }
    (*buffer)[10]='\0';
}

int main(void) {

    char *b = NULL;
    generate0to9(&b);  // pass a pointer to the pointer b, such that the pointer`s value can be changed in the function 
    printf("b: %s\n", b);
    free(b);
    return 0;
}

Output:

0123456789
Stephan Lechner
  • 34,891
  • 4
  • 35
  • 58
  • So, '*buffer = malloc(size)' is changing *str? That is what I was thinking as well. Could you explain what is going on in your third paragraph/sentence? – karafar Sep 21 '17 at 21:11
  • 2
    @Stephan Lechner *buffer[i] = c; is an invalid statement Should be (*buffer)[i] = c – Vlad from Moscow Sep 21 '17 at 21:13