1
char** names=(char**)malloc(count*sizeof(char*));
//while loop
names[count]=(char*)malloc(size+1);//why malloc again?

So char** names=(char**)malloc(count*sizeof(char*)); creates a pointer to a location hosting 4 times count bytes and then stores pointer location to names?

Then in the while loop, size+1 bytes long memory is allocated and its address is given to names[count], which refers to the location pointed to by names? So here the memory created from the first malloc stores the location of memory created by the second malloc? And is the size of memory pointer 4 bytes, so that I can access each names[count] by moving to the next 4 bytes starting from the start of the memory location?

If my thinking is correct, are these correct NASM implementations of these two lines of c code:

;char** names=(char**)malloc(count*sizeof(char*));
  mov eax, dword count
  ;lea ebx, [eax*4]
  imul ebx, eax, 4
  push ebx
  call _malloc
  mov names, eax
  add esp, 4

;names[count]=(char*)malloc(size+1);
  mov ebx, size
inc ebx
push ebx
call _malloc
add esp,4
mov ebx, names
mov ecx, dword count
mov [ebx+ecx*4], eax

Fyi, also, these two lines of code are part of the following c code to extract names from a file and sort them:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
  char path[]="names.txt";

  char szName[50];

  int count=0;

  int size=0;

  int x=0;
  int y=0;

  char* temp=0;
  int pos=0;

  FILE* file=fopen(path,"rt");

  if (file){

  while (!feof(file)){

  fscanf(file,"%50s",szName);
  count++;


  }

  }
  else{
       printf("Error opening file\n");
       return 1;
  }

  printf ("Count: %d\n",count);

  char** names=(char**)malloc(count*sizeof(char*));

  rewind(file);

  count=0;



  while (!feof(file)){

  fscanf(file,"%50s",szName);

  size=strlen(szName);

  names[count]=(char*)malloc(size+1);

  strcpy(names[count],szName);

  count++;


  }

  printf("Original file\n");

  for (x=0;x<count;x++){

  printf("Name %d:\t%s\n",x+1,names[x]);

  }


  for (x=0;x<count;x++){

  temp=names[x];
  pos=x;

  for (y=x;y<count;y++){

      if (strcmp(temp,names[y])>0){

      temp=names[y];
      pos=y;

      }    
  }

  temp=names[x];
  names[x]=names[pos];
  names[pos]=temp;    
  }

  printf("Sorted names\n");

  for (x=0;x<count;x++){

  printf("Name %d:\t%s\n",x+1,names[x]);

  }

  system("PAUSE");  
  return 0;
}
haris
  • 149
  • 1
  • 14
  • Do not do `names[count]=(char*)malloc(size+1);`. It is out-of-bound access. – MikeCAT Dec 05 '15 at 14:56
  • It's part of working code given by my instructor. I will post the code in the op. – haris Dec 05 '15 at 14:58
  • Working? It seems **happened** to work. It is considered as dangerous, so do not execute the code! – MikeCAT Dec 05 '15 at 15:00
  • Your instructor is teaching you to write bad code. Want a dynamic array with two levels of indirection? `char (*array)[y] = malloc(x * sizeof *array);`... Want three levels of indirection? `char (*array)[y][z] = malloc(x * sizeof *array);`... As you can see, **there is no need for multiple `malloc`s to produce an array**, and in fact using multiple `malloc`s is wrong. **An array is a contiguous block of memory, in a single allocation**, and most functions rely upon this. **Using multiple `malloc`s breaks quite a few standard functions in serious ways**. – autistic Dec 05 '15 at 15:04
  • Okay I won't use it, but please still help me understand the logic behind it. – haris Dec 05 '15 at 15:04
  • Why do you want to understand broken and/or unnecessary logic? Just get a new professor (one that's less broken and/or unnecessary)... – autistic Dec 05 '15 at 15:05
  • How about asking to the instructor who gave the code? – MikeCAT Dec 05 '15 at 15:06
  • Additionally, you should not try to understand C as though it corresponds with a series of machine code instructions. C is described in terms of an *abstract machine*, which gives credibility to what @MikeCAT said earlier regarding the dangers of undefined behaviour. Your *physical machine* may have given that *undefined behaviour* a definition, but that isn't required to be the same definition for all machines, or even the same definition on your machine at a different time. – autistic Dec 05 '15 at 15:09
  • @Seb I you are using a library which cannot handle an array of dynamically allocated strings, then you may want to start looking for a new library. – Jeffery Thomas Dec 05 '15 at 15:09
  • @JefferyThomas You may want to start writing a new library... – autistic Dec 05 '15 at 15:13
  • I'm very sorry. I missed the `count=0;`. It won't be such a serious problem. By the way, they say `!feof(file)` won't work well and you should use `fscanf(file,"%50s",szName)==1` to tell whether it read some data. Also they say [you shouldn't cast the result of `malloc()` in C](http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc). – MikeCAT Dec 05 '15 at 15:14

2 Answers2

2

So char** names=(char**)malloc(count*sizeof(char*)); creates a pointer to a location hosting 4 times count bytes and then stores pointer location to names?

Yes, if sizeof(char*) in your environment is 4.

Then in the while loop, size+1 bytes long memory is allocated and its address is given to names[count], which refers to the location pointed to by names?

The formar part is correct, but the latter is wrong. The address is given to count elements ahead from the location pointed to by names.

So here the memory created from the first malloc stores the location of memory created by the second malloc?

Yes, it does.

And is the size of memory pointer 4 bytes,

It depends on the environment.

so that I can access each names[count] by moving to the next 4 bytes starting from the start of the memory location?

Yes, if sizeof(char*) in your environment is 4.

If my thinking is correct, are these correct NASM implementations of these two lines of c code:

I found no error in the code, if int is a 32-bit integer in your environment.

MikeCAT
  • 73,922
  • 11
  • 45
  • 70
  • _"The address is given to count elements ahead from the location pointed to by names, which is out-of-range."_ `count` is reset to zero before the second loop, so when are you suggesting that `count` will be out of bounds? – Michael Dec 05 '15 at 15:09
  • @Michael I was wrong and I missed the `count=0;`. I'm very sorry. – MikeCAT Dec 05 '15 at 15:14
  • so with the first malloc let's say 8 bytes are allocated: 00000000. Then with the second two 4 bytes are allocated and their addresses are stored in above as follows: 11112222, where 1111 is the address of the location of the first 4 byte memory. – haris Dec 05 '15 at 15:25
1

Your professor is creating an array of strings. The code as written is unnecessarily confusing, but the idea is sound.

Here is a hopefully less confusing example

// copy an existing string just like strdup()
char* stringDuplicate(const char* string) {
    size_t size = strlen(string) + 1; // 1 additional for `\0` terminator
    char* result = (char*)malloc(size * (sizeof(char)));
    strcpy(result, string);
    return result;
}

// copy an existing string array
char** stringArrayDuplicate(const char** stringArray, size_t size) {
    // Allocate an array which can hold size number of strings
    char** result = (char**)malloc(size * sizeof(char *));

    // For each string in stringArray, copy the string
    for (size_t i = 0; i < size; ++i) {
        result[i] = stringDuplicate(stringArray[i]);
    }

    return result;
}
Jeffery Thomas
  • 42,202
  • 8
  • 92
  • 117
  • The main problem I have with this answer is the reference to "array of strings". Unfortunately it is a significant problem, because it's not uncommon for conflicting definitions of "string" to get in the way of a correct understanding of the C standard library. **edit**: I'm also not fond of explicit `void *` casts, particularly of the return value of `malloc`. – autistic Dec 05 '15 at 15:40