0

I'm implementing a K-means algorithm in C. It works well most of the time, but debugging it with Valgrind tell me that I'm doing an "Invalid read of size 8 - Invalid write of size 8 - Invalid read of size 8" using '''memcpy''' at the beginning. I think the problem isn't there, but where I assign a value to the multidimensional float array element, which memory is dynamically allocated with '''malloc''' with a for loop at some point. 'Cause Valgrind also tell "Address 0x572c380 is 0 bytes after a block of size 80 alloc'd".

I've tried to add 1 to the number of bytes that I allocate, cause I thought that maybe '''malloc''' "needed" more memory to do its job, but nothing changed. I know maybe it's a basic error, but I'm quite new to the language and at my course it wasn't explain anything so "technical". I've tried to search the answer and explanation of the error but I have only found problems with '''char''' arrays, and with those I'd understood the function '''strcpy''' can resolve the issue. What about float arrays? It's the first time a use '''memcpy'''.

Here are pieces of code that raise those Valgrind messages.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void main(){
    FILE* fp; //used to open a .txt file to read
    char buf[100];
    float ** a;
    char * s;
    int i;
    int j;
    int rows = 10;
    fp = fopen("data.txt", "r");
    if(fp==NULL){
        perror("Error at open file.");
        exit(1);
    }

    a = (float**) malloc(rows*sizeof(float*));
    for(i=0; i<rows; i++){
        s = fgets(buf, 100, fp); //reading .txt file
        if (s==NULL){
            break;
        }
        a[i] = malloc(dim*sizeof(float));
        a[i][0] = atof(strtok(s, ","));
        for(j=1; j<dim; j++){
            a[i][j] = atof(strtok(NULL,","));  //save as float value the                token read from a line in file, for example, from line "1.0,2.0,3.0" as first line -> get a[0][1] = 2.0
        }
    }
    fclose(fp);

    m = (float**) malloc(rows*sizeof(float*));
    for (i=0; i<rows; i++){
        m[i]=malloc(dim*sizeof(float)); //not initialized
    }

    memcpy(m, a, rows*dim*sizeof(float));
}

Can someone also help me understand why it works but Valgrind raises these error messages?

erika
  • 15
  • 6
  • 1
    Show the valgrinds verbatim, in full and as text, please. – Yunnosch Jan 14 '19 at 18:38
  • 1
    Also please provide a [mcve]. – Yunnosch Jan 14 '19 at 18:39
  • 3
    That `memcpy` call does *not* do what you apparently think it does. It does a byte-wise copy from the source, which is a *shallow* copy. – Some programmer dude Jan 14 '19 at 18:41
  • Sorry, I've edited: there is no ```righe``` variable, just ```rows``` – erika Jan 14 '19 at 19:08
  • @Someprogrammerdude can you please explain me the correct use of ```memcpy```, in this case? Should I memcpy every float in ```a[i][j]``` in ```m[i][j]``` with a for loop or it's the same issue? – erika Jan 14 '19 at 19:12
  • @Someprogrammerdude I used ```memcpy``` because ```m[i][j] = a[i][j]``` "links" pointers ```a```and ```m```, but I don't want it – erika Jan 14 '19 at 19:19
  • Possible duplicate of [Correctly allocating multi-dimensional arrays](https://stackoverflow.com/questions/42094465/correctly-allocating-multi-dimensional-arrays) – Andrew Henle Jan 15 '19 at 11:24

1 Answers1

0

You're first allocating an array of float*, then allocating several arrays of float so your last memcpy(m, a, rows*dim*sizeof(float)) copies an array of float* (pointers to float) to another one, but using rows * dim floats, which @SomeProgrammerDude rightfully noted. That would copy pointers, and not values.

Also, as pointed by @xing, you're allocating rows but using righe (which you didn't show). It might be a cause of problems.

I would suggest allocating the whole array at once on the first row, then having all other rows pointing to adequate rows:

a = malloc(rows * sizeof(float*));
a[0] = malloc(dim * rows * sizeof(float)); // Allocate the whole matrix on row #0
for (i = 1; i < rows; i++) {
    a[i] = a[0] + i * dim; // sizeof(float) automatically taken into account as float* pointer arithmetics
}
...
m = malloc(rows * sizeof(float*));
m[0] = malloc(dim * rows * sizeof(float));
memcpy(m[0], a[0], dim * rows * sizeof(float));

(add NULL checks of course)

Matthieu
  • 2,736
  • 4
  • 57
  • 87
  • As I said in a previous comment, there is no ```righe``` variable, just ```rows``` (it was just a translation of the variable name, to be more comprehensible! ahah). Anyway, thank you, cause this solution solved the problem also in other points of the code! In the beginning, I didn't understand the difference, maybe I didn't understand well the theory of pointers. Thanks! – erika Jan 15 '19 at 11:28
  • @erika I guess you missed one `righe` from the edit (it still shows in the `for` loop). I also translate my code before pasting it here so I get you :p Pointers are *the* feature I love about C so it's really worth spending some time to fully understand it, as it is *really* powerful (and also error-prone. But as they say "*with great powers ...*" ;)) – Matthieu Jan 15 '19 at 15:22
  • 1
    Ooh, you're right! Thank you @Matthieu! I will surely spend more time on them... as I'm sure one day I'll get "great responsibility" ;) – erika Jan 16 '19 at 14:26