1

I have some problems with my C code.

The code is intended to read a specific type of txt file (it's used in a C project of a basic search engine by similarity, which can search image, audio or text files).

Here's the code :

typedef struct HISTOGRAMME_E{
    int ** valeurs;
    int nbcolonne;
    int nbligne;
    char type;
}HISTOGRAMME;

HISTOGRAMME * lireDescripteur(FILE * read){

    HISTOGRAMME * retour = malloc(sizeof(HISTOGRAMME));


    int etat = 0;
    int id, type, nbligne, nbcolonne;
    unsigned int max;
    unsigned int cpt = 0;
    int i;
    char canswitch = 1;
    char* val = malloc(sizeof(char));

    int ** histoTempo;


    while (fscanf(read,"%s",val) == 1) {
        // Fonctionnement en MAE

        // Actions
        if(etat == 1){
            id = atoi(val);
            etat = 0;
        }

        if(etat == 2){
            if(strcmp(val, "RGB"))
                type = 3;
            else
                type = 1;
            etat = 0;
        }

        if(etat == 3){
            nbcolonne = atoi(val);
            etat = 0;
        }

        if(etat == 4){
            nbligne = atoi(val);
            etat = 0;
        }



        // Valeurs

        if(etat == 5){
            max = nbligne * nbcolonne;
            histoTempo = malloc(sizeof(int*)*2);
            histoTempo[0] = malloc(sizeof(int)*max);
            histoTempo[1] = malloc(sizeof(int)*max);
            cpt = 0;
            canswitch = 0;
            histoTempo[0][0] = (int)strtol(val, NULL, 0);
            etat = 52;
        }

        if(etat == 51 && canswitch){
            histoTempo[0][cpt] = (int)strtol(val, NULL, 0);
            etat = 52;
            canswitch = 0;
        }


        if(etat == 52  && canswitch){

            histoTempo[1][cpt] = atoi(val);
            etat = 51;
            canswitch = 0;
            cpt += 1;
        }





        // Changement d'états
        if(strcmp(val, "<id>") == 0 && (etat == 0))
            etat = 1;

        if(strcmp(val, "<type>") == 0 && (etat == 0))
            etat = 2;

        if(strcmp(val, "<nbcolonne>") == 0 && (etat == 0))
            etat = 3;

        if(strcmp(val, "<nbligne>") == 0 && (etat == 0))
            etat = 4;

        if(strcmp(val, "<valeurs>") == 0 && (etat == 0))
            etat = 5;


        //if(strcmp(val, "</valeurs>") == 0 && ((etat==51) || (etat == 52))) 
        if(strcmp(val, "</valeurs>") == 0)
            {
                //affichage debug
                printf("id:%u, type:%u, nbcolonne:%u, nbligne:%u\n", id, type, nbcolonne,nbligne);
                /*for(i=0;i<cpt;i++){
                    printf("%x : %u \n", histoTempo[0][i], histoTempo[1][i]);
                }*/

                int ** histogramme = malloc(sizeof(int*)*2);
                histogramme[0] = malloc(sizeof(int)*cpt);
                histogramme[1] = malloc(sizeof(int)*cpt);

                for(i=0;i<cpt;i++){
                    histogramme[0][cpt] = histoTempo[0][cpt];
                    histogramme[1][cpt] = histoTempo[1][cpt];
                }
                cpt = 0;
                etat = 0;
                retour->valeurs = histogramme;
                retour->nbcolonne = nbcolonne;
                retour->nbligne = nbligne;
                retour->type = type;


                nbligne = 0;
                nbcolonne = 0;
                type = 0;
                free(histoTempo[0]);
                free(histoTempo[1]);
                free(histoTempo);
                free(val);
                return retour;
            }
        canswitch =1;
    }
    return retour;
}

void test()
    {
    FILE * read =  fopen(FICHIER_DESCRIPTEUR,"r");


    HISTOGRAMME * test;
    int i = 0;

    test = lireDescripteur(read);

    //printf("%i\n\n\n", test->valeurs[1][1]);
    fclose(read);
    free(test);
}

Here's Valgrind log:

> 220 bytes in 1 blocks are indirectly lost in loss record 1 of 3
>> ==21968==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>> ==21968==    by 0x402AF2: lireDescripteur (index_img.c:905)             
>> ==21968==    by 0x402C62: test (index_img.c:942)                             
>> ==21968==    by 0x402C8F: main (index_img.c:958)               


> ==21968== 220 bytes in 1 blocks are indirectly lost in loss record 2 of 3
>> ==21968==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>> ==21968==    by 0x402B13: lireDescripteur (index_img.c:906)            
>> ==21968==    by 0x402C62: test (index_img.c:942)                  
>> ==21968==    by 0x402C8F: main (index_img.c:958)



> ==21968== 456 (16 direct, 440 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 3
>> ==21968==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>> ==21968==    by 0x402ADF: lireDescripteur (index_img.c:904)         
>> ==21968==    by 0x402C62: test (index_img.c:942)                 
>> ==21968==    by 0x402C8F: main (index_img.c:958)

It seems it's comming from the malloc of "histogramme" in "lireDescripteur" but I can't grasp why. (The three line of code pointed by valgrind :)

int ** histogramme = malloc(sizeof(int*)*2);
histogramme[0] = malloc(sizeof(int)*cpt);
histogramme[1] = malloc(sizeof(int)*cpt);

The function "lireDescripteur" should return a pointer on a structure of type "HISTOGRAMME", with its "valeurs" pointing on "histogramme".

Complete code

Complete valgrind log

Gautier A.
  • 13
  • 5
  • 2
    `char* val = malloc(sizeof(char));`: that's allocating 1 byte. not enough – Jean-François Fabre Jan 04 '18 at 21:19
  • 1
    too many `malloc`s IMO. I only `malloc` if I don't know how much data I need until run time, or if I need "a lot" of data. Otherwise, just put everything in automatic storage. It will reduce the possibility for headaches like this. For example, if you know at compile time you only need one `HISTOGRAMME` .. just put it in automatic storage rather than `malloc`ing from the heap. – yano Jan 04 '18 at 21:27
  • note that you also have this issue in some other part of the linked code: `lireBase`. Don't malloc 1 element. Use local arrays that can hold your tokens. – Jean-François Fabre Jan 04 '18 at 21:29
  • 1
    @yano Thank you for your response. The problem is that it indeed needs dynamic allocation. I'm reading a file which can be of different lenght (it can have basically 2³ to 2¹² differents variables (2¹² because of 4 bits for each RVB channel of the image), and I don't think I can allocate 2¹² bits for each. (I'm trying to stock the "HISTOGRAMME" structure type in a LIFO). – Gautier A. Jan 04 '18 at 21:58
  • the posted code does not compile! Amongst other things, the posted code is missing the needed `#include` statements. Do you expect us to guess as to which header files your code actually includes? – user3629249 Jan 06 '18 at 05:17
  • when calling `fopen()`, always check (!=NULL) the returned value to assure the operation was successful. (and if not successful, call `perror()` so the enclosed text and the reason the system thinks the function failed to `stderr` – user3629249 Jan 06 '18 at 05:20
  • for flexibility, the definition of a struct should be separate from any `typedef` for that struct. – user3629249 Jan 06 '18 at 05:22
  • when calling any of the heap allocation functions: (malloc, calloc, realloc) always check (!=NULL) the returned value to assure the operation was successful. If not successful, then call `perror()` to output the enclosed text and the reason the system thinks the error occurred to `stderr` – user3629249 Jan 06 '18 at 05:27
  • regarding: `char* val = malloc(sizeof(char)); ... while (fscanf(read,"%s",val) == 1)` The size of the allocated heap for `val` is only one byte` So the call to `fscanf()` must not actually input anything as the input/format specifier '%s' will always append a NUL byte to the input. Note: when using that input/format specifier, always use a MAX CHARACTERS modifier that is one less than the length of the input buffer. This will avoid any buffer overflow, which would be undefined behavior and can lead to a seg fault event. – user3629249 Jan 06 '18 at 05:38
  • for ease of readability and understanding: 1) consistently indent the code. 2) follow the axiom: *only one statement per line and (at most) one variable declaration per statement.* – user3629249 Jan 06 '18 at 05:40
  • the posted code is missing a `main()` function, so we cannot reproduce the problem. please post a [mcve] – user3629249 Jan 06 '18 at 05:43
  • regarding: `retour->type = type;` this is assigning a `int` to a `char` with out a cast, so this will result in the compiler outputting a warning message. – user3629249 Jan 06 '18 at 05:46
  • regarding: `FILE * read = fopen(FICHIER_DESCRIPTEUR,"r");` the ?macro? `FICHIER_DESCIPTEUR` is not defined anywhere in the posted code. So the posted code will not compile – user3629249 Jan 06 '18 at 05:48
  • in function: `test()`, the variable `i` is not used, so it results in the compiler outputting a warning message – user3629249 Jan 06 '18 at 05:49
  • regarding: `for(i=0;i – user3629249 Jan 06 '18 at 05:51
  • the function: `atoi()` does not announce when an error occurs. Strongly suggest using `strtol()` – user3629249 Jan 06 '18 at 05:53

1 Answers1

4

The issue valgrind points out seems to be outside the code you're showing, but in the code you're showing, you also have a memory leak because you're not freeing val in all return paths: here:

    canswitch =1;
}
return retour;  // you're not freeing `val` here

but more importantly you have undefined behaviour because val is an array of 1 byte, and you're putting way more data in it (see: How dangerous is it to access an array out of bounds?).

I'd suggest a local array, with a correct size (increase if you have bigger words):

char val[100];

int ** histoTempo;


while (fscanf(read,"%99s",val) == 1) {  // safe fscanf: cannot read more than 99 chars

and don't free(val) in the end. That way you're solving the memory leak and the memory out-of-bounds that can crash your program.

(your program has this issue several times, from the link you provided. As a general rule, use properly sized local arrays when you don't need to return the buffer to the caller)

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • Thank you for your comment. Indeed, it was another memory leak, and I fixed it thanks to you. I'll look forward to correctly size my variables. Concerning the issue valgrind points out, may I ask you if you're sure it's not in the code ? I honestly don't know where it can come from. I'm only compiling and executing the code that I gave in the description, and I can give you the file I read. – Gautier A. Jan 04 '18 at 21:55
  • at some point in your full code in `lireDescripteur` you're allocating `histogramme` and assign it to `retour->valeurs = histogramme;` but I never saw a free with `valeurs` so the memory leak may well be there. But I'm not putting that in my (partial) answer since your question only links to your code. – Jean-François Fabre Jan 04 '18 at 22:03
  • Indeed, it this free was lacking. It solved my problem, thank you a lot. – Gautier A. Jan 04 '18 at 22:08
  • you're welcome. Feel free to accept the answer, and after that, look up C++ and its wonderful `std::vector` class, which allows very smart memory management & reallocation (C is horrible in that aspect). – Jean-François Fabre Jan 04 '18 at 22:13
  • May I ask you an additional question ? As I said before, the main idea of this function is to read a file, and create a structure of type "Histogramme". After having creating this structure, I want to store them up in a LIFO. The problem is that it crashes after something like 3 creation of descriptor, out of 63. I can't really "skirt" the malloc... Do you have any idea how to get around this problem ? Again, thanks for your help ! – Gautier A. Jan 04 '18 at 22:13
  • Sadly, I can't use C++. It's a school project, and C is imposed. The use of LIFO is also imposed... But thank you for your suggestion. – Gautier A. Jan 04 '18 at 22:15
  • so when you quit school, drop C and use C++ for this (or python if you don't need compiled language). additional question: well, that would be another question. Try to reduce your case to a [mcve] so it'll be well recieved and answered. But don't edit that one. That would make this Q&A confusing. – Jean-François Fabre Jan 04 '18 at 22:17