0

Hello. I got a project about the Longest Common Subsequence

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int max(int a, int b);


int lcs(char *X,char *Y,int m,int n){
    //int L[m+1][n+1];
    int i, j;
    int **L; 

    L = (int **)malloc(1*sizeof(int));

    for(i=0;i<=m;i++){
        for(j=0;j<=n;j++){
            L = (int **) realloc(L[i],j*sizeof(int*));
            if(i==0 || j==0)
                L[i][j]=0;
            else if(X[i-1]==Y[j-1])
                L[i][j]=L[i-1][j-1]+1;
            else
                L[i][j]=max(L[i-1][j],L[i][j-1]);
        }
    }
/*  
    printf("\n");
    for(i=0;i<=m;i++){
        for(j=0;j<=n;j++){
            printf("%d ",L[i][j]);
        }
        printf("\n");
    }
*/
   return L[m][n];
}

int max(int a, int b)
{
    int max;
    if(a>b)
        max=a;
    else 
        max=b;
    return max;
}


int main(){
    FILE *file;
    char c;
    int m=0, n=0, flag=0;
    file=fopen("in1.txt","r");
    if(file==NULL){
        printf("Error opening file.\n");
        exit(0);
    }

    if(file){
        while((c=fgetc(file))!=EOF){
            if(c!=' ' && c!='\n' && flag==0)
                m++;
            if(c=='\n')
                flag=1;
            if(c!=' ' && c!='\n' && flag==1)
                n++;
        }
    }

    char X[m], Y[n];
    int i=0;
    rewind(file);
    flag=0;
    while((c=fgetc(file))!=EOF){
        if(c!=' ' && c!='\n' && flag==0){       
            X[i]=c;
            i++; 
        }
        if(c=='\n'){
            flag=1;
            i=0;
        }               
        if(c!=' ' && c!='\n' && flag==1){
            Y[i]=c;
            i++;        
        }
    }


    printf("Length of LCS is %d .\n", lcs( X, Y, m, n ) );
    fclose(file);
    return 0;
}

For small matrixs it worked fine but for bigger sequences I get a stack overflow. I was advised "You shouldn't store 2D-matrix inside main() function or any other function (it may cause stack overflow). Maybe move it outside of functions or allocate dynamically." but I dont know what the person means by moving it outside of functions and I dont know to allocate 2D matrix.

Sorry for the long post. Thank you

João Vieira
  • 121
  • 1
  • 9
  • "Move outside of functions" means "use global variables"; it works (at least for simple programs), but it's not really recommended, and it becomes hard to manage and error prone for more complex programs. "Allocate dynamically" means just that, use dynamic memory allocation (`malloc()`, `free()` and friends); it's more complicated you need to learn anyway. – AlexP Dec 05 '17 at 20:03
  • int **L; L = (int **)malloc(1*sizeof(int)); for(i=0;i<=m;i++){ for(j=0;j<=n;j++){ L = (int **) realloc(L[i],j*sizeof(int*)); i've tried using malloc and realloc. still cant get result – João Vieira Dec 05 '17 at 20:30
  • You haven't used `malloc` in the code. Even if you did, `int *L; L = (int **)malloc(1*sizeof(int));` is going nowhere. Please don't cast the return value from `malloc`, and enable compiler warnings. What is the actual question? Show the code where you allocate memory: in the question. – Weather Vane Dec 05 '17 at 20:34
  • Aside: `char c;` ==> `int c;` because most library functions do not return (or take) `char` but `int`. – Weather Vane Dec 05 '17 at 20:38
  • Aside: `max` and `min` are macros provided by MSVC and (AFAIK) gcc libraries. – Weather Vane Dec 05 '17 at 20:39
  • as mentioned i need to show the length of the longest common subsequence. i get it from a file and then i need to do the matrix. heres the code of the matrix https://pastebin.com/1BUFuP3n – João Vieira Dec 05 '17 at 20:40
  • `int **L;` is a pointer to a pointer to `int`. So `L = (int **)malloc(1*sizeof(int));` is wrong. You must at least allocate memory for one pointer, which may not be the same size as `int`. So `L = malloc(1*sizeof(int*));`. But after that, I can't follow your logic. The function is passed `int m,int n` so why aren't you using them to allocate enough memory, and forget the `realloc` malarky? – Weather Vane Dec 05 '17 at 20:53
  • I could declare L with both m and n right away but when I do that I cant use my program for big sequences. I believe it gets a stack overflow since the L matrix would be to big to be done.About the code i posted, the idea was to alocate a L[0][0] matrix and then everytime i needed another spot i would give it necessary memory which doesnt seem to be working either cuz i cant use that method or i dont know how to use dynammic memory on 2D vectors – João Vieira Dec 05 '17 at 21:02
  • If you have a real C compiler, that is at least C99, allocate a 2D matrix as `int (*mat)[Y] = malloc(sizeof(int[X][Y]));` and use it as usual. No need to allocate the rows separately etc. see also https://stackoverflow.com/questions/24680166/allocating-a-2d-contiguous-array-within-a-function – Jens Gustedt Dec 05 '17 at 21:10

1 Answers1

1

Replace

char X[m], Y[n];

by

char *X = malloc(m * sizeof(char));
char *Y = malloc(n * sizeof(char));

and at the end of your main add:

free(X);
free(Y);

Now you have dynamic allocation on the heap. What you did isn't standard anyways (variable length array on the stack, see here). Apart from that I haven't checked your code. At a first glance, it seems strange that you first split into n and m when counting and then use i for both arrays when filling.

Update:
Now I see you edited your code from your original post. Here is how to do it with a double pointer.

Allocation:

int **L = malloc((m + 1) * sizeof(int*));
for (int i = 0; i < m + 1; i++)
    L[i] = malloc((n + 1) * sizeof(int));

Freeing:

for (int i = 0; i < m + 1; i++)
    free(L[i]);
free(L);

And remove the realloc. And don't forget that you can't return L[m][n] after freeing, so store that value, then free, then return the value.

Pedro
  • 842
  • 6
  • 16
  • I did it and i still cant get it working. the problem seem to be building the matrix in lcs function – João Vieira Dec 05 '17 at 22:34
  • Then it's not a stack overflow (at least not any more) but buggy code. Now, looking at your `lcs` function, I can say at least the allocation part is definitely buggy. Check out how to use double pointers (see e.g. [here](https://stackoverflow.com/a/16001894/563940)) or how to implement 2D address calculation yourself (e.g. see [here](https://stackoverflow.com/a/2151141/563940)). – Pedro Dec 05 '17 at 22:59