0

I want to store the words of a text in a two dimensional character array, where identical words are stored exactly once. The problem is that program seems to not store the last few words. I have included the whole program, because I can't figure out the problematic part. Here's my try:

#include <stdio.h>
#include <ctype.h>
#include <string.h>
#define N 20
#define M 110

int main() { 
    char text[M], c[N][N], word[N];
    int i, j, k=0, l=0, d, v;

    for(i=0; i<N; i++) {
        for(j=0; j<N; j++) {
            c[i][j]='\0';
        }
    } 
    for(i=0; i<M; i++) {
        text[i]='\0';
    }    
    for(i=0; i<N; i++) {
        word[i]='\0';
    }    
    for(i=0; i<M; i++) {
        d=scanf("%c", &text[i]);  
        if(d==EOF) {
            text[i]='\0';
            break;
        }    
    }        
    for(i=0; i<strlen(text); i++) {    
        if(isspace(text[i])==0) {
            word[l]=tolower(text[i]);
            l++;
        }
        if((i==strlen(text)-1)||((isspace(text[i])!=0)&&(isspace(text[i+1])==0))) {    
            for(v=0; v<i; v++) {
                for(j=0; j<N; j++) {
                    if(word[j]!=c[v][j]) {
                        break;
                    }
                }
                if(j==N) {
                    l=0;
                    break;
                }
            }
            if(v==i) {  
                for(j=0; j<l; j++) { 
                    c[k][j]=word[j];
                }
                k++;
                l=0;
            }
        }
    }
    printf("\n\n");
    for(i=0; i<N; i++) { 
        for(j=0; j<N; j++) {
            printf("%c", c[i][j]);
        }    
        printf(" ");
    }        
    return 0;
}

For some input not all words get printed (not stored?). For example with input: a b c d e f g h i k l m n o p and Ctrl+D the output is: a b c d e f g h i k

I noticed that if I increase N and give the same input, the number of words printed also increases. Any help will be much appreciated.

EDIT:

if we replace

for(v=0; v<i; v++)
for(j=0; j<N; j++)
//...
if(j==N)
//...
if(v==i)

with

for(v=0; v<k; v++)
for(j=0; j<l; j++)
//...
if(j==l)
//...
if(v==k)

the program works properly.

John Kall
  • 3
  • 3
  • 6
    `"I have included the whole program, because I can't figure out the problematic part."` -- Have you tried running your code line by line in a debugger while monitoring the values of all variables, in order to determine at which point your program stops behaving as intended? If you did not try this, then you may want to read this: [What is a debugger and how can it help me diagnose problems?](https://stackoverflow.com/q/25385173/12149471) You may also want to read this: [How to debug small programs?](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/). – Andreas Wenzel Dec 29 '21 at 20:28
  • 2
    Even if using a debugger does not actually solve the problem, it should at least help you to isolate the problem and to create a [mre] of the problem, so that it will be easier for other people to help you. Most people will not be willing to debug your whole program for you, as that should be your job. – Andreas Wenzel Dec 29 '21 at 20:32
  • In addition to using a debugger another tip is to use meaningful variable names. At any one point you have almost 10 one letter variables. Which makes it very hard to read and debug. – kaylum Dec 29 '21 at 20:33
  • In addition to what @kaylum said, breaking this down into small functions with focused purposes would be very helpful. Though I concede you may not have covered that in your class yet. – Chris Dec 29 '21 at 20:48
  • I haven't used a debugger since I don't really know how to handle it yet. Also I agree, I did poorly with the variable names. Perhaps I should edit the question to change them? – John Kall Dec 29 '21 at 21:02
  • @JohnKall: As long as there are no existing answers that would be invalidated by changing the variable names, there is nothing wrong with changing them. Currently, your question has no answers, only comments (which are unimportant and therefore may be invalidated). – Andreas Wenzel Dec 29 '21 at 21:08
  • 2
    @JohnKall: `"I haven't used a debugger since I don't really know how to handle it yet."` -- Then this may be the ideal opportunity to change that? If you are using an [IDE](https://en.wikipedia.org/wiki/Integrated_development_environment), it probably has a built-in debugger, or one can easily be installed. If you are compiling without an IDE, then you will probably want to use GDB. You can search for `GDB tutorial` in Google. – Andreas Wenzel Dec 29 '21 at 21:18
  • I wrote the program on onlinegdb.com , which has a debugger. I'm currently learning its use and I will probably be able to use it soon. – John Kall Dec 29 '21 at 21:37
  • Your sample input is not alphabetical. There seems to be a `j` missing in `a b c d e f g h i k l m n o p`. Is this intentional? – Andreas Wenzel Dec 29 '21 at 22:30
  • @JohnKall: I'm not sure if I would recommend onlinegdb.com for debugging, because I have encountered multiple bugs when using it. – Andreas Wenzel Dec 29 '21 at 22:49
  • Sounds like it would be ideal to use a hash table instead of an array? – Neil Dec 30 '21 at 01:05

1 Answers1

1

How to find the bug with a debugger

First of all, so that you don't have to enter a long line of input whenever you restart your program in the debugger, I suggest that you replace the code

for(i=0; i<M; i++) {
    d=scanf("%c", &text[i]);  
    if(d==EOF) {
        text[i]='\0';
        break;
    }    
} 

with:

strcpy( text, "a b c d e f g h i j k l m n o p" );

That way, the input that you specified in your question will be hard-coded into the program, and keyboard input will no longer be required.

If your run your program in a debugger, setting breakpoints in key places and running the program line by line afterwards, then you will notice that in the line

if(word[j]!=c[v][j]) {

the variable v sometimes has the value 20. This is accessing the array c out of bounds, because valid indices are only 0 to 19. This is causing undefined behavior, and causing your program to falsely believe that the word already exists in the array, so that it does not print it.

How to fix the bug

The bug described in the previous section is probably due to the line

for(v=0; v<i; v++) {

being wrong. The variable v should be limited to N, not i. So you should change that line to the following:

for(v=0; v<N; v++) {

If you then also change the line

if(v==i) {

to

if(v==N) {

then the program has the intended output.

Other remarks

It is also worth noting that your sending nearly 400 null characters to the output stream. Your terminal/console seems to to be ignoring them instead of displaying them, but some terminals may behave differently. Therefore, you may want to change the lines

for(i=0; i<N; i++) {
    for(j=0; j<N; j++) {
        printf("%c", c[i][j]);
    }
    printf(" ");
}

to the following:

for(i=0; i<N; i++) {
    printf("%s", c[i]);
    printf(" ");
}

That way, the null characters will not be sent to the output stream. However, if you do this, you can only store 19 characters instead of 20 characters in every string, because one extra byte is now required for the terminating null character. You may want to change your code to enforce this limit.

Andreas Wenzel
  • 22,760
  • 4
  • 24
  • 39
  • Thanks a lot for taking the time to explain! All words get stored now. I found a logical error in the initial code: In the buggy lines instead of `i` , I wanted to write `k` . That change seems to also solve the problem. – John Kall Dec 30 '21 at 09:57
  • Unluckily there exists another problem: if a word has more than one letters the next words always get stored again. For example `a or a` prints `a or a` instead of `a or`. My question has been answered so maybe I'll try and debug it myself ;) – John Kall Dec 30 '21 at 10:14
  • I made some changes and now it works normally. Again, thank you for your help. Next time before asking a such question I will have used a debugger first. – John Kall Dec 30 '21 at 12:09
  • @JohnKall: I am pleased I was able to help. Although I helped you solve your immediate problem, I believe the most valuable lesson that you learnt was how to identify the exact problem yourself, using a debugger. – Andreas Wenzel Jan 02 '22 at 03:07