0

I am trying to write a C program that can filter through lines. It is supposed to print only one line when there are consecutive duplicate lines. I have to use arrays of chars to compare the lines. The size of the arrays are inconsequential (set at 79 chars for the project). I have initialized the arrays as such:

char newArray [MAXCHARS];
char oldArray [MAXCHARS];

and have filled the array by using this for loop, to check for newlines and the end of file:

 for(i = 0; i<MAXCHARS;i++){
         if((newChar = getc(ifp)) != EOF){
                 if(newChar != '/n'){
                           oldArray[i] = newChar;
                           oldCount++;
                  }
                  else if(newChar == '/n'){
                           oldArray[i] = newChar;
                           oldCount++;
                           break;
                  }
         }
         else{
              endOf = true;
              break;
         }
}      

To cycle through the next line(s) and search for duplicates, I am using a while loop that is initially set to true. It fills the next array up to the newline and tests for EOF as well. Then, I use two for loops to test the arrays. If they are the same at each position in the arrays, duplicate remains unchanged and nothing is printed. If they are not the same, duplicate is set to false and a function (testArrays) is called to print the contents of each array.

 while(duplicate){
         newCount = 0;
         /* fill second array, test for newlines and EOF*/
         for(i =0; i< MAXCHARS; i++){
                if((newChar = getc(ifp)) != EOF){
                       if(newChar != '/n'){
                           newArray[i] = newChar;
                           newCount++;
                       }
                       else if(newChar == '/n'){
                              newArray[i] = newChar;
                              newCount++;
                              break;
                       }
                }
                else{                 
                        endOf = true;
                         break;
                }
         }
/* test arrays against each other to spot duplicate lines*
  if they are duplicates, continue the while loop getting new 
  arrays of characters in newArray until these tests fail*/
        for(i =0; i< oldCount;  i++){
               if(oldArray[i] == newArray[i]){
                     continue;
               }
              else{
                    duplicate = false;
                     break;
               }
        }
        for(i =0; i <newCount; i++){
                if(oldArray[i] == newArray[i]){
                       continue;
                }
                else{
                     duplicate = false;
                     break;
                }
        }

        if(endOf && duplicate){
                testArray(oldArray);
                break;
         }
}      
if((endOf && !duplicate) || (!endOf && !duplicate)){
         testArray(oldArray);
         testArray(newArray);
}      

I find that this does not work and consecutive identical lines are being printed anyways. I cannot figure out how this could be happening. I know this is a lot of code to wade through but it is pretty straight forward and I think that another set of eyes on this will spot the problem easily. Thanks for the help.

z.rubi
  • 327
  • 2
  • 15

4 Answers4

3

is there a reason why you read a character at a time and instead of calling fgets() to read a line?

char instr[MAXCHARS];
for( iline = 0; ( fgets( instr, 256, ifp ) ); iline++ ) {

. . .<strcmp() current line to previous line here>. . .

}

EDIT: You might want to declare 2 character strings and 3 char pointers -- one point to the current line and the other to the previous line. Then swap the two pointers using the third pointer.

CFan
  • 31
  • 4
  • what happens if the line is less than the max amount of characters (in this case maxchars is set to 79). So, say the line is only 30 chars and then a newline char. Would fgets recognize this as a string of 31 chars or would it think that the next 48 chars should also be read in? – z.rubi May 01 '18 at 02:54
  • Then `fgets` only reads up to the *nul-terminating* character at the end of the string, storing only those number of characters. (you also need to remove the `'\n'` included by `fgets` in the buffer it fills by overwriting it with the *nul-terminating* character, e.g. `'\0'` or just `0`) – David C. Rankin May 01 '18 at 03:04
1

You need to use a function to read lines — either fgets() or one you write (or POSIX getline() if you are familiar with dynamic memory allocation).

You then need to use an algorithm equivalent to:

  1. Read first line into old.
  2. If there is no line (EOF), stop.
  3. Print the first line.
  4. For every extra line read into new.
    • If there is no line (EOF), stop.
    • If new is the same as old, go to step 4.
    • Print new.
    • Copy new to old.
    • Go to step 4.

Those 'go to' steps would be part of normal loop controls, not actual goto statements.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
0

I would do it by strings instead of char by char. I would use gets() to get the full input line and strcmp it to the previous string. You can also use fgets(str, MAX_CHARS, stdin) if you want. strcmp assumes your strings are nul terminated and you may need special EOF handling but something like whats below should work:

int main(){
  char newStr[MAX_CHARS] = {0}; //string for new input
  char oldStr[MAX_CHARS] = {0};

  // Loop over input as long as there is something to read
  while(gets(newStr) != NULL){
    if(strcmp(newStr,oldStr) != 0){
      printf("%s", newStr); 
    }
    else{
      //This is the case when you have duplicate strings.  Dont print
    }

    memset(oldStr, 0, sizeof(oldStr)); //clear out old string incase it was longer
    strcpy(oldStr, newStr); //copy new string into old string for future compare
  }
}
Bwebb
  • 675
  • 4
  • 14
  • So is there a way to add a null terminate on the end if you get the input line by line? How would you count in order to reference the right element or the right pointer in memory? – z.rubi May 01 '18 at 02:21
  • I am not sure what you mean by "count in order to reference the right place in memory". memset(oldStr, 0 , sizeof(oldStr)) will make sure the string is nul terminated on the next read. newStr should read a whole line and be something like "adsfadfasdfasdf\n\0" if you use the fgets(newStr, MAX_CHARS, stdin) implementation. – Bwebb May 01 '18 at 02:31
  • This example main function only has two strings, oldStr and newStr and it does not save copies of them, except when oldStr gets assigned with newStr values for future comparison. If you need to access or save the strings youre reading in you need to create more variables than what the sample code provides. – Bwebb May 01 '18 at 02:32
  • The [`gets()` function is too dangerous to be used — ever!](https://stackoverflow.com/questions/1694036/why-is-the-gets-function-dangerous-why-should-it-not-be-used). – Jonathan Leffler May 01 '18 at 03:34
0

At the part where you tested for duplicate, maybe you could test if oldCount == newCount first? My reasoning is that, if it is a duplicate line, oldCount will be equals to newCount. If it’s true, then proceed to check against the two array?