0

I've wrote this short code to read a text file and copy its information to a new txt file, but doing some character substitution in the process.

My problem is, the code does all the job it is supposed to do but it doesnt end. It cant find the EOF special character at the end of file (arq1) that would tell it to finish processing.

What is the problem?

#include<stdio.h>
#include<stdlib.h>

typedef enum {false, true} Boolean;

int main(){

  FILE *arq1, *arq2;
  char filename[30];
  char first = '*' , second = '*';
  Boolean f_use = false, s_use = false;
  char aux;

  printf("Filename: ");
  scanf("%s", filename);

  arq1 = fopen(filename, "r");
  arq2 = fopen("Codes_out.txt", "w");

  while(fscanf(arq1, "%c", &aux) != EOF && aux != '\n')
    fprintf(arq2, "%c", aux); /*Copy first line*/

  fprintf(arq2, "\n");

  while(aux != EOF){ //#

    printf("Processing new line\n"); // TEST
    f_use = false;
    s_use = false; 

    while(aux != '\t' && aux != ' ' && aux != EOF){ /* Copy locus ID*/
      printf("%c", aux);//TEST
      fscanf(arq1, "%c", &aux);
      fprintf(arq2, "%c", aux);
      printf("Copying ID\n");//TEST
    }
    printf("ID copied\n");//TEST

  while(fscanf(arq1, "%c", &aux) != EOF && aux != '\n'){ //##

      /*If a code for nitrogen base is found, identify
    as first or second state and substitute 
    for a numeric code (1 or 2)*/
      if(aux == 'C' || aux == 'G' || aux == 'A' || aux == 'T' ||
     aux == 'c' || aux == 'g' || aux == 'a' || aux == 't'){

    if(f_use == false){ 
      /*First base not yet identified*/
      first = aux;
      f_use = true;
      printf("OK 6a\n\n"); //TEST
      printf("first = %c\n", first); //TEST
    }
    else if(s_use == false && aux != first){  
      /*second base not yet identified
        and aux different from first base*/
      second = aux;
      s_use = true;
      printf("OK 6b\n\n"); //TEST
      printf("second = %c\n", second); //TEST
    }

    if(aux == first){
      fprintf(arq2, "1");
      printf("OK 5a\n\n"); //TEST
    }
    else if(aux == second){
      fprintf(arq2, "2");
      printf("OK 5b\n\n"); //TEST
    }
  }

  else if(aux == ' ')
    fprintf(arq2, "%c", aux);

  else if(aux == 'N' || aux == 'n')
    fprintf(arq2, "%c", aux);

  else
    fprintf(arq2, "3");

  } //##
  printf("%c ", aux);
  fprintf(arq2, "\n"); /*add line break*/
  printf("OK 7\n\n"); // //TEST
} //#

printf("Processing finished\n"); //Control
fclose(arq1);
fclose(arq2);

return 0;
}

Here is the link for the input file

Kaiser
  • 35
  • 1
  • 3
  • 9
  • possible duplicate of [How to use EOF to run through a text file in C?](http://stackoverflow.com/questions/1835986/how-to-use-eof-to-run-through-a-text-file-in-c) – edmz Jan 27 '15 at 14:18
  • printf marked whit //TEST are temporary for debugging – Kaiser Jan 27 '15 at 14:19
  • @Kaiser your code needs a lot of improvement actually, but the main issue is addressed in my answer, I suggest you make your code follow it's own logic in a clean way, I don't even want to try and guess what it does. – Iharob Al Asimi Jan 27 '15 at 14:38

2 Answers2

3

EOF is not a character, you can't try to read it with fscanf(arq1, "%c", &eof); you should instead check fscanf()'s return value, which could be EOF or the number of matched arguments.

Try with something like this

int status;

while ((status = fscanf(arq1, "%c", &aux)) != EOF)
{
 .
 .
if (status == 1)
    fprintf(arq2, "%c", aux);
 .
 .
}

and also for that very long if I would recommend this

switch (aux)
{
case 'C':
case 'G':
case 'A':
case 'T':
case 'c':
case 'g':
case 'a':
case 't':
    /* code here */
    break;
}
Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
-1

Use feof( arqN ) function to check for end of file. EOF code (0x1A) does not exist in all text files.

i486
  • 6,491
  • 4
  • 24
  • 41
  • 1
    And `feof()` is almost never useful. – Iharob Al Asimi Jan 27 '15 at 14:21
  • There is such code but you don't know it. 0x1A code (Ctrl-Z). Used in the past in DOS. – i486 Jan 27 '15 at 14:22
  • Yes you are saying in DOS, `EOF` is a macro because it could have different values I suppose, for example on linux it's `-1` – Iharob Al Asimi Jan 27 '15 at 14:23
  • It is real 0x1A byte in file. If you enter `type myfile.txt` in Prompt and Ctrl-Z code is reached, the file is closed. – i486 Jan 27 '15 at 14:25
  • 1
    ...when you open file with `fopen` have to use "rb" mode for binary. If you open it with "rt" or "r", 0x1A code will be interpreted as EOF and you cannot read forward. (Other difference is that 0x0A codes are replaced with 0x0D 0x0A). – i486 Jan 27 '15 at 14:31
  • Apparently that is a DOS specific thing, it's really wierd. And you give me one more reason to hate the MS OS. – Iharob Al Asimi Jan 27 '15 at 14:35
  • Many things are not excellent in DOS but that is the reality. And many of them are inherited in Windows from DOS. – i486 Jan 27 '15 at 15:09
  • @iharob `0x1A` as an EOF for text files preceded DOS. Many successful OS of the day used that approach as a space efficient file system EOF marker - believe it was used in prior tape systems too. It worked well in the paradigm of the day as many text based files were sequential and not random accessible. Save hate for evil things - like "user input". – chux - Reinstate Monica Jan 27 '15 at 15:57
  • @chux I am sorry I just wanted to express that I hate it, and I didn't know that. – Iharob Al Asimi Jan 27 '15 at 16:00