-1

I'm new to the forum and to c in general, so please bear with me. I'm trying to write a c program that takes a text file and parses all words and characters, then saves them to an output text file. I'm using C99, Windows 7-64bit, MinGW, notepad, notepad++, and ASNI format for txt files. I've read that fgets() is better to use than fscanf for reading input because it has buffer overflow protection, so I decided to try using it, but it's having issues with some punctuation in the test file (I think it's the carriage return \r). I tried using fscanf, and aside from it skipping all of the whitespace (which I can add back in later, not concerned with that), it seems to take in all of the text just fine and print it in the output file.

Here's my test code:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>

void main(int argc, char* argv[])
{
    int limit=100, flimit=0, flimitmax=1900000000; //I stopped flimitmax short of the 2GB mark 
    char name[limit], copyname[limit];
    FILE *data, *output;

//Gets the value of a specified data file for reading   
    printf("\nPlease specify a file to format for data input.  NOTE: CHARACTERS LIMITIED TO %d\n", limit);
    fgets(name, limit, stdin); //takes input of size 'limit' and assigns it to 'name'
    size_t ln = strlen(name);   //gets the size of name[]
    if(name[ln-1]=='\n') name[ln-1]='\0'; //gets rid of newline character read in by fget
    strncpy(copyname, name, limit); //stores the value of the specified file name for use making the input and output files 
    strcat(name, ".txt"); //appends .txt file extension to the file name
    printf("\nYou chose file %s\n", name);

    data = fopen(name, "r"); //Checks to see if the specified data file exists and if it can be read
    if(data==NULL)
    {
        fprintf(stderr, "\nCan't open file %s!!!\n", name);
        exit(1);
    }

//Gets the size of the data file being worked.  Used later when the file is copied into the program using fgets.    
    fseek(data, 0, SEEK_END); // seek to end of file
    flimit = ftell(data)+1; // get current file pointer
    fseek(data, 0, SEEK_SET); // seek back to beginning of file
    if((flimit > flimitmax) || (flimit < 0))//Checks to see if flimit falls between 0 and 1.9GB.  If not, the file is larger than 1.9GB
    {
        printf("Error, max file size exceeded.  Program terminating\n");
        exit(1);
    }
    printf("File size is %d bytes\n", flimit);

//Creates a name for the output file
    strncpy(name, copyname, limit); //reassigns original value to name to make output file name
    strcat(name, "OUT.txt"); //appends OUT.txt file extension to the file name
    printf("\nOutput file is %s\n", name);

    output = fopen(name, "w"); //checks to see if the Input file exists and if it can be read
    if(output==NULL)
    {
        fprintf(stderr, "\nCan't open file %s!!!\n", name);
        exit(1);
    }

//Reads the data file and assigns values to the input and output files  

    char filein[flimit]; //I created this variable here to avoid issues of array resizing.
    //fgets attempt
    fgets(filein, flimit, data); //scans the whole datafile and stores it in the char array.

    printf("\n%s\n", filein);
    fprintf(output, filein);

    memset(&filein[0], 0, sizeof(filein)); //clears the filein array

    fseek(data, 0, SEEK_SET); // seek back to beginning of file 
    //fscanf attempt
    while(fscanf(data, "%s", &filein)!=EOF)
    {
        printf("\n%s\n", filein);
        fprintf(output, filein);
    }

//Closes the files and ends the program
    printf("\nDONE!!!\n");
    fclose(data);
    fclose(output);
}

Here's the text I use in the data file:

Things/Words and punctuation:  The Test

This is a test (mostly to see if this program is working).

Here's the output I get from the output file:

Things/Words and punctuation:  The Test
Things/Wordsandpunctuation:TheTestThisisatest(mostlytoseeifthisprogramisworking).

Why is fgets() getting hung-up? It gets the first line just fine, then it gets stuck.

Thanks in advance for taking the time to look at this. If you have any other recommendations for my code, feel free to let me know.

zentath05
  • 3
  • 3
  • I always wonder what might be the reason for putting `!!!` somewhere... – glglgl Dec 13 '15 at 17:11
  • 2
    It isn't clear why you think it's "stuck". The program is too messy to reason about. Write a very short program that reads a file with `fgets` line by line and prints it back, and does nothing more. That's something that can be analyzed. – n. m. could be an AI Dec 13 '15 at 17:22
  • 1
    fscanf can be safe--you just need to specify a maximum width when parsing string (eg, the "%10s" format specifier). See http://stackoverflow.com/questions/1621394/how-to-prevent-scanf-causing-a-buffer-overflow-in-c – forkrul Dec 13 '15 at 17:25
  • maybe you are thinking of fread, not fgets since fgets stops at the first newline. with fread OTOH you can read the whole file provided you have allocated the space. your comment after fgets suggests that you want to read the whole file so fread would be better, just terminate the string with \0 – AndersK Dec 13 '15 at 17:50
  • I found it a bit difficult to understand your question, but now it seems to me that you thought that `fgets` would read the entire contents of the file in one big chunk? That is not correct, since it just reads a single line. – Thomas Padron-McCarthy Dec 14 '15 at 07:16
  • Hey guys, thanks for the input. Sorry about the wordiness of my program. Most of what's up top is used to get file information, while the question was about the section at the bottom. I'm also still figuring out how to write code efficiently. I'll make my question programs more concise in the future. Thanks again guys. – zentath05 Dec 24 '15 at 16:33

1 Answers1

0

"The fgets function will stop reading when flimit characters are read, the first new-line character is encountered in filein, or at the end-of-file, whichever comes first."

Your fgets() encounters a new-line character and halts. That's why you need to use a while loop to keep your code running until you reach the end of the file. Upon successful completion, fgets() returns a stream. If this stream is at end-of-file, fgets() returns a null pointer.

The following code block solves your problem,

...

    while(fgets(filein,flimit,data) != NULL)
    {
        printf("%s\n",filein);
        fprintf(output,filein);
    }

...

The following link explains the correct usage of fgets function with examples.

Reference: http://pubs.opengroup.org/onlinepubs/009695399/functions/fgets.html

Hope it helps,

Berk Soysal
  • 2,356
  • 1
  • 18
  • 17
  • Using `feof` to detect end of file is almost always the wrong way to read a file in C. In this case, you can simply get rid of your `if` statement, and just use the `while` loop. – Thomas Padron-McCarthy Dec 14 '15 at 07:04
  • What is the source of your citation ? You should also explain why and how this code block solves the problem. – Chnossos Dec 14 '15 at 09:03
  • Thanks Berk Soysal. I tried the while loop and it worked. Thinking about it, it makes a certain amount of sense since the fscanf also had a while loop. I didn't pick-up on that. Thanks again! – zentath05 Dec 24 '15 at 16:30