2

I'm trying to build a C portable code (for Windows, MacOS and Linux) that creates an output .txt file to receive the results of a numerical simulation.

Summarizing, the code takes the name of the file and the extension and checks if the file already exists in the directory. If so, it creates another file with the same name, but with a number between parenthesis (#) in the end to distinguish the old from the new one.

The problem is: it is working properly on the mac environment, however when I compile and run it on windows, the file is not created in the end of execution. I could not find what I'm doing wrong.

Also, I'm using Intel C/C++ Classic compiler. If I use another compiler, for example, the Intel® oneAPI DPC++/C++ Compiler for windows, it complains about the usage of sizeof(src) when I call the strncat(...) function.

So far, this is the version of the code:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>

bool file_exists(char *filename);
FILE *create_output_file(char *name, char *ext, bool (*file_exists)(char *));

#define funcname "test"

int main(void) {
    
    // Create output files to store results
    char *output_filename = malloc(200 * sizeof(*output_filename));
    sprintf(output_filename, "%s_out", funcname);
    printf("output_filename = %s\n", output_filename);
    char *ext = ".txt";
    FILE *output_file = create_output_file(output_filename, ext, file_exists);
}

bool file_exists(char *filename) {
    // Try to open file with same name as filename
    FILE *testfile = fopen(filename, "r");
    // Bool variable to check the existence of file
    bool exists = false; 
    // Check the existence of a file called filename
    if (testfile != NULL) {
        // Returns true if the file exists
        exists = true; 
    }
    // Close the file
    fclose(testfile); 
    // Returns the existence of a file (1) = does exist, (0) = does not exist
    return exists;
}

FILE *create_output_file(char *name, char *ext, bool (*file_exists)(char *)) {
    // Form the full filename
    name = strncat(name, ext, sizeof(ext));
    printf("fullfilename = %s\n", name);
    // Check if a file with the filename exists in the directory
    if (file_exists(name)) {
        // If exists, assign the same name with "(number)" to differentiate the new version
        int j = 1;
        char *numb = malloc(10 * sizeof(*numb));
        sprintf(numb, "(%i)", j);
        // Remove the extension from the name string
        name[strlen(name) - strlen(ext)] = '\0';
        // Add (number) to the name and then add the file extension again
        name = strncat(name, numb, sizeof(numb));
        name = strncat(name, ext, sizeof(ext));
        // Check if the name with numbers exists until it doesn't
        int limit = 1e1;
        while (file_exists(name)) {
            j++;
            sprintf(numb, "(%i)", j);            
            if (j == limit) {
                name[strlen(name) - strlen(numb) + 1 - strlen(ext)] = '\0';
                limit = limit*10;    
            } else {
                name[strlen(name) - strlen(numb) - strlen(ext)] = '\0';
            }
            name = strncat(name, numb, sizeof(numb));
            name = strncat(name, ext, sizeof(ext));
        }
        // Free allocated memory
        free(numb); 
    }
    // After assign the proper name, create the output file
    FILE *output_file = fopen(name, "w");
    // Returns the file
    return output_file;
}

What am I missing here?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Lgcos
  • 91
  • 7
  • 3
    in `create_output_file` `sizeof(ext)` is the size of a pointer, same with `sizeof(numb)`. Use `strlen` to get the length of a 0 terminated string. The compiler was correct to warn you, and you should always pay attention to and correct warnings. You might also consider not using `malloc` at all in this code. It is completely unnecessary. – Retired Ninja May 29 '22 at 22:41
  • 1
    Complains how? What warning do you get? Something about `sizeof(numb)` I assume? That's the size of a pointer (8 bytes on x86-64), unrelated to string lengths. – Peter Cordes May 29 '22 at 22:41
  • BTW, your function looks vastly over-complicated. You don't need to redo all this concatenation or keep increasing `limit`, you can just find the end of the base part and tell sprintf to write its output there. Like `sprintf(name+some_length, "(%i)%s", j, ext);`. When you initially allocate space for `name`, make space for `strlen(name)+strlen(ext)+7` or something, and then just set `limit=7-1-2` (-1 for terminating `\0`, -2 for the `()`, leaving space for up to 4 digits. If you want, use `snprintf` with a limit = 7+length of ext. – Peter Cordes May 29 '22 at 22:46
  • 1
    But keep track of your lengths instead of redoing string concat and strlen all the time. You're doing so much calculation on different lengths (or trying to derive them from the wrong thing) that it's easy to get wrong. See also [strcpy() return value](https://stackoverflow.com/a/51547604) re: why it's not a very useful function anyway, and if you know the lengths you should just memcpy instead of redoing string-searching work. – Peter Cordes May 29 '22 at 22:46
  • 1
    Thanks for the advice @Petercordes. Indeed I'm confusing `sizeof()` with `strlen()`. I'll study all the suggestions and fix the code. – Lgcos May 29 '22 at 22:55

1 Answers1

3

There are multiple problems:

  • In file_exists you call fclose(testfile) even if fopen failed This has undefined behavior.

  • in create_output_file , your usage of sizeof is incorrect: in strncat(name, ext, sizeof(ext)); sizeof is applied to a pointer, hence it evaluates to the size of a pointer, not the length of the string it points to. You could write

    strncat(name, ext, strlen(ext));
    

    but it would be exactly equivalent to strcat(name, ext);

    The function strncat is defined as

      char *strncat(char *dest, const char *src, size_t n);
    

    it copies at most n characters plus a null terminator from the string pointed to by src at the end of the string pointed to by dest.

  • The code is too complicated, you have multiple memory leaks and you do not check for buffer overflow when composing the filename.

Here is a simplified version:

#include <stdbool.h>
#include <stdio.h>
#include <string.h>

#define funcname "test"

bool file_exists(const char *filename) {
    // Try to open file with same name as filename
    FILE *testfile = fopen(filename, "r");
    if (testfile != NULL) {
        fclose(testfile);
        return true;
    } else {
        return false;
    }
}

#define MAX_FILE_NUM  10000

FILE *create_output_file(char *name, size_t size,
                         const char *ext,
                         bool (*file_exists)(const char *))
{
    size_t len = strlen(name);
    int i = 0;

    snprintf(name + len, size - len, "%s", ext);
    while (file_exists(name)) {
        if (++i > MAX_FILE_NUM) {
            // all file versions tried.
            return NULL;
        }
        snprintf(name + len, size - len, "(%d)%s", i, ext);
    }
    return fopen(name, "w");
}

int main() {
    // Create output files to store results
    char output_filename[200];
    snprintf(output_filename, sizeof output_filename, "%s_out", funcname);
    printf("output_filename = %s\n", output_filename);
    const char *ext = ".txt";
    FILE *output_file = create_output_file(output_filename,
                            sizeof output_filename, ext, file_exists);
    if (output_file == NULL) {
        printf("could not open output file, last try: %s\n", output_filename);
    } else {
        printf("actual output_filename = %s\n", output_filename);
        fclose(output_file);
    }
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • I always hate to see a loop with `if (i==0)` to special-case the first iteration. I think we can peel that special iteration by doing that one `snprintf` ahead of the loop, and doing `for(int i=1 ; file_exists(name) ; i++) {` `if (i >= limit) return NULL;` `snprintf(..., i, ext); }`. A `for` loop that declares and increments `i` but doesn't use it in the condition is not as easy to read as one would like either, with the other option being a `while(exists(name)){}` declaring `int i` ahead of the loop. (@Lgcos). – Peter Cordes May 30 '22 at 00:11
  • It's unfortunate that ISO C doesn't have a more efficient way to test for file existence. Like in normal usage without filename conflicts, this code will try to `fopen` read only and fail, then open with creation. In POSIX you'd simply `open(name, O_EXCL|O_CREAT)`. It fails with EEXIST if the file already exists, otherwise creates+opens it like `fopen`. It avoid TOCTOU races where something else could create a file between the `fopen(name,"r")` check and the `fopen(name, "w")` use, as well as being more efficient. But it's well known that ISO C file stuff is primitive. – Peter Cordes May 30 '22 at 00:17
  • 1
    @PeterCordes: I agree with your analyses, I changed the answer to use the `while (file_exists(name))...` alternative. It is indeed unfortunate that neither the POSIX nor the C committees extended the mode string with options to implement the various `open` flags. – chqrlie May 30 '22 at 00:27
  • Indeed this revised version with `while loop` is more readable. @chqrlie – Lgcos May 30 '22 at 01:15
  • I understand why you add `size - len` in the second argument of `snprintf`: to ensure that the number of `chars` added in the string are limited to the `sizeof (name)`. But why did you add the `name + len` in the buffer (first argument)? (@chqrlie) – Lgcos May 30 '22 at 01:20
  • 1
    @Lgcos: `name + len` is the position of the end of the filename. `snprintf` formats the file number in brackets followed by the extension at this position for all versions tested until one is found to not exist. – chqrlie May 30 '22 at 01:36