2

I am learning C language. Here is a simple program I did to create 1000 text files.

#include <stdio.h>
#include <string.h> 

char * create_filename(char *, char *, int);

int main(void)
{ 
    char prefix_name[50] = "file_no_";
    char snum[5];
    int no_of_files = 1000;

    FILE * fp = NULL;


    for (int i = 0; i < no_of_files; i++)
    {
        fp = fopen( create_filename(prefix_name, snum, i + 1), "w");
        fprintf(fp, "This is file no %d", i+1);
        fclose(fp);
        fp = NULL;
        strcpy(prefix_name, "file_no_");
    }
    
    return 0;
   
}  

char * create_filename(char * prefix_name, char * snum, int i)

{

    sprintf(snum, "%d", i);
    strcat(prefix_name, snum);
    strcat(prefix_name, ".txt");

    return prefix_name;
}

This runs as expected. But I want to know, how can I make this more efficient and as portable as possible. If I want to scale this up to, say 10000 text files, are there other approaches which will be better ?

Thanks

user9026
  • 852
  • 2
  • 9
  • 20
  • 1
    First of all, one quick and easy "optimization" would be to use another buffer for the constructed file-name, instead of using `strcpy(prefix_name, "file_no_");` in your loop. I also suggest you try to minimize the number of string-function calls, and construct your whole file-name using a single `snprintf` call (using the new buffer as destination). Oh, and you don't need the `fp = NULL` assignment. – Some programmer dude Jul 06 '22 at 13:04
  • 1
    It's pure standard C so it is about as portable as it gets. As for efficiency, the bottleneck is the fopen calls. I'm not sure if parallel processing would improve things or not, it mainly depends on the hard drive and OS. – Lundin Jul 06 '22 at 13:06
  • 2
    It's not a good idea to code up the parts as string literals in different functions when they all belong to the same file name. I would use the easy-to-follow and low-maintenance `sprintf(filename, "file_no_%d.txt", i + 1);`. The whole filename spec is all in the same place. – Weather Vane Jul 06 '22 at 13:12
  • @Lundin I deleted when I noticed that. It's high maintenance, and hard to follow. – Weather Vane Jul 06 '22 at 13:13
  • Wanting to optimize exagerately, you should remove all `printf`s: `puts` is way more efficient – Marco Balo Jul 06 '22 at 13:13
  • 1
    @MarcoBalo What printf? Also compilers tend to replace printf calls without the format string with a puts call. – Lundin Jul 06 '22 at 13:14
  • Before making it efficient, make it readable. What is `snum`? Why do you need it in `main`? – n. m. could be an AI Jul 06 '22 at 13:15
  • Thanks for all the comments. I took Udemy's "C Programming For Beginners - Master the C Language" course by Jason Fedin and now practicing my skills. – user9026 Jul 06 '22 at 13:15
  • Passing `snum` to the function rather than using a variable local to the function is weird — not actually wrong, but not necessary and probably marginally (immeasurably) less efficient than using a local variable. If it was my code, I'd probably use an interface more like: `void create_filename(size_t buflen, char buffer[buflen], const char *prefix, int serial)`, and then use `snprintf(buffer, buflen, "%s%d", prefix, serial)` to format the file name. The `strcpy()` would not be needed then. – Jonathan Leffler Jul 06 '22 at 13:15
  • @Lundin , i intended `fprintf` and `sprintf`. I don't know exactly the optimizations made by the compilers, but some year ago I did a school project which purpose was to optimize as much as we could and substituting all `printf`/`scanf` with `gets`/`puts` and similars gave my program a decent boost. – Marco Balo Jul 06 '22 at 13:19
  • 1
    @Roi only a `static` variable can be returned. – Weather Vane Jul 06 '22 at 13:20
  • 1
    @Roi — you can't reliably return a local array. You can return values (like `int`), but you can't return pointers to local variables such as character arrays (though you can return pointers to dynamically allocated variables, of course). – Jonathan Leffler Jul 06 '22 at 13:21
  • 1
    @MarcoBalo In this case the integer to string conversion needs to be done somewhere. Sure, it is more efficient to hack it out manually than to have printf do it. But the performance bottleneck is the file I/O so that's where any manual optimization attempts should look at. – Lundin Jul 06 '22 at 13:32
  • 1
    @MarcoBalo — I trust your mention of `gets()` is a typo for `fgets()`. [You cannot use `gets()` safely — ever!](https://stackoverflow.com/q/1694036/15168) And `gets()` is no longer a part of standard C — it has the unique privilege of being the only function removed from the C standard (so far). – Jonathan Leffler Jul 06 '22 at 13:46
  • 1
    user9026, `how can I make this .... as portable as possible` --> check return value of `fopen()` before writing. – chux - Reinstate Monica Jul 06 '22 at 14:47

1 Answers1

1

how can I make this more efficient and as portable as possible.

  1. More error checking. Example: a failed fopen() can readily occur.

  2. Realize that a huge amount of time will occur in fopen() and local code likely will have scant time improvements.

  3. Avoid re-writing the prefix.

  4. Use a helper function.

Example:

// Return count of successfully written files.
int create_many_files(const char *prefix, int count, const char *suffix) {
  int n;
  int len = snprintf(NULL, 0, "%s%n%d%s", prefix, &n, count, suffix);
  if (len < 0) {
    return 0;
  }

  // Use len to determine longest name and use a VLA or allocation.
  // Consider using a fixed array when len is not too big.
  char *filename = malloc((size_t)len + 1u);
  if (filename == NULL) {
    return 0;
  }
  strcpy(filename, prefix);
  char *offset = filename + n;

  for (int i = 0; i < count; i++) {
    sprintf(offset, "%d%s", i + 1, suffix);
    FILE *fp = fopen(filename, "w");
    if (fp == NULL) {
      free(filename);
      return i;
    }
    fprintf(fp, "This is file no %d", i + 1);
    fclose(fp);
  }

  free(filename);
  return count;
}

Other potential error checks:

  • prefix == NULL
  • cout < 0
  • suffix == NULL
  • fprintf() < 0
  • fclose() != 0
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • _"Realize that a huge amount of time will occur in `fopen()`"_ Can you explain why? – Zakk Jul 06 '22 at 15:16
  • 1
    @Zakk, The amount of checks and set-up for a new stream dwarfs the small amount of code here. Parsing the filename and searching file system namespace is extensive. – chux - Reinstate Monica Jul 06 '22 at 15:20