-2

I am doing a homework to programming in C. Bonus points are for quick writing to the file in there upload test system.

I am trying to write a number of lines each comprising three space delimited decimal integer strings and then '\n' in file. The problem is, that fprintf is too slow (their reference time is more or less 1/3 faster).

I have tried a lots of possibilities (everything is in one for loop). fprintf (too slow):

fprintf(f, "%d %d %d\n", a[i], b[i], c[i]);

converting to string and then put the string into it - even worse:

sprintf(buffer, "%d", a[i]); //or: _itoa(_itoa(a[i], buffer, 10);
fputs(buffer, f);
fputc(' ', f);

is there any quick way to write integer numbers to simple text file (.txt) (the last solution has time 220ms, reference is 140ms for you to picture the time)? I have been trying and googling as hell, but nothing is working. But if the time is this short, there has to be some way!

PS: The numbers are integers all the time, size is 4 bytes, all the time in format:

a0 b0 c0
a1 b1 c1
a2 b2 c2
a3 b3 c3
etc...

More info: When I send the solution, I send only two files: file.h and file.c. No main etc... so everything is in their optimization. The solution should be in commands/algorithm (even in the description of the problem is statement, that fprintf is too slow and we should try something else to speed things up).

Thank you for everything!

Edit: since you want the whole code, here it is:

void save(const str_t * const str, const char *name)
{
  FILE* f;
  int i;

  if(str->cnt == 0)
      return;

  f = fopen(name, "w");
  if(f == NULL)
      return;

  for(i = 0; i < str->cnt; i++)
  {
      fprintf(f, "%d %d %d\n", str->a[i], str->b[i], str->c[i]);
  }
  fclose(f);
}
Clifford
  • 88,407
  • 13
  • 85
  • 165
Stepan
  • 61
  • 1
  • 3
  • 12
  • 2
    Show the entire source code and the compilation command you are using (and also give more details about your operating system & computer & compiler). – Basile Starynkevitch Dec 18 '16 at 16:11
  • 3
    In your example, you are not writing an "integer", you are writing a decimal string representation of an integer - the time may depend significantly on the number of digits output, The question makes little sense; the actual write time will be almost entirely determined by the speed of the hard drive, but since in most cases (i.e. on a desktop OS), the write will be buffered and delayed by OS write caching, the `fprintf()` call itself should take only a matter of microseconds. I think in your question you need to include exactly what you are measuring and how you are measuring it. – Clifford Dec 18 '16 at 16:15
  • Read "more info". I am not. I just send two files, their system starts it and compare my writting time with their reference program. Sadly, I am not measuring it. – Stepan Dec 18 '16 at 16:17
  • 2
    Further, if the write truly takes 220ms, I am willing to bet that it will make little difference to the time if you wrote 30 bytes or 1Mb - what you are measuring is largely the OS/hard-drive performance, and that can be affected by many things including other processes running and/or writing to the drive. The benchmark is meaningless unless you are writing large blocks of data and test repeatedly to get an average/minimum/maximum. – Clifford Dec 18 '16 at 16:20
  • 1
    Unless your code is running on exactly the same machine the original benchmark was determined on, and built with teh same compiler and compiler options, comparison is meaningless. – Clifford Dec 18 '16 at 16:20
  • Yes, it is running on the same machine. As I said: I send (to their server, machine - upload those files) only two files: file.h and file.c where are my functions. Machine will run it and test it, then compare it to therie reference solution of this problem. If am I fast enough, I got the points. – Stepan Dec 18 '16 at 16:26
  • 2
    Ah - hang on; you said you were writing three integers and that it took 220ms - but reading the question more carefully it seems clear that you are writing possibly many more than three integers in a loop - how many *lines* are you writing? You've already been asked to post the complete solution and much confusion would have been avoided were you just to do that! – Clifford Dec 18 '16 at 16:26
  • Well, I don't know how many. I just wrote an algorithm that read text file and stores data in arrays. Then another function to save those data in functions. Their system loads some quantity of data, and then save them and measuers the time. The whole function is simple: some check if you can read the file and then simple for loop (for(i = 0; i < cnt; i++)). – Stepan Dec 18 '16 at 16:29
  • 2
    So you should *still* post all that code - i.e. the entirety of your part of the solution. You are really not helping yourself with your caginess. At the very least the clarifications you have given in comments should be added to the question rather then left in comments. The whole code is necessary so that at least we can see whether you are responsible for opening and closing the file or whether you are just given a file pointer - that may affect the possible solution. – Clifford Dec 18 '16 at 16:32
  • 1
    "even in the description of the problem, is statement, that fprintf is too slow" (your comment is below an answer). If `fprintf` is too slow, then write your own integer to string function. – Weather Vane Dec 18 '16 at 16:37
  • The functions are there at the end. It is simple so that is why I didnt do it at the first place. Just simple: load (that is ok) and then write to file (that is slow). – Stepan Dec 18 '16 at 16:42
  • 1
    If you are realloacting memory for every integer read, you will never get it to run very fast. If the problem statement did not stae any limits, you can speed that part up by allocating for say, another 10000 integers each time. – Weather Vane Dec 18 '16 at 16:42
  • Well that I want to change later. Right know I am talking about writting, not loading. – Stepan Dec 18 '16 at 16:46
  • 1
    What challenge problem starts timing half way through the task? – Weather Vane Dec 18 '16 at 16:47
  • 1
    Slightly grudging provision of the requested information; it is not clear why you would think that this was not necessary! If you want others to read your code however, you should at the very least indent it conventionally - fixed it for you. – Clifford Dec 18 '16 at 17:21
  • 1
    The presence of the `load()` function is a distraction since that is not involved in the timing (and is extraordinarily inefficient) - you should remove that, otherwise you are likely to get a lot of "noise" about that to the detriment of your actual question. If you have a question about the `load()` code, post it separately. – Clifford Dec 18 '16 at 17:35
  • Just for sure I posted it too. That part is deleted. – Stepan Dec 18 '16 at 17:41
  • "...size is 4 bytes,..." - Size of *what* is 4 bytes? Do you mean that all the integers are 4 decimal digits, or that they are all 32 bit integers? If the first, then you can determine the size of the output file in advance and that may lead to an even more efficient implementation - for example you can allocate a single buffer of the necessary size and use a single write. – Clifford Dec 18 '16 at 17:48
  • Yes, of integer. This was just for information and sure. – Stepan Dec 18 '16 at 17:50

3 Answers3

4

Using any variation of printf, the function will have to scan the format string to find %d, and parse it for any extra options (such as %-03d), and work accordingly. That is a lot of processing time. printf is awesome because it is super-flexible, not because it is fast.

If you use an itoa type function to write each number, you're still going to turn your integer into a string, then copy that string to the file. You will spend all your processing time moving between string-buffers and file writes.

I think your fastest approach will be to make an in-memory, really big buffer, write everything to that, then do ONE AND ONLY ONE write to dump that entire buffer to the file.

Outline:

char buffer[10000];
for(i = 0; i < str->cnt; i++)
{
    /* write to buffer */
}

fwrite(buffer, buffer_size, 1, my_file);  // One fast write.
abelenky
  • 63,815
  • 23
  • 109
  • 159
  • This occured in my mind, but still I need to convert all numbers to strings. Anyway I cannot avoid that. Thanks, I will try. – Stepan Dec 18 '16 at 16:53
  • 1
    You can still use `itoa` to convert to strings, just do not write each one to a file individually. Would you rather do 20 file-writes, or one big file-write? – abelenky Dec 18 '16 at 17:06
  • Well it doesn't have to be fastest on the earth. Just a few writes would be ok than writting each line. – Stepan Dec 18 '16 at 17:09
  • Since file I/O operations are certainly predominant in this timing, I doubt that the overhead of formatted I/O is at all significant in achieving the required performance. – Clifford Dec 18 '16 at 17:32
1

It might be operating system and implementation specific.

Perhaps you could explicitly set the buffering using setvbuf(3) (I would recommend using at least a 32Kbyte buffer, and probably more).

Don't forget to explicitly ask your compiler to optimize, e.g. use gcc -Wall -O2

You could also code explicitly your integer to decimal representation routine (hint: writing a routine which, ginven some int x like 1234, fills a given small array of char with the digits in reverse order (e.g. "4321") is really simple and it will run fast).

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Well when I send the solution, I send only two files: file.h and file.c. No main etc... so everything is in their optimization. The solution should be in commands (even in the description of the problem, is statement, that fprintf is too slow and we should try something else to speed things up). – Stepan Dec 18 '16 at 16:14
1

You can reduce the overhead of file I/O by writing to the file in large blocks to reduce the number of individual write operations.

#define CHUNK_SIZE 4096
char file_buffer[CHUNK_SIZE + 64] ;    // 4Kb buffer, plus enough 
                                       // for at least one one line
int buffer_count = 0 ;
int i = 0 ;

while( i < cnt )
{
    buffer_count += sprintf( &file_buffer[buffer_count], "%d %d %d\n", a[i], b[i], c[i] ) ;
    i++ ;

    // if the chunk is big enough, write it.
    if( buffer_count >= CHUNK_SIZE )
    {
        fwrite( file_buffer, buffer_count, 1, f ) ;
        buffer_count = 0 ;
    }
}

// Write remainder
if( buffer_count > 0 )
{
    fwrite( file_buffer, buffer_count, 1, f ) ;    
}

There may be some advantage in writing exactly 4096 bytes (or some other power of two) in a single write, but that is largely file-system dependent and the code to do that becomes a little more complicated:

#define CHUNK_SIZE 4096
char file_buffer[CHUNK_SIZE + 64] ;
int buffer_count = 0 ;
int i = 0 ;

while( i < cnt )
{
    buffer_count += sprintf( &file_buffer[buffer_count], "%d %d %d\n", a[i], b[i], c[i] ) ;
    i++ ;

    // if the chunk is big enough, write it.
    if( buffer_count >= CHUNK_SIZE )
    {
        fwrite( file_buffer, CHUNK_SIZE, 1, f ) ;
        buffer_count -= CHUNK_SIZE ;
        memcpy( file_buffer, &file_buffer[CHUNK_SIZE], buffer_count ) ;
    }
}

// Write remainder
if( buffer_count > 0 )
{
    fwrite( file_buffer, 1, buffer_count, f ) ;    
}

You might experiment with different values for CHUNK_SIZE - larger may be optimal, or you may find that it makes little difference. I suggest at least 512 bytes.


Test results:

Using VC++ 2015, on the following platform:

enter image description here

With a Seagate ST1000DM003 1TB 64MB Cache SATA 6.0Gb/s Hard Drive.

The results for a single test writing 100000 lines is very variable as you might expect on a desktop system running multiple processes sharing the same hard drive, so I ran the tests 100 times each and selected the minimum time result (as can bee seen in the code below the results):

Using default "Debug" build settings (4K blocks):

line_by_line: 0.195000 seconds
block_write1: 0.154000 seconds
block_write2: 0.143000 seconds

Using default "Release" build settings (4K blocks):

line_by_line: 0.067000 seconds
block_write1: 0.037000 seconds
block_write2: 0.036000 seconds

Optimisation had a proportionally similar effect on all three implementations, the fixed size chunk write was marginally faster then the "ragged" chunk.

When 32K blocks were used the performance was only slightly higher and the difference between the fixed and ragged versions negligible:

Using default "Release" build settings (32K blocks):

block_write1: 0.036000 seconds
block_write2: 0.036000 seconds

Using 512 byte blocks was not measurably differnt from 4K blocks:

Using default "Release" build settings (512 byte blocks):

block_write1: 0.036000 seconds
block_write2: 0.037000 seconds

All the above were 32bit (x86) builds. Building 64 bit code (x64) yielded interesting results:

Using default "Release" build settings (4K blocks)- 64-bit code:

line_by_line: 0.049000 seconds
block_write1: 0.038000 seconds
block_write2: 0.032000 seconds

The ragged block was marginally slower (though perhaps not statistically significant), the fixed block was significantly faster as was the line-by-line write (but not enough to make it faster then any block write).

The test code (4K block version):

#include <stdio.h>
#include <string.h>
#include <time.h>


void line_by_line_write( int count )
{
  FILE* f = fopen("line_by_line_write.txt", "w");
  for( int i = 0; i < count; i++)
  {
      fprintf(f, "%d %d %d\n", 1234, 5678, 9012 ) ;
  }
  fclose(f);       
}

#define CHUNK_SIZE (4096)

void block_write1( int count )
{
  FILE* f = fopen("block_write1.txt", "w");
  char file_buffer[CHUNK_SIZE + 64];
  int buffer_count = 0;
  int i = 0;

  while( i < count )
  {
      buffer_count += sprintf( &file_buffer[buffer_count], "%d %d %d\n", 1234, 5678, 9012 );
      i++;

      // if the chunk is big enough, write it.
      if( buffer_count >= CHUNK_SIZE )
      {
          fwrite( file_buffer, buffer_count, 1, f );
          buffer_count = 0 ;
      }
  }

  // Write remainder
  if( buffer_count > 0 )
  {
      fwrite( file_buffer, 1, buffer_count, f );
  }
  fclose(f);       

}

void block_write2( int count )
{
  FILE* f = fopen("block_write2.txt", "w");
  char file_buffer[CHUNK_SIZE + 64];
  int buffer_count = 0;
  int i = 0;

  while( i < count )
  {
      buffer_count += sprintf( &file_buffer[buffer_count], "%d %d %d\n", 1234, 5678, 9012 );
      i++;

      // if the chunk is big enough, write it.
      if( buffer_count >= CHUNK_SIZE )
      {
          fwrite( file_buffer, CHUNK_SIZE, 1, f );
          buffer_count -= CHUNK_SIZE;
          memcpy( file_buffer, &file_buffer[CHUNK_SIZE], buffer_count );
      }
  }

  // Write remainder
  if( buffer_count > 0 )
  {
      fwrite( file_buffer, 1, buffer_count, f );
  }
  fclose(f);       

}

#define LINES 100000

int main( void )
{
    clock_t line_by_line_write_minimum = 9999 ;
    clock_t block_write1_minimum = 9999 ;
    clock_t block_write2_minimum = 9999 ;

    for( int i = 0; i < 100; i++ )
    {
        clock_t start = clock() ;
        line_by_line_write( LINES ) ;
        clock_t t = clock() - start ;
        if( t < line_by_line_write_minimum ) line_by_line_write_minimum = t ;

        start = clock() ;
        block_write1( LINES ) ;
        t = clock() - start ;
        if( t < block_write1_minimum ) block_write1_minimum = t ;

        start = clock() ;
        block_write2( LINES ) ;
        t = clock() - start ;
        if( t < block_write2_minimum ) block_write2_minimum = t ;
    }

    printf( "line_by_line: %f seconds\n", (float)(line_by_line_write_minimum) / CLOCKS_PER_SEC ) ;
    printf( "block_write1: %f seconds\n", (float)(block_write1_minimum) / CLOCKS_PER_SEC ) ;
    printf( "block_write2: %f seconds\n", (float)(block_write2_minimum) / CLOCKS_PER_SEC ) ;
}
Clifford
  • 88,407
  • 13
  • 85
  • 165
  • 1
    @Stepan : When you have tried this, I'd be very interested to know what the benchmark result was, what CHUNK_SIZE you used and whether you compared the two versions. – Clifford Dec 18 '16 at 17:36
  • Well there is limited number of uploads (so I can't just upload whenewher I want). Anyway I don't see anywhere "line_buffer" and then you use "file_buffer" which is not inicialized or am I missing something? These "buffer" etc.. is kind a new to me. That is why I post this question (after deep searching). – Stepan Dec 18 '16 at 17:52
  • 1
    @Stepan : `line_buffer` was a hangover from an initial idea I had, but changed. Should be `file_buffer` - fixed. The `sprinff()` call was not quite correct - fixed that too. `file_buffer` needs no initialisation - it is written to by `sprintf()`. You could use the possibly more efficient `atoi()` but the subsequent string processing to determine the end of the string will probably be more expensive - `sprintf()` returns the number of characters written so you do not have to scan for the end of the string. – Clifford Dec 18 '16 at 18:00
  • Thanks. I thought so that line_buffer = file_buffer, but still - better to ask. – Stepan Dec 18 '16 at 18:04
  • 1
    @Stepan : I have made a number of other corrections - make sure you have my latest version. The problem with posting code fragments without the ability to test it. – Clifford Dec 18 '16 at 18:07
  • Great! Thank you for your help and patience. I will post results (the timing) in the near future (propably in two days). – Stepan Dec 18 '16 at 18:10
  • This answer is completely wrong. `fwrite` already performs buffering internally; that is why we have `fflush`. This code will actually be worse for performance (because it will be buffering twice); it will be faster on exactly zero platforms. – Nemo Dec 18 '16 at 18:25
  • So, when I read this I made a dinner and continue. It was easier since you (@Clifford) updated your answer. Anyway here are the results and it IS better! First smallest file: Reading: I = 92 ms, reference = 120 ms Converting: I = 184 ms, reference = 208 ms Writing: I = 92 ms, reference = 88 ms but in bigger files (it has a multipe tests - larger and larger files) the largest file: Reading: I = 288 ms, reference = 368 ms Converting: I = 576 ms, reference = 664 ms Writing: I = 288 ms, reference = 296 ms Maybe change the chunk for smaller files. (+ need to change realloc in loading). – Stepan Dec 18 '16 at 19:30
  • 1
    @Nemo : I added test results. The block write is significantly faster (as I expected based on experience); nearly twice as fast on 32 bit code, for 64 bit code, the difference was not as great but still significant. I suggest that "completely wrong" may have been overstating it; I'd have accepted "in need of verification", but you were at lest right to query it without test results. – Clifford Dec 18 '16 at 21:16
  • Good job about testing. Anyway interesting bug. I generated txt file. Then load it into my arrays and then save that to the new file. When I use your _second_ solution (with memcpy), in some time there is this line: ** 5678 9012** (no first column) and then normal 1234 5678 9012. The _fist_ solution is fine. – Stepan Dec 19 '16 at 17:32
  • @Stepan : I'll take a look at that, but not right now. – Clifford Dec 19 '16 at 21:47