19

I'm adding some functionality to an existing code base that uses pure C functions (fopen, fwrite, fclose) to write data out to a file. Unfortunately I can't change the actual mechanism of file i/o, but I have to pre-allocate space for the file to avoid fragmentation (which is killing our performance during reads). Is there a better way to do this than to actually write zeros or random data to the file? I know the ultimate size of the file when I'm opening it.

I know I can use fallocate on linux, but I don't know what the windows equivalent is.

Thanks!

George Kagan
  • 5,913
  • 8
  • 46
  • 50
user1024191
  • 193
  • 1
  • 1
  • 6
  • 1
    Since C doesn't have a notion of disks, file systems or fragmentation, you cannot hope for a "pure C" answer. Ask you OS. – Kerrek SB Nov 01 '11 at 17:28
  • 4
    This answer (http://stackoverflow.com/questions/455297/creating-big-file-on-windows/455302#455302) might help. – Robᵩ Nov 01 '11 at 17:31
  • 1
    Unless you have a very large file, memory mapped files will probably provide the best performance. Not only file creation is fast, the read/write operations are also generally faster than C/C++ file writing/streaming functions. – Gene Bushuyev Nov 01 '11 at 17:39
  • Agreed. Unfortunately, in this case, I don't have the ability to change the actual mechanism since I'm modifying an existing code base that isn't structured to allow me to change it in the time allotted. I have used memory-mapped files extensively elsewhere, and you're absolutely right. – user1024191 Nov 02 '11 at 15:40

6 Answers6

19

Programatically, on Windows you have to use Win32 API functions to do this:

SetFilePointerEx() followed by SetEndOfFile()

You can use these functions to pre-allocate the clusters for the file and avoid fragmentation. This works much more efficiently than pre-writing data to the file. Do this prior to doing your fopen().

If you want to avoid the Win32 API altogether, you can also do it non-programatically using the system() function to issue the following command:

fsutil file createnew filename filesize
Michael Goldshteyn
  • 71,784
  • 24
  • 131
  • 181
  • 3
    `SetFilePointerEx` followed by `SetEndOfFile` still requires writing out zeros to disk, unless the file is sparse -- see [Why does my single-byte write take forever?](http://blogs.msdn.com/b/oldnewthing/archive/2011/09/22/10215053.aspx) – Adam Rosenfield Nov 01 '11 at 17:43
  • 1
    If you want to set the valid data, you can use the SetFileValidData(), but this requires the SE_MANAGE_VOLUME_NAME privilege which a general application is not likely to have. I thought the author only wanted to avoid fragmentation as mentioned explicitly in the question. This answer does solve that problem. – Michael Goldshteyn Nov 01 '11 at 17:46
  • @Adam That's not an issue if you're reserving the space but do your actual writing at the beginning of the file. – Aaron Klotz Nov 01 '11 at 23:04
  • In my case, I'm reserving space for subsequent writes that will happen over the next 24 hours. I'll look into using the SetFilePointerEx/SetEndOfFile method - I'm not terribly opposed to the idea of writing zeros other than that it seems like a hack rather than the "right way" to do it. Thanks! – user1024191 Nov 02 '11 at 15:44
  • `fsutil` requires administrative privileges. This little detail is probably worth mentioning... Another rumor, that it creates sparse files, proved to be false though. – ivan_pozdeev Oct 20 '17 at 06:27
7

You can use the SetFileValidData function to extend the logical length of a file without having to write out all that data to disk. However, because it can allow to read disk data to which you may not otherwise have been privileged, it requires the SE_MANAGE_VOLUME_NAME privilege to use. Carefully read the Remarks section of the documentation.

I'd recommend instead just writing out the 0's. You can also use SetFilePointerEx and SetEndOfFile to extend the file, but doing so still requires writing out zeros to disk (unless the file is sparse, but that defeats the point of reserving disk space). See Why does my single-byte write take forever? for more info on that.

Elethom
  • 5
  • 2
Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
  • So I guess just writing out zeros (however that is accomplished) is probably the best way to do this. Thanks for the response. – user1024191 Nov 02 '11 at 15:50
  • 2
    *You can also use SetFilePointerEx and SetEndOfFile to extend the file, but doing so still requires writing out zeros* - note that this is only true if you are writing the contents of the file non-sequentially. If you are writing the file from beginning to end, explicitly writing out zeros first is unnecessary. – Harry Johnston Jun 08 '16 at 21:45
5

Sample code, note that it isn't necessarily faster especially with smart filesystems like NTFS.

if (  INVALID_HANDLE_VALUE != (handle=CreateFile(fileName,GENERIC_WRITE,0,0,CREATE_ALWAYS,FILE_FLAG_SEQUENTIAL_SCAN,NULL) )) {                                               
        // preallocate 2Gb disk file                
        LARGE_INTEGER size;
        size.QuadPart=2048 * 0x10000;
        ::SetFilePointerEx(handle,size,0,FILE_BEGIN);
        ::SetEndOfFile(handle);
        ::SetFilePointer(handle,0,0,FILE_BEGIN);
}
Martin Beckett
  • 94,801
  • 28
  • 188
  • 263
1

You could use the _chsize() function.

Ferruccio
  • 98,941
  • 38
  • 226
  • 299
  • Thanks for the answer! That's interesting. What is the relationship between _sopen and fopen? Is there a downside to using one over the other? – user1024191 Nov 02 '11 at 15:47
  • You could use _open or _sopen to get a file descriptor. You could also use _fopen and then use _fileno to convert the FILE* returned by _fopen to a file descriptor. I don't think it makes any difference which one you use to open the file. – Ferruccio Nov 02 '11 at 16:16
  • It uses `_lseeki64` + repeated `_write` internally (as can be seen in `chsize.c` in VC source code bundled with VS). I.e. it's inferior to `SetFilePointerEx`+`SetEndOfFile`. – ivan_pozdeev Oct 20 '17 at 06:13
  • The link is dead – 김선달 Jun 16 '20 at 01:39
0

这篇文章可能对你有帮助。

The following article from Raymond may help.

How can I preallocate disk space for a file without it being reported as readable?

Use the Set­File­Information­By­Handle function, passing function code File­Allocation­Info and a FILE_ALLOCATION_INFO structure. “Note that this will decrease fragmentation, but because each write is still updating the file size there will still be synchronization and metadata overhead caused at each append.”

The effect of setting the file allocation info lasts only as long as you keep the file handle open. When you close the file handle, all the preallocated space that you didn’t use will be freed.

jgx
  • 107
  • 1
  • 5
0

Check out this example on Code Project. It looks pretty straightforward to set the file size when the file is initially crated.

http://www.codeproject.com/Questions/172979/How-to-create-a-fixed-size-file.aspx

FILE *fp = fopen("C:\\myimage.jpg","ab");

fseek(fp,0,SEEK_END);
long size = ftell(fp);

char *buffer = (char*)calloc(500*1024-size,1);
fwrite(buffer,500*1024-size,1,fp);

fclose(fp);
Jim Fell
  • 13,750
  • 36
  • 127
  • 202
  • This is the method I looked at initially, and realized that I'm just writing out zeros to the file to pre-allocate space. Hence the question posted here... – user1024191 Nov 02 '11 at 15:45