My task is to reorganize a big (~1GB) binary file. I have to get values of different types and write them back into one single big file, transposed. The original file looks like that (V stands for Value)
V1.1,V2.1,V3.1...VX.1,V1.2,V2.2,V3.2,...VX.2... ...VX.Y
The Output file should look like this: V1.1,V1.2...V1.Y,V2.1,V2.2...VX.Y.
What i am doing now is to open a bunch of temporary files and write all V1 into the first, all V2 into the second... once i am through the original file i concatenate all temporary files.
My Limitations are:
- Memory (thats most important, 0 would be best)
- Speed (my task is to do this as fast as possible)
My Problem now is: - When using Filestreams or FILE* i am limited to 2048 files per process. There might be more that 2000 values in that original file. - Using CreateFile is very, very slow.
How am i reading the data: I know how many values are in one block (i.e: V1.1 - VX.1 --> X = 1000) The Input file is a ifstream where i read the data into a vector of byte, then i write every value into a FILE* via fwrite(). Then i read the next block V1.2 - VX.2 and so on...
My question now is:
Is there a way how to handle such a situation correctly? I know i will have to compromise somehow. How can i speed up this thing without gaining too much memory footprint?
thanks in advance, Nicolas
Edit: OS is Windows XP Embedded, .NET 4.0 Edit: source File Size is ~ 1GB
Edit: My first approach was to create a skeleton file and fill it with data using fseek, but that was even slower than my current approach.
Edit: The program will run on a hard disk RAID-1 setup.