It's perfectly simple, I'd say: You open a file (fopen
returns a FILE *
) which you can then read in a loop, using fread
to specify the max amount of bytes to read on each iteration. Given the fact you're using a loop anyway, you can increment a simple int to keep track of the chunk-file names, using snprintf
to create the name, write the characters read by fread
to each file, and continue until you're done.
Some details on fread
that might be useful to you
A basic example (needs some work, still):
int main( void )
{
int chunk_count = 0, chunk_size = 256;
char buffer[256]
FILE *src_fp,
*target_fp;
char chunk_name[50];
while (chunk_size == fread(buffer, chunk_size, 1, src_fp))
{//read chunk
++chunk_count;//increase chunk count
snprintf(chunk_name, 50, "chunk_part%d.txt", chunk_count);
target_fp = fopen(chunk_name, "w");
//write to chunk file
fwrite(buffer, chunk_size, 1, target_fp);
fclose(target_fp);//close chunk file
}
//don't forget to write the last chunk, if it's not 0 in length
if (chunk_size)
{
++chunk_count;//increase chunk count
snprintf(chunk_name, 50, "chunk_part%d.txt", chunk_count);
target_fp = fopen(chunk_name, "w");
//write to chunk file
fwrite(buffer, strlen(buffer) + 1, 1, target_fp);
fclose(target_fp);//close chunk file
}
fclose(src_fp);
printf("Written %d files, each of max 256 bytes\n", chunk_count);
return 0 ;
}
Note that this code is not exactly safe to use as it stands. You'll need to check the return values of fopen
(it can, and at some point will, return NULL
). The fread
-based loop simply assumes that, if its return value is less than the chunk size, we've reached the end of the source-file, which isn't always the case. you'll have to handle NULL pointers and ferror
stuff yourself, still. Either way, the functions to look into are:
fread
fopen
fwrite
fclose
ferror
snprintf
That should do it.
Update, just for the fun of it.
You might want to pad the numbers of your chunk file names (chunk_part0001.txt). To do this, you can try to predict how big the source file is, divide that by 256 to work out how many chunks you're actually going to end up with and use that amount of padding zeroes. How to get the file size is explained here, but here's some code I some time ago:
long file_size = 0,
factor = 10;
int padding_cnt = 1;//at least 1, ensures correct padding
fseek(src_fp, 0, SEEK_END);//go to end of file
file_size = ftell(src_fp);
file_size /= 256;//divided by chunk size
rewind(src_fp);//return to beginning of file
while(10 <= (file_size/factor))
{
factor *= 10;
++padding_cnt;
}
//padded chunk file names:
snprintf(chunk_name, sizeof chunk_name, "chunk_part%0*d.txt", padding_cnt, chunk_count);
If you want, I could explain every single statement, but the gist of it is this:
fseek
+ ftell
gets to size of the file (in bytes), divided by the chunk size (256) gets you the total number of chunks you'll create (+1 for the remainder, which is why padding_cnt
is initialized to 1)
- The
while
loop divides the total count by 10^n, each time the factor is multiplied by 10, the padding count increases
- the format passed to
snprintf
changed to %0*d
which means: _"print an int, padded by n occurrences of 0 (ie to a fixed width). If you end up with 123 chunks, the first chunk file will be called chunk_part001.txt
, the tenth file will be chunk_part010.txt
all the way up to chunk_part100.txt
.
- refer to the linked question, the accepted answer uses
sys/stat.h
to get the file-size, which is more reliable (though it can pose some minor portability issues) Check the stat
wiki for alternatives
Why? Because it's fun, and it makes the output files easier to sort by name. It also enables you to predict how big the char array that holds the target file name should be, so if you have to allocate that memory using malloc
, you know exactly how much memory you'll need, and don't have to allocate 100 chars (which should be enough either way), and hope that you don't run out of space.
Lastly: the more you know, the better IMO, so I thought I'd give you some links and refs you might want to check.