0

I have a large binary file that is read and compressed with bzip2. I am trying to decrease the compression time, as it's taking around 1min 30sec to complete.

One thing I wanted to try was to expand the size of the buffer for fopen. However, I am noticing that the memory allocated during the compression process is hardly exceeding 7,000K.

Here is my code:

int bzipError = BZ_OK;
BZFILE *bzipLogFile = BZ2_bzWriteOpen(&bzipError, CompressedLogFile, 9, 0, 30);
const int BUF_SIZE = 200000;
char* Buffer = new char[BUF_SIZE];

while (!feof(LogFile)) {
    const size_t BytesRead = fread(Buffer, (size_t)1, BUF_SIZE, LogFile);
    BZ2_bzWrite(&bzipError, bzipLogFile, Buffer, (int)BytesRead);
}

I realize there is a limit by default that an application can allocate on the heap and stack but I used

#pragma comment(linker, "/STACK:200000")
#pragma comment(linker, "/HEAP:200000")

to try and circumvent this. Clearly I am wrong.

user0000001
  • 2,092
  • 2
  • 20
  • 48
  • 3
    `sizeof(Buffer)` is incorrect. It is the size of the pointer, not the allocated block of memory it points to. – Retired Ninja Jun 07 '16 at 19:30
  • @RetiredNinja Oh wow. Thanks for pointing that out. I made the changes but I am still restrained at around 7,000K. – user0000001 Jun 07 '16 at 19:38
  • 3
    Please read [Why is “while ( !feof (file) )” always wrong?](http://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong). – Some programmer dude Jun 07 '16 at 19:39
  • As for your problem, *why* is it a problem? What is the actual problem you try to solve by increasing the internal buffer size? – Some programmer dude Jun 07 '16 at 19:41
  • Most likely, you don't need to increase the size for `fopen`, but the size of the buffers used by `fread`. The `fopen` function may only be using minimal storage for file attributes. – Thomas Matthews Jun 07 '16 at 19:44
  • 1
    Profile. Is the bottleneck the file reading? Profile. Is the bottleneck the ZIP operation? Profile. Have you tried *memory mapping*? Did I say to Profile? – Thomas Matthews Jun 07 '16 at 19:46
  • I think two, three more times might do it, @ThomasMatthews – user4581301 Jun 07 '16 at 19:53
  • @ThomasMatthews Did you say Profile? :) Thanks for the suggestions. As I use to understand it, memory mapping was good for files that are accessed frequently? Not for files that are accessed once and deleted. I'm probably wrong but I will definitely look into this. Thanks. – user0000001 Jun 07 '16 at 20:14

0 Answers0