0

I'm using the following code:

const size_t MonaBuffSize = 1024 * 1000;
std::ifstream file(Path.string(), std::ifstream::binary);
MD5_CTX md5Context;
MD5_Init(&md5Context);
char buf[MonaBuffSize];
while (file.good()) {
    file.read(buf, sizeof(buf));
    MD5_Update(&md5Context, buf, file.gcount());
}
unsigned char result[MD5_DIGEST_LENGTH];
MD5_Final(result, &md5Context);

When I set MonaBuffSize to 1024 * 1024 - the program crashes when file.read function is called.

Are there any buffer size limits in ifstream.read() function? I know that it is probably better to use smaller buffers, but I would like to understand what is the problem here.

I'm using Visual Studio 2019, C++ 17.

EDIT: As recommended in comments, I returned to the previous method that I used with boost::iostreams::mapped_file_source:

const std::string md5_from_file(const std::filesystem::path Path, std::uintmax_t Size)
{
        unsigned char result[MD5_DIGEST_LENGTH];
        if (Size > 0) {
            try {
                boost::iostreams::mapped_file_source src;
                src.open(Path.string());
                MD5((unsigned char*)src.data(), src.size(), result);
            }
            catch (std::ios_base::failure const& e)
            {
                MD5((unsigned char*)"", 0, result);
                //MessageBox(NULL, e.what(), "EXCEPTION", MB_YESNO | MB_ICONINFORMATION);
            }
        }
        else {
            MD5((unsigned char*)"", 0, result);
        }
        std::ostringstream sout;
        sout << std::hex << std::setfill('0');
        for (auto c : result) sout << std::setw(2) << (int)c;

        return sout.str();
}

Additionally, I realized that it was Windows Defender slowing disk read speed so much, so my attempts to speed it up were pointless.

Mona
  • 337
  • 3
  • 15
  • 5
    According to [this](https://learn.microsoft.com/en-us/cpp/build/reference/stack-stack-allocations?view=vs-2019), the default Windows stack size is only 1MB, so your buffer will be too big for that. – BoBTFish Oct 05 '20 at 08:43
  • Thank you. So I can safely keep 1000*1024 and it won't crash on the other configurations / OS versions? – Mona Oct 05 '20 at 08:49
  • No you can't - and why use a buffer that big? Just read smaller chunks (`BUFSIZ` is probably good enough). – Ted Lyngmo Oct 05 '20 at 08:51
  • 2
    And preferably allocate your buffer on the heap! – Botje Oct 05 '20 at 08:56
  • I was using the boost::iostreams::mapped_file_source before, but in the task manager, the maximum read speed from a NVMe disk was only ~50MB/s (usually lower), while CPU usage < 5%. I thought that greater buffer could speed it up. Unfortunately, with the new ifstream method I'm getting exactly the same speeds. So I will return to the mapped_file_source. – Mona Oct 05 '20 at 08:58
  • 2
    @Mona That is pretty large for the stack anyway, I certainly wouldn't be comfortable deploying that to multiple platforms. There are other things using the stack too, it's very hard to make any guarantees. I'd put it on the heap if you really need a buffer that large anyway, but maybe you really would be better to memory map the file? – BoBTFish Oct 05 '20 at 08:59
  • @BoBTFish, Yes. Thank you. I will return to the mapped_file_source. Now I realized that it was Windows Defender slowing it down so much. – Mona Oct 05 '20 at 09:02

0 Answers0