7

I tried to read a 3GB data file using ifstream and it gives me wrong file size, whereas when I read a 600MB file, it gives me the correct result. In addition to wrong file size, I am also unable to read the entire file using ifstream.

Here is the code that I used

        std::wstring name;
        name.assign(fileName.begin(), fileName.end());
        __stat64 buf;
        if (_wstat64(name.c_str(), &buf) != 0)
            std::cout << -1; // error, could use errno to find out more

        std::cout << " Windows file size : " << buf.st_size << std::endl;;


        std::ifstream fs(fileName.c_str(), std::ifstream::in | std::ifstream::binary);
        fs.seekg(0, std::ios_base::end);

        std::cout << " ifstream  file size: " << fs.tellg() << std::endl;

The output for 3GB file was

 Windows file size : 3147046042
 ifstream  file size: -1147921254

Whereas the output for 600 MB file was

 Windows file size : 678761111
 ifstream  file size: 678761111

Just in case, I also tested for 5GB file and 300 MB file,

The output for 5GB file was

Windows file size : 5430386900
 ifstream  file size: 1135419604

The output for 300MB file was

Windows file size : 318763632
 ifstream  file size: 318763632

It looks to me like it is reaching some limit.

I am testing the code using Visual Studio 2010 on a Windows Machine which has plenty of memory and disk space.

I am trying to read some large files. Which is a good stream reader to use if ifstream can't read large files?

veda
  • 6,416
  • 15
  • 58
  • 78
  • 1
    I noticed you're invoking _wstat64 directly. Are you compiling 32bit binaries? Did you try 64bit binaries for your `ifstream` test? – WhozCraig May 01 '13 at 19:24
  • @WhozCraig: 32-bit code should be able to handle files in excess of 2GB if the filesystem does, if not it's a pretty bad bug. – Ben Voigt May 01 '13 at 19:26
  • @BenVoigt I concur, I was only curious if `stat()` (not `_wstat64()`) behaved similarly on a 32bit implementation, and if `ifstream` behaved *differently* on a 64bit implementation. – WhozCraig May 01 '13 at 19:29
  • @WhozCraig: The platform i am compiling for is Win32. I didn't try 64bit binary. – veda May 01 '13 at 19:31

3 Answers3

7

I think you want to say:

std::cout << " ifstream  file size: " << fs.tellg().seekpos() << std::endl;

At least that works correctly for a 6GB file I have laying around. But I'm compiling with Visual Studio 2012. And even your original code works fine too on that environment.

So I suspect that this is an bug in the std library on VS 2010 that got fixed in VS 2012. Whether it's a bug in the operator overloading for the pos_type or if that class isn't 64-bit aware is unknown. I'd have to install VS 2010 to validate, but that is likely the problem.

selbie
  • 100,020
  • 15
  • 103
  • 173
  • According to the Standard, it doesn't look like `fpos::state_type>` (the type returned by `tellg()` is supposed to have a public `seekpos()` member. Is this an implementation-specific extension? – Ben Voigt May 01 '13 at 19:35
  • Yes, it did solve the problem. Now I am getting the correct result. – veda May 01 '13 at 19:37
  • 3
    Note that MS deprecated `seekpos()` with Visual Studio 15.8, they also changed the implementation to always return 0. – Brandlingo Aug 28 '18 at 07:16
4

I modified your code slightly, to something that would compile:

#include <fstream>
#include <iostream>
#include <string>
#include <windows.h>

int main() { 

    std::wstring name(L"whatever.txt");

    __stat64 buf;
    if (_wstat64(name.c_str(), &buf) != 0)
        std::cout << -1; // error, could use errno to find out more

    std::cout << " Windows file size : " << buf.st_size << std::endl;;


    std::ifstream fs(name.c_str(), std::ifstream::in | std::ifstream::binary);
    fs.seekg(0, std::ios_base::end);

    std::cout << " ifstream  file size: " << fs.tellg() << std::endl;

    return 0;
}

I tried this on a ~3 Gigabyte file. With VS 2012 (either 32- or 64-bit) it produced:

 Windows file size : 3581853696
 ifstream  file size: 3581853696

With 32-bit VS 2008 (sorry, don't have a copy of VS 2010 installed right now) I got:

 Windows file size : 3581853696
 ifstream  file size: -713113600

So, it would appear that older versions of VS/VC++ used a 32-bit signed number for file sizes, so their practical limit for iostreams was probably 2 gigabytes. With VS 2012, that has apparently been corrected.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
0

The maximum file sizes are determined by the compiler and the OS.

The compiler has control over the sizes of the variables used to access the file sizes.

The OS determines the largest file size that it can support.

The C++ language does not limit the file size.

Example 1:
The compiler could allocate 16 bits for the file position variable, while the OS may use a 32-bit pointer for the maximum file size. In this case, the compiler is the limiting factor.

Example 2:
The compiler could use 32-bits for the file position variable, but the OS uses 24 bits. In this example, the OS is the limiting factor.

In summary, the maximum file size depends on both the compiler and the OS.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
  • I think you mean the Standard library, not the compiler... but the Standard library can only be the limiting factor if you're not using a library appropriate for the OS. – Ben Voigt May 01 '13 at 19:33
  • Does the Standard Library determine the size of the `filepos` variable or the compiler? I know that length of `size_t` is set by the compiler. – Thomas Matthews May 02 '13 at 01:26
  • The Standard library does. The types used for file position are not specified to be `size_t` or `ssize_t`, and generally should not be, since filesystem limits greatly exceed pointer limits (even on 64-bit systems, on which filesystems are starting to use 128-bit lengths!). – Ben Voigt May 02 '13 at 02:16