2

I am trying to analyze a basic read operation using ifstream with Procmon.

Part of the code used for read operation where i was trying to read data of 16kb size from a file:

char * buffer = new char[128000];
ifstream fileHandle("file.txt");
fileHandle.read(buffer, 16000);
cout << buffer << endl;
fileHandle.close();

In Procmon there were 4 ReadFile operation with the following details:

Offset: 0, Length: 4,096, Priority: Normal
Offset: 4,096, Length: 4,096
Offset: 8,192, Length: 4,096
Offset: 12,288, Length: 4,096

So does it mean that there were 4 operations of each 4kb size ? and if so why did that happen instead of just having a single ReadFile operation of 16 kb size.

moooni moon
  • 333
  • 1
  • 5
  • 19

2 Answers2

2

So does it mean that there were 4 operations of each 4kb size ?

Yes.

and if so why did that happen instead of just having a single ReadFile operation of 16 kb size.

Probably because the standard library shipped with your compiler sets the default size of the buffer of file streams to 4 KB; since the read operation has to go through the buffer, it has to be filled (through OS calls) and emptied 4 times before satisfying your request. Notice that you can change the internal buffer of an fstream using fileHandle.rdbuf->pubsetbuf.

Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • I will experiment more by changing the internal buffer and see how that works. – moooni moon Aug 25 '15 at 23:09
  • For 16 KB I doubt you'll manage to see any significant difference... the initial IO cost is going to hide completely the cost of two extra syscalls, and if you are going for performance `fstream` is a dead end anyway. – Matteo Italia Aug 25 '15 at 23:27
  • Thanks. I was able to read 16 Kb in a single operation using fileHandle.rdbuf. – moooni moon Sep 08 '15 at 20:04
1

So does it mean that there were 4 operations of each 4kb size ?

That is exactly what it is saying.

and if so why did that happen instead of just having a single ReadFile operation of 16 kb size.

Just because you asked for 16000 bytes does not mean ifstream can actually read 16000 bytes in a single operation. File systems do not usually allow for such large reads, there is usually a cap. Even if you increase the size of the internal buffer that ifstream uses internaly, that is still no guarantee that the file system will honor a larger read size.

The contract of read() is that it returns the requested number of bytes unless an EOF/error is encountered. HOW it accomplishes that reading internally is an implementation detail. In this case, ifstream had to read four 4KB blocks in order to return 16000 bytes.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • I don't think that, at this level, the filesystem implementation matters in any other way than suggesting the designers of the CRT to use 4096 as default block size. `ReadFile` per se is quite a high-level API, you can ask to it any size (and pass any kind of file handle) and it will happily oblige, blocking until it fetches all the requested data; the filesystem drivers limitations are way below it. – Matteo Italia Aug 25 '15 at 21:56