I had a fairly basic doubt on HFiles.
When a put/insert request is initiated, the value is first written into the WAL and then into the memstore. The values in the memstore is stored in the same sorted manner as in the HFile. Once the memstore is full, it is then flushed into a new HFile.
Now, I have read that the HFile stores the data in sorted order i.e. the sequential rowkeys will be next to each other.
Is this 100% true?
For example: I first write rows with rowkeys 1 to 1000, except rowkey 500. Assume that the memstore is now full and so it will create a new HFile, call it HFile1. Now, this file is immutable.
Now, I will write rows 1001 to 2000, then I write rowkey 500. Assume that the memstore is full and it writes to a HFile, call it HFile2.
So, is this how it happens?
If yes, then rowkey 500 is not in the HFile1, so the rowkeys in the HFiles are not in sorted order. So, is the original statement in bold correct?
So, when a read happens, how does the read happen?