I am trying to analyze the behavior of BTRFS writing process. I need to create a simple test program which produces (at user level obviously) the same compressed blobs as the very blobs generated by BTRFS module onto physical hard disk.
What are the exact steps of writing of files under compression-enabled BTRFS filesystem? Are files split into filepages/extents? How are the filepage sizes determined? Is the compression process deterministic? How is the filepage size determined? Some filepages are not 128KiB-big (the maximal capacity) despite massive free space. How come?
What is the exact rule of compression? Some filepages are not compressed despite high compression ratio. How come?
My discovery (& further questions) after reading some documentation & the source code (linux/fs/btrfs/zlib.c & linux/fs/btrfs/inode.c) and testing the FS with small files (smaller than 10KiB): (Correct me)
*Smaller files are divided into 4096-bytes pages for further compression. The compressed blobs are contiguous. With zlib a smaller file is saved only as 1 blob (which begins with 0x785E). Bigger files are saved as separate noncontiguous blobs. With zlib a bigger file is saved as multiple 0x785E blobs. What are the steps during fragmentation of such big files before the compression? How are the fragment sizes determined?
*If the file is smaller than sectorsize (512 bytes), the file is saved as raw file. Correct?
*If the compression attempt fails to save at least 1x PAGE_SIZE of space (i.e. at least 4096 bytes), the file is saved as raw file and is marked as incompressible. Correct?
A pseudocode of the writing process would be nice. Please excuse me for my ignorance on BTRFS specification. Thank you in advance for the help.
UPDATE: After doing further tests I can already answer some of my own questions.
UPDATE: I discovered another corner case in which filepages are not in size 128KiB. I still need to know the exact rule of the compression process.
UPDATE: I changed my questions. Answering my previous questions:
(a) Yes, the files are split into usually 128-KiB-big pages; Some pages are smaller. Some small files (I still don't know the conditions thereof -- How small?) are saved either compressed or raw into inline extents.
(b) (I still don't know the answer to this question) Usually in case there is much free space, the filepages are in full 128-KiB size. In some cases (non-contiguous free space and other unknown conditions yet to be identified) some filepages are set in multiples of 0x1000 bytes (4096 bytes).
(c) The compression function is deterministic. However since identical files in different operating environment are differently processed (different filepage sizes, different compression flagging), the compressed blobs of the identical files on different disks are different to each other across the disks. Were the files processed in exactly same environment, the compressed blobs thereof would be identical across disks.
The main rule:
The file is to be divided into multiple 128KiB pages. (At least with zlib) Each page is to be divided into 4096-bytes steps (at most 128 steps per page). The compression heuristics check (to ensure that the compressed output is smaller than the raw input) begins upon the 3rd step until the end of page - For the first 2 steps no check is done. From the 3rd steps on during the compression loop if the compressed output is bigger than the raw input, then the compression process of the current page is cancelled and the current page is to be saved as raw blob.
Upon success the page is saved as compressed blob - Under default zlib settings the blob begins with 0x785E. The compressed blob is padded at the end with 0x00.
The side rule:
If the file is smaller than 512 bytes and is stored into an inline extent, the file is to be saved raw.
If the size of the compressed file + sectorsize (512 bytes by default) is smaller-equal than the size of the raw file, the compression output is accepted, else the compression output is discarded and the file is marked incompressible.
If free space is very scarce for the file, then the file is to be divided into unequal pages - Some pages will not be in size 128KiB.