A few years ago MongoDB caught some heat for having an unsafe default relating to disk persistence (see this question for instance). What measures must a database implementation go through to ensure that writes to disk are safe? Is it sufficient to call fsync()
after a write, or must other precautions be taken such as journaling or particular ways of using the disk?

- 1
- 1

- 22,421
- 2
- 50
- 77
1 Answers
Calling fsync()
would flush the dirty pages in the buffer cache to the disk. This depends on the load on your server, as having a large number of dirty pages in the cache and initiating a flush could causes the system to hung or get to an unresponsive state. However its recommended tune some of the kernel turntables with optimal values for vm.dirty_expire_centisecs
, vm.dirty_background_ratio
to make sure all writes a safe and quick and not kept in the cache for a long time. Having lower values could slow average I/O speed as constantly trying to write dirty pages out will just trigger the I/O congestion code more frequently.
Alternatively, some of the databases provide Direct I/O as a feature of the file system whereby file reads and writes go directly from the applications to the storage device, bypassing caches. Direct I/O is mostly used in applications (databases) that manage their own caches with the O_DIRECT flag.

- 6,501
- 30
- 43