7

I am learning MySQL architecture. I come up with the following illustration:

enter image description here

There are 4 concepts that I don't understand well:

  • double write buffer
  • log buffer
  • write-ahead log
  • redo log

I read from many documents, Write-Ahead Log (WAL) is a mechanism for database durability. MySQL WAL Design Wikipedia WAL

Like the above image, there are 2 types of buffers when flushing data from the memory buffer pool to disks: double write buffer and log buffer. Why do we need 2 buffers, and how are they related to WAL?

Last but not least, what are the differences between redo logs and WAL. I think WAL can help database recover when something wrong happens (e.g.: power outage, server crashes ...). What do we need redo log alongside with WAL?

Trần Kim Dự
  • 5,872
  • 12
  • 55
  • 107

1 Answers1

17

The WAL design document you linked to gives a clue:

All the changes to data files are logged in the WAL (called the redo log in InnoDB).

That means WAL and redo log are two different terms for the same log. There is no difference.

The log buffer is an allocation in RAM. All writes to the redo log are saved in the log buffer first, because it's very fast to save some data in RAM. A transaction could be made of many changes affecting many individual rows, and writing to disk for every one of these rows would be too slow. So changes on their way to the redo log are saved in the log buffer first. Periodically, a group of changes in the log buffer are saved to disk, in the redo log. This happens when:

  • You commit a transaction
  • The log buffer is full (the log buffer has a fixed size)
  • Every 1 second regardless of whether the log buffer is full

The double-write buffer has a totally different purpose. It is actually a segment of the InnoDB tablespace on disk, not in RAM (I think it's confusing that the term "buffer" is used for storage in both RAM and disk).

The purpose of the double-write buffer is to prevent data corruption from partial page writes, while modified pages are copied from the innodb buffer pool to the tablespace. That is, if MySQL Server were to crash while InnoDB is writing a given page to disk, it could overwrite a page on disk partially. Even with the redo log, there would be no way to recover this page.

So InnoDB writes first every page to a small subset of the tablespace called the doublewrite buffer. Once it has finished writing that page, it can then save the page again to the proper page in the tablespace. If this fails partially, it's okay because the page has also been written to the doublewrite buffer. Once the page has been saved to its proper location in the tablespace, the copy of that page in the doublewrite buffer is not needed, and it can be overwritten the next time there's a page flush from the buffer pool.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • 2
    Thanks for very great detail answer. I have one more question: if the redo log/write buffer log is saved before committing a transaction, then the transaction is reverted, how does mysql handle it? It is outside of my question, but I am thrilled if I can get the answer elsewhere. – Trần Kim Dự Jul 01 '19 at 08:34
  • 2
    Have you read https://dev.mysql.com/doc/refman/8.0/en/innodb-multi-versioning.html ? – Bill Karwin Jul 01 '19 at 13:38
  • 2
    You might also like to read [High Performance MySQL](http://shop.oreilly.com/product/0636920022343.do), which is a well-written book that covers a lot of these concepts. I used to work for the authors. – Bill Karwin Jul 01 '19 at 13:39
  • awesome, double-write-buffer prevents data corruption from partial page writes, then what if double-write-buffer has data corruption ? – http8086 Sep 18 '19 at 12:33
  • 1
    All pages, including the double-write buffer, have page checksums. If there's data corruption in a double-write buffer page, it will be detected. The good news is that the change can be recovered because the change is still in the InnoDB redo log, and the original page to apply the change to is still in the tablespace. – Bill Karwin Sep 18 '19 at 21:34
  • 1
    In addition to this, I'm sharing this link which has additional information: http://dimitrik.free.fr/blog/archives/2011/01/mysql-performance-innodb-double-write-buffer-redo-log-size-impacts-mysql-55.html. Example: "[...] Even having transactional logging [...] your database is still not protected from half-written pages. This may still happen during a server crash [...] when the page write operation was interrupted in the middle. So, you'll get a corrupted data, and it will be impossible to repair it from the redo log as **there are only changes saved within redo and not a whole page**." – Rarylson Freitas Dec 26 '19 at 20:27
  • 1
    @user3824832: The redo log is not used to apply changes to the data file on disk. Writing to the data file is only done by copying modified pages from the buffer pool to the respective page on disk. The redo log file is used only to reproduce modified pages in the buffer pool after a crash. A modified page that was not able to be copied to disk must be recreated by reading the original page from disk into the buffer pool, then applying redo log changes to the page in the buffer pool. Then the modified page can be flushed back to disk. – Bill Karwin May 25 '20 at 18:49