4

When running queries while using myisam engine, because its not transactional, long queries (as far as I understand) don't affect the data from other queries.

In InnoDB, one of the things it warns is to avoid long queries. When InnoDB snapshots, is it snap shotting everything?

The reason I am asking this is: say a query for whatever reason takes a longer time than normal and eventually rolls back. Meanwhile, 200 other users have updated or inserted rows into the database. When the long query rolls back, does it also remove the updates/inserts that were made by the other users? or are the rows that involved the other users safe, unless they crossed over with the one that gets rolled back?

jeremycole
  • 2,741
  • 12
  • 15
Lawrence Cooke
  • 1,567
  • 3
  • 26
  • 52

2 Answers2

5

Firstly, I think that it would be useful as background to read up on multi-version concurrency control (MVCC) as a background to this answer.

InnoDB implements MVCC, which means it can use non-locking reads for regular SELECT. This does not require creating a "snapshot" and in fact InnoDB doesn't have any real concept of a snapshot as an object. Instead, every record in the database keeps track of its own version number and maintains a "roll pointer" to an "undo log" record (which may or may not still exist) which modifies the row to its previous version. If an older version of a record is needed, the current version is read and those roll pointers are followed and undo records applied until a sufficiently old version of the record is produced.

Normally the system is constantly cleaning up those undo logs and re-using the space they consume.

Any time any long-running transaction (note, not necessarily a single query) is present, the undo logs must be kept (not purged) in order to sufficiently recreate old enough versions of all records to satisfy that transaction. In a very busy system, those undo logs can very quickly accumulate to consume gigabytes of space. Additionally if specific individual records are very frequently modified, reverting that record to an old enough version to satisfy the query could take very many undo log applications (thousands).

That is what makes "long-running queries" expensive and frowned upon. They will increase disk space consumption for keeping the undo logs in the system tablespace, and they will perform poorly due to undo log record application to revert row versions upon read.

Some databases implement a maximum amount of undo log space that can be consumed, and once they have reached that limit they start throwing away older undo log records and invalidating running transactions. This generates a "snapshot too old" error message to the user. InnoDB has no such limit, and allows accumulation indefinitely.

jeremycole
  • 2,741
  • 12
  • 15
2

Whether your queries affect concurrency or not have to do with the types of queries. Having many read queries won't affect concurrency in MyISAM or InnoDB (besides performance issues).

Inserts (to the end of an index with InnoDB, or the end of a table with MyISAM) also don't impact concurrency.

However, as soon as you have an update query, rows get locked in InnoDB, and with MyISAM, it's the entire table that gets write locked. When you try to update a record (or table) that has a write lock, you must wait until the lock is released before you can proceed. In MyISAM, updates are served before reads, so you have to wait until the updates are processed.

MyISAM can be more performant because table locks are faster than record locks (though record locks are fast). However, when you start making a significant number of updates, InnoDB is generally preferred because different users are generally not likely to contend for the same records. So, with InnoDB, many users can work in parallel without affecting each other too much, thanks to the record level locking (rather than table locks).

Not to mention the benefit of full ACID compliance that you get with InnoDB, enforcement of foreign key constraints, and the speed of clustered indexes.

Snapshots (log entries) are kept long enough to complete the current transaction and are discarded if they are rolled back or committed. The longer a transaction runs, the more likely it is that other updates will occur, which grows the number of log entries required to roll back.

There will be no "cross-over" due to locking. When there is write contention for the same records, one user must wait until the other commits or rolls back.

You can read more about The InnoDB Transaction Model and Locking.

Marcus Adams
  • 53,009
  • 9
  • 91
  • 143
  • 2
    There's no physical manifestation of a snapshot per se, there's only old versions of the rows that have been updated. Long-running transactions inhibit the garbage-collection of those old rows. I.e. if any outstanding transaction still needs to "see" the database in the state it was when that transaction started (and the transaction is repeatable-read), then the old row versions necessary to complete its view of the state of data cannot be purged. The longer the transaction, the more data tends to accumulate in the rollback segment, assuming updates continue to happen. – Bill Karwin Jan 15 '14 at 15:02
  • I updated my answer. Let me know if something's still not accurate. – Marcus Adams Jan 15 '14 at 15:08
  • Yep, that looks better. – Bill Karwin Jan 17 '14 at 02:06