17

While searching trough SO, I've found two contradicting answers (and even a comment that stated that) but no definitive answer:

The problem is: is there any performance benefit, if you store a TEXT/BLOB field outside of a table?

We assume:

  • You SELECT correctly (only selection the TEXT/BLOB if required, no SELECT *)
  • Tables are indexed properly, where it makes sense (so it's not a matter of 'if you index it')
  • The database design doesnt really matter. This is a question to identify the MySQL behaviour in this special case, not to solve certain database design problems. Let's assume this Database has only one table (or two, if the TEXT/BLOB gets separated)
  • used engine: innoDB (others would be interesting too, if they fetch different results)

This post states, that putting the TEXT/BLOB into a separate table, only helps if you're already SELECTing in a wrong way (always SELECTing the TEXT/BLOB even when it's not necessary) - basically stating, that TEXT/BLOB in the same table is basically the better solution (less complexity, no performance hit, etc) since the TEXT/BLOB is stored seprately anyway

The only time that moving TEXT columns into another table will offer any benefit is if there it a tendency to usually select all columns from tables. This is merely introducing a second bad practice to compensate for the first. It should go without saying the two wrongs is not the same as three lefts.

MySQL Table with TEXT column


This post however, states that:

When a table has TEXT or BLOB columns, the table can't be stored in memory

Does that mean that it's already enough to have a TEXT/BLOB inside a table, to have a performance hit?

MySQL varchar(2000) vs text?


My Question basically is: What's the correct answer?

Does it really matter if you store TEXT/BLOB into a separate table, if you SELECT correctly?

Or does even having a TEXT/BLOB inside a table, create a potential performance hit?

Community
  • 1
  • 1
Katai
  • 2,773
  • 3
  • 31
  • 45
  • 1
    Any specific storage engine you're using? InnoDB/MyISAM/NDB, etc. – gertvdijk Dec 17 '12 at 13:23
  • @gertvdijk I added it - I'm especially interested in a solution regarding InnoDB - but basically, if there is a difference, MyISAM and other storage engines would be interesting too – Katai Dec 17 '12 at 15:32

2 Answers2

19

Update: Barracuda is the default InnoDB file format since version 5.7.

If available on your MySQL version, use the InnoDB Barracuda file format using

innodb_file_format=barracuda

in your MySQL configuration and set up your tables using ROW_FORMAT=Dynamic (or Compressed) to actually use it.

This will make InnoDB to store BLOBs, TEXTs and bigger VARCHARs outside the row pages and thus making it a lot more efficient. See this MySQLperformanceblog.com blog article for more information.

As far as I understand it, using the Barracuda format will make storing TEXT/BLOB/VARCHARs in separate tables not valid anymore for performance reasons. However, I think it's always good to keep proper database normalization in mind.

gertvdijk
  • 24,056
  • 6
  • 41
  • 67
  • `barracuda` seems to be the default value, right? – XCS Mar 25 '21 at 12:19
  • 1
    The answer was written in 2012 and the default has changed in more recent versions. [The innodb_file_format default value was changed to Barracuda in MySQL 5.7.](https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_file_format) – gertvdijk Mar 25 '21 at 21:35
  • 1
    Thanks for the response, I noticed the answer was old. So this means that by default there is no reason to store BLOB/TEXT in a separate table. – XCS Mar 26 '21 at 13:00
6

One performance gain is to have a table with fixed length records. This would mean no variable length fields like varchar or text/blob. With fixed length records, MySQL doesn't need to "seek" the end of a record since it knows the size offset. It also knows how much memory it needs to load X records. Tables with fixed length records are less prone to fragmentation since space made available from deleted records can be fully reused. MyISAM tables actually have a few other benefits from fixed length records.

Assuming you are using innodb_file_per_table, keeping the tex/blob in a separate table will increase the likelihood that the file system caching will be used since the table will be smaller.

That said, this is a micro optimization. There are many other things you can do to get much bigger performance gains. For example, use SSD drives. It's not going to give you enough of a performance boost to push out the day of reckoning when your tables get so big you'll have to implement sharding.

You don't hear about databases using the "raw file system" anymore even though it can be much faster. "Raw" is when the database accesses the disk hardware directly, bypassing any file system. I think Oracle still supports this. But it's just not worth the added complexity, and you have to really know what you are doing. In my opinion, storing your text/blob in a separate table just isn't worth the added complexity for the possible performance gain. You really need to know what you are doing, and your access patterns, to take advantage of it.

Brent Baisley
  • 12,641
  • 2
  • 26
  • 39
  • Yeah, but what does it mean for a table that actually has a TEXT inside of it? If MySQL stores the TEXT externally, it should basically only be a pointer to that external storage, so it should be about the same as storing the TEXT in a separate table, manually. That's basically the question, does that already happen automatically? (separate storing) What I'm trying to find out, if it's nonsense or not, to do it manually. How is that automatic 'reference' to the TEXT handled? – Katai Dec 24 '12 at 10:38
  • There is a difference between MySQL storing the TEXT "external" and storing TEXT externally in a separate file. MySQL stores it "external" within the file, just like index and data are stored in the same file with Innodb. If you store it in a separate table, it's a separate file. It's not nonsense to do it manually, but you are not getting even a 1% performance boost. It won't even be measurable until you have many millions of records. – Brent Baisley Dec 24 '12 at 19:02