2

I tried to google this, but any results I found were related to importing data from a txt file to populate the database as opposed to storing data.

To me, it seems strange that the contents of a file should be stored in a database. We're working on building an eCommerce site, and each item has a description. I assumed the standard would be to store the description in a txt file and the URL in the database, and not to store the huge contents in the database to keep the file size low and speeds high. When you need to store images in a database, you reference it using a URL instead of storing all the pixel data - why would text be any different? That's what I thought, but everyone seems to be arguing about VARCHAR vs TEXT, so what really is the best way to store text data up to 1000 characters or so? Thanks!

Josh Lyness
  • 102
  • 1
  • 12

1 Answers1

2

Whether you store long text data or image data in a database or in external files has no right or wrong answer. There are pros and cons on both sides—despite many developers making unequivocal claims that you should store images outside the database.

Consider you might want the text data to:

  • Allow changes to be rolled back.
  • Support transaction isolation.
  • Enforce SQL access privileges to the data.
  • Be searchable in SQL when you create a fulltext index.
  • Support the NOT NULL constraint, so your text is required to contain something.
  • Automatically be included when you create a database backup (and the version of the text is the correct version, assuring consistency with other data).
  • Automatically transfer the text to replica database instances.

For all of the above, you would need the text to be stored in the database. If the text is outside the database, those features won't work.

With respect to the VARCHAR vs. TEXT, I can answer for MySQL (though you mentioned VARCHAR(MAX) so you might be referring to Microsoft SQL Server).

  • In MySQL, both VARCHAR and TEXT max out at 64KB in bytes. If you use a multibyte character set, the max number of characters is lower.

  • Both VARCHAR and TEXT have a character set and collation.

  • VARCHAR allows a DEFAULT, but TEXT does not.

  • Internally the InnoDB storage engine, VARCHAR and TEXT are stored identically (as well as VARBINARY and BLOB and all their cousins). See https://www.percona.com/blog/2010/02/09/blob-storage-in-innodb/

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • ahh sorry I was reading an SQL answer on this site to a similar question when I wrote this, I do use MySQL. I've been using UTF8MB4 so the max number of characters is much lower. A lot of the disadvantages you mentioned about using a text file can be implemented by us, things like backups, and I don't see a case where a large amount of text would need to be searchable during a SQL query. It's kinda a big problem we're facing, there's no real easy option here. 64kb is nothing to play around with, especially for product descriptions. Thanks, I'll have a think and come up with something. – Josh Lyness Nov 07 '17 at 22:55
  • doing the math, 64kb is a lot more than it seems... 16k characters is more than enough for product descriptions. Also I wasn't aware that MySQL 4.x measured VARCHAR's by bytes rather than characters. Thankfully 5.x measures it in characters! That's where I got the VARCHAR(MAX) from :) – Josh Lyness Nov 07 '17 at 23:01
  • 1
    In MySQL 4.0, they didn't support multibyte characters, so bytes and characters is equivalent. In MySQL 4.1, they started counting the string length in characters. Note that both 4.0 and 4.1 are years past their end-of-life. – Bill Karwin Nov 07 '17 at 23:12
  • I didn't know that, I assumed it was version 5.0 and above that started allowing multibyte characters. Yep we use 10.1 so we're pretty safe, thanks for clarifying. – Josh Lyness Nov 08 '17 at 17:03
  • 1
    10.1 sounds like a MariaDB version number. MariaDB is a fork based on the MySQL 5.5 source, circa 2010. – Bill Karwin Nov 08 '17 at 17:32