23

I need to store on average a paragraph of text, which would be about ~800 characters in the database. In some rare cases it may go up to 2000-2500~ characters. I've read the manual and I know there are many of these questions already, but I've read over 10+ questions on stackoverflow and I still find it a bit hard to figure out whether I should simply use text or something like varchar(2000). Half seem to say use varchar, while the other half say text. Some people say always use text if you have more than 255 characters (yea, this was after 5.0.3 which allowed varchar up to 65k). But then I thought to myself if I were to use text everytime the characters were over 255, then why did mysql bother increasing the size at all if that was always the best option?

They both have a variable size in storage I've read, so would there be no difference in my situation? I was personally leaning towards varchar(2000) then I read that varchar stores the data inline while text doesn't. Does this mean that if I constantly select this column, storing the data as varchar would be better, and conversely if I rarely select this column then using text would be better? If that is true, I think I would now choose the text column as I won't be selecting this column many of the times I run a query on the table. If it matters, this table is also frequently joined to as well (but won't be selecting the column), would that also further the benefit of using text?

Are my assumptions correct that I should go with text in this case?

Joker
  • 1,133
  • 7
  • 18
  • 22
  • 2
    Think about it in terms of memory usage. VARCHAR fields are part of the row, where TEXT columns are just pointed to on disk by an address in the row. Your decision will affect the query cache (how many result sets can fit in that memory), the table cache (how many tables can fit in that memory), how many rows fit in various per-query buffers, etc. – Dan Grossman Feb 19 '11 at 21:48

2 Answers2

18

When a table has TEXT or BLOB columns, the table can't be stored in memory. This means every query (which doesn't hit cache) has to access the file system - which is orders of magnitude slower than the memory.

Therefore you should store this TEXT column in a seperate table which is only accessed when you actually need it. This way the original table can be stored in memory and will be much faster.

Think of it as separating the data into one "memory table" and one "file table". The reason for doing this is to avoid accessing of the filesystem except when neccessary (i.e. only when you need the text).

You don't earn anything by storing the text in multiple tables. You still have to access the file system.

Sorry what I meant was for example, a forum script, in the posts table they might be >storing 20 columns of post data, they also store the actual post as a text field in the >same table. So that post column should be separated out?

Yes.

It seems weird to have a table called post, but the actual post isn't stored there, maybe >in another table called "actual_post" not sure lol.

You can try (posts, post_text) or (post_details, posts) or something like that.

I have a tags table that has just three fields, tag_id, tag, and description. So that >description column should also be separated out? So I need a tags table and a >tags_description table just to store 3 columns?

If the description is a TEXT column and you run queries against this table that doesn't need the description it would certainly be preferable.

  • Sorry, what do you mean when you say, "tables using TEXT can't be in-memory but requires file I/O." So does that mean even if I wanted to use text in this case I couldn't (if I kept it in the same table). – Joker Feb 19 '11 at 22:17
  • Thanks for explaining in detail. Although if it is as drastic as it sounds, doesn't this mean that every text column should be stored in a separate table since even a column with less than 2000 chars needs to be separated? Do you have any examples of when a text column should be kept in the same table of relevant data? – Joker Feb 19 '11 at 22:44
  • Sorry what I meant was for example, a forum script, in the posts table they might be storing 20 columns of post data, they also store the actual post as a text field in the same table. So that post column should be separated out? It seems weird to have a table called post, but the actual post isn't stored there, maybe in another table called "actual_post" not sure lol. I have a tags table that has just three fields, tag_id, tag, and description. So that description column should also be separated out? So I need a tags table and a tags_description table just to store 3 columns? – Joker Feb 19 '11 at 22:59
  • Why do people rarely do this then? Even in a lot of commercial applications that care a lot about performance don't seem to do this. The funny thing is I used to do this, but later changed to adding everything to the main table. Now I should go back to what I did before... Is this also something you should do right from the outset or keep the data in the same table and only if you see problems then try separating them out? Wouldn't I just be better off to use varchar(2000) in the main table instead? – Joker Feb 19 '11 at 23:30
  • 2
    I guess it just seems weird to me, imagine if you had 50 tables, and 30 of those tables need descriptions, now you need 30 more tables that just have 1 column to hold each description. – Joker Feb 19 '11 at 23:31
  • 2
    I was going to see if there were any more opinions. But wouldn't I just be better off to use varchar(2000) in the main table instead of all the hassle? It also means when I update or delete records I also need to do it multiple tables. I also thought that using text is also better if you don't plan to select it out all the time, but you said the opposite. I also noticed you said for uncached queries it would be much slower, but in general most queries will become cached right? – Joker Feb 21 '11 at 20:17
  • @Joker You can of course use VARCHAR(2000) and chance later if the performance sucks. However, if you sometimes SELECT from the table but don't need the paragraph of text, you would be better off storing it in a seperate table. –  Feb 21 '11 at 22:30
  • 7
    When you said, "When a table has TEXT or BLOB columns, the table can't be stored in memory" I don't think it's true. I just read from mysql.com that only if you actually select the value you get a performance penalty because it uses disk. Which is why they say avoid SELECT *. So if your queries don't actually select the text column I don't think it will be as bad as you say. Which is why you got me confused because as I said in my first post I thought if you don't select the value out as much text is actually better than varchar. – Joker Feb 22 '11 at 21:21
  • 4
    My understanding was that TEXT columns are stored separately anyway, so storing them in a separate table is pointless. In fact, there is a question about this here and people reply in this favour: http://stackoverflow.com/questions/1849269/mysql-table-with-text-column - what are your thoughts on this? Do you still stand by what you have said here? Trying to work out which is right, thanks! – Mike Jun 21 '12 at 21:53
1

I think you summarized it well. Another thing you could consider is just moving the "text" to another table... and join back to the master record. That way every time you are actually using the master table, that extra data of where the "text" is isn't even taking up space in the master record. When you need it you can join to that table. This way you can store it as a varchar just in case you want to do something like " where text like... "

John Sobolewski
  • 4,512
  • 1
  • 20
  • 26
  • I used to do it this way, but I was told otherwise as unless I was doing many queries per second it's better to store it in the same table. I also figured 2000 characters of data wouldn't call for putting it in a separate table. Another case of when some say one thing and another says something else. :P Maybe I should keep the column in the table for now and if I ever have problems move it out? – Joker Feb 19 '11 at 22:15
  • The answers to [this question](http://stackoverflow.com/questions/1849269/mysql-table-with-text-column) seem to suggest that we should not be storing in another table. Thoughts anybody? @Joker? Just trying to get a solid answer on this. Cheers. – Mike Jun 21 '12 at 21:54