223

In multiple courses, books, and jobs, I have seen text fields defined as VARCHAR(255) as kind of the default for "shortish" text. Is there any good reason that a length of 255 is chosen so often, other than being a nice round number? Is it a holdout from some time in the past when there was a good reason (whether or not it applies today)?

I realize, of course, that a tighter limit would be more ideal, if you somehow know the maximum length of the string. But if you are using VARCHAR(255) that probably indicates that you don't know the max length, only that it is a "shortish" string.


Note: I found this question (varchar(255) v tinyblob v tinytext), which says that VARCHAR(n) requires n+1 bytes of storage for n<=255, n+2 bytes of storage for n>255. Is this the only reason? That seems kind of arbitrary, since you would only be saving two bytes compared to VARCHAR(256), and you could just as easily save another two bytes by declaring it VARCHAR(253).

Community
  • 1
  • 1
Kip
  • 107,154
  • 87
  • 232
  • 265

10 Answers10

216

255 is used because it's the largest number of characters that can be counted with an 8-bit number. It maximizes the use of the 8-bit count, without frivolously requiring another whole byte to count the characters above 255.

When used this way, VarChar only uses the number of bytes + 1 to store your text, so you might as well set it to 255, unless you want a hard limit (like 50) on the number of characters in the field.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
  • 8
    Does this hold true for DBs where varchars are UTF-8? – antak Jul 17 '15 at 02:27
  • 2
    @antak: In MySQL, using InnoDB, any key column cannot be larger than 767 bytes. If a VARCHAR column is UTF8 (meaning each char might take up to 3 bytes), the maximum allowed length of the column is floor(767/3) = 255. I'm assuming "767" was chosen for exactly that reason. – BlueRaja - Danny Pflughoeft Oct 08 '16 at 20:12
  • 3
    **If the charset is `utf8`**, `varchar(85)` is the limit over which crossing tips the *length byte* from one to two bytes. If it's `utf8mb4`, it's `varchar(63)`. These are significant because they're the maximum to which a [VARCHAR's length can be extended through the use of online ALTER TABLE](https://dev.mysql.com/doc/refman/5.7/en/innodb-create-index-overview.html#innodb-online-ddl-column-properties). Consequently, I derived those numbers by creating a table with a `varchar(2) charset utf8` column and seeing how far I was able to extend it given `ALGORITHM=INPLACE`. – antak Apr 11 '17 at 07:57
  • 1
    It makes even more sense when you consider that many "databases" Back In The Day were stored on magnetic tape. It was very common to read data in "blocks" that were sized in multiples of two. This way, data was stored most efficiently (and when you were running on an old mainframe, small efficiencies like that were make-it-or-break-it optimizations). – TMN Jan 14 '19 at 18:52
129

Historically, 255 characters has often been the maximum length of a VARCHAR in some DBMSes, and it sometimes still winds up being the effective maximum if you want to use UTF-8 and have the column indexed (because of index length limitations).

chaos
  • 122,029
  • 33
  • 303
  • 309
  • ms access suggests this length by default, for example – matt eisenberg Aug 02 '09 at 00:03
  • "winds up being the effective maximum length of a varchar ..." what's the justification for this? If you had left out the word effective, I would buy it with that word, you imply that in those DBMSs that you refer to, you can have larger columns, but they are ineffective. this is not true. - I am not even sure what it means. – Charles Bretana Feb 17 '16 at 13:00
  • 4
    @CharlesBretana: if you read the rest of the sentence you quoted, you will find the exact explanation you are requesting. – chaos Feb 17 '16 at 16:07
  • Respectfully, no it doesn't. What does the specific value 256 have to do with effectiveness of indices? Why not 254 or 355, or 257 ? ... and, since b-Tree indices consist of the same columns as the tables, the length of index columns must be limited by the same constraint placed on column length. – Charles Bretana Feb 17 '16 at 16:47
  • @CharlesBretana: If your maximum index length is 767 bytes, and you're using a fake version of UTF-8 that reserves 3 bytes per character, your maximum indexable VARCHAR is 255 characters. All of which was precisely the case in MySQL last I looked. As to your statement regarding indexes and columns necessarily having the same length limit, your conclusion is demonstrably wrong (you can have fake-3-byte UTF-8 VARCHARs in MySQL longer than 255 characters, you just can't index them fully), so I imagine one of your premises must also be wrong. – chaos Feb 21 '16 at 11:51
  • @chaos, I confess I do not know what a "fake UTF-8" character is, so until I do some research I cannot speak to that, but regardless of what it might be, it did not exist when the 255 character limit was the norm on databases column widths. And (although as I said I'm not sure what a "fake UTF-8 is or how it works), I would find it astounding that the technical details about that have anything at all to do with why there was a (specifically) 255 character limit on column lengths, or why people still use this value as a default when it is unnecessary. – Charles Bretana Feb 21 '16 at 15:37
  • 3
    @CharlesBretana: By "fake UTF-8" I mean MySQL's "utf8" encoding, which as I mentioned reserves (and is limited to) 3 bytes per character. This isn't a very good version of UTF-8; if you want decent UTF-8 in MySQL, you have to use its "utf8mb4" encoding. But people are much more likely to not know that and go with "utf8", and much more likely to want UTF-8 than any other encoding, so, presto, they wind up with a maximum indexable length of 255 characters in a VARCHAR. Your astoundment notwithstanding. – chaos Feb 22 '16 at 19:02
  • @chaos, I just reviewed some docs on this... at `https://serversforhackers.com/mysql-utf8-and-indexing` As I read this, (I'm not a MySQL guy, so this is, admittedly,, new to me) There is still nothing in there that specifies a specific limit of 255 characters. In fact, as I read this, for three-char utf8 columns, it says you should limit you varchars to 191 characters. So, (unless I am misreading this), , I again, ask you, what does this have to do with the historical tendency to use varchar(255) in specifying column datatypes ? – Charles Bretana Feb 22 '16 at 20:25
  • 4
    @CharlesBretana: I have now explained it three times and not a single thing has changed. MySQL's index length limit is still 767 bytes, the number of bytes needed to encode a 3-byte UTF-8 character is still 3, and floor(767 / 3) is still 255. Your determination to find something to be confused about beggars belief. – chaos Feb 23 '16 at 21:34
  • 1
    @CharlesBretana (Sorry for being late to this whole party) I'm no DB specialist, but I think what chaos is saying is : yes a 'Fake UTF-8' column can be of more than 255 characters long, but the index will only work on the first 255 characters of the varchar, making it effectively the maximum of a column if you want it fully indexed. Now that's only what I understood of his explainations, I may be wrong, I'm not an expert in SQL indexes at all. – Noémie Lord Apr 06 '17 at 15:43
  • 1
    @Francis, Even if it were true that indices could, for some database products, at some point in database history, only handle 255 characters,any such index limitation would be irrelevant to this discussion. The historical reasons for the 255 character limit on column length cannot, obviously, have been retroactively affected by index limitations on database products designed years after those length limitation became common. The 255 character limit has been in common usage since the early 80s, when the first relational databases were created in the early 80s. *BASIC* UTF wasn't created til 92 – Charles Bretana Apr 06 '17 at 16:10
  • 2
    @CharlesBretana If you look properly at Chaos' answer, you'll notice it it seperated into 2 parts : 1. The historical reason behind Varchar(255) being so common (it used to be the maximum on some older DBMS), 2. Even today, it is still an limitation for some because of the index limitations discussed previously, Part 1 and 2 are not linked. Part 1 is the actual answer to the question, part 2 is a side note which is still relevant to the question because it explains why even today it may still be a limitation. (CONTINUED ->) – Noémie Lord Apr 06 '17 at 18:03
  • @CharlesBretana (Continuation) Nowhere is he stating that the index limitations of today's system are the historical reason for Varchar(255) being so prevalent. – Noémie Lord Apr 06 '17 at 18:05
  • Perhaps, but, although I have not reached into the minds of all those who use `varchar(255)` as a matter of course, as stated in the ops question, "... in multiple courses, books, and jobs..." without understanding (or knowledge of why), I would have no hesitation arguing that the very great majority of them are not doing it because of index limitations. ... and the ops question, is, again, (my italics), "Is there .. good reason that a length of 255 is *chosen so often*, ... " Not whether there's a good reason now, in any specific circumstances, but why is it so often chosen. – Charles Bretana Apr 06 '17 at 20:34
  • *maximum index length is 767 bytes* that's a MySQL/InnoDB thing. That doesn't for example happen on PostgreSQL. `CREATE TABLE foo ( bar varchar(255) PRIMARY KEY ); INSERT INTO foo (bar) SELECT repeat('', 255);` – Evan Carroll Jul 11 '17 at 16:18
  • 1
    @CharlesBretana "In fact, as I read this, for three-char utf8 columns, it says you should limit you varchars to 191 characters" you probably figured this out already by now, but the serversforhackers author was using utf8mb4 when making this suggestion, since that is what uses 1 to 4 bytes (unlike the utf8 charset which only supports up to 3 bytes) – georaldc Oct 26 '18 at 00:21
  • Right, a large part of the confusion here is due to MySQL restricting what it incorrectly terms as 'UTF-8' [to three bytes](https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-utf8.html) rather than the four that it actually can be. "utf8 is an alias for the utf8mb3 character set" – ijoseph May 10 '20 at 22:12
26

Probably because both SQL Server and Sybase (to name two I am familiar with) used to have a 255 character maximum in the number of characters in a VARCHAR column. For SQL Server, this changed in version 7 in 1996/1997 or so... but old habits sometimes die hard.

chaos
  • 122,029
  • 33
  • 303
  • 309
Charles Bretana
  • 143,358
  • 22
  • 150
  • 216
22

I'm going to answer the literal question: no, there isn't a good reason you see VARCHAR(255) used so often (there are indeed reasons, as discussed in the other answers, just not good ones). You won't find many examples of projects that have failed catastrophically because the architect chose VARCHAR(300) instead of VARCHAR(255). This would be an issue of near-total insignificance even if you were talking about CHAR instead of VARCHAR.

MusiGenesis
  • 74,184
  • 40
  • 190
  • 334
  • 1
    1 byte out of 255 is 0.4%. Sometimes you care about the last half a percent or so. Sometimes you don't. If you hosting and perf costs run into the tens of dollars, you probably don't care. If they run into the millions, they probably do. – Edward Brey Oct 14 '17 at 14:58
  • It's more about standards in the same vein as _linting_ than it is about preventing catastrophic failures. If you also don't think there is no good reasons for linting, then this view would be consistent with that. – Jon May 11 '23 at 03:21
16

When you say 2^8 you get 256, but the numbers in computers terms begins from the number 0. So, then you got the 255, you can probe it in a internet mask for the IP or in the IP itself.

255 is the maximum value of a 8 bit integer : 11111111 = 255

Does that help?

Andrew Barber
  • 39,603
  • 20
  • 94
  • 123
exec.-
  • 185
  • 1
  • 2
  • 1
    With integers, you count starting from 0 and you end at 255. But with places in a string, you count starting from the 1st place, so doesn't it make sense to end at the 256th place, because you started at 1 instead of 0? I'm not agreeing with varchar(256) entirely just yet, because of string_length() results, but I really am not certain. – HoldOffHunger Mar 16 '16 at 23:18
  • 1
    @HoldOffHunger strings in a database can have a length of zero characters, so the permissible range of lengths when the length is stored in eight bits is between 0 and 255. If you wanted to say that strings all must have at least one character then you could support 256-character strings with an eight-bit length. – phoog Jan 14 '19 at 19:44
9

Note: I found this question (varchar(255) v tinyblob v tinytext), which says that VARCHAR(n) requires n+1 bytes of storage for n<=255, n+2 bytes of storage for n>255. Is this the only reason? That seems kind of arbitrary, since you would only be saving two bytes compared to VARCHAR(256), and you could just as easily save another two bytes by declaring it VARCHAR(253).

No. you don't save two bytes by declaring 253. The implementation of the varchar is most likely a length counter and a variable length, nonterminated array. This means that if you store "hello" in a varchar(255) you will occupy 6 bytes: one byte for the length (the number 5) and 5 bytes for the five letters.

Stefano Borini
  • 138,652
  • 96
  • 297
  • 431
  • 3
    This statement is not true of all databases. many databases use varchar fields of the given size in the tables so that they don't have to move rows around when that field is changed for a row. – SingleNegationElimination Aug 01 '09 at 22:11
  • yes you are right. it's implementation dependent. You have to check the vendor manual to see what is the case – Stefano Borini Dec 10 '09 at 17:48
  • 5
    It may be permissible, but implementing `VARCHAR` that way defeats the whole *point* of using `VARCHAR` instead of `CHAR`. – dan04 Sep 07 '10 at 03:00
4

An unsigned 1 byte number can contain the range [0-255] inclusive. So when you see 255, it is mostly because programmers think in base 10 (get the joke?) :)

Actually, for a while, 255 was the largest size you could give a VARCHAR in MySQL, and there are advantages to using VARCHAR over TEXT with indexing and other issues.

gahooa
  • 131,293
  • 12
  • 98
  • 101
4

In many applications, like MsOffice (until version 2000 or 2002), the maximum number of characters per cell was 255. Moving data from programs able of handling more than 255 characters per field to/from those applications was a nightmare. Currently, the limit is less and less hindering.

4

0000 0000 -> this is an 8-bit binary number. A digit represents a bit.

You count like so:

0000 0000 → (0)

0000 0001 → (1)

0000 0010 → (2)

0000 0011 → (3)

Each bit can be one of two values: on or off. The total highest number can be represented by multiplication:

2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 - 1 = 255

Or

2^8 - 1. 

We subtract one because the first number is 0.

255 can hold quite a bit (no pun intended) of values.

As we use more bits the max value goes up exponentially. Therefore for many purposes, adding more bits is overkill.

ScottyBlades
  • 12,189
  • 5
  • 77
  • 85
1

Another reason may be that in very old data access libraries on Windows such as RDO and ADO (COM version not ADO.NET) you had to call a special method, GetChunk, to get data from a column with more than 255 chars. If you limited a varchar column to 255, this extra code wasn't necessary.

Booji Boy
  • 4,522
  • 4
  • 40
  • 45