3

I see some developers make fields like VARCHAR(50) / VARCHAR(100), and other developers make fields as VARCHAR(64) / VARCHAR(128) / VARCHAR(256).

Is there any reason, other than personal preference, to use 2^n numbers to specify field length when creating a VARCHAR field vs. a round number, assuming that both of the field lengths can support the incoming data?

Lil' Bits
  • 898
  • 2
  • 9
  • 24
  • Many computery types do think of e.g. 128 or 256 as "round number"s – Damien_The_Unbeliever Feb 26 '18 at 15:33
  • 1
    It's a programmer superstition that using powers of 2 might align better in memory or in disk. It's not true but most people just do it anyway. – apokryfos Feb 26 '18 at 15:34
  • It lines up better with the word size of the computer architecture, so it might make it slightly more efficient/store more data for the same computational cost. – Aiden Grossman Feb 26 '18 at 15:34
  • 2
    Ideally, you pick neither `10n` nor `2^n` but instead research the data type and pick a size that is *appropriate* for the type of data being stored - e.g. if there are laws or standards that are applicable to the data type, you pick the size that matches what the standard says. – Damien_The_Unbeliever Feb 26 '18 at 15:36
  • https://stackoverflow.com/q/8295131/330315 –  Feb 26 '18 at 15:37
  • No. In fact, the powers-of-2 superstition becomes actively harmful if used with non-variable length types, because SQL Server cares about *pages* of data (or at worst *rows*) not whether individual columns have some sort of "nice round size". So this likely leads to overallocation (and hence extra I/O) without doing anything for, say, alignment. In theory you can carefully align rows as a whole; in practice this would require preternatural design skills and discipline, taking things like the null bitmap into account as well. Let the engine handle that stuff. – Jeroen Mostert Feb 26 '18 at 15:39
  • @apokryfos is that the same for MySQL NDB Cluster? – Aleksey Solovey Feb 26 '18 at 15:39
  • @AlekseySolovey It probably doesn't make a difference there either. There isn't even any indication that the entire data will be stored on the same device let alone the same block. I don't know, I haven't developed NDB to know its inner workings but in the very least a VARCHAR does not take the maximum allocated space and only the string length so the OS will take responsibility in optimally storing it anyway. – apokryfos Feb 26 '18 at 15:53

1 Answers1

1

There is no advantage or disadvantage, the number simply defines the maximum length of the stored string.

MJH
  • 1,710
  • 1
  • 9
  • 19
  • 1
    I disagree with there being no disadvantage. The length of a column is an implicit constraint. By rounding up, you're relaxing the constraint with possible implications to data integrity. Take, for example, a column defined to hold a US ZIp Code. Since it can be either 5 or 10 characters, varchar(10) makes sense. But, if you "round up" to the nearest power of 2, you get a varchar(16). Now you can add anything up to and including 16 characters worth of data. Anything between 11 and 16 characters is invalid and should be rejected from being in the database at all. – Ben Thul Feb 26 '18 at 20:29
  • Ben, you are of course correct, if you know what the length of the column should be then setting a sensible max length makes sense. However this post is asking about e.g. VARCHAR(100) vs VARCHAR(128) so it's relatively safe to assume these columns are set to a "safe" length to prevent truncation errors etc. – MJH Feb 26 '18 at 22:24