3

Char and varchar are datatypes in SQL, as they are in many other languages(So this question could be multi-language).

From what I understand, the difference is that if I declared a Char as Char(20) it would allocate 20 (bytes/bits) [Could someone clarify this bit too? For now, I'll use bytes.]. Then if I only used 16 bytes, I would still have four allocated to that field. (A waste of 4 bytes of memory.)

However, if I declared a varchar as varchar(20) and only used 16 bytes, it would only allocate 16 bytes.

Surely this is better? Why would anyone choose char? Is it foe legacy reasons, or is there something I'm missing?

KidCode
  • 3,990
  • 3
  • 20
  • 37
  • 1
    Damn, I didn't see that. I wonder why it didn't come up when I was asking. At least I have an answer, Thank you! – KidCode Oct 25 '14 at 20:16
  • 1
    To answer your other question, `Char(20)` would allocate enough space for 20 characters. How much space that is depends on your what charset you're using, but regular English text (`latin1` is the example in the docs) requires only 1 bytes per character. So it would be 20 bytes in a `latin1` charset, but might be more if you're using a non-English language. – Jason Baker Oct 25 '14 at 20:19
  • For **SQL Server**: `varchar(n)` always carries at least 2 bytes overhead - so `varchar(2)` isn't using 0-2 bytes - but 2-4 bytes - while `char(2)` **always** uses just 2 bytes (e.g. for two-chars country codes). So if you store 16 characters of text in a `varchar(20)` column, it uses 18 bytes of storage – marc_s Oct 25 '14 at 20:27
  • 1
    @Jason the discussion would have to shift to nchar/nvarchar for that to be relevant... – Aaron Bertrand Oct 25 '14 at 20:29

2 Answers2

12

Prefer VARCHAR.

In olden days of tight storage, it mattered for space. Nowadays, disk storage is cheap, but RAM and IO are still precious. VARCHAR is IO and cache friendly; it allows you to more densely pack the db buffer cache with data rather than wasted literal "space" space, and for the same reason, space padding imposes an IO overhead.

The upside to CHAR() used to be reduced row chaining on frequently updated records. When you update a field and the value is larger than previously allocated, the record may chain. This is manageable, however; databases often support a "percent free" setting on your table storage attributes that tells the DB how much extra space to preallocate per row for growth.

VARCHAR is almost always preferable because space padding requires you to be aware of it and code differently. Different databases handle it differently. With VARCHAR you know your field holds only exactly what you store in it.

I haven't designed a schema in over a decade with CHAR.

codenheim
  • 20,467
  • 1
  • 59
  • 80
  • 1
    Storage is cheap, but who cares? This isn't about wasting disk space. It's about wasting I/O and precious space in memory. (Memory, while also cheap, is often much harder to expand after the fact than storage space.) – Aaron Bertrand Oct 25 '14 at 20:28
  • I said "that is no longer the reason", it was mentioned for historical means. Though I should have mentioned the optimal buffer / IO characteristics of varchar as well, slipped my mind. – codenheim Oct 25 '14 at 20:30
  • You still seemed to be focused on just the storage today: `Nowadays, storage is cheap.` It's a common adage that bears correcting. – Aaron Bertrand Oct 25 '14 at 20:32
  • Storage is storage. When I said storage, I mean space, no matter where it sits. You read it as disk but thats one kind of storage. Granted, I should and will rewrite it, but in my mind I was thinking space. – codenheim Oct 25 '14 at 20:42
  • Well, even today, space isn't cheap in memory, nor when you have to move a lot more 8kb pages in I/O either. So I still disagree with your premise. – Aaron Bertrand Oct 25 '14 at 20:43
  • The premise in your comment ("I mean space, no matter where it sits.") – Aaron Bertrand Oct 25 '14 at 20:54
  • Ah. Well I was trying to communicate to the OP that there are space / physical issues as well as coding/logic issues to be understood. That is what I meant. I didn't intend to get so detailed on the answer, but that was a mistake. – codenheim Oct 25 '14 at 20:58
  • 1
    We're peers, trying to better each other's answers so that the OP (and future readers) are best informed. I thought your original answer was murky and still think your comments could be taken as misleading. Don't be so defensive. P.S. re-read my first comment and consider the difference between "but who cares about the cost of the storage?" (which is what I meant) and "but who cares what you say?" (which I believe is how you took it). Again, don't be so defensive. I was merely trying to clarify that there is a *big* difference between storage on disk and storage in memory. – Aaron Bertrand Oct 25 '14 at 21:03
  • You need to get used to the Internet. It is definitely not the South. – Aaron Bertrand Oct 25 '14 at 21:09
  • Because people say "but who cares?" and assume that the reader will give them the benefit of the doubt that it was relevant to the rest of that sentence, and not flipping you off? Again, *shrug*, don't know how to help you. – Aaron Bertrand Oct 25 '14 at 21:17
2

FROM Specification

char[(n)]

Fixed-length non-Unicode character data with length of n bytes. n must be a value from 1 through 8,000. Storage size is n bytes. The SQL-92 synonym for char is character.

So Char(20) will allocate fixed 20 Bytes space to hold the data.

Usage:

For example if you have a column named Gender and you want to assign values like only M for Male (OR) F for female and you are sure that the field/column are non-null column . In such case, it's much better to define it as CHAR(1) instead like

Gender CHAR(1) not null

Also, varchar types carries extra overhead of 2 bytes as stated in document . The storage size is the actual length of the data entered + 2 bytes.

In case of char that's not the case.

Rahul
  • 76,197
  • 13
  • 71
  • 125
  • Downvoter: care to leave a comment along with the downvote? – Rahul Oct 25 '14 at 20:47
  • 1
    Not me but if I had to guess it's because you've just regurgitated the documentation and not really addressed the main question - you've said that for gender it's "much better" to define it as `CHAR(1)` but haven't bothered to explain why. – Aaron Bertrand Oct 25 '14 at 20:56
  • Ahh! Yes but I kind a explained by giving the example. With that example what I mean when you know about the size in early and the size is going to be fixed, use `char` – Rahul Oct 25 '14 at 21:03
  • 2
    You're still not coming anywhere near explaining *why*... – Aaron Bertrand Oct 25 '14 at 21:04
  • I don't think so except the point about overhead carried by `varchar` type which already mentioned by Marc_s in comment. – Rahul Oct 25 '14 at 21:10
  • You mean the comment he left 5 minutes after you posted your answer? [Comments are not permanent. And don't assume that the reader of your answer will even see the comment, never mind read and understand it, even in the case it still exists](http://meta.stackexchange.com/a/140710/165455). I'm not going to argue with you about this. You asked why someone might have down-voted your answer, I've told you my opinion, you can do something with the information I've given, or not. I'm not the person you need to convince. – Aaron Bertrand Oct 25 '14 at 21:13
  • Aaron, Thanks for pointing that. Edited my answer to include that point. – Rahul Oct 25 '14 at 21:33