3

I did red-read on this and could not clearly understand...

Having (for the purpose of shortening this question)

DECLARE @temp VARCHAR(4001); 
--update: 4001 is for example only to avoid varchar(MAX) discussions
-- I am aware about 8000 
SET @temp = 'a';  

does SQL Server reserve 4001 bytes for @temp which (the rest 4000 from 4001 bytes of @temp) cannot be reused for ANY other purposes until @temp is disposed?

If it doesn't then (update: sorry, this was rhetoric question, I believe it did not change the sense of my questions) :

  • Why does SQL Server need to know maximum size of varchar values?
  • Why cannot the string of size bigger than 4001 bytes be (re)assigned to @temp?
    update: if 4001 bytes are not pre-allocated or reserved for @temp
  • How SQL Server uses this size?

The same questions about a table containing one varchar(4001) column.
How should I name 4001 in varchar(4001) in order to avoid ambiguity with varchar(max)?
i.e. avoiding engagement of "maximum" word.

[1], for example, hints that it is column size:

The varchar data type is a variable-length data type. Values shorter than the size of the column... [1]

that I could have understood that this size is reserved, the column has fixed size (over all rows), independently of shorter values stored there.
Correct? If not correct then what is the size of a varchar(4001) column, using terminology from [1]?

Update
I wanted (by specifying 4001) to avoid deviating this topic from in-row value storage (to, for ex., SQL Server varchar(max), text, etc. types) as well to other topics not specifically asked here.

The questions are not about how to use n in varchar(n) but how it is used (by SQL Server).

I welcome other DBMS specialists, since it should be DBМS agnostic questions, though concrete digits are specified in context of SQL Server 2000 (or 2005+ without varchar(MAX))

Update2:
The size of length prefix for storing table data in SQL Server is always 2 according to [2]. Why not just to use varchar() to let SQL Server do it job with actual size?
Besides I (intentionally) based my question on variable.
Also, put digit really truncates the storage/data:

DECLARE @temp VARCHAR(2); 
SET @temp = 'aaaaaa';
select @temp
 -- result is: aa

[1]
msdn "Using char and varchar Data"
http://msdn.microsoft.com/en-us/library/ms175055.aspx
[2]
Estimating the Size of a Heap
http://msdn.microsoft.com/en-us/library/ms189124.aspx

2 Answers2

2

How is n in varchar(n) used?

It's implementation-dependent.

In SQLite, n is completely ignored.

In PostgreSQL, varchar(n) is essentially equivalent to TEXT with the constraintCHECK (LENGTH(TheColumn) <= n). There is no performance advantage to specifying a maximum size.

In MySQL, n determines the size of the length prefix, so VARCHAR(255) uses 1 byte to store the length, and VARCHAR(65535) uses 2 bytes.

For MS SQL Server, see the question [Why not use] varchar(max) everywhere?

Community
  • 1
  • 1
dan04
  • 87,747
  • 23
  • 163
  • 198
  • Unfortunately, the "[Why not use] varchar(max) everywhere?" question doesn't have any definitive answers, just a bunch of speculation. Does anybody have a definitive answer for SQL Server? – Gabe Oct 16 '10 at 18:00
1

Why would one assign tinyint or integer? If your data for one given column has 15 discrete values then tinyint obviously.

The same applies to varchar: it tells SQL Server what values/lengths are expected in this column. SQL Server throws an error it the data would be truncated.

You could apply the same argument for NULL/NOT NULL, or foreign keys or CHECK constraints etc: they are all there to keep your data correct. See "Declarative referential integrity".

For example, I'd want to disallow someone trying to store 500k of XMl in my 100 byte name plain text column because they can. If someone did succeeds, what do think will happen to other clients that expected 100 bytes maximum?

It's also important for storage efficiency. It's Ok to declare "String" for a c# object that you may instantiate, use, discard and stays in memory (mostly) for it's short life. Persisting a billion rows of "string" is an unnecessary overhead.

One may go further and ask why use varchar? Why not use nvarchar everywhere? Again, I do have table that store currency code with approaching a billion rows. nchar(3) vs char(3) costs me an extra 3GB of storage (+ indexes + row length changes).

Summary: it's a constraint on your data.

Community
  • 1
  • 1
gbn
  • 422,506
  • 82
  • 585
  • 676
  • Thinking of `VARCHAR(n)` as a constraint on the length makes sense (although I personally *hate* maximum-width fields), but I don't see why it has an advantage for storage efficiency. Isn't the whole point of `VARCHAR` that its storage size is determined by the size of the actual data rather than the declaration? – dan04 Oct 17 '10 at 16:51
  • @dan04: I've assumed one aspect of the questions "why can't I have a .net-like string and not worry about n". Over a billion rows this a lot of uncertainty given how you'd store upto 2GB per row *without* any constraints on the length – gbn Oct 17 '10 at 17:07