59

Every time I create a table I wonder if there is any performance difference whether I say nvarchar (100) or nvarchar (1000) assuming that actual string size will be less than 100. So is there?

jeteon
  • 3,471
  • 27
  • 40
ren
  • 3,843
  • 9
  • 50
  • 95

7 Answers7

65

According to the documentation:

nvarchar [ ( n | max ) ]

Variable-length Unicode string data. n defines the string length and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size, in bytes, is two times the actual length of data entered + 2 bytes.

So, only the actual length of the data entered matters when calculating the storage size.

The documentation doesn't say why it's there, but the length parameter is useful because it enforces simple limit constraints (for example, so someone can't enter 2 GB of text as their "name").

Brendan Long
  • 53,280
  • 21
  • 146
  • 188
  • 1
    Indeed, if there's a separate reason why something should be constrained then that's a good reason to add a constraint. Maybe there's a business rule that says "data element X may never be more than Y characters." Or perhaps there's a constraint on another part of the system and, to maintain data integrity throughout the system as a whole, the constraint should be maintained throughout the system. – David Apr 02 '12 at 21:11
  • 7
    Correct. Do note that nvarchar(max) is stored differently and should not simply be used as a replacement for nvarchar(x), though. The storage space is the same, but it is stored more like a BLOB than an inline field. – Andrew Barber Apr 02 '12 at 21:17
  • 2
    I just verified @AndrewBarber's comment by tweaking the tests at http://rusanu.com/2010/03/22/performance-comparison-of-varcharmax-vs-varcharn/ to use nvarchar instead of varchar. – StriplingWarrior Mar 05 '13 at 21:25
  • I'm reminded by the answer by @HLGEM that there is a tipping point - after a size of 4000 - where nvarchar's behavior changes. The docs don't make it clear what happens if you declare a size of 4001 or larger. – Darren Griffith Dec 04 '14 at 17:32
  • 4
    The wording on the linked documentation page has changed (page edit date is 2018-10-22) to "The storage size is two times n bytes + 2 bytes" which contradicts this answer. I'm hoping that this is a case of erroneous or misleading documentation, can anyone confirm? – Joe Lee-Moyet Feb 22 '19 at 14:32
  • @JoeLee-Moyet Using SQL Server 2016 at least, and getting the DATALENGTH() of the field, nvarchar(y) is returning 2n bytes for n characters in Unicode range 0-65,535. Compare that to using nchar(y) which returns 2y bytes for n characters in Unicode range 0-65,535. I don't know where the +2 has gone, I assume DATALENGTH returns the data of the field itself and the +2 is some pointer that SQL Server must store somewhere. – David Jacobsen Apr 05 '22 at 15:43
22

The reason why you should not use nvarchar(1000) when you need nvarchar (10) is to help prevent bad data from being entered into your database. Unless you like it when phone numbers say things like 'call the fat secretary not the cute one if you want a real answer." (not so random example I once found in an actual customer file sent to us) Users will figure out pretty quickly what fields are big enough that they can use them to store notes in which tends to make the data in the field useless over time.

And as for nvarchar(Max), it is a bad idea to use this at all unless you expect to have over 4000 characters. Look up indexing and varchar(max) to see why.

HLGEM
  • 94,695
  • 15
  • 113
  • 186
  • 2
    On the other hand, I consider this only the "last sanity check" in most cases. That is, for a telephone number I would probably use `varchar(20)` (I like to round to 10, 20, 100, 200) to account for different area/country codes, extensions and such. There should usually be *additional* data validation constrains (part of the DB, DAL, or otherwise) to make sure the data conforms with whatever business rules are in place. –  Apr 02 '12 at 21:39
18

Yes, it matters from the performance point-of-view.

Query Optimizer looks at this meta data to plan the query. It estimates the row size based on the provided length and this can cause a performance issue. For example, when you need to sort a column that is varchar(10), it may plan to run the sort operation on RAM, but the same query for varchar(1000) may be planned to be run on a secondary storage.

I try to use the domain knowledge and estimate the required size. In addition, you may need to put some space for future maintenance. For example, if you think that your data may have maximum 50 characters, use varchar(70) instead of 50 so that it can handle unpredictable future changes in the application usage.

I got to know about it from this blog post (I am NOT the author): http://aboutsqlserver.com/2010/08/18/what-is-the-optimal-size-for-variable-width-columns/


NOTE: Don't choose smaller lengths blindly. Changing the field size may become a big maintenance headache. I can remember when I choose a small length for LastName field, and some users couldn't sign up in the system because of this. We had to update a critical database in-use (it takes time to increase field length), and compile the program and redeploy. If I had chosen a proper field size, I could avoid all these headaches.

You may also want to read about differences between nvarchar(max) and nvarchar(n) as n>4000 for 4000 makes the field basically similar to nvarchar(max). (Are there any disadvantages to always using nvarchar(MAX)?)

mohghaderi
  • 2,520
  • 1
  • 19
  • 12
9

As for size versus performance, remember that SQL server will store the initial value of the data for nvarchar/varchar and the whole value for nchar/char in terms of space. For example: nvarchar(1000) with data stored test data will initially take 9*2 bytes of space or 18-bytes. While a nchar(1000) will take 1000*2 bytes (2000-bytes) no matter what.

Then it goes on its merry way adding the next set of data on the page (which is 8k) until the page meets (or is close to) the fill-factor set for the table. Then starts a new page. Now let's say a user needs to update that data and enters something with some substance in the previous field, let's say something 800 chars long. Now that value needs to update and will grow significantly, but now the page is full and when the data for that field has to grow, the page needs to split and make way for the data (unless the fill factor is low enough to allow for the growth).

That page split will aggregate as index fragmentation and result in slower search/seek times and longer update times. So there may be a difference in terms of impact for performance if the data changes significantly.

As is often the case, the answer is: "depends".

Jonas G. Drange
  • 8,749
  • 2
  • 27
  • 38
phil vuollet
  • 91
  • 1
  • 1
  • So you are saying that nvarchar(1000) will reserve 1000*2 bytes of space, even if the content takes 40 bytes only? Because I thought it would not. – Didier A. Sep 03 '13 at 18:47
3

At least in sql server database it is not allowed to create Unique constraint against column with its type as nvarchar(max). It should be limited to nvarchar(450) to add this constraint successfully.

Bronek
  • 10,722
  • 2
  • 45
  • 46
  • 1
    And not allowed to create primary keys when the total size of the columns exceeds 900 bytes, which can easily be the case if you use two nvarchar columns with max size, say, 450 as composite primary key. – Varvara Kalinina Jun 09 '15 at 16:30
2

Since nvarchar is a variable length data type it will only store the data you assign to it (2 bytes per char) plus 2 bytes for length information and is primarily used for double byte languages like Chinese.

Personally, I use varchar(n) when I know of a certain limitation (i.e. URL query string limit, file path szie limit, or my own limit). I use varchar(max) when the max length is undefined and it could go beyond 8000 characters. And I almost never use nvarchar primarily because our application will never go international.

Chris Gessler
  • 22,727
  • 7
  • 57
  • 83
0

There is some mess in the documentation obout the actual size occupied by varchar(n) or nvarchar(n) fields. Probably @Brendan Long quoted the text from Microsoft documentation that later has been changed. Today this documatation says that the actual size ocuppied by a field varchar(n) or nvarchar(n) depends only on the 'n' value and not on actual length of a stored string. Another strange thing is, that a documentation of varbinary(n) data type states that actual storage size for this type is a actual entered data length plus 2 bytes. This informations seem to be incorrect or messy.

cigien
  • 57,834
  • 11
  • 73
  • 112
  • 1
    Do you have a link to the documentation you are referring to? If so it would be good to include a link to it in your answer. – Ryan White May 02 '22 at 14:34