708

What is meant by nvarchar?

What is the difference between char, nchar, varchar, and nvarchar in SQL Server?

PeterAllenWebb
  • 10,319
  • 3
  • 37
  • 44
MrDatabase
  • 43,245
  • 41
  • 111
  • 153

12 Answers12

972

Just to clear up... or sum up...

  • nchar and nvarchar can store Unicode characters.
  • char and varchar cannot store Unicode characters.
  • char and nchar are fixed-length which will reserve storage space for number of characters you specify even if you don't use up all that space.
  • varchar and nvarchar are variable-length which will only use up spaces for the characters you store. It will not reserve storage like char or nchar.

nchar and nvarchar will take up twice as much storage space, so it may be wise to use them only if you need Unicode support.

shA.t
  • 16,580
  • 5
  • 54
  • 111
Brian Kim
  • 24,916
  • 6
  • 38
  • 26
  • 19
    char and varchar aren't meant to store unicode, but with some additional coding tricks and extra logic, you can still misuse a [var]char field for unicode storage. – Wim ten Brink Nov 02 '09 at 10:41
  • 10
    It is collation dependant whether or not the `n...` versions take up twice as much storage space [as my answer shows](http://stackoverflow.com/q/8250586/73226) – Martin Smith Nov 23 '11 at 23:48
  • 9
    What's the advantage to reserving storage? – mlissner Feb 20 '13 at 00:11
  • 4
    On the last ppoint: Using Unicode nchar and nvarchar are still better in most cases, better collation, flexibility for the users, removes future compatibility issues. And by the way storage space is not an issue for this case, as using collation without Unicode is a lot of hassle, and memory rates will continue to decrease in future – Jaison Varghese May 24 '13 at 21:05
  • Actually, neither char/varchar or nchar/nvarchar can store the full range of Unicode characters, so it's misleading to say that one can store Unicode characters while the other can't. A fixed two-bytes-per-character encoding (SQL Server uses such an encoding called UCS-2 for nchar and nvarchar) is not large enough to cover the full Unicode range. See my answer for more details. – PeterAllenWebb Jun 27 '13 at 05:57
  • 1
    However, for [n]varchar, you can store fixed length strings if you want to, like varchar(20). What's the difference between char(20) and varchar(20)? – Ben Caine Jun 20 '14 at 15:00
  • 8
    @BenCaine char(20) will use 20 bytes (assuming an 8-bit collation); varchar(20) will use len(data)+2 bytes, ie 22 for 20 bytes of data, but only 12 for 10 bytes of data. The extra two bytes are the length records. If your data will always be the full length, then use a char, as it saves space and may be faster. Please don't ever use a varchar(1), or indeed anything smaller than a varchar(4). A single character in varchar format uses three bytes, so a char(3) will never use more space than a varchar(3). – Richard Gadsden Jul 03 '14 at 15:31
  • 1
    @RichardGadsden Wouldn't an empty string varchar(3) take up less space than an empty string char(3)? 2 bytes vs 3 bytes? – andrewb Sep 23 '14 at 02:29
  • @andrewb technically, but you'd more usually have a null, which is no bytes at all either way. – Richard Gadsden Sep 23 '14 at 07:14
  • The only other case here I would like to make is that when trying to port data in from a program (like an asp.net web api) into SQL server and you have a column set for char it will net let you update or insert any current type into it. Is this incorrect or is it still possible to say cast the string to char. Example being field of "StateAbbrv char(2)" on the server and `public string StateAbbrv {get;set;}` in the app. – Edward Mar 22 '17 at 02:39
  • What version of unicode does nchar and nvarchar supports in different versions of SQL? – Alexey Shevelyov Aug 30 '21 at 16:21
  • https://en.wikipedia.org/wiki/List_of_Unicode_characters So I can store _any_ of those characters? Including emojis? – carloswm85 Jun 03 '22 at 14:34
109

All the answers so far indicate that varchar is single byte, nvarchar is double byte. The first part of this actually depends on collation as illustrated below.

DECLARE @T TABLE
(
C1 VARCHAR(20) COLLATE Chinese_Traditional_Stroke_Order_100_CS_AS_KS_WS,
C2 NVARCHAR(20)COLLATE  Chinese_Traditional_Stroke_Order_100_CS_AS_KS_WS
)

INSERT INTO @T 
    VALUES (N'中华人民共和国',N'中华人民共和国'),
           (N'abc',N'abc');

SELECT C1,
       C2,
       LEN(C1)        AS [LEN(C1)],
       DATALENGTH(C1) AS [DATALENGTH(C1)],
       LEN(C2)        AS [LEN(C2)],
       DATALENGTH(C2) AS [DATALENGTH(C2)]
FROM   @T  

Returns

enter image description here

Note that the and characters were still not represented in the VARCHAR version and were silently replaced with ?.

There are actually still no Chinese characters that can be reprsented by a single byte in that collation. The only single byte characters are the typical western ASCII set.

Because of this it is possible for an insert from a nvarchar(X) column to a varchar(X) column to fail with a truncation error (where X denotes a number that is the same in both instances).

SQL Server 2012 adds SC (Supplementary Character) collations that support UTF-16. In these collations a single nvarchar character may take 2 or 4 bytes.

Community
  • 1
  • 1
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • 5
    The kind of answer I was looking for. Also to save time to the likes of me - the non-english text translates to "People's Republic of China" https://translate.google.com/#auto/en/%E4%B8%AD%E5%8D%8E%E4%BA%BA%E6%B0%91%E5%85%B1%E5%92%8C%E5%9B%BD – Igand Mar 14 '18 at 13:17
43

nchar and char pretty much operate in exactly the same way as each other, as do nvarchar and varchar. The only difference between them is that nchar/nvarchar store Unicode characters (essential if you require the use of extended character sets) whilst varchar does not.

Because Unicode characters require more storage, nchar/nvarchar fields take up twice as much space (so for example in earlier versions of SQL Server the maximum size of an nvarchar field is 4000).

This question is a duplicate of this one.

Community
  • 1
  • 1
Luke Bennett
  • 32,786
  • 3
  • 30
  • 57
  • 6
    You forget one thing: nchar uses a fixed-length so nchar(10) always needs to receive ten characters. And varchar(10) is indeed Unicode and will accept any number of characters, up to 10 characters. Also see http://msdn.microsoft.com/en-us/library/ms186939.aspx – Wim ten Brink Nov 02 '09 at 10:29
39

Just to add something more: nchar - adds trailing spaces to the data. nvarchar - does not add trailing spaces to the data.

So, if you are going to filter your dataset by an 'nchar' field, you may want to use RTRIM to remove the spaces. E.g. nchar(10) field called BRAND stores the word NIKE. It adds 6 spaces to the right of the word. So, when filtering, the expression should read: RTRIM(Fields!BRAND.Value) = "NIKE"

Hope this helps someone out there because I was struggling with it for a bit just now!

Dimuthu
  • 399
  • 3
  • 2
28

My attempt to summarize and correct the existing answers:

First, char and nchar will always use a fixed amount of storage space, even when the string to be stored is smaller than the available space, whereas varchar and nvarchar will use only as much storage space as is needed to store that string (plus two bytes of overhead, presumably to store the string length). So remember, "var" means "variable", as in variable space.

The second major point to understand is that, nchar and nvarchar store strings using exactly two bytes per character, whereas char and varchar use an encoding determined by the collation code page, which will usually be exactly one byte per character (though there are exceptions, see below). By using two bytes per character, a very wide range of characters can be stored, so the basic thing to remember here is that nchar and nvarchar tend to be a much better choice when you want internationalization support, which you probably do.

Now for some some finer points.

First, nchar and nvarchar columns always store data using UCS-2. This means that exactly two bytes per character will be used, and any Unicode character in the Basic Multilingual Plane (BMP) can be stored by an nchar or nvarchar field. However, it is not the case that any Unicode character can be stored. For example, according to Wikipedia, the code points for Egyptian hieroglyphs fall outside of the BMP. There are, therefore, Unicode strings that can be represented in UTF-8 and other true Unicode encodings that cannot be stored in a SQL Server nchar or nvarchar field, and strings written in Egyptian hieroglyphs would be among them. Fortunately your users probably don't write in that script, but it's something to keep in mind!

Another confusing but interesting point that other posters have highlighted is that char and varchar fields may use two bytes per character for certain characters if the collation code page requires it. (Martin Smith gives an excellent example in which he shows how Chinese_Traditional_Stroke_Order_100_CS_AS_KS_WS exhibits this behavior. Check it out.)

UPDATE: As of SQL Server 2012, there are finally code pages for UTF-16, for example Latin1_General_100_CI_AS_SC, which can truly cover the entire Unicode range.

PeterAllenWebb
  • 10,319
  • 3
  • 37
  • 44
15
  • char: fixed-length character data with a maximum length of 8000 characters.
  • nchar: fixed-length unicode data with a maximum length of 4000 characters.
  • Char = 8 bit length
  • NChar = 16 bit length
sth
  • 222,467
  • 53
  • 283
  • 367
ss.
  • 151
  • 1
  • 2
  • `char` couldn't have an 8-bit length. It doesn't have to store the length, and the fixed length can be up to 8000 characters. – John B. Lambe Jul 16 '19 at 10:36
14

nchar[(n)] (national character)

  • Fixed-length Unicode string data.
  • n defines the string length and must be a value from 1 through 4,000.
  • The storage size is two times n bytes.

nvarchar [(n | max)] (national character varying.)

  • Variable-length Unicode string data.
  • n defines the string length and can be a value from 1 through 4,000.
  • max indicates that the maximum storage size is 2^31-1 bytes (2 GB).
  • The storage size, in bytes, is two times the actual length of data entered + 2 bytes

char [(n)] (character)

  • Fixed-length, non-Unicode string data.
  • n defines the string length and must be a value from 1 through 8,000.
  • The storage size is n bytes.

varchar [(n | max)] (character varying)

  • Variable-length, non-Unicode string data.
  • n defines the string length and can be a value from 1 through 8,000.
  • max indicates that the maximum storage size is 2^31-1 bytes (2 GB).
  • The storage size is the actual length of the data entered + 2 bytes.
Rasel
  • 5,488
  • 3
  • 30
  • 39
12

nchar(10) is a fixed-length Unicode string of length 10. nvarchar(10) is a variable-length Unicode string with a maximum length of 10. Typically, you would use the former if all data values are 10 characters and the latter if the lengths vary.

Jason Kresowaty
  • 16,105
  • 9
  • 57
  • 84
10

nchar requires more space than nvarchar.

eg,

A nchar(100) will always store 100 characters even if you only enter 5, the remaining 95 chars will be padded with spaces. Storing 5 characters in a nvarchar(100) will save 5 characters.

Venkataraman R
  • 12,181
  • 2
  • 31
  • 58
  • 7
    Not completely true, since you're required to fill a char(100) with up to 100 characters. You would use this when you're eg store phone numbers in your database, or order numbers with a fixed length. Because the field length is fixed, you have no choice to fill it up to the maximum number of characters. But when all your data is 100 characters per record, a char(100) will take less storage than a varchar(100) because it doesn't need a length indication: every value would be exactly 100 characters. – Wim ten Brink Nov 02 '09 at 10:44
8

The differences are:

  1. n[var]char stores unicode while [var]char just stores single-byte characters.
  2. [n]char requires a fixed number of characters of the exact length while [n]varchar accepts a variable number of characters up to and including the defined length.

Another difference is length. Both nchar and nvarchar can be up to 4,000 characters long. And char and varchar can be up to 8000 characters long. But for SQL Server you can also use a [n]varchar(max) which can handle up to 2,147,483,648 characters. (Two gigabytes, a signed 4-byte integer.)

Wim ten Brink
  • 25,901
  • 20
  • 83
  • 149
4
  • nchar is fixed-length and can hold unicode characters. it uses two bytes storage per character.

  • varchar is of variable length and cannot hold unicode characters. it uses one byte storage per character.

Manu
  • 28,753
  • 28
  • 75
  • 83
  • Wrong. Unicode can use 1 to 4 bytes (in general) for every character. Also, a varchar can hold unicode, but it's not recognised as unicode. As a result, a varchar is considered unreliable for unicode storage. (Especially since there's a risk that the code that accesses the field will translate it incorrectly.) – Wim ten Brink Nov 02 '09 at 10:39
  • @Alex: I think you made your point but I still do not agree with you. What you are saying is that an int CAN hold a long if the long happens to be smaller than 2^32. This is not only 'unreliable', it is an inherent limitation which makes it impossible to cover the whole value range. – Manu Nov 05 '09 at 08:36
  • 4
    @Workshop Alex: Wrong. Unicode encoded as `UCS-2` (which happens to be the encoding used by SQL Server) stores every character in *exactly* two bytes, see http://msdn.microsoft.com/en-us/library/bb330962%28v=sql.90%29.aspx: `SQL Server stores Unicode in the UCS-2 encoding scheme... UCS-2 is a fixed-length encoding that represents all characters as a 16-bit value (2 bytes)`. SQL Server 2008 can use SCSU compression, but is still compression of the UCS-2 encoded Unicode strings: http://msdn.microsoft.com/en-us/library/ee240835.aspx – Remus Rusanu Dec 18 '10 at 06:23
2

NVARCHAR can store Unicode characters and takes 2 bytes per character.

Gustavo Rubio
  • 10,209
  • 8
  • 39
  • 57
  • 1
    WRONG! Unicode uses between 1 and 4 bytes per character! Many people forget this! Even the use of UTF-16 might result in some characters taking 4 bytes instead of 2, although the common length will be 2 bytes. Certain other subformats of Unicode might take even more than 4 bytes! – Wim ten Brink Nov 02 '09 at 10:37
  • 8
    @WimtenBrink - The question is about SQL Server and `nvarchar` always takes 2 bytes per character. – Martin Smith Nov 23 '11 at 23:15
  • 1
    @Wim, you are correct there are several encodings for Unicode that can produce a different number of bytes. But SQL Server does not give you a choice about Unicode encoding. SQL Server prior to 2012 only used UCS-2, two bytes wide, so Martin was correct at the time he wrote the answer. As other answers above have said, SQL Server 2012 now provides UTF-16, so two bytes for many characters (those in the Unicode Basic Multiliingual Plane), four bytes for others. – Concrete Gannet Feb 07 '14 at 00:55