11

Let's say I have a table that has a varchar field. If I do an insert like this:

INSERT MyTable 
 SELECT N'the string goes here'

Is there any fundamental difference between that and:

INSERT MyTable 
 SELECT 'the string goes here'

My understanding was that you'd only have a problem if the string contained a Unicode character and the target column wasn't unicode. Other than that, SQL deals with it just fine and converts the string with the N'' into a varchar field (basically ignores the N).

I was under the impression that N in front of strings was a good practice, but I'm unable to find any discussion of it that I'd consider definitive.

gdoron
  • 147,333
  • 58
  • 291
  • 367
jcollum
  • 43,623
  • 55
  • 191
  • 321

3 Answers3

6

You should prefix strings with N when they are destined for an nvarchar(...) column or parameter. If they are destined for a varchar(...) column or parameter, then omit it, otherwise you end up with an unnecessary conversion.

It's definitely not a "best practice" to stick N in front of every string regardless of what it's for.

Aaronaught
  • 120,909
  • 25
  • 266
  • 342
  • 1
    But if you put the N in front, you don't really have to know what the target column's datatype is. Worrying about the extra conversion feels like micro-optimization. – jcollum Apr 26 '10 at 19:25
  • 2
    @jcollum: The fact remains that the best practice is to use what's actually correct. Someone may look at your query and incorrectly conclude that the column supports Unicode characters when it actually doesn't. – Aaronaught Apr 26 '10 at 19:28
  • What I wonder then is why so many tools put N in front of strings, regardless of the destination type. – jcollum Apr 26 '10 at 19:32
  • 1
    @jcollum: That is a very good question. Might just be laziness - easier to use the same code for all character types than to write special-case code based on Unicode settings. – Aaronaught Apr 26 '10 at 19:38
3

Short answer: fine for scripts, bad for production code.

It is not considered a best practice. There is a downside, it creates a minuscule performance hit as 2 byte characters are converted to 1 byte characters.

If one doesn't know where the insert is going, or doesn't know where the source text is coming from (say this is a general purpose data insertion utility that generates insert statements for an unknown target, say when exporting data), N'foo' might be the more defensive coding style.

So the downside is small and the upside is that your script code is much more adaptable to changes in database structure. Which is probably why you see it in bulk data-insert scripts.

However, if the code in question is something meant for re-use in an environment where you care about the quality of the code, you should not use N'the string' because you are adding a conversion where none is necessary.

jcollum
  • 43,623
  • 55
  • 191
  • 321
MatthewMartin
  • 32,326
  • 33
  • 105
  • 164
  • 1
    Ah, defensive, I like that term here. +1 – jcollum Apr 26 '10 at 19:55
  • I think this, combined with the one above is the answer. I think I'll edit this and call it the answer. – jcollum Apr 27 '10 at 19:12
  • @jcollum: It's not just about the performance hit of a conversion. You're indicating the wrong semantics to future maintainers. It's a bit like writing predicates as `WHERE SomeCol = '1'`, as if it's a `varchar` column, when it's really an `int` column. It may work, but that doesn't make it less wrong, and it may lead to subtle collation-related errors in the future as the script gets changed. Generated code is obviously not held to as high a quality standard as written code; the same is generally true of any code gen, try looking at the designer code-behind of a Windows Form. – Aaronaught Apr 28 '10 at 01:24
  • @aaron: there are any number of places in scripts where there are datetimes that are passed in as strings '2010-04-28 12:31AM'. I'm not seeing what you mean about the collation errors. I.e. I don't understand how asking sql to convert and nvarchar to a varchar (before inserting data for a column) could result in collation errors. – jcollum Apr 29 '10 at 21:54
  • @aaron: my edit of this answer also addressed not doing this in production code, which I think is appropriate. I don't see any downside to doing this in scripts. Please illustrate if you see an issue that I don't. – jcollum Apr 29 '10 at 21:56
  • @jcollum: My issue was with your last sentence, "because you are adding a conversion where none is necessary." The implication is that's the only reason, but a more important reason is that it's simply misleading. Code generators don't care about that because machine-generated code is not intended for human consumption. – Aaronaught Apr 29 '10 at 22:09
  • @aaron: I think you're reading an "only reason" in there when it doesn't say that. It's just the first reason that I picked and the most quantifiable one. – jcollum Apr 30 '10 at 14:21
0

From INSERT (Transact-SQL)

When referencing the Unicode character data types nchar, nvarchar, and ntext, 'expression' should be prefixed with the capital letter 'N'.

Also have a read at Why do some SQL strings have an 'N' prefix?

And

Server-Side Programming with Unicode

Unicode string constants that appear in code executed on the server, such as in stored procedures and triggers, must be preceded by the capital letter N. This is true even if the column being referenced is already defined as Unicode. Without the N prefix, the string is converted to the default code page of the database. This may not recognize certain characters.

Adriaan Stander
  • 162,879
  • 31
  • 289
  • 284
  • Doesn't answer the question. I wasn't asking why it's there. I was wondering if there's a downside or if it's a best practice to have it there regardless. – jcollum Apr 26 '10 at 19:27