4

What is the full syntax of a string literal in T-SQL?

I could not find any explicit definition of string syntax in the official documentation, and any blog post I could find only mentioned certain aspects (like the N prefix). Additionally, any posts on Stack Overflow or other sites that asked how to escape a string had answers that pointed to using parameters instead of answering the question, which doesn't provide me the information I'm looking for.

From what I can tell, a string literal starts with an optional N followed by a wrapper of single quotes (or double quotes, if a setting is set a certain way), single quotes within the string are escaped by doubling the single quote, and backslashes are escaped by doubling the backslash (possibly only if a setting is set a certain way?).

Is there an authoritative and/or comprehensive source for this information? Even just a list of reserved/special characters would be better than what I was able to find.

GPHemsley
  • 535
  • 1
  • 7
  • 18
  • 6
    Have you referred to this documentation? https://learn.microsoft.com/en-us/sql/t-sql/data-types/constants-transact-sql?view=sql-server-ver15. – Gordon Linoff Jul 15 '20 at 14:34
  • 1
    A backslash has no specific meaning in tsql. Presumably your context is another language like csharp where it does need to be escaped. If so, you should make that clear. – SMor Jul 15 '20 at 14:38
  • @GordonLinoff No, I missed that. _It's the first I've seen mention that `?` is a special character, but it doesn't mention escaping single quote or backslash._ Nevermind, read "quotation mark" as "question mark" for some reason. – GPHemsley Jul 15 '20 at 14:38
  • `?`isn't a special character. And `N` means the literal is a Unicode literal. There's no special syntax. What is the actual problem you want to solve? What special characters do you want to use? If it's Unicode, you can just type the Unicode characters directly. – Panagiotis Kanavos Jul 15 '20 at 14:39
  • 1
    The docs do mention that single quotes need escaping, in the third paragraph of [Character string constants](https://learn.microsoft.com/en-us/sql/t-sql/data-types/constants-transact-sql?view=sql-server-ver15#character-string-constants). Anything else is just a character. There are no escape sequences. SQL, the language itself, isn't great at text handling and T-SQL is worse than usual. Entering tabs or non-English characters is trivial - just type the character. There's no need to escape – Panagiotis Kanavos Jul 15 '20 at 14:44
  • @PanagiotisKanavos Sorry, the `?` was a misread on my part. I am generating a query file from data, and I'm trying to generate the strings properly. Doubling the single quotes was the first thing I did, but then my data was seemingly hanging up on backslashes. – GPHemsley Jul 15 '20 at 14:49
  • `I am generating a query file from data` that's the bug then. What if that file contained `' or 1=1; drop table users; --`? Just don't use such code, use parameterised queries. If you want to quickly insert data from a CSV or fixed-length file into a table use SqlBulkCopy or SQL Server's `bcp` and `BULK INSERT` – Panagiotis Kanavos Jul 15 '20 at 14:51
  • @PanagiotisKanavos Parameterized queries are not an option. As I mentioned in my OP, that answer does not help me. I am fully aware of the risks of SQL injection. Using a different bulk insertion method may be an option, but I would still like an answer to my question. – GPHemsley Jul 15 '20 at 14:54
  • You already got that answer. You haven't asked your real question yet. *Why* are parameterised queries not an option? What are you trying to do? Parameterised queries *are* the most common and easiest way to send text data. String concatenation is *not* a bulk insertion technique on the other hand. You have to ask a specific question if you want an answer. If you asked "How can I import a CSV in code?" I'd tell you to use CsvReader to load the data, FastMember's ObjectReader to wrap the records into an IDbDataReader and pass it to SqlBulkCopy. – Panagiotis Kanavos Jul 15 '20 at 14:57
  • Does this answer your (real) question? [Why do we always prefer using parameters in SQL statements?](https://stackoverflow.com/q/7505808/2029983) – Thom A Jul 15 '20 at 14:59
  • If you said `How can I import an Excel file in code?` I'd say "Use ExcelDataReader and pass the DbDataReader it creates to SqlBulkCopy". If you said `I want to perform bulk updates` I'd say there's no such thing, insert the new data into a staging table and run an `UPDATE target FROM source INNER JOIN target` etc – Panagiotis Kanavos Jul 15 '20 at 15:00
  • The only thing that came close to answering my question was @GordonLinoff's original response, a link to documentation that I had missed in my original search. I was going to add that as an answer and accept it, but the question got closed before I could. – GPHemsley Jul 15 '20 at 15:01
  • If you said "I want to replace the contents of a table with new data" I'd say "Forget updates and deletes, load the data into a staging table identical to the target and *swap the partitions* – Panagiotis Kanavos Jul 15 '20 at 15:01
  • 1
    @GPHemsley if you did, it would probably be downvoted. You haven't asked a concrete question yet. It's obvious you have some *other* problem and think escaping would be a solution. It's not – Panagiotis Kanavos Jul 15 '20 at 15:02
  • @PanagiotisKanavos I'm not asking for help with my other problem. I'm asking for a syntax reference. – GPHemsley Jul 15 '20 at 15:04
  • 3
    You got it, and actually found it yourself in all those articles and docs you read before posting the question. There is no special syntax apart from escaping single quotes. That's why all articles and SO answers said you should use parameterised queries – Panagiotis Kanavos Jul 15 '20 at 15:05

2 Answers2

2

The documentation I could not find is here (thanks @GordonLinoff): https://learn.microsoft.com/en-us/sql/t-sql/data-types/constants-transact-sql?view=sql-server-ver15

Character string constants are enclosed in single quotation marks and include alphanumeric characters (a-z, A-Z, and 0-9) and special characters, such as exclamation point (!), at sign (@), and number sign (#).

If the QUOTED_IDENTIFIER option has been set OFF for a connection, character strings can also be enclosed in double quotation marks, but the Microsoft SQL Server Native Client Provider and ODBC driver automatically use SET QUOTED_IDENTIFIER ON. We recommend using single quotation marks.

If a character string enclosed in single quotation marks contains an embedded quotation mark, represent the embedded single quotation mark with two single quotation marks. This is not required in strings embedded in double quotation marks.

Unicode strings have a format similar to character strings but are preceded by an N identifier (N stands for National Language in the SQL-92 standard). The N prefix must be uppercase.

So, by that definition, a Unicode string in the recommended format has the following syntax (as PCRE):

N'(?:[^']|'')*'

GPHemsley
  • 535
  • 1
  • 7
  • 18
  • As you want it *technically exact* you might add, that - by default - the `N` does not mean *unicode*, but just the range covered by `UCS-2`-encoding. Starting with v2019 there are enhancements covering supplementary characters with a special collation, [read this](https://learn.microsoft.com/en-us/sql/relational-databases/collations/collation-and-unicode-support?view=sql-server-ver15#Unicode_Defn) – Shnugo Jul 16 '20 at 07:47
1

A String literal is anything enclosed in a single quote ('). The N prefix isn't "optional" it means that the string following uses the National language character set. A string without the N prefix is a varchar and with is an nvarchar.

Thom A
  • 88,727
  • 11
  • 45
  • 75
  • This does not answer my question. – GPHemsley Jul 15 '20 at 14:44
  • @GPHemsley: yes it does. At least the question "*What is the full syntax of a string literal*" –  Jul 15 '20 at 14:45
  • @a_horse_with_no_name I am looking for a technical answer, with a syntactic definition (like a regular expression or ABNF), not a prose one. This answer does not even mention escaping single quotes. – GPHemsley Jul 15 '20 at 14:50
  • 1
    @GPHemsley you got the answer to the question you posted. The *real* question, why string concatenation produced an invalid SQL statement, is completely different. The answer there is `just don't do it`. It's easier, faster (by orders of magnitude) and infinitely safer to use parameterised queries or SqlBulkCopy – Panagiotis Kanavos Jul 15 '20 at 14:52
  • 1
    Classic example of an [XY Problem](http://xyproblem.info) @GPHemsley . You asked Y, got an answer for Y. Obviously Y doesn't answer X, thus the answer to Y is "not" an answer to Y in your eyes, even though it actually is. Ask X, get an answer to X. – Thom A Jul 15 '20 at 14:57