3

I'm looking for a way to determine if a column contains any non-ASCII data such that I can't convert it from nVarChar to varChar.

Ideally I would like to do this in T-SQL.

Jonathan Allen
  • 68,373
  • 70
  • 259
  • 447
  • 1
    https://dba.stackexchange.com/questions/167489/detect-if-any-values-in-nvarchar-columns-are-actually-unicode – Hadi Jan 21 '18 at 10:36
  • 2
    https://stackoverflow.com/questions/3891331/find-non-ascii-characters-in-varchar-columns-using-sql-server – Hadi Jan 21 '18 at 10:36
  • "non-ASCII" seems overly specific. Are you going for a test that doesn't depend on the varChar collocation? If not, @Pரதீப் 's [answer](https://stackoverflow.com/a/48363762/2226988) would work for the database collation, allowing whichever non-ASCII characters it supports. (SQL Server doesn't have a collation with ASCII as the character set.) – Tom Blodget Jan 21 '18 at 14:15
  • What I am really looking for is to answer the question "Does this column need to be nVarChar or could I change it to nVarChar without losing data?" – Jonathan Allen Jan 21 '18 at 19:00
  • 1
    Possible duplicate of [Find non-ASCII characters in varchar columns using SQL Server](https://stackoverflow.com/questions/3891331/find-non-ascii-characters-in-varchar-columns-using-sql-server) – Vanity Slug - codidact.com Feb 06 '18 at 17:55

2 Answers2

8

one simple way is by converting to varchar

select * 
from yourtable 
where nvarcharcolumn <> cast(nvarcharcolumn as varchar(8000))  
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
Pரதீப்
  • 91,748
  • 19
  • 131
  • 172
7

The other current answer is missing a step. All it checks is that after converting to varchar they still compare equal - not that they are actually the same character.

The string N'⁴"*8?' would silently be converted to '4"*8?' and still compare equal under many collations.

See

SELECT nvarcharcolumn,
       CAST(nvarcharcolumn AS VARCHAR(4000))
FROM   ( VALUES(N'⁴"*8?')) V(nvarcharcolumn )
WHERE  nvarcharcolumn <> cast(nvarcharcolumn AS VARCHAR(4000)) 

I would check that it round trips to the exact same value after casting to varchar and back again.You can do this with

WHERE nvarcharcolumn <> cast(nvarcharcolumn as varchar(4000)) collate Latin1_General_100_BIN 
Martin Smith
  • 438,706
  • 87
  • 741
  • 845