0

I have a contact table where it stores names of the contact persons. So our research team copies the names from websites and pastes it into application, while copying we got some special characters got stored into database. Check below examples which will reflect as "?"(ASCII code=63) while extracted to text file. Examples of contact Last names are listed as follows.

EX: 1) Sheffield 2) Griffin-Smith 3) Lhoёst

Is there a way to query list of all special characters available with ASCII code as 63 in the "Last_name" column of my contact table so that I could figure out and send them to researchers to update the correct names.

Thanks in advance!

MGM
  • 65
  • 2
  • 7

6 Answers6

3

If you want to find all entries which contain non ASCII Characters you can do the following:

select * from TheTable where Last_name != Cast(Last_name AS VARCHAR(1000)) 
Narti
  • 181
  • 7
  • Hi Narti, your answer is useful in some perspective but it will not answer my question. Anyways thanks!! – MGM Jun 16 '14 at 10:04
1

I would prefer to leave a comment but my reputation wasn't enough to comment Narti's answer. Anyway although MGM marked Narti's answer as correct answer, he/she commented that it does not exactly answer his/her question. Well it answered my exact same question.

I modified my script that didn't replace ascii 63 and I was able to replace it by casting and then replacing.

From this: (This doesn't work)

update TheTable set last_name = replace(last_name, char(63),'');

To this:(When you cast and make it an actual '?', now you can replace '?' and remove it

update TheTable set last_name = replace(Cast(last_name AS VARCHAR(1000)), '?','');

Just wanted to add this to this thread.

Thanks,

Yeshim
  • 11
  • 3
0

The character with ASCII code 63 is not some strange character, it's the question mark. (It's no surprise that it is shown as a question mark, as it is a question mark.)

To find the records that contains the question marks, you can use the like operator:

select * from TheTable where Last_name like '%?%'
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • yeah agree, but in the database the char is not stored in the format of "?" and it is stored as "Sheffield" but at the time of copying it to notepad it is converting into "?" so i want all the names that would turn as "?" while copying into notepad. Hope you understood my question. – MGM Jun 16 '14 at 09:27
  • @Manojgade: Aha, you said that they were stored as ASCII code 63 in the database, so I throught that they were... To find exactly the characters that would not be supported in some specific encoding would be complicated, but you can get the names containing potentially troublesome characters using something like `select * from TheTable where Last_name like '%[^A-Za-z\- ]%' escape '\'`. – Guffa Jun 16 '14 at 10:04
0

You can directly use that character.

If you know that it is in keyboard like 63 is ascii of ? then run this query as

declare @t table (col varchar(50))

insert into @t values('ajay'),('bbb?bb'),('bbb?cc'),('55?bb')

select * from @t where col like '%?%'

Else

you can use regular expression for this.

To find out not alphabet, simple query as

SELECT Col1
FROM TABLE
WHERE Col1 like '%[^a-Z0-9]%'

Find all special characters in a column in SQL Server 2008

Check if field contains special character in SQL

The logic behind '%[^a-Z0-9]%' is,

  1. a-Z0-9 is search all alphabetic character
  2. ^ is used to not operator, means no alphabets
  3. Both % give occurance, zero or more.

For escaping charcter search.

http://web.archive.org/web/20150519072547/http://sqlserver2000.databases.aspfaq.com:80/how-do-i-search-for-special-characters-e-g-in-sql-server.html

Community
  • 1
  • 1
Ajay2707
  • 5,690
  • 6
  • 40
  • 58
  • your query gives me all the special characters available in the column, but my requirement is to get special characters only with ASCII value as 63. – MGM Jun 16 '14 at 09:30
  • Hi please check my above specified examples and try to copy them into your notepad that will tell you the story and to add to the before I have data stored in my table in the examples I specified and not with "?" in the middle of the word. My challenge is to identify such special characters and indeed i don't want to query for all special characters available in my column I am specific to special chars with ASCII value as 63. Hope you understood now. – MGM Jun 16 '14 at 09:47
0
create function fnSpecialChars(@MyString as varchar(100)) returns bit as 
begin
    declare @result as bit
    declare @i as int
    declare @n as int
    set @result = 0
    set @i = 1 
    if @MyString = '' or @MyString is null 
        set @result = 0 
    else
        begin
            set @n = datalength(@MyString)
            while @i <= @n and @result = 0
                begin
                    if ascii(substring(@MyString, @i, 1)) = 63 --between 128 and 255 
                        set @result = 1
                    else
                        set @i = @i + 1
                end
        end
    return @result
end
GO




SELECT * FROM tbl_crm_h_contact WHERE dbo.fnSpecialChars(last_nm) = 1
MGM
  • 65
  • 2
  • 7
0

Be aware that ASCII characters are those in the 0-127 code point range. Characters in the 128-255 range (sometimes casually called extended ASCII characters) depend on the code page of the column collation. When you insert characters that do not map to a character in the target collation code page, SQL Server will either change the character to '?' or map it to a close alternate character (e.g. 'Ǜ' to 'U'). I mention this because it is not only '?' that may be an issue. This character loss occurs when data is converted so it is not possible to identify the original character after data has been inserted.

The script below will list most problem characters according to your default database collation. Note that this checks all mathematically possible UCS-2 code points rather than the subset of valid UCS-2 characters.

WITH 
      t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
    , t256 AS (SELECT 0 AS n FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d)
    , t64k AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) - 1 AS num FROM t256 AS a CROSS JOIN t256 AS b)
SELECT 
      num AS UnicodeCodePoint
    , NCHAR(num) AS UnicodeChar
    , CAST(NCHAR(num) AS char(1)) AS AsciiChar
FROM t64k
WHERE 
    CAST(NCHAR(num) AS char(1)) = '?';
Dan Guzman
  • 43,250
  • 3
  • 46
  • 71