0

My CUSTOMER table has some characters that are causing a downstream XML process to bark. It appears to be where LAST_NAME has a two-byte character. I've reviewed this SQL Server question which explained collation. But I'm still stuck. I notice that some characters are 2-bytes. I'm using AL32UTF8 and UTF8 character sets.

How can I find the last_names with 2 byte characters? Is that the question I should be asking if I want find all 2-bytes characters regardless of the collation?

2byte character

Community
  • 1
  • 1
zundarz
  • 1,540
  • 3
  • 24
  • 40
  • 1
    It depends on what you consider a "special character." A character that requires 2-bytes to store is a very strange definition of "special character," and an extremely poor one if you ask me. The same character could use one byte in one character set and multiple bytes in another. – NullUserException Jan 18 '13 at 16:11
  • Sounds fair, and you've educated me about character sets. Given that I'm using AL32UTF8 and UTF8 character sets, how do I find all characters in LAST_NAME that are 2-bytes long? – zundarz Jan 18 '13 at 16:32

2 Answers2

3

for the basic check of bytes != characters you can do something like

select * 
  from customer
 where length(last_name) != lengthb(last_name); 

but as NullUserException stated, depends how you really want to define special characters.

DazzaL
  • 21,638
  • 3
  • 49
  • 57
  • @@DazzaL - I respectfully think that this is not how you check bytes in Oracle. – Art Feb 25 '13 at 16:47
  • @Art `lengthb` returns the number of bytes used for a varchar/char field. what is the problem you see here for this datatype (not dates etc, where we wouldn't use lengthb)? – DazzaL Feb 25 '13 at 17:28
  • DazzaL- I misred your lengthb, sorry... Although lenghtb applies to Java mostly... – Art Feb 25 '13 at 17:43
1

Look here:

http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions233.htm#SQLRF06162

SELECT last_name, VSIZE (last_name) "BYTES"      
  FROM employees
 WHERE department_id = 10
ORDER BY employee_id;

LAST_NAME            BYTES
--------------- ----------
Whalen                   6
Art
  • 5,616
  • 1
  • 20
  • 22