Character encoding issue in Oracle PL/SQL

Question

I'm facing a character discrepancy issue while extracting data from db tables.

I've written a PL/SQL code to spool some data to .txt file from my db tables and running this sql using unix shell but when I'm getting the spooled file, the result set is a changed one from the one at back end.

For example:

At back end: SADETTÝN

In Spooled txt file :  SADETTŸN

If you look at the Y character, it is a changed one. I want to preserve all the characters the way they are at back end.

My db's character set:

SELECT * FROM v$nls_parameters WHERE parameter LIKE 'NLS%CHARACTERSET'
PARAMETER              VALUE 
NLS_CHARACTERSET       WE8ISO8859P1 
NLS_NCHAR_CHARACTERSET WE8ISO8859P1

And Unix NLS_LANG parameter :

$ echo $NLS_LANG
AMERICAN_AMERICA.WE8ISO8859P1

I tried changing NLS_LANG parameter to WE8ISO8859P9(Trukish characterset) but no help!

Could anyone let me know the solution to this problem?

Your DB's character set may be that but there's no guarantee that the terminal you're using for spooling can support `Ý`, or that your operating system can. — Ben, Sep 15 '12 at 21:47
How are you generating the text files? Are you using `UTL_FILE`? Or are you doing something else? Have you opened the file in a hex editor to see what hex value is actually in the file that is being interpreted as Ÿ in your text editor? — Justin Cave, Sep 15 '12 at 22:38
@JustinCave I'm using simple spooling concept in SQL*PLUS. I'll try opening my file using hex editor and let you know. Thanks. — prashant1988, Sep 16 '12 at 13:17
To see the character codes in SQL you can "select dump(column_name) from table". — Rene, Sep 17 '12 at 09:20
@Justin Cave edited my question with brief explanation. Thanks in advance. — prashant1988, Sep 17 '12 at 11:26
@prashant1988 - It would be helpful to post both the output of the `dump` function and the hex value stored in the file. Since you are posting the `NLS_LANG` from the Unix box, are you stating that you are invoking SQL*Plus from the Unix machine, connecting to the database, issuing a `SPOOL` command, and then executing a `SELECT` statement to generate the file? — Justin Cave, Sep 17 '12 at 13:32
@Justin Cave Dump value : `Typ=96 Len=8: 83,65,68,69,84,84,221,78` Hex Values: `Typ=96 Len=8 CharacterSet=WE8ISO8859P1: 53,41,44,45,54,54,dd,4e` Yes, I'm invoking SQL*plus from Unix machine and using SPOOL command then. for more info, My unix locale: `$ locale LANG= LC_CTYPE="C" LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL=` Though I don't know much about locale thing, Please help. — prashant1988, Sep 17 '12 at 14:36
@Justin Cave Sorry, Hex value for SADETTŸN in hex editor is `53 41 44 45 54 54 9F 4E` And for SADETTÝN : `53 41 44 45 54 54 DD 4E` And they are in this format ONLY. — prashant1988, Sep 17 '12 at 15:08
Is the value in the file that is generated on the server (before you copy it to your client machine) a 0xDD? Or a 0x9F? — Justin Cave, Sep 18 '12 at 04:39
@Justin Cave In beckend table: 0xDD. txt file in unix environment(opening via vi editor), Ý is appearing as \335 - hex value 5C 33 33 35. When I pcput the file to my disk, hex value: 0x9F Hope this clarifies more. Thanks — prashant1988, Sep 18 '12 at 10:55

score 3 · Accepted Answer · answered Sep 17 '12 at 12:01

3

I presume that you are trying to visualize your file with "vi" or something similar.NLS_LANG parameter is used only by your database to export to your file.For your editor(vi), you need to set the LANG parameter to the corresponding value to your NLS_LANG. Exemple : For ISO8859P1 american english you have to do export LANG=en_US.ISO8859-1 In other words your file is just fine it's your editor who doesn't know what to do with your Turkish characters.

answered Sep 17 '12 at 12:01

BulentB

318
1
4

Thanks for the suggestion. As my file is txt file, I'm just using pcput to save the file to my disk and then opening it using simple notepad. The result is a 'changed character' Nevertheless I tried changing LANG parameter as suggested by you : $ echo $export LANG=en_US.ISO8859-1 LANG=en_US.ISO8859-1 The result is same in notepad and vi editer. fyi..the result in VI EDITOR is : SADETT\335N (same as before) Please advise. – prashant1988 Sep 17 '12 at 12:28
1

Like i said your file is just fine.The "/335" which you see is the oct code for HEX DD which is "Ý" in ISO8859-1.(See [link](http://www.ascii-code.com/)).Sometimes vi don't show extended characters of the ascii table(character code 128-255) but not being a unix connaisseur i can't help you there.To be able to see special characters you need to use an editor with the correct encoding.For exemple if you use notepad++ and choose iso 8859 9 in the menu Encoding/Character Sets/Turkish you will even be able to see the "İ" which is the character you want to see for your "SAADETTİN" :). – BulentB Sep 18 '12 at 12:28
1

And i would not use "pcput" to tranfer the file because as you said above pcput seems to change the file during the transfer.I would use ftp/sftp instead. – BulentB Sep 18 '12 at 12:37

score 2 · Answer 2 · edited May 23 '17 at 10:34

2

You should use NCHAR data types. More information is available at Oracle Documentation - SQL and PL/SQL Programming with Unicode

For spooling from SQL*Plus, you need to set the NLS_LANG environment variable correctly. Here is a similar question in stackoverflow.

edited May 23 '17 at 10:34

Community

1
1

answered Sep 15 '12 at 22:19

Geordee Naliyath

1,799
17
28

but my char and nchar character set are same in db, so I don't think there is any neccesity to use NCHAR data type. `SELECT * FROM v$nls_parameters WHERE parameter LIKE 'NLS%CHARACTERSET'` `PARAMETER VALUE NLS_CHARACTERSET WE8ISO8859P1 NLS_NCHAR_CHARACTERSET WE8ISO8859P1` Please correct me if I'm wrong. – prashant1988 Sep 16 '12 at 13:13
Just some more info. Even my nls_lang parameter on Unix client OS is same as on DB: $ echo $NLS_LANG AMERICAN_AMERICA.WE8ISO8859P1 – prashant1988 Sep 16 '12 at 14:41

Character encoding issue in Oracle PL/SQL

2 Answers2