How can I convert unicode characters to ascii codes in delphi 7?

Question

Yes we're talking about ASCII codes. My appologies I'm not the Delphi dev here.

The title does't really say it all! I'm not sure you understand unicode, or else I misunderstood your question. — Steve, Nov 20 '08 at 13:09
We already knew we were talking about ASCII codes because that's what you said in the title. But you haven't said what kind of conversion you mean. Can you give some example inputs and the outputs you expect them to yield? Maybe you should have the Delphi developer ask the question instead. — Rob Kennedy, Nov 21 '08 at 02:58

score 6 · Accepted Answer · edited May 23 '17 at 11:45

6

For Delphi 7, I'd get the free Unicode Library by Mike Lischke who is the author of Virtual Treeview.

The libary includes a lot of conversion functions to go to and from Unicode, so you can use the ones that make most sense in your application.

Or you can upgrade to Delphi 2009 which has built-in encoding routines, and its own library of conversion functions.

edited May 23 '17 at 11:45

Community

1
1

answered Nov 21 '08 at 01:04

lkessler

19,819
36
132
203

Thanks for letting me know @Zéiksz - The page changed and I've now updated it. – lkessler Jul 30 '12 at 20:13

score 3 · Answer 2 · answered May 24 '09 at 03:19

Let's get a few things straight. Character set (charset) and character encodings are two related but different concepts. A character set is an abstract list of characters with some sort of integer character code associated. Then there are character encodings, which is basically an algorithm that describes how the characters are represented in bytes.

ASCII acts as both the character set and encoding. It uses 7 bits to express 128 characters (94 printable). Unicode on the other hand is a character set, expressing 1,114,112 code points. There are several encodings to represent Unicode strings but most notable ones are UTF-8, UTF-16, UTF-16LE, and UTF-32. In other words, a single Unicode character can be represented in different ways depending on the encodings.

How can I convert unicode characters to ascii codes in delphi 7?

I think the question could be interpreted in two ways.

I have a Unicode string in some encoding that only includes ASCII printable characters. How can I convert the string into a byte array of ASCII encoding?
I have a Unicode string in some encoding that also includes non-ASCII printable characters such as Chinese characters. How can I encode the string into a ASCII encoding without losing information, and later decode it back to the original Unicode string?

If you mean the first, you can load the Unicode string into WideString like Osman is saying and do

var
  original: WideString;
  s: AnsiString;
begin
  s := AnsiString(original);

If you mean the second, you would need a generic encoding algorithm like Base64 encoding. You can use DCPBase64.pas included in David Barton's DCPcrypt v2 Beta 3.

score 1 · Answer 3 · answered Nov 20 '08 at 12:42

1

It depends what your definition of conversion is. If you want to map the 127 lowest characters to the Unicode equivalent, you can use an explicit cast. But this creates garbage if the string contains higher characters.

If you want mappings like ë -> e and û -> u, you can write your own code. But be aware that there are always characters that can't be converted.

answered Nov 20 '08 at 12:42

Toon Krijthe

52,876
38
145
202

You do NOT want to write your own code to do character set conversion! – lkessler Nov 21 '08 at 01:05

Steve · Answer 4 · 2008-11-21T11:05:18.633

As an example, the letter A is represented in unicode as U+0041 and in ansi as just 41. So converting that would be pretty simple, but you must find out how the unicode character is encoded. The most common are UTF-16 and UTF-8. UTF 16, is basically two bytes per character, but even that is an oversimplification, as a character may have more bytes. UTF-8 sounds as if it means 1 byte per character but can be 2 or 3. To further complicate matters, UTF-16 can be little endian or big endian. (U+0041 or U+4100).

Where your question makes no sense is if you wanted to for example convert the arabic letter ain U+0639 to ansi on an English locale. You can't.

The question is fine. He simply wants to convert Unicode to ASCII. Obviously, he'll have to decide on what to do with letters that can't convert. — lkessler, Nov 21 '08 at 01:11

score 1 · Answer 5 · answered Nov 20 '08 at 21:45

"ASCII" is the name of a specific mapping of characters to numbers, but some people say "ASCII code" when they don't really mean ASCII at all; they just want the numeric value of a character, whatever mapping is in effect at the time. Does that description apply to you?

If so, then you can use the Ord standard function to get the Unicode code-point value of whatever Unicode character you have.

var
  wc: WideChar;
  ws: WideString;
  x: Word;

x := Ord(wc);
x := Ord(ws[1]);

If you really meant ASCII, though, then you'll have to be more specific about what sort of conversion you have in mind.

score 1 · Answer 6 · edited May 23 '17 at 12:33

See related questions on converting from Unicode to ASCII:

In general, character set of hundreds thousands entries cannot be converted to character set of 127 entries without some loss of information or encoding scheme.

score 1 · Answer 7 · answered Nov 22 '08 at 20:31

You can use the function in http://swissdelphicenter.ch/en/showcode.php?id=1692
It converts Unicode string to Ansi string using specified code page.
If you want convert using default system codepage (defined in regional options as non-unicode codepage) you can do it simply like following:

var
  ws: widestring;
  s: string;
begin
  s:=string(ws)

How can I convert unicode characters to ascii codes in delphi 7?

7 Answers7

Linked