1

I have migrated from Delphi 6 to Delphi 11 (64 bit edition) and in that I'm using the Indy and ZLib components. I have migrated from Indy 9 to Indy 10, using the component to post API and before that I'm writing the String to a Stream via:

var 
  XML: String; 
  stream: TStream;
begin
  ...
  stream.WriteBuffer(XML[1], Length(XML) * 2);

And then compressing it using the ZLib component. But when it reaches the server it includes the space for every letter in the message:

< h t m l > u s e r < / h t m l >

Any idea on how to resolve this issue?

AmigoJack
  • 5,234
  • 1
  • 15
  • 31
sn_na_v
  • 31
  • 5
  • 1
    Most likely you now have [UTF-16](https://en.wikipedia.org/wiki/UTF-16) instead of [ASCII](https://en.wikipedia.org/wiki/ASCII) - handling a `String` like bytes was okay in Delphi 6 (which never had a 64 bit version), but now you _really_ deal with text, not bytes anymore - now text may need 2 or 4 bytes per character. In fact you don't see "_spaces_" but instead NULLs. – AmigoJack Jul 24 '23 at 18:14
  • Yes, I have to multiply the Length with two to receive the full message and how I need to handle? – sn_na_v Jul 24 '23 at 18:16
  • You have not understood my comment. And you have to convert your UTF-16 `XML` content to whatever encoding you want in your `stream`... but [XML also supports UTF-16 directly](https://en.wikipedia.org/wiki/XML#Encoding_detection) - you may not have understood XML, too. – AmigoJack Jul 24 '23 at 18:21
  • 1
    Required read: https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/ – HeartWare Jul 25 '23 at 05:36

1 Answers1

6

Since Delphi 2009, the string type has been a UTF-16 encoded UnicodeString. You are writing the raw character bytes of a string as-is to your TStream, hence why you need to multiply the string's length by 2 since SizeOf(Char) is 2 bytes. The spaces you are seeing are actually byte #$00, as ASCII characters in UTF-16 have their high byte set to zero.

But in Delphi 6, the string type was a 8-bit AnsiString instead, and SizeOf(Char) was 1 byte.

To get the same behavior in Delphi 11 that you had in Delphi 6, you need to convert your UnicodeString characters into encoded bytes before writing them to your TStream. The default/preferred encoding for XML is UTF-8 (but can be any other charset you choose to specify in the XML's prolog), eg:

var
  XML: string;
  utf8: UTF8String;
  stream: TStream;
...
utf8 := UTF8String(XML); // use UTF8Encode() in Delphi 6
stream.WriteBuffer(utf8[1], Length(utf8));

Alternatively, Indy has overloaded WriteStringToStream() functions in the IdGlobal unit, which have an optional ADestEncoding parameter, eg:

uses
  ..., IdGlobal;

var
  XML: string;
  stream: TStream;
...
WriteStringToStream(stream, XML, IndyTextEncoding_UTF8);

Alternatively, you can use Delphi's own TStringStream class instead, which has an optional TEncoding parameter in Delphi 2009+, eg:

var
  XML: string;
  stream: TStream;
...
stream := TStringStream.Create(XML, TEncoding.UTF8);

Alternatively, simply don't use a TStream at all. Indy's TIdIOHandler class has a DefStringEncoding property for textual reads/writes, and its Write(string) and WriteLn(string) methods also have an optional AByteEncoding parameter, eg:

var
  XML: string;
...
Connection.IOHandler.DefStringEncoding := IndyTextEncoding_UTF8;
...
Connection.IOHandler.Write(XML);

or:

var
  XML: string;
...
Connection.IOHandler.Write(XML, IndyTextEncoding_UTF8);
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770