I have ancountered a problem in saving of a string containing German letters to a txt file. The MCVE looks like this:
procedure TForm1.Button1Click(Sender: TObject);
var
s: string; //alias for UnicodeString
tf: textfile;
ms: tmemorystream;
begin
s := 'ßüÜöÖäÄФфшШ';
assignfile(tf, 'b:\tmp.txt');
Rewrite(tf);
write(tf, s);
closefile(tf);
ms := tmemorystream.Create;
try
ms.WriteBuffer(Pointer(s)^, Length(s) * SizeOf(s[Low(s)]));
ms.Position := 0;
ms.SaveToFile('b:\tmp2.txt');
finally
ms.Free;
end;
end;
If the string is saved directly to the file we get the following: tmp.txt
– ?uUoOaAФфшШ
. The German letters are changed though Cyrrilic letters remain. If the string is saved by TMemoryStream the result is proper: tmp2.txt
– ßüÜöÖäÄФфшШ
. What is the reason for this?
Appended
I decided to add the HEX values for the given string saved in different ways:
For Write
method:
data: array[0..10] of byte = (
$3F, $75, $55, $6F, $4F, $61, $41, $D4, $F4, $F8, $D8
);
For Write
method called after AssignFile(tf, 'b:\tmp.txt',CP_UTF8);
:
data: array[0..21] of byte = (
$C3, $9F, $C3, $BC, $C3, $9C, $C3, $B6, $C3, $96, $C3, $A4, $C3, $84, $D0, $A4,
$D1, $84, $D1, $88, $D0, $A8
);
For TMemoryStream
:
data: array[0..21] of byte = (
$DF, $00, $FC, $00, $DC, $00, $F6, $00, $D6, $00, $E4, $00, $C4, $00, $24, $04,
$44, $04, $48, $04, $28, $04
);
For TStringList
:
data: array[0..27] of byte = (
$FF, $FE, $DF, $00, $FC, $00, $DC, $00, $F6, $00, $D6, $00, $E4, $00, $C4, $00,
$24, $04, $44, $04, $48, $04, $28, $04, $0D, $00, $0A, $00
);
Appended
upon the valued advice of @Remy-Lebeau:
This method generates a file of 25 bytes long. It is alike with HEX generated by Write method called after AssignFile(tf, 'b:\tmp.txt',CP_UTF8);
with additional 3 bytes (BOM?).
data: array[0..24] of byte = (
$EF, $BB, $BF, $C3, $9F, $C3, $BC, $C3, $9C, $C3, $B6, $C3, $96, $C3, $A4, $C3,
$84, $D0, $A4, $D1, $84, $D1, $88, $D0, $A8
);