5

So here's the setup:

  1. Make a new Delphi 7 application with a TRichEdit control on it. We are talking non-Unicode applications here.
  2. Install a new Input language in Windows' Regional and Language Options, that has a different encoding from the encoding of your default Language for non-Unicode programs - for example Greek.
  3. Add a button in the application, and in its OnClick handler add Button1.Caption := RichEdit1.Text;, and set its Font.Charset to the charset of the input language you just installed (GREEK_CHARSET if we stick to this example).
  4. Run the application, switch to your new (Greek) input language, type a few letters in the RichEdit and press the button - the button's caption now has ???? symbols instead of Greek characters.

  5. Now, if you set your default Language for non-Unicode programs to Greek (Windows restart required), this problem would disappear - greek characters would appear properly. Set your default Language for non-Unicode programs back to what it was before and the problem is there again.

So I would guess that TRichEdit works with Unicode internally, as changing its Font.Charset value never changes anything - the RichEdit accepts any installed Input language properly, and if you have installed two different non-latin languages which use different character sets (Greek /GREEK_CHARSET/ and Russian /RUSSIAN_CHARSET/ for example) it would accept them both without changing its Font.Charset.

I would also guess that when you get the .Text (or .Lines[i]) value of the TRichEdit, it converts its internal Unicode text to ANSI, based on the Windws' default Language for non-Unicode programs.

Further more, assigning the .Text value to a WideString or a UnicodeString also doesn't work properly (the text is once again in ???? instead of the proper characters), it's not only when you assign it to a String (AnsiString).

So here's the question:

I want to be able to convert the text of a RichEdit to a String (ANSI) properly, based on a character set of my choosing instead of the system's default Language for non-Unicode programs. How can I do that? I would prefer a solution that doesn't involve third party components, but, of course, if not possible - anything would do.

Thanks!

P.S.: Switching to Delphi 2009 or later is not an acceptable solution.

TLama
  • 75,147
  • 17
  • 214
  • 392
jedivader
  • 828
  • 10
  • 23

1 Answers1

5

Send the underlying rich edit window the EM_GETTEXTEX message. You pass a GETTEXTEX struct which specifies the code page.

So, something like this would pull the text out into a UTF-16 encoded WideString:

function GetRichEditText(RichEdit: TRichEdit): WideString;
var
  GetTextLengthEx: TGetTextLengthEx;
  GetTextEx: TGetTextEx;
  Len: Integer;
begin
  GetTextLengthEx.flags := GTL_DEFAULT;
  GetTextLengthEx.codepage := 1200;
  Len := SendMessage(RichEdit.Handle, EM_GETTEXTLENGTHEX, 
    WPARAM(@GetTextLengthEx), 0);
  if Len=E_INVALIDARG then
    raise Exception.Create('EM_GETTEXTLENGTHEX failed');
  SetLength(Result, Len);
  if Len=0 then
    exit;
  GetTextEx.cb := (Length(Result)+1)*SizeOf(WideChar);
  GetTextEx.flags := GTL_DEFAULT;
  GetTextEx.codepage := 1200;
  GetTextEx.lpDefaultChar := nil;
  GetTextEx.lpUsedDefChar := nil;
  SendMessage(RichEdit.Handle, EM_GETTEXTEX, WPARAM(@GetTextEx), 
    LPARAM(PWideChar(Result)));
end;

You can then convert that UTF-16 string to whatever code page you like. If you'd rather pull it out in a specific code page directly, then do it like this:

function GetRichEditText(RichEdit: TRichEdit; AnsiCodePage: UINT): AnsiString;
var
  GetTextLengthEx: TGetTextLengthEx;
  GetTextEx: TGetTextEx;
  Len: Integer;
begin
  GetTextLengthEx.flags := GTL_DEFAULT;
  GetTextLengthEx.codepage := AnsiCodePage;
  Len := SendMessage(RichEdit.Handle, EM_GETTEXTLENGTHEX, 
    WPARAM(@GetTextLengthEx), 0);
  if Len=E_INVALIDARG then
    raise Exception.Create('EM_GETTEXTLENGTHEX failed');
  SetLength(Result, Len);
  if Len=0 then
    exit;
  GetTextEx.cb := (Length(Result)+1)*SizeOf(AnsiChar);
  GetTextEx.flags := GTL_DEFAULT;
  GetTextEx.codepage := AnsiCodePage;
  GetTextEx.lpDefaultChar := nil;
  GetTextEx.lpUsedDefChar := nil;
  SendMessage(RichEdit.Handle, EM_GETTEXTEX, WPARAM(@GetTextEx), 
    LPARAM(PWideChar(Result)));
end;
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • @TLama Are you sure that `RICHEDIT` does not support Unicode? – David Heffernan Apr 01 '13 at 11:15
  • By [`this description`](http://msdn.microsoft.com/en-us/library/windows/desktop/bb787873(v=vs.85).aspx#_win32_Rich_Edit_Version_1.0) at least editing shouldn't support Unicode, but I haven't verified this. – TLama Apr 01 '13 at 11:24
  • 1
    @TLama Your link seems to state this for RichEdit control 1.0, i.e. the one included in Windows 98/Me, AFAIR. Good old days... from my experiment, the RichEdit component is Unicode since a lot of time. – Arnaud Bouchez Apr 01 '13 at 15:31
  • +1 Note that the function returning an AnsiString will work with Delphi < 2009, but with Unicode version of Delphi, it may be confusing, since AnsiString has the system-wide code page of Non-Unicode applications settings. This is not the case for Delphi 7, so the answer is perfect for what the OP asked. Nice! – Arnaud Bouchez Apr 01 '13 at 15:33
  • @ArnaudBouchez Yes, this was written with pre-Unicode Delphi in mind. I guess in Unicode Delphi you'd return `RawByteString`. – David Heffernan Apr 01 '13 at 15:40
  • @DavidHeffernan - Thanks! That works like a charm! Now I also need to do the opposite - set the RichEdit text (the `????` are there again if I simply use the `.Text` property). I was thinking it would be as simple as using `EM_SETTEXTEX`, however this message and the `SETTEXTEX` structure are defined in RichEdit v.3, and my Delphi 7 RichEdit unit is v.2.0... Is there another way to set the RichEdit text properly, or do I HAVE to use RichEdit v.3? If I have to use RichEdit v.3, is it compatible with Delphi 7? Sorry if those are dumb questions, I'm not that advanced yet. Thanks! – jedivader Apr 01 '13 at 16:48
  • Did you try `EM_SETTEXTEX`. It looks so similar to `EM_GETTEXTEX` that it seems hard to imagine that your control handles one and not the other. – David Heffernan Apr 01 '13 at 17:01
  • @DavidHeffernan Yes, I tried it. It IS hard to imagine, but nonetheless `EM_SETTEXTEX` is defined in RichEdit v.3, and `EM_GETTEXTEX` is defined in RichEdit v.2. Here's the link: [`EM_SETTEXTEX`](http://msdn.microsoft.com/en-us/library/windows/desktop/bb774284(v=vs.85).aspx) - it says Rich Edit 3.0. I checked the code of my RichEdit unit, it's v.2.0, it has EM_GETTEXTEX and GETTEXTEX, but not EM_SETTEXTEX and SETTEXTEX. What should I do, any advice? – jedivader Apr 01 '13 at 18:11
  • I should think you can do it with `EM_STREAMIN` passing `SF_TEXT or SF_UNICODE`. You'd need to convert from your ANSI code page to UTF-16, and then send that wide text. – David Heffernan Apr 01 '13 at 19:08
  • 1
    You could also try `EM_SETTEXTEX`. You'd need to define it, and the struct. The versions of rich edit seem very hard to fathom. – David Heffernan Apr 01 '13 at 22:14
  • Thanks, @David - here's what I came up with: `procedure SetRichEditText(RichEdit: TRichEdit; Text: String; AnsiCodePage: UINT); const EM_SETTEXTEX = WM_USER + 97; type _settextex = packed record flags: DWORD; codepage: UINT; end; SETTEXTEX = _settextex; TSetTextEx = _settextex; var TheSetTextEx: TSetTextEx; begin TheSetTextEx.flags := GTL_DEFAULT; TheSetTextEx.codepage := AnsiCodePage; SendMessage(RichEdit.Handle, EM_SETTEXTEX, WPARAM(@TheSetTextEx), LPARAM(PWideChar(Text))); end;` Note that this requires Win 2k or later to work (won't work on Win 98). – jedivader Apr 03 '13 at 08:51
  • Just one more note - it should be `LPARAM(PChar(Result))` in `GetRichEditText` and `LPARAM(PChar(Text))` in `SetRichEditText` (typecast to PChar instead of to PWideChar), otherwise the compiler complains for `Suspicious typecast of String to PWideChar`, although everything works correctly anyway. – jedivader Apr 03 '13 at 15:14