2

I have a unicode string that could contain characters from a right to left language such as Arabic or Hebrew, but could also contain text from left to right languages. I need to be able to know at which end is the start and in which direction to step when stepping through the string from beginning to end depending on which language is in the string. Is there a standard way of dealing with this?

TMemo appears to handle this in the way I want. I paste some hebrew text into a TMemo and the direction that the caret moves is the reverse of the arrow keys I use. I can even have a mixture of english and hebrew text in the same memo and the direction the caret moves will depend on whether it's within an english or hebrew section of text. I'd like to replicate this behaviour. I attempted to look into the Delphi code including FMX.Memo and FMX.Text, but couldn't find the code responsible. I have a feeling that the code for handling this may be hidden in a dll. I could write code myself that contains a list of all possible right to left unicode characters to test if a character is RTL or LTR, but I'd like to make use of code that already exists if possible. Can anyone point me in the right direction?

I do know about the unicode RLM mark, which is an invisible character used to mark a section of RTL text, but I don't think this is being used by TMemo. The hebrew text I'm pasting into the TMemo doesn't contain this or any other invisible character.

XylemFlow
  • 963
  • 5
  • 12
  • 2
    AFAIK, RTL vs LTR only applies to the *display* of text, not the *processing* of text in memory. Unicode characters only go in one direction, so you should always be able to iterate your strings from beginning to end. Note that Unicode has special codepoints for denoting when text switches between LTR and RTL within a string. – Remy Lebeau Dec 07 '21 at 16:54
  • 2
    AFAIK, FMX itself still has no support for BiDi processing. Whatever behavior you are seeing in the UI is likely being handled by the underlying OS, not by FMX. – Remy Lebeau Dec 07 '21 at 16:59
  • 3
    The UI bidi rules can be found [here](http://www.unicode.org/reports/tr9/) - this explains how the `TMemo` can automatically switch context based on the text the cursor finds itself within. Some characters are considered to have a strong or weak bidirectional affinity. Strong characters force the context to change but weak characters may not and neutral characters have no effect. You can also explicitly change the context between LTR to RTL with explicit directional formatting characters if the context isn't enough. As Remy says, though, this doesn't affect the layout of the string in memory. – J... Dec 07 '21 at 17:07
  • I've discovered a way to know if a string contains RTL text. Create a TTextLayout and set the Text property to the string. Then use TTextLayout.RegionForRange with TTextRange pointing at just the first character (Pos=0, Length=1). If the resulting region has Left=0 then you have LTR text, otherwise you have RTL text. However, this only works if there's more than one character in the string. – XylemFlow Dec 10 '21 at 17:49

0 Answers0