14

It seems that combining left-to-right (LTR) and right-to-left (RTL) text using paste can produce unexpected results for the resulting order:

(x = paste(c('green', 'أحمر', 'أزرق'), collapse=' ')) # arabic for blue and red
#> [1] "green أحمر أزرق"
paste(x, 'yellow')
#> [1] "green أحمر أزرق yellow"
paste(x, 123)
#> [1] "green أحمر أزرق 123"

Is there any known solution to this - i.e. a way to ensure concatenation in the same sequence as the arguments are given? Perhaps the answer is don't concatenate different alphabets!

Henrik
  • 65,555
  • 14
  • 143
  • 159
geotheory
  • 22,624
  • 29
  • 119
  • 196
  • Even typing `x = paste(c('green',, 123, collapse=' '))` in a text editor gives me a similar issue. I can't even format it correctly here... – CPak Jul 14 '17 at 15:51
  • I'm unsure, but this is due to numbers being always encoded as LTR (left-to-right), therefore it's `123`. But in case of Arabic it's added on the left of the Arabic text - therefore later in the Arabic text. – m0nhawk Jul 14 '17 at 16:02

1 Answers1

4

You may use the Unicode control characters 'left-to-right embedding', u202A ("Treat the following text as embedded left-to-right"):

paste(x, "\u202A", 123)
# [1] "green أحمر أزرق ‭ 123"

See also Terminating Explicit Directional Embeddings and Overrides, (u202C), a thorough description on UNICODE BIDIRECTIONAL ALGORITHM, and here.

Henrik
  • 65,555
  • 14
  • 143
  • 159