0

title is almost self explanatory; how can I use Aramaic/Syriac writing for C/C++ variable identifiers.

Example: char *ܫܠܡܐ = "Hello World";

I tried -fextended-identifiers compiler flag but it didn't work.

I'm asking because Visual Studio allows me to do this, even with #define identifiers, but I want to use this on linux and on GCC. What should I do?

EDIT: This question was marked as duplicate of an older question but this is not a duplicate on the basis that I'm using a different language/alphabet system rather than symbols. Secondly, this question pertains more to gcc for C rather than g++.

Nergal
  • 349
  • 3
  • 14
  • I don't know whether GCC supports this as an implementation-defined extension but I do know for certain that it is not supported by standard C++. Are you sure that the benefits you gain by using those identifiers outweigh the cost of being non-portable? As an aside, using non-English identifiers will make your code practically unreadable by the majority of programmers worldwide, which will further decrease “portability”. – 5gon12eder May 23 '16 at 22:16
  • Are you trying to implement the Old Testament in C? Because it feels like a contemporary language? Hm... In the beginning there was `void`, and the `void` was `main`. And the Lord said, let there be `int`, and the `int` was `EXIT_SUCCESS`. – Kerrek SB May 23 '16 at 22:24
  • @5gon12eder Yes, all of my employees speak Aramaic and have a desire and motivation to read and write Aramaic. Many of us have agreed to take this step to become more proficient in Aramaic but we cannot do so if GCC will not allow it. Secondly, does portability matter when one can use an IDE to replace all the Aramaic words with different identifiers? – Nergal May 23 '16 at 22:25
  • 2
    *"This question was marked as duplicate of an older question but this is not a duplicate on the basis that I'm using a different language/alphabet system rather than symbols."* Doesn't matter, it's either ASCII or Unicode. Ö, , ܠܡܐ , it's the same thing in this context. And gcc and g++ are basically the same software from this point fo view. – Baum mit Augen May 23 '16 at 22:26
  • @KerrekSB best comment right there! – Nergal May 23 '16 at 22:26
  • Well, that's your decision then. Regarding your edit that this question is not a duplicate, I still think it is. You are using characters that GCC doesn't accept, that's in common. Where these characters come from is not really important. And I'm pretty sure that there is no difference between the C and C++ font-end of GCC in that regard. – 5gon12eder May 23 '16 at 22:27
  • going back to the topic, how can I do this in 2016? – Nergal May 23 '16 at 22:34
  • @Nergal: For what it's worth, [Clang is happy](http://melpon.org/wandbox/permlink/1kLTBym4QL70Od67) with it. – Kerrek SB May 23 '16 at 22:38
  • @5gon12eder: You know for certain that it is not supported? There is a _huge_ range of Unicode characters supported for identifiers. Do you have personal experience that Aramaic characters are not among those supported? – gnasher729 May 23 '16 at 22:41
  • 2
    @gnasher729 I know for certain that Aramaic is not in the C++ basic character set. – 5gon12eder May 23 '16 at 22:49
  • @5gon12eder: But it doesn't need to be. Identifiers can also consist of universal characters, not just basic characters. – Kerrek SB May 23 '16 at 23:06
  • @KerrekSB: But universal character names need to be written in the basic character set. Writing them in extended characters would be an extension, and GCC doesn't provide that. Since the linked duplicate was answered, gcc has implemented UCNs in identifiers, and those work: http://coliru.stacked-crooked.com/a/e991f23759f5c6c9 . However, I don't think that naming an identifier `\u072b\u0720\u0721\u0710` will help a human being practice their Aramaic. – rici May 23 '16 at 23:20
  • @nergal: The GCC FAQ entry linked by the answer to the duplicate question includes a sample Makefile and utility which provides what is probably the best answer to your question as of today (2016-05-23 Gregorian). For what it's worth, here's the link again: https://gcc.gnu.org/wiki/FAQ#What_is_the_status_of_adding_the_UTF-8_support_for_identifier_names_in_GCC.3F – rici May 23 '16 at 23:22
  • @rici: Well, that's Phase 1 of the translation - the mapping from source file to basic character set happens in an implementation-defined way. So it's not so much a matter of extension as of specification of the implementation. – Kerrek SB May 23 '16 at 23:32
  • @KerrekSB: Yeah, the fact that gcc can handle the characters in strings makes it harder to argue that it is under no obligation to handle them at all (although in fact it is under no obligation to accept programs encoded in UTF-8 which include characters outside the basic character set). In any case, the documentation does specify that you can use UCNs in identifiers (as required by the standard) but that you cannot yet use the UTF-8 encoding of such characters in identifiers. It's not necessary to know which character block the character comes from to interpret that documentation. – rici May 23 '16 at 23:42
  • from what Kerrek has said and others, it seems I should (temporarily) stick to Clang until GCC can properly support non-Latin Identifiers beyond its current range. – Nergal May 24 '16 at 00:34
  • How do you plan to sanely handle the mix of right-to-left and left-to-right scripts? The standard display conventions really don't seem suitable for programming. Things like `ܫܠܡܐ1-3` and `3-ܫܠܡܐ1` would be semantically different even though substituting `rtlWord` for the rtl text copied from above, the first is `rtlWord1-3` and the second is `3-rtlWord1`. – supercat Feb 28 '17 at 23:00
  • it wouldn't be a mix of such direction. I already have #defines for main and data types, etc. I could something more sane but complex like fork GCC, isolate the C frontend only, and modify it to work with Aramaic but that's a stretch. – Nergal Mar 04 '17 at 20:26

0 Answers0