Can the keywords (and standard library) of C++ be localised? (Modifiable parser syntax)

Question

Hang on, this will be long! I'll need to explain a few things before asking my questions.

According to the C++ standard (and as described in this question and its answers), a compiler should support Unicode (and even more precisely UTF-8 in source) in the names of identifiers (variables, functions, etc.) I know that Clang supports that fully (I mean you can use UTF-8 encoded source files) and GCC supports it only if you use \u codes in the identifiers, but let's assume we live in a perfect world where this works properly on all compilers.

That is great! Now I no longer have to write my code in English and can finally do it in my native Bulgarian, or maybe Esperanto. That's the point of this requirement of the standard, after all. Except there is a still a huge problem with that. Let's see some (not really meaningfull) code:

First using identifiers in English (ASCII):

int i = 0;
while(i < 100)
{
    auto f = static_cast<float>(i);
    std::string currentName = "name_" + toString(f);
    std::cout << getPrettyName(currentName) << ": " << getSalary(currentName) << std::endl;
}

And then using identifiers in Bulgarian (as it shows the problem very clearly):

int и = 0;
while(и < 100)
{
    auto д = static_cast<float>(и);
    std::string текущоИме = "име_" + превърниВНиз(д);
    std::cout << красивоИме(текущоИме) << ": " << заплата(текущоИме) << std::endl;
}

As you can see, the second code is still mainly in English because of keywords and the standard library. There are two problems with that:

It doesn't help non-English speaking Bulgarians understand the code (assuming they do not know C++ that well), they still have to know English to be proper programmers, and isn't that part of the point of this whole thing?
What is actually worse, at least for me, is that this is very annoying to write. Those of you that speak a language, the alphabet of which is not based on the latin script, know that to write with a different alphabet, you have to switch the keyboard layout (most people use Alt+Shift). I had to switch the layout 4 times to write each line. This is very annoying, and slow.

This goes on for all languages, that are not based on the latin script: Chinese, Arabic, Russian, Hindi, …

The obvious solution (at least for me) is that the C++ language should support localised keywords (and standard library classes) in order for this whole Unicode-identifiers thing to have any sense. That has been done for ALGOL 68 and possibly others, and there are other more modern examples in the same article. That way the code in Bulgarian would look better and be much more easier to write (I don't claim that the Bulgarian words used must be exactly these):

цяло и = 0;
докато(и < 100)
{
    авт д = статично_преобр<дробно>(и);
    стд::низ текущоИме = "име_" + превърниВНиз(д);
    стд::изх << красивоИме(текущоИме) << ": " << заплата(текущоИме) << стд::кред;
}

So, on to the questions:

Is this actually allowed/possible according to the standard right now? I may be missing something…
Is there any way to make a workaround in a decent way myself? Macros would work for the keywords but that would be awful. using would work about standard library classses (namespace стд { using низ = std::string; }) but there is no way to deal with methods (std::string::size() -> размер()?) apart from subclassing… or is there?
In case that is not possible or even considered, how should one go about suggesting this idea to the C++ gurus that make the standard?

Just to be clear, I don't mean that there should be different versions of C++ for the different languages — more like that it should be possible for it to support all at once via some setting or include or whatever, if needed.

You could write your own scripting language to achieve this: https://accu.org/index.php/journals/2252 — doctorlove, Aug 22 '16 at 10:59
I'd say you should stick to english for variables etc as well, so your code can be broadly understood. You just have to accept that learning english is pretty much a requirement for programming in C++. — Jesper Juhl, Aug 22 '16 at 11:08
actually in Japanese you don't need to change keyboard layout at all. After typing the sentence/word just press space and the IME will convert that to the correct form automatically for you, or you can specify it manually if needed — phuclv, Aug 22 '16 at 11:13
If one wants to follow programming or any significant scientific related career, they should learn English, otherwise it'll be difficult even in finding and reading documents — phuclv, Aug 22 '16 at 11:30

Basile Starynkevitch · Answer 1 · 2016-08-22T12:12:05.057

8

No, keywords are fixed in the C++ standard (C++11, C++14, etc...). You cannot change them (or else the language won't be C++ anymore).

You might use preprocessor tricks like:

#define стд std

(or, as you commented, using стд = std;; but for proper keywords like while you can "replace" them only with the preprocessor). But I am not sure that is working, and I really believe it is a very bad idea.

A C++ programmer is expecting the names mentioned in the standard. Don't confuse him.

And programming is not about coding in a near natural language (that was the ambition of Cobol, which failed completely on that aspect). The point is that programming is difficult so it takes ten years to learn it, so you do expect programmers to be able to use English looking keywords and read technical documentation in English.

edited Aug 22 '16 at 12:12

answered Aug 22 '16 at 10:58

Basile Starynkevitch

223,805
18
296
547

`using стд = std;` is better than that. As I said macros are an awful solution but I meant macros for keywords as `using` won't work for that. My idea is that maybe the standard should allow that. If it does, "a C++ programmer" will recognise them, because they will be in the standard. Plus, the whole point of this is for people that don't want to use English in the code. If I'll use Bulgarian names for identifiers, some person that doesn't speak Bulgarian will already not be able to understand the code, even though the keywords are in English. – Lyubomir Vasilev Aug 22 '16 at 11:05
4

@LyubomirVasilev, if you use Bulgarian for your identifiers, you will be promptly fired from any programming job, even in most Bulgarian companies. Because most companies either work with people/companies abroad and need them to be able to make sense of their code, or expect to do so one day. – Jan Hudec Aug 22 '16 at 11:13

Jan Hudec · Answer 2 · 2016-08-22T12:22:20.997

That is great! Now I no longer have to write my code in English and can finally do it in my native Bulgarian, or maybe Esperanto. That's the point of this requirement of the standard, after all.

I am pretty sure it isn't. The point of the standard seems to be purely compatibility with other programming systems that may generate such symbols. After all, the specification does not require accepting actual utf-8 anywhere. The only thing it requires is the \u escapes supported in gcc.

Is this actually allowed/possible according to the standard right now? I may be missing something…

No, it isn't. The specification specifies the exact symbol names.

Is there any way to make a workaround in a decent way myself? Macros would work for the keywords but that would be awful. using would work about standard library classses (namespace стд { using низ = std::string; }) but there is no way to deal with methods (std::string::size() -> размер()) apart from subclassing… or is there?

You could cover them with the #defines, but obviously it would apply to the same name everywhere, which is rarely appropriate.

In case that is not possible or even considered, how should one go about suggesting this idea to the C++ gurus that make the standard?

Forget it. It is extremely bad, borderline evil, idea. Remember, that most code out there is, or one day will be, maintained, or at least reviewed, by somebody on the other end of the world, who has different native language. English makes that possible. Switching from it would be very, very bad. At least bad for the big software companies and keep in mind that the key people in C++ standards committee do represent big software companies.

@LyubomirVasilev, if you ask about whether it makes sense to suggest this idea to the C++ committee, politics *must* be discussed, because the C++ committee decision *will* be pretty much a *political* one. — Jan Hudec, Aug 22 '16 at 11:45

Can the keywords (and standard library) of C++ be localised? (Modifiable parser syntax)

2 Answers2