7

Let me explain. Suppose I want to teach Python to someone who only speaks Spanish. As you know, in most programming languages all keywords are in English. How complex would it be to create a program that will find all keywords in a given source code and translate them? Would I need to use a parser and stuff, or will a couple of regexes and string functions be enough?

If it depends on the source programming language, then Python and Javascript would be the most important.

What I mean by "how complex would it be" is that would it be enough to have a list of keywords, and parse the source code to find keywords not in quotes? Or are there enough syntactical weirdnesses that something more complicated is required?

JasonMArcher
  • 14,195
  • 22
  • 56
  • 52
Javier
  • 4,552
  • 7
  • 36
  • 46
  • 9
    If the goal is to actually learn the language then what you're proposing would be counterproductive. – Azeem.Butt Oct 31 '09 at 02:46
  • It doesn't really matter, I first thought of the idea to actually teach someone, but latter discarded that. It's just for fun. Anyway, I still could use the 'translated' language for teaching basic programming concepts. – Javier Oct 31 '09 at 02:48
  • I remember there were some serious attempts to translate the BASIC language to Spanish, French etc., IIRC something like "para" was the Spanish "for" statement, unfortunately I can't find any references to this on the web - but anyway, it never caught on. – Artelius Oct 31 '09 at 02:58
  • 1
    Is it that difficult to teach simple English keywords like `for`, `while`, `function`? Besides, these are **keywords**. If they ever one day need to google up for help on certain programming topics, they'll find themselves not being able to understand what others are writing. – mauris Oct 31 '09 at 03:00
  • 1
    I've seen a program written with such a translated programming language years ago. If I remember correctly it was in Basic in French. I am a native speaker of French and I can tell you that even if such a "translated" language still existed today I wouldn't use it. Getting help on the internet is much harder if nobody understands your code. – Marcel Gosselin Oct 31 '09 at 03:04
  • Is the English language Turing-complete? – yfeldblum Oct 31 '09 at 03:12
  • 1
    There are several extant SO questions concerning programing languages in non-english human languages. In particular there is http://stackoverflow.com/questions/202723/coding-in-other-spoken-languages , but also http://stackoverflow.com/questions/440052/should-identifiers-and-comments-be-always-in-english-or-in-the-native-language-of and http://stackoverflow.com/questions/250824/do-you-use-another-language-instead-of-english – dmckee --- ex-moderator kitten Oct 31 '09 at 04:05

9 Answers9

8

If all you want is to translate keywords, then (while you definitely DO need a proper parser, as otherwise avoiding any change in strings, comments &c becomes a nightmare) the task is quite simple. For example, since you mentioned Python:

import cStringIO
import keyword
import token
import tokenize

samp = '''\
for x in range(8):
  if x%2:
    y = x
    while y>0:
      print y,
      y -= 3
    print
'''

translate = {'for': 'per', 'if': 'se', 'while': 'mentre', 'print': 'stampa'}

def toks(tokens):
  for tt, ts, src, erc, ll in tokens:
    if tt == token.NAME and keyword.iskeyword(ts):
      ts = translate.get(ts, ts)
    yield tt, ts

def main():
  rl = cStringIO.StringIO(samp).readline
  toki = toks(tokenize.generate_tokens(rl))
  print tokenize.untokenize(toki)

main()

I hope it's obvious how to generalize this to "translate" any Python source and in any language (I'm supplying only a very partial Italian keyword translation dict). This emits:

per x in range (8 ):
  se x %2 :
    y =x 
    mentre y >0 :
      stampa y ,
      y -=3 
    stampa 

(strange though correct whitespace, but that could be easily enough remedied). As an Italian speaker I can tell you this is terrible to read, but that's par for the course for any "programming language translation" as you desire. Worse, NON-keywords such as range remain un-translated (as per your specs) -- of course, you don't have to constrain your translation to keywords-only (it's easy enough to remove the if that does that above;-).

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
0

The problem you will encounter is that, unless you have strict coding standards, the fact that people will not necessarily follow a pattern in how they do the code. And in any dynamic language you will have a problem where the eval function will have keywords within quotes.

If you are trying to teach a language, you could create a DSL that has keywords in spanish, so that you can teach in your language, and it can be processed in python or javascript, so you have basically made your own language, with the constructs you want, for teaching.

Once they understand how to program, they will then need to start learning languages with the "English" keywords, so that they can communicate with others, but that could come after they understand how to program, if it would make your life easier.

So, to answer your question, there is enough syntactic weirdness that it would be considerably more complicated to translate the keywords.

James Black
  • 41,583
  • 10
  • 86
  • 166
0

This is not an optimistic answer nor a great one. However, I feel it has some merit.

I can speak about C# and the translation is not worth it. Here are reasons:

  1. C# is based on English but it is not English literature per se. For example, what would "var" or "int" be in Spanish?
  2. It is possible to create a program to let you use Spanish words in place of English keywords like "for", "in" and "as". However, some Spanish equivalent words may be compound words (two words instead of one, dealing with space can get tricky) or an English keyword may not have a direct Spanish equivalent.
  3. Debugging may get tricky. Converting to English and to Spanish and back to English then Spanish has the marks of "loaded with bugs" written all over it.
  4. The user will not have then benefit of having learning resources. All C# code examples are in the way Microsooft designed it. No one will try to Spanish-ize the syntax just for a few users who will use your app.


I have seen a few people discuss C# code in language other than English. In all cases the authors explain code in their native language but write it in English-looking code as it naturally is. The best approach seems to be try to learn enough of English to be comfortable with C# as it naturally is.

Phil
  • 2,143
  • 19
  • 44
  • I agree with this. The important parts are the documentation and the discussion, not the 20 or so keywords. I mean, most keywords aren't even English words (`method` is Greek, `routine` is French (I think), `function` is Latin, lambda isn't even a word, just a letter spelled out). And what kind of word is `=~`??? And even *if* they are English words, they usually don't mean what they mean in English anyway. `Yield` is good example. Heck, in most programming languages `yield` doesn't even mean what it normally means in computer science, plus it means *different* things in every language. – Jörg W Mittag Oct 31 '09 at 21:30
0

It would be impossible to make a translation that would handle every case. Take for example this Javascript code:

var x = Math.random() < 0.5 ? window : { location : { href : '' } };
var y = x.location.href;

The x variable can either become a reference to the window object, or a reference to the newly created object. It would only make sense to translate the members if it's the window object, otherwise you would have to translate the variable names too, which would be a mess and could easily cause problems.

Besides, it's not really useful to know a language in the wrong language. All the documentation and examples out there is going to be in the original language, so they would be useless.

Guffa
  • 687,336
  • 108
  • 737
  • 1,005
0

You should think that the 'de facto' language for tokens on commonly used programming languages is english. So, for purely educational objectives, to teach on a translated language can be harmful for your student(s). But, if you really want to translate a computer language tokents, you should think on the following issues:

  • You should translate language primitive constructs. This is easy... you have to learn and use a basic parser like yacc or antlr
  • You should translate language API's. This can be so painful and difficult... first, modern API's like java's one are very extensive; second, you have to translate the API's documentation.... no more words about that.
JPCF
  • 2,232
  • 5
  • 28
  • 50
0

While I don't have an answer to the question, I think it's an interesting one. It brings up some issues which I have been thinking about:

  • As developing countries start introducing their population to higher technologies, naturally some will be interested in learning to program. Will English-only programming languages be an impediment?

  • Let's say a programming language was developed in a non-English part of the world: the keywords were written in the native language for that area and it used the native punctuation (eg, «» instead of " ", a comma as the decimal point (123,45), and so forth). It's a fantastic programming language, generating lots of buzz. Do you think it would see widespread adoption? Would you use it?

Most English-speaking people answer "no" to the first question. Even non-English (but educated) people answer no. But they also answer "no" to the second question, which seems to be a contradiction.

Barry Brown
  • 20,233
  • 15
  • 69
  • 105
0

There was a moment I was thinking about something like that for bash scripts, but idea can be implemented in other languages too:

#!/bin/bash

PrintOnScreen() {
    echo "$1 $2 $3 $4 $5 $6 $7 $8 $9"
}
PrintOnScreenWithoutNewline() {
    echo -n "$1 $2 $3 $4 $5 $6 $7 $8 $9"
}
MathAdd() {
    expr $1 + $2
}

Then we can add this to some script:

#!/bin/bash
. HumanLanguage.sh
PrintOnScreen Hello
PrintOnScreenWithoutNewline "Some number:"
MathAdd 2 3

This will produce:

Hello
Some number: 5
lauriys
  • 4,652
  • 7
  • 32
  • 40
0

You might find Perl's Lingua::Romana::Perligata interesting -- it allows you to write your perl programs in latin. It's not quite the same as your idea, as it essentially restructures the language semantics around Latin ideas, rather than just translating the strings.

Andrew Aylett
  • 39,182
  • 5
  • 68
  • 95
0

It is relatively easy to translate the keywords from one programming language into another language. There are several non-English-based programming languages, including Chinese Python, which replaces English keywords with Chinese keywords.

It would be much more difficult to translate each individual variable name from English into another natural language. If two different English variable names had only one translation in another language, there would be a name collision.

Anderson Green
  • 30,230
  • 67
  • 195
  • 328