4

Languages like Haskell allow you to create your own operators. The following answer explains which punctuation characters are allowed in operators: https://stackoverflow.com/a/10548541/783743

Languages like JavaScript on the other hand do not allow you to use punctuation character (beside $) in your variable names. [1]

I am writing a compiler which compiles a subset of Haskell to JavaScript and I don't know how to convert the operators into valid JavaScript identifiers.

Hence I decided to map each punctuation character to a basic latin lowercase alphabet (i.e. a-z). For example:

& = a
| = l
@ = q

However instead of deciding the character mapping for myself, I first want to know whether anybody else has already done the same thing or whether there's a standard which decides how to map them.

I realize that this question could become primarily opinion based (which for some reason is strictly disallowed on StackOverflow). Hence I'm only looking for canonical answers which state definitively that "this is the way to do it" (perhaps with a link). If you want to opine then you can do so in the comments.

There are currently 19 characters which I wish to map to alphabets:

! # $ % & * + . / < = > ? @ \ ^ | - ~

Although $ is a valid character for identifiers in JavaScript it would be nice to map it to an alphabet too.


[1] Property name can have special characters, but that's an ugly hack.

Community
  • 1
  • 1
Aadit M Shah
  • 72,912
  • 30
  • 168
  • 299
  • 1
    [Haskell -> JS?](http://www.haskell.org/haskellwiki/The_JavaScript_Problem#Haskell_-.3E_JS) – merlin Jun 18 '14 at 06:59
  • The question is: do you wish your js code to be human readable or not? – didierc Jun 18 '14 at 08:20
  • @didierc In my opinion `True.aa(True)` is more human readable than `True["&&"](True)`. The latter case is more descriptive but in my opinion it looks ugly. – Aadit M Shah Jun 18 '14 at 08:43
  • What I mean is: if you care about readability, of course you'll try to stick to common idioms (usage of methods rather than array selectors), but if you don't, then it might make your life simpler to use whichever way allowing a direct mapping from haskell identifiers to js ones. – didierc Jun 18 '14 at 08:51
  • @didierc Yes, I do want the generated code to be readable. I would like people to be able to understand the generated code and integrate it with their JavaScript applications. – Aadit M Shah Jun 18 '14 at 09:26

1 Answers1

3

Ghc uses what they call z-encoding. For example, >>= is encoded as zgzgze. See https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/SymbolNames

Twan van Laarhoven
  • 2,542
  • 15
  • 16
  • I appreciate the fact that you found out what GHC officially does. Hence +1. Nevertheless expanding punctuation characters to two character codes doubles the size of operators. When readability and understandability counts, that is unacceptable. – Aadit M Shah Jun 18 '14 at 09:30
  • 1
    The reason for expanding to two characters is to be completely unambiguous. You wouldn't want a function `gge` to conflict with the `>>=` operator. If you know that names don't mix symbols and letters, then you can get away with only an operator marker at the start of the name, say `op_gge`. – Twan van Laarhoven Jun 18 '14 at 10:06
  • True. I was thinking along the lines of simply converting `&&` to `aa`. However if there's already a function named `aa` then I would compile it to `$aa`. Since `$` is not a valid character in varsyms in Haskell and `$` is allowed in identifiers in JavaScript this would resolve all ambiguities, while also keeping the length of the symbol to a minimum. – Aadit M Shah Jun 18 '14 at 10:09
  • But if the `$aa` symbol is already taken, you'll have to find another way. [Tag:C] simply prepends any symbol with an underscore, but the same problem arises, though the standard used to discourage that usage for anything other than system/compiler code. You don't really have that luxury. – didierc Jun 18 '14 at 11:11
  • @didierc The `$aa` symbol can never be taken because Haskell doesn't allow the `$` in varsyms. The compiled JavaScript code will be namespaced. Hence it wouldn't cause any naming conflicts there either. – Aadit M Shah Jun 19 '14 at 03:05