5

As the title says, which characters are allowed in identifiers (function, variable, and record field names)? aöø all seem to be fine, as do '_9 if not the first character. <$;% do not. Is it documented somewhere which ranges/blocks of unicode characters and symbols are allowed?

Follow-up question: which characters are allowed in infix operators?

Andreas Hultgren
  • 14,763
  • 4
  • 44
  • 48

1 Answers1

4

So, after reading the Haskell specs (which can be assumed has influenced Elm), the JavaScript specs, and trial and error, I have arrived at the following rules:

  • An identifier must begin with a character from the unicode categories:
    • Uppercase letter (Lu) (modules, types)
    • Lowercase letter (Ll) (functions, variables)
    • Titlecase letter (Lt) (modules, types)
  • The rest of the characters must belong to any of the following categories:
    • Uppercase letter (Lu)
    • Lowercase letter (Ll)
    • Titlecase letter (Lt)
    • Modifier letter (Lm)
    • Other letter (Lo)
    • Decimal digit number (Nd)
    • Letter number (Nl)
    • Or be _ (except for in module names).

Technically "Other number" (No) seems to also be valid in Elm, but it crashes after it's been compiled to JavaScript.

I used this tool to get the ranges for each category.

Community
  • 1
  • 1
Andreas Hultgren
  • 14,763
  • 4
  • 44
  • 48
  • Primes (') will be disallowed from 0.18 https://github.com/elm-lang/elm-platform/blob/master/upgrade-docs/0.18.md#no-more-primes – swelet Oct 26 '16 at 12:20