98

The use of symbol literals is not immediately clear from what I've read up on Scala. Would anyone care to share some real world uses?

Is there a particular Java idiom being covered by symbol literals? What languages have similar constructs? I'm coming from a Python background and not sure there's anything analogous in that language.

What would motivate me to use 'HelloWorld vs "HelloWorld"?

Thanks

Joe Holloway
  • 28,320
  • 15
  • 82
  • 92

4 Answers4

79

In Java terms, symbols are interned strings. This means, for example, that reference equality comparison (eq in Scala and == in Java) gives the same result as normal equality comparison (== in Scala and equals in Java): 'abcd eq 'abcd will return true, while "abcd" eq "abcd" might not, depending on JVM's whims (well, it should for literals, but not for strings created dynamically in general).

Other languages which use symbols are Lisp (which uses 'abcd like Scala), Ruby (:abcd), Erlang and Prolog (abcd; they are called atoms instead of symbols).

I would use a symbol when I don't care about the structure of a string and use it purely as a name for something. For example, if I have a database table representing CDs, which includes a column named "price", I don't care that the second character in "price" is "r", or about concatenating column names; so a database library in Scala could reasonably use symbols for table and column names.

Alexey Romanov
  • 167,066
  • 35
  • 309
  • 487
  • 26
    It's probably worth remembering that == in Scala does the .equals thing, so really the difference would be when using the "eq" method which does reference equality. One bonus, though, is that comparison between symbols is extremely cheap. – Calum May 28 '09 at 13:59
  • @Calum, as is comparison between two strings. Java interns (more or less) all strings. – Elazar Leibovich Aug 24 '10 at 10:12
  • 4
    @Elazar: Is that really true? I was under the impression that Java only interned literals (i.e. almost all strings in trivial examples, and almost no strings in production software). Having said the use-cases of symbols are usually as literal values (I doubt you often build them from scratch), so arguably the main advantage you get is just a more descriptive type. – Calum Aug 24 '10 at 11:43
  • @Elazar Actually, it occurs to me now that even in the case of an interned Java String, you only benefit from interning when comparing two identical objects. Since the runtime cannot guarantee that two instances will both be interned, all interning yields is an early-out in the case where the items are the same. It does not bypass the requirement to check character-by-character on Strings of equal length. – Calum Aug 24 '10 at 15:30
  • @Elazar Sorry for keeping updating this same answer (there's a time limit!). I've elaborated on this point in the answer to another question: http://stackoverflow.com/questions/3554362/purpose-of-scalas-symbol/3555381#3555381 . For what it's worth, though, I'm certain that Java does not intern dynamically-created Strings. – Calum Aug 24 '10 at 15:39
  • @Calum, indeed Java doesn't intern dynamically created strings. But Symbols are equivalent to Java's string literals, not to just any string. See http://javatechniques.com/public/java/docs/basics/string-equality.html – Elazar Leibovich Aug 25 '10 at 08:06
  • 1
    @Elazar They are not, because Symbols can rely upon their value being interned, whereas with Strings its optional and can optimise some use-cases. Symbols are more like dynamically-created Enums than anything. I elaborated on this on the answer I linked in my last response; please read it for more. – Calum Aug 25 '10 at 09:30
  • 3
    So, "hello, world!" is a sequential collection of characters while 'helloWorld is people friendly value rather than 14392. – CW Holeman II May 17 '11 at 13:14
  • I should also add that Python has a similar concept (though not really 'first class') via the `intern` built-in. See http://docs.python.org/library/functions.html#intern and http://stackoverflow.com/questions/1136826/what-does-python-intern-do-and-when-should-it-be-used – rlotun Aug 20 '12 at 17:32
  • Just mentioning for reference: the Lua language strings are all interned. – akauppi Aug 14 '14 at 08:03
26

If you have plain strings representing say method names in code, that perhaps get passed around, you're not quite conveying things appropriately. This is sort of the Data/Code boundary issue, it's not always easy to the draw the line, but if we were to say that in that example those method names are more code than they are data, then we want something to clearly identify that.

A Symbol Literal comes into play where it clearly differentiates just any old string data with a construct being used in the code. It's just really there where you want to indicate, this isn't just some string data, but in fact in some way part of the code. The idea being things like your IDE would highlight it differently, and given the tooling, you could refactor on those, rather than doing text search/replace.

This link discusses it fairly well.

Saem
  • 3,403
  • 23
  • 12
4

Note: Symbols will be deprecated and then removed in Scala 3 (dotty).

Reference: http://dotty.epfl.ch/docs/reference/dropped-features/symlits.html

Because of this, I personally recommend not using Symbols anymore (at least in new scala code). As the dotty documentation states:

Symbol literals are no longer supported

it is recommended to use a plain string literal [...] instead

juanmirocks
  • 4,786
  • 5
  • 46
  • 46
3

Python mantains an internal global table of "interned strings" with the names of all variables, functions, modules, etc. With this table, the interpreter can make faster searchs and optimizations. You can force this process with the intern function (sys.intern in python3).

Also, Java and Scala automatically use "interned strings" for faster searchs. With scala, you can use the intern method to force the intern of a string, but this process don't works with all strings. Symbols benefit from being guaranteed to be interned, so a single reference equality check is both sufficient to prove equality or inequality.

ChemaCortes
  • 131
  • 1
  • 2