I've been with Ruby for about a year now and have a language question: are symbols necessary because Ruby strings are mutable and not interned?
No.
Symbol
and String
are simply two different data types. String
is for text, Symbol
is for labels.
In, say, Java, strings are immutable and interned.
No, they are not. They are immutable and sometimes interned, sometimes not. If String
s were interned, then why is there a method java.lang.String.intern()
which interns a String
? String
s in Java are only interned if
- you call
java.lang.String.intern()
or
- the
String
is the result of a String
literal expression or
- the
String
is the result of String
-typed constant value expression
Otherwise, they are not.
So "foo" is always equal to "foo" in value and reference and its value cannot change.
Again, this is not true:
class Test {
public static void main(String... args) {
System.out.println("foo".equals(args[0]));
System.out.println("foo" == args[0]);
}
}
Call it with
java Test foo
# true
# false
In Ruby, strings are mutable and not interned, so "a".object_id == "a".object_id
will be false.
In modern Ruby, that is not necessarily true either:
#frozen_string_literal: true
"a".object_id == "a".object_id
#=> true
If Ruby had implemented strings like Java, symbols wouldn't be necessary, right?
No. Like I said, they are different types for different use cases.
Take a look at Scala, for example, which implements "strings like Java" (in fact, on the JVM implementation of Scala, there is no String
, Scala String
simply is java.lang.String
). Yet, it also has a Symbol
class.
Likewise, Clojure has not one but two datatypes like Ruby's Symbol
: keywords are exactly equivalent to Ruby's Symbol
s, they evaluate to themselves and stand only for themselves. Symbols OTOH may stand for something else.
Erlang has immutable strings and atoms, which are like Clojure/Lisp symbols.
ECMAScript has immutable strings and recently added a Symbol
datatype. They are not 100% equivalent to Ruby Symbol
s, though, since they have an additional guarantee: not only do they evaluate only to themselves and stand only for themselves, but they are also unforgeable (meaning it is impossible to create a Symbol
which is equal to another Symbol
).
Note that Ruby is moving away from mutable strings:
- Ruby 2.1 optimizes the pattern
'literal string'.freeze
to return a frozen string from a global string pool.
- Ruby 2.3 introduces the
# frozen_string_literal: true
pragma and --enable=frozen-string-literal
feature toggle switch to make all string literals frozen (and pooled) by default on a per-script (pragma) or per-process (feature toggle) basis.
- Ruby 3 will switch the default for both of those to
true
, so that you have to explicitly say # frozen_string_literal: false
or --disable=frozen-string-literal
in order to get the current behavior.
- Some later version will remove support for mutable strings altogether.