44

It's the reverse of this question: Why can't strings be mutable in Java and .NET?

Was this choice made in Ruby only because operations (appends and such) are efficient on mutable strings, or was there some other reason?

(If it's only efficiency, that would seem peculiar, since the design of Ruby seems otherwise to not put a high premium on faciliating efficient implementation.)

Community
  • 1
  • 1
Seth Tisue
  • 29,985
  • 11
  • 82
  • 149

2 Answers2

34

This is in line with Ruby's design, as you note. Immutable strings are more efficient than mutable strings - less copying, as strings are re-used - but make work harder for the programmer. It is intuitive to see strings as mutable - you can concatenate them together. To deal with this, Java silently translates concatenation (via +) of two strings into the use of a StringBuffer object, and I'm sure there are other such hacks. Ruby chooses instead to make strings mutable by default at the expense of performance.

Ruby also has a number of destructive methods such as String#upcase! that rely on strings being mutable.

Another possible reason is that Ruby is inspired by Perl, and Perl happens to use mutable strings.

Ruby has Symbols and frozen Strings, both are immutable. As an added bonus, symbols are guaranteed to be unique per possible string value.

rjh
  • 49,276
  • 4
  • 56
  • 63
  • OK. But why are strings mutable by default? I've edited the question and added "by default". – Seth Tisue Apr 09 '10 at 15:05
  • 2
    It's possible to make a string immutable by calling `.frozen` on it, but you can't really make an immutable string mutable - it would violate the principle. For example, if I pass an immutable string to a function, I wouldn't expect the function to make it mutable and start changing it. – rjh Apr 09 '10 at 15:17
  • It may also be because the ruby string is considered an enumerable. A string is seen as a "list" of characters, the natural expectation is that you can add and remove from a list. Implementing unmutable strings by default and still applying enumerable semantics would be painful. – Jean Apr 09 '10 at 16:13
  • @Jean, String is not enumerable in >= 1.9 – horseyguy Apr 09 '10 at 16:22
  • @banister : is it still mutable by default ? (I haven't played with ruby >1.8 yet, but I am surprised that it is not an enumerable anymore, that's a pretty big api change isn't it ? or is it just that it doesn't include Enumerable anymore but it still responds_to? most of the api ?) – Jean Apr 09 '10 at 16:41
  • @Jean, it's still mutable it just doesn't include Enumerable. THe main reason it's not enumerable is that a natural unit couldn't be decided upon. Iteration over lines or characters? for example. – horseyguy Apr 09 '10 at 16:50
  • * should be `.freeze`, not .frozen :S – rjh Apr 09 '10 at 16:54
  • @banister: also, support for native encoding of strings. Do you enumerate over bytes, or characters? That's why we have `String#each_byte` and `String#each_char`. – rjh Apr 09 '10 at 16:58
  • @Jean: String isn't Enumerable because it's seen as a list of characters. If that were the case, it would be Enumerable by character, not by line as it is in 1.8. – sepp2k Apr 09 '10 at 18:22
  • 20
    I'm fascinated by the idea that immutability makes extra work for me as a programmer. My idea of extra work is that I have to be very careful who I show a string to, because anybody might mutate it! I **like** my immutable strings! – Norman Ramsey Apr 10 '10 at 03:36
  • @sepp2k it is both actually (each_byte) but I guess each line as default made more sense at the time ... @Norman maybe I was just traumatized by some uses of StringBuilder... – Jean Apr 10 '10 at 08:27
  • @Jean: Yes, you have each_byte, but all the Enumerable methods (map, select, inject) work on lines because each works on lines. So saying it's Enumerable by characters because it has each_byte (which is by bytes actually, not by characters, but that's tangential to the point) is a bit inaccurate I think, since each_byte has nothing to do with Enumerable. – sepp2k Apr 10 '10 at 11:07
  • @Norman: you should be very worried about passing hashes, arrays or objects to other functions then :) – rjh Apr 10 '10 at 11:36
  • 5
    @rjh: these days my psychiatrist is allowing me nothing but integers and floats---and she's not so sure about the floats. – Norman Ramsey Apr 10 '10 at 18:33
  • 12
    How exactly do immutable strings "make work harder for the programmer"? – cletus Apr 18 '10 at 07:50
  • "Immutable strings are more efficient than mutable strings" in what way? I like my strings immutable just as much as the next guy but there are certainly scenarios when a mutable string are faster. Especially if the number of bytes per character is fixed, which I believe they are not in Ruby. – Jonas Elfström Jan 08 '12 at 22:56
  • Maybe he means more efficient from a memory perspective? Set 5 variables to the same immutable string (at least in Python & Java) and you only have 1 string object (and 5 references) in memory. – Matt Luongo Oct 19 '12 at 00:28
  • 1
    Actually, when it comes to the inner workings of the `+` operator in Java, it doesn't _have to_ be `StringBuffer`, `StringBuilder` can be used as well. [The specification mentions _`StringBuffer` or a similar technique_](http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.18.1) – toniedzwiedz Oct 08 '14 at 10:53
  • 1
    For what it's worth, this is behaviour that will possibly change in Ruby 3.0 – shevy Sep 14 '15 at 07:52
  • Has it changed then? – rjh Jan 06 '23 at 14:09
6

These are my opinions, not Matz's. For purposes of this answer, when I say that a language has "immutable strings", that means all its strings are immutable, i.e. there is no way to create a string that is mutable.

  1. The "immutable string" design sees strings as both identifiers (e.g. as hash keys and other VM-internal uses) and data-storage structures. The idea is that it's dangerous for identifiers to be mutable. To me, this sounds like a violation of single-responsibility. In Ruby, we have symbol for identifiers, so strings are free to act as data stores. It's true that Ruby allows strings as hash keys, but I think it's rare for a programmer to store a string into a variable, use it as a hash key, then modify the string. In the programmer's mind, there is (or should be) a separation of 2 usages of strings. Often times a string used as a hash key is a literal string, so there is little chance of it being mutated. Using a string as a hash key is not much different from using an array of two strings as a hash key. As long as your mind has a good grasp on what you're using as a key, then there's no problem.

  2. Having a string as a data-store is useful from a viewpoint of cognitive simplicity. Just consider Java and its StringBuffer. It's an extra data structure (in an already large and often unintuitive standard library) that you have to manage if you're trying to do string operations like inserting one string at a certain index of another string. So on the one hand, Java recognizes the need to do these kinds of operations, but because immutable strings are exposed to the programmer, they had to introduce another structure so the operations are still possible without making us reinvent the wheel. This puts extra cognitive load on the programmer.

  3. In Python, it seems like the easiest way to insert is to grab the substrings before and after the insertion-point, then concatenate them around the to-be-inserted string. I suppose they could easily add a method to the standard library that inserts and returns a new string. However, if the method is called insert, beginners may think it mutates the string; to be descriptive it would have to be called new_with_inserted or something odd like that. In everyday usage, "inserting" meaning you change the contents of the things inserted into (e.g. inserting an envelope into a mailbox changes the contents of the mailbox). Again, this raises the question, "why can't I change my data store?"

  4. Ruby provides freezing of objects, so they can be safely passed around without introducing subtle bugs. The nice thing is that Ruby treats strings just like any other data structure (arrays, hashes, class instances); they can all be frozen. Consistency is programmer-friendly. Immutable strings make strings stand out as a "special" data structure, when it's not really, if you use it as a data store.

Kelvin
  • 20,119
  • 3
  • 60
  • 68