4

I'm not clear on the value and proper use of symbols.

The benefit seems to be that they remove the need for multiple copies of the same hash by letting it exist only in memory. I wonder whether this is true and what other benefits this brings.

If I were creating a user object with properties such as name, email, and password, and used symbols for each property instead of strings, does that mean that there is only one object for each property? It seems like this would avoid a string copy for the properties in the hash (which seems like a good thing).

Can someone help me understand what a symbol is and when it's better to use one over a string in a hash? What are the benefits and pitfalls of each?

Also, can anyone speak to the memory tradeoffs of each? With scalability being important, I'm curious if symbols would help with speed.

markthethomas
  • 4,391
  • 2
  • 25
  • 38
  • 2
    http://stackoverflow.com/questions/2341837/understanding-symbols-in-ruby might be useful – JKillian Jul 18 '14 at 17:36
  • Awesome, thanks. Do you know about the relative memory tradeoffs? I can tell that less object are created, which I would assume is better in general. But I'm curious about actual performance experience or more insight about practical tradeoffs, too. – markthethomas Jul 18 '14 at 17:38
  • I don't see why you are mentioning hashes. Does hash have any relevance to your question? And I don't know what you mean by "need for multiple copies of the same hash". – sawa Jul 18 '14 at 18:03
  • Like I said, I'm new to ruby, so maybe I'm not being as clear as I might otherwise be. I mentioned hashes bc the primary area I've seen symbols being used so far is as a replacement for re-used strings in hashes. – markthethomas Jul 18 '14 at 18:05
  • @jkeuhlen The other answer doesn't seem to speak to speed specifically; do you know of another answer where this is addressed? Or, rather, is the immutability of symbols such a significant downside that there are essentially no use cases where using symbols would be wise? – markthethomas Jul 18 '14 at 18:07
  • @markthethomas it's the exact opposite of what you just said. The immutability of strings *is* one of their upsides. Take a look at [this SO question](http://stackoverflow.com/questions/16621073/when-to-use-symbols-versus-strings-in-ruby) – jkeuhlen Jul 18 '14 at 18:10
  • 1
    That second link is much better than the first i posted as a duplicate and the two of them combined should form your answer. If anything is still unclear let me know and I'll try to answer it. – jkeuhlen Jul 18 '14 at 18:11
  • Cool; did you mean symbols when you typed strings? Just getting things straight. My understanding is that stings are mutable while symbols can never be overwritten or changed; is that right? – markthethomas Jul 18 '14 at 18:11
  • Cool :) I'll check it out! – markthethomas Jul 18 '14 at 18:12
  • So the second link is fantastic—thank you. One slight clarification though: it seems like symbols are great for assigning unique identifiers internally (like to a user--but not based on dynamic input). Do you know of use cases besides this one where they might be useful? The point made by the answerer in the question that we should think about what they _ought_ to be was well-taken; I'm just still trying to work through what it is they ought to be. If it's primarily for internal reference/tracking/identifying, then that makes sense. Am I getting that correctly? – markthethomas Jul 18 '14 at 18:19
  • As it has answer, let me put 2 cents: `arr.collect &:succ` is, more or less equivalent of `arr.collect {|x| x.succ}`. Less methods: `String.instance_methods.size #=> 180` `Symbol.instance_methods.size #=> 94`. Although Ruby doesn't have *too much* patern matching but `Symbol` is used in a similar way. `String`, on other hand, is more powerful and versatile: you can edit, check if something exist in that string, freeze it so it's immutable (similar to symbols), paste int into other string (`"one string #{this is pasted} lala"`) – Darek Nędza Jul 18 '14 at 19:13
  • @markthethomas, you're correct. Strings in Ruby are (in general) mutable. Symbols are immutable. I'm not sure why jkeuhlen said strings were immutable in Ruby, I assume just a typo like you said. – JKillian Jul 18 '14 at 20:36
  • Thanks; that's what I figured based on the context. No biggie. – markthethomas Jul 18 '14 at 21:53

4 Answers4

4

Symbols, or "internals" as they're also referred to as, are useful for hash keys, common arguments, and other places where the overhead of having many, many duplicate strings with the same value is inefficient.

For example:

params[:name]
my_function(with: { arguments: [ ... ] })
record.state = :completed

These are generally preferable to strings because they will be repeated frequently.

The most common uses are:

  • Hash keys
  • Arguments to methods
  • Option flags or enum-type property values

It's better to use strings when handling user data of an unknown composition. Unlike strings which can be garbage collected, symbols are permanent. Converting arbitrary user data to symbols may fill up the symbol table with junk and possibly crash your application if someone's being malicious.

For example:

user_data = JSON.load(...).symbolize_keys

This would allow an attacker to create JSON data with intentionally long, randomized names that, in time, would bloat your process with all kinds of useless junk.

tadman
  • 208,517
  • 23
  • 234
  • 262
  • The security point is well-taken; definitely something I will keep in mind. To clarify: I'd essentially never want to use a symbol to refer to a piece of data that could potentially ever change, right? – markthethomas Jul 18 '14 at 18:25
  • Symbols are hardly ever used to store "data", though they might be used as you might an [enum](http://en.wikipedia.org/wiki/Enumerated_type). Symbolizing string data for no reason is usually counter-productive. – tadman Jul 18 '14 at 18:27
  • Gotcha; so an actual use might be something along the likes of assigning an internal reference to an object for every user? That seems to be (vaguely) the only actual use-case I've seen talked about so far. Can you help me better understand? – markthethomas Jul 18 '14 at 18:29
1

Besides avoiding the need for repeated memory allocation, symbols can be compared for equality faster than strings, and their hash codes can be computed faster than strings (so both storing and retrieving data from a Hash will be faster when symbol rather than string keys are used).

Internally, Ruby uses something closely related to symbols to identify methods, the names of classes, and so on. So, for example, when you retrieve a list of the methods an object supports (with obj.methods), you get back an array of symbols. When you want to call a method "dynamically", using a name stored in a variable or passed in as an argument, you must use a symbol. Likewise for getting/setting the values of instance variables, constants, and so on.

Intuitively, you can think of it this way. If you've ever programmed in C, you have written things like:

 #define SOMETHING 1
 #define SOMETHING_ELSE 2

These defines eliminate the need to use "magic numbers" in your code. The names used (SOMETHING, etc) are not relevant to users of your program, just as the names of functions or classes are not relevant to users. They are just "labels" which are internal to the code, and are of concern only to the programmer. Symbols play a similar role in Ruby programs. They are a data type with performance properties similar to integers, but with a literal syntax which makes them appear as meaningful names to a human programmer.

Once you "get" the concept of Ruby symbols, understanding Lisp symbols will be much easier, if you ever program in Lisp. In Lisp, symbols are the basic data type which program code is composed of. (Because Lisp programs are data, and can be manipulated as such.)

Alex D
  • 29,755
  • 7
  • 80
  • 126
0

You should think about symbols like a numbers. It is constant, immutable and non-gc object that is created on first usage and you should use them whenever you need to reference to object that cannot be duplicated, like:

  • messages aka methods (Ruby doesn't have overloading)
  • hash keys (Ruby doesn't have multi hashes)
Hauleth
  • 22,873
  • 4
  • 61
  • 112
0

Yes, your example is fine.

name, email, and password could all be stored as symbols, even in a hash - the specific object could still be a string object.

{
  :name => 'John doe',
  :email => 'foo@hotmail.com',
  :password => 'lassdgjkl23853'
}
shevy
  • 920
  • 1
  • 12
  • 18
  • Thanks! :) I guess what I was getting more at though is the appropriate use-cases for symbols, not whether they can be stored in a hash. Maybe my question wasn't as clear on that as I would have liked – markthethomas Jul 18 '14 at 18:22