9

I do not understand what a symbol table is. Can someone help me understand symbols, from the very basics and explain thoroughly.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • It would be helpful if you say what background you have - for example, do you know any other language? Do you know it well, or you are a beginner in programming? – Arsen7 Jul 19 '11 at 09:30
  • Hi Arsen7: I have programmed for about 2 years, Java and Objective C. Been reading Ruby since early June, and now I have come to a full stop with the Symbols, and I really want to understand them because they are frequently used. – Wei Lie Sho Jul 19 '11 at 09:33

4 Answers4

12

The most basic usage of Symbols is well summarized with phrase: "A Symbol is a constant integer with human readable name" (by Wei Lie Sho).

If in C you type this:

#define USER 1
#define ADMIN 2
#define GUEST 3
[...]
user.type = ADMIN;

then in ruby you just use a Symbol:

user.type = :admin

So, a Symbol in ruby is just some value, in which the only important thing is the name, or put in other words: the value of the Symbol is its name.

The main difference between Symbols and Strings (because this is also correct code: user.type = "admin") is that Symbols are fast. The main difference between Symbols and integers (used as in this example) is that the Symbols are easily readable for the programmer, while integers are not.

Other Symbol properties are not crucial for their basic usage.


While there is some integer value associated with Symbols (for example: .object_id), you should not rely on it. In each run of your program the integer value of a given Symbol may be different. However, while your program runs, each (let's call it so) "instance" of the same Symbol will have the same integer value.

Unlike the integer constants (as in the C example) the Symbols cannot be sorted in Ruby 1.8 - they do not know whether one is greater than another.

So, you can match Symbols (for equality) as quick as integers are matched, but you cannot sort Symbols directly in Ruby 1.8. Of course, you can sort the String equivalents of Symbols:

if user.type == :admin # OK
if user.type > :guest # will throw an exception.

[:two, :one].sort # will throw an exception
[:two, :one].sort_by {|n| n.to_s} # => [:one, :two]

One of the important properties of Symbols is that once a Symbol is encountered in the program (typed in the source code or created 'on-the-fly'), its value is stored until the end of the program, and is not garbage-collected. That's the "Symbol-table" you mentioned.

If you create a lot of unique Symbols in your program (I talk about millions), then it is possible that your program will run out of memory.

So, a rule of thumb is: "Never convert any user supplied value to Symbols".

# A Rails-like example:
user.role = params["role"].to_sym # DANGEROUS!

I believe this set of information may be sufficient for using Symbols efficiently in Ruby.


Note that in Ruby 1.9 Symbols include Comparable, and so you can do things like

p :yay if :foo > :bar 
Phrogz
  • 296,393
  • 112
  • 651
  • 745
Arsen7
  • 12,522
  • 2
  • 43
  • 60
  • Arsen7: Thank you, so they are like a constant integer, only with a human readable name? I am maybe starting to understand it, could you explain a litte more here? "...So, a Symbol in ruby is just some value, in which the only important thing is the name. It is a name for you - the programmer." Sorry to bother you too much. – Wei Lie Sho Jul 19 '11 at 10:28
  • It's a pleasure ;-) Yes, you understand it right - "a constant with human readable name" is a perfect summary. Unlike with the C-like constants, you can easily convert a ruby Symbol to String: `:admin.to_s => "admin"`, and vice-versa (however that may be a little dangerous if you convert user-supplied strings into symbols - a possible DOS attack to your application). – Arsen7 Jul 19 '11 at 10:38
  • Arsen7: Thank you very much, but if symbols are constant integers with a readable name, then what is the value of a symbol? And I am a litte unsure about this sentence: "...So, a Symbol in ruby is just some value, in which the only important thing is the name. It is a name for you - the programmer." I am thinking of the name-part.. – Wei Lie Sho Jul 19 '11 at 10:44
  • Under the hood of the Symbol there is some integer value (some address), but that integer is not something you can rely on, neither something you should consider as being important. In each run of your program the integer value of a given symbol will be different, but while the program is running, each (let's-call-it) "instance" of the same symbol will have the same integer value. So, you can compare two Symbols as quick as integers are compared, but if you store the Symbol outside of your program, you should use its name. In general, the integer value of a Symbol is totally not important. – Arsen7 Jul 19 '11 at 10:52
  • In other words - you use Symbols only to compare whether 'something' is equal to 'something'. For example you cannot sort Symbols directly. A Symbol is only equal or not equal to another. Of course, you can sort the String equivalents of Symbols: `[:one, :two].sort_by(&:to_s)` – Arsen7 Jul 19 '11 at 10:59
  • Arsen7: Thank you for being patient. I am not 100 % sure about what Symbols are, but I probably need to think a litte about them. I may have to ask you again here in the comment field. – Wei Lie Sho Jul 19 '11 at 11:10
  • 1
    i don't know or really even care about Ruby but this is one of the best answers I've seen on SO. – matt eisenberg Aug 05 '11 at 16:18
2

A ruby symbol is a pointer to a place in memory where there is a constant string. And all identical symbols point to the same place in memory.

A pointer is just an integer representing an address in memory, where addresses in memory are much like the addresses of houses in a city. And each address is unique. In effect, ruby transforms each symbol into a single integer: the address of a constant string in memory.

In order for ruby to compare two strings, ruby starts with the first letter of each string and compares their ascii codes. If they are the same, then ruby moves on to the second letter of each string to compare their ascii codes--until ruby finds a difference in the ascii codes or the end of one of the strings is reached. For instance, with these two strings:

"hello_world_goodbye_mars"
"hello_world_goodbye_max"

ruby has to compare each letter of the two strings until if finds a difference at 'r' and 'x' in order to tell that the strings are not the same.

On the other hand, ruby only has to make one comparison when comparing symbols--no matter how long the symbol is. Because a symbol is effectively just a single integer, in order for ruby to tell whether the following two symbols are different:

:hello_world_goodbye_mars
:hello_world_goodbye_max

ruby only has to do one comparison: the integer representing the first symbol v. the integer representing the second symbol; and the comparison becomes something like:

if 245678 == 345789
  #then the symbols are the same
else
  #the symbols are not the same
end

In other words, comparing two symbols only requires comparing two integers, while comparing strings can require comparing a series of integers. As a result, comparing symbols is more efficient than comparing strings.

Symbols that are different are each given unique integers. Identical symbols are given identical integers.

Disclaimer: any factual errors with the above description will not harm you in any way.

7stud
  • 117
  • 4
0

I am assuming you are talking about symbols which are written like this :something.

Symbols are strings, with the difference that they are "singleton" (so to speak).

For example,

a = "hello"
b = "hello"

a and b are different objects in memory, both of which have the same value.

But,

a = :hello
b = :hello

a and b point to the same object in memory, which contains the value "hello"

The advantage of using symbols (over constant strings) is that they are allocated only once (not familiar with a symbol table in Ruby, but it might where these allocations are made). So, you can use them in multiple places and they are all pointing to the same memory.

sparkymat
  • 9,938
  • 3
  • 30
  • 48
0

I see you know Java. Well, the nearest thing to a Java string in Ruby is a symbol: you can't change them, and strings with the same content are actually the same object.

irb(main):001:0> a = :firstthing
=> :firstthing
irb(main):002:0>  b = :firstthing
=> :firstthing
irb(main):004:0> p a.object_id
467368
=> 467368
irb(main):005:0> p b.object_id
467368
=> 467368

Compare this to Ruby strings, which you can change -- more like a Stringbuf(?) object in Java? (Sorry, my Java is very rusty indeed).

irb(main):006:0> a = "secondthing"
=> "secondthing"
irb(main):007:0> b = "secondthing"
=> "secondthing"
irb(main):009:0> p a.object_id, b.object_id
8746200
8760220
=> [8746200, 8760220]

So symbols are the best thing to use as a key to a Hash; or anywhere where you need a string that won't change, because they are faster than strings. So long as you remember that they can't change, you can use them in exactly the same way as a string. Ruby is more forgiving than Java.

Andy
  • 1,480
  • 14
  • 17