6

I am trying to convert a Ruby program to Crystal.

And I am stuck with missing string.to_sym

I have a BIG XML file, which is too big to fit in memory.

So parsing it all is out of question. Fortunately i do not need all information, only a portion of it. So i am parsing it myself, dropping most of the lines. I used String::to_sym to store the data, like this:

:param_name1 => 1
:param_name2 => 11
:param_name1 => 2
:param_name2 => 22
:param_name1 => 3
:param_name2 => 33

What should I use in Crystal?

Memory is the bottleneck. I do not want to store param_name1 multiple times.

Grokify
  • 15,092
  • 6
  • 60
  • 81
jsaak
  • 587
  • 4
  • 17

1 Answers1

7

If you have a known list of parameters you can for example use an enum:

enum Parameter
  Name1
  Name2
  Name3
end

a = "Name1"
b = {'N', 'a', 'm', 'e', '1'}.join
pp a.object_id == b.object_id # => false
pp Parameter.parse(a) == Parameter.parse(b) # => true

If the list of parameters is unknown you can use the less efficient StringPool:

require "string_pool"

pool = StringPool.new

a = "param1"
b = {'p', 'a', 'r', 'a', 'm', '1'}.join

pp a.object_id == b.object_id # => false
a = pool.get(a)
b = pool.get(b)
pp a.object_id == b.object_id # => true
Jonne Haß
  • 4,792
  • 18
  • 30
  • I am a bit confused, so at compile time crystal has immutable strings and symbols, but i do not see the difference between the two. And there is Enums also. And only Enums are usable at runtime (getting the int32 from a name). And StringPool is a runtime equivalent of Symbol? – jsaak Sep 12 '15 at 11:08
  • Symbols get translated to a unique number at compile time, so their memory representation is a single number. That's why you can't create them dynamically, the values are assigned non-deterministically at compile time and the table can't be expanded at runtime. – Jonne Haß Sep 12 '15 at 11:14
  • Strings are immutable, but with their full data in memory, so you can do operations on their actual string value, symbols you have to convert to their string value first to do so. StringPool is merely a convenience API upon a lookup Hash/Set that deduplicates instances. – Jonne Haß Sep 12 '15 at 11:15
  • Enums are closer to Symbols, but namespaces and each instance of their own type. Likewise you can't expand their value set at runtime. Being a type allows to add methods to it, for example the signal API makes use of this like `Signal::INT.trap { ... }`. It also allows for an easy compile time generation of the `parse` method: https://github.com/manastech/crystal/blob/master/src/enum.cr#L323-L332 – Jonne Haß Sep 12 '15 at 11:18