1

I have a ruby hash (possibly huge) that contains time-based (keys) and string data values.

def time_now
  Time.now.utc.iso8601(3)
end  

time_1 = time_now #=> "2014-12-04T10:05:07.852Z"
data_1= "{\"data\": \"some possibly big json data as string\"}"

h = { time_1 => data_1, ... } 

so in my app (BTW, here the full code: http://github.com/solyaris/pere) I used as keys just ISO8601 string timestamp.

That's maybe ok, but ...

for some reasons I have to select in run-time a subset of hash items; that imply key string comparisons:

h.select { |key, value| key > some__string_timestamp }

So, to have some memory optimizations, and faster execution, possibly STRINGS as key are not the best solution.

I thinked about using symbols:

time_1_key_as_symbol = time_1.intern

But here I have a problem: considering hash could contain a big amount of keys items (these are timestamps of realtime "cluster" events...), having as keys just symbols, could create memory overflow problems because the lack of garbage collection on symbols on Ruby (if I well understood, see also: Why use symbols as hash keys in Ruby? )

So I concluded that could be better (for faster comparisons and memory optimization) to use as keys the internal Ruby Time internal format (64bit by Ruby version 2.1?) instead of strings:

time_1_as_ruby_time = Time.iso8601 time_1

Correct ? What do you think about ?

Community
  • 1
  • 1
Giorgio Robino
  • 2,148
  • 6
  • 38
  • 59
  • I would use (time_1.to_f * 1000).round as the key for speed and simplicity – Vu Minh Tan Dec 04 '14 at 10:33
  • BTW time_1 is a string. do you maybe mean: `t = (Time.iso8601(time_1).to_f*1000).round` ? In this way `t.class #=> Bignum`. Maybe just `Time.iso8601(time_1).to_f` ? – Giorgio Robino Dec 04 '14 at 10:49
  • I actually mean (Time.now.to_f * 1000).round :). Float is not good for hash key because of the problem with floating-point precision. I multiply by 1000 to store the time in milisecond and round it to have an integer. I think I will use 1000000 to make sure it unique actually. I don't have ruby 2.1 installed, but if Time.iso8601(time_1) returns an unique number, then it should be fine. – Vu Minh Tan Dec 04 '14 at 11:16
  • I'm a bit perplex about `(Time.now.to_f * 1000).round` that produce a Bignum. In facts is again at least a 64 bit stuff, wwith the minus that with conversions it lost 'precision'... maybe a simple Time.now is better... – Giorgio Robino Dec 08 '14 at 09:24
  • If you don't care about unique key, Time.now.to_i is the best option, it is a FixNum and well-suited to be a hash key. On the other hand, if you don't want two different moment to has the same key, in terms of precision, Time.now is worse and Time.now.utc.iso8601(3) is as good as (Time.now.to_f * 1000).round since it stored time in miliseconds, and in terms of performance, a string is not as good as a BigNum – Vu Minh Tan Dec 08 '14 at 15:12
  • Ruby 2.2 supports the garbage collection of symbols now. – Yu Hao May 03 '15 at 13:24
  • yes... now! (I posted 5 months ago) :-) – Giorgio Robino May 03 '15 at 17:11

0 Answers0