Title says it all, what the maximum value that can be returned from 'some random string'.hash
in Ruby?
The docs don't offer much insight.
Title says it all, what the maximum value that can be returned from 'some random string'.hash
in Ruby?
The docs don't offer much insight.
The maximum size String#hash
can output appears to be the maximum size of an unsigned long
in your environment.
The String#hash
function is implemented in rb_str_hash()
:
/* string.c, l. 2290 */
st_index_t
rb_str_hash(VALUE str)
{
int e = ENCODING_GET(str);
if (e && rb_enc_str_coderange(str) == ENC_CODERANGE_7BIT) {
e = 0;
}
return rb_memhash((const void *)RSTRING_PTR(str), RSTRING_LEN(str)) ^ e;
}
st_index_t
is defined as type st_data_t
:
/* st.h, l. 48 */
typedef st_data_t st_index_t;
st_data_t
is an unsigned long
:
/* st.h, l. 20 */
typedef unsigned long st_data_t;
Since the hash is randomly generated (using SipHash), the entire range of values possible in an unsigned long
should be available. In a 64-bit environment, unsigned long
will be 64-bit, of course. SipHash's output is 64-bit, so in a 32-bit environment Ruby stores its output in an array with two 32-bit unsigned integers, and rb_memhash()
combines them with a bitwise XOR.
/* siphash.h, l. 14 */
#ifndef HAVE_UINT64_T
typedef struct {
uint32_t u32[2];
} sip_uint64_t;
#define uint64_t sip_uint64_t
#else
typedef uint64_t sip_uint64_t;
#endif
/* random.c, l. 1306 */
st_index_t
rb_memhash(const void *ptr, long len)
{
sip_uint64_t h = sip_hash24(sipseed.key, ptr, len);
#ifdef HAVE_UINT64_T
return (st_index_t)h;
#else
return (st_index_t)(h.u32[0] ^ h.u32[1]);
#endif
}
Here's Ruby's sip_hash24()
, if you want to look at the implementation.
The Object#hash
method returns a Fixnum
, which:
Holds Integer values that can be represented in a native machine word (minus 1 bit).
Annoyingly, there doesn't appear to be an easy way to determine the exact max value on a particular system (there is an open feature request by Matz - #7517), so you must currently compute it yourself.
The sample code below (https://stackoverflow.com/a/736313/244128) works on some Ruby platforms but not reliably on all of them:
FIXNUM_MAX = (2**(0.size * 8 -2) -1)
FIXNUM_MIN = -(2**(0.size * 8 -2))