0

According to the perl doc page:

perldata

Hashes are unordered collections of scalar values indexed by their associated string key.

I would have thought perl hashes are ordered since they are usually constructed using array:

my %h = ("a",1,"b", 2);

If Perl hashes are unordered, they need to be "hash"-ed to allow access. Question is when does Perl hash the hashes?

If we do:

my %h = ("1",1); #1
print $h{"1"};     #2

I am assuming this line #1 is doing the hashing internally.

If a hash is constructed from array:

#@a = ("1", 1);
my %h = @a;

I am assuming the assignment to %h is doing the hashing.

Please confirm if my assumption is correct and I don't seem to find the details anywhere on the internet.

SwiftMango
  • 15,092
  • 13
  • 71
  • 136
  • 3
    This might help: http://www.perl.com/pub/2002/10/01/hashes.html –  May 12 '15 at 14:39
  • 7
    `("a",1,"b", 2)` is actually a list, not an array: http://stackoverflow.com/questions/6023821/perl-array-vs-list – Hunter McMillen May 12 '15 at 14:45
  • @huntermcmillen Thanks for pointing out but I don't think thats the primary concerns here – SwiftMango May 12 '15 at 14:46
  • 1
    It certainly makes sense for the assignment operator to do the hashing when the operand on the left is a hash. I can imagine another possibility: that a copy of the list would be stored as-is and the hashing would be delayed until the first lookup. That seems very unlikely. But a good answer to this would be somebody pointing directly at the assignment operator's implementation in the perl source code and saying "look, here's the part where it hashes all the keys". It would probably be near the part that generates the `Odd number of elements in hash assignment` error message. –  May 12 '15 at 14:50
  • 5
    What *is* the primary concern? Your question is an esoteric one about language implementation and doesn't seem to be related to any real-world issue of using Perl. –  May 12 '15 at 14:51
  • 2
    @dan1111 It isn't completely impractical to want to understand exactly how much work the interpreter is doing at a particular step in your program. –  May 12 '15 at 14:58
  • @dan1111 I believe it is an acceptable behaviour to try to gather an understanding of how things work? And by that it would also help doing some optimization based on interpreter (in the real world) – SwiftMango May 12 '15 at 15:03
  • 3
    About the question in general: when we say hashes are **unordered** we don't mean you can't put some ordered data into a hash. We mean that you can't expect to get the original order (or any sensible order) back when you retrieve entries from the hash. After `%h1=(a=>1,b=>2);%h2=(b=>2,a=>1);` there is no way to tell the hashes apart. (Note: this doesn't mean they're guaranteed to yield the keys in the same order if you do `keys %h1` and `keys %h2` - it means they *might* be the same or *might not*. There are no guarantees at all.) –  May 12 '15 at 15:14
  • @wumpusqwumbley I guess that won't guarantee O(1) access time as well? – SwiftMango May 12 '15 at 15:22
  • 1
    Esoteric questions like this, I'm all for. It's an interesting thing to ask and answer. Maybe not particularly necessary to know, but still something that's useful and interesting. – Sobrique May 12 '15 at 16:30
  • @WumpusQ.Wumbley: If fact, Perl randomly changes the order every time the program is run. This is to prevent a hash-dos-attack – shawnhcorey May 12 '15 at 20:18
  • @texasbruce, of course that is a fine reason to want to know. But the question would be strengthened by giving the reason, even if it is "I am just curious how it is implemented." Otherwise, we are left wondering whether you are really asking about the implementation or just fundamentally misunderstand something and think this is important for writing your code. –  May 13 '15 at 09:43

1 Answers1

7

Whenever a string is used as a hash key, the hash of that string is computed. If the key is a constant (e.g. "a" in $hash{"a"}) then its hash is computed at compile time and stored in the optree; otherwise it's computed at runtime as needed. This applies the same way to assignment and retrieval; the hash key is used to find the right hash element to store a value into or retrieve a value from.

hobbs
  • 223,387
  • 19
  • 210
  • 288
  • So how does the hash get stored in memory? Is it like a hash table with O(1) access time? Or just an array with O(n) access time? – SwiftMango May 12 '15 at 17:13
  • 2
    @texasbruce it's a hash table, thus the hashing. And the name. Specifically it's a single-hashing table with linked list buckets, and table growth if any bucket gets too big. I suspect there's an answer here that explains the structure in more detail but I can't easily find it as I'm on my phone right now. – hobbs May 12 '15 at 17:16