0

I cannot figure out why this isn't printing the hash map in order. Does it save this way when assigning keys? With the naked eye it seems to be randomized. Is that normal for Rust?

use std::collections::HashMap;
use rand::Rng;

fn main() {
    let mut board = HashMap::new();
    
    for n in 0..99 {
        board.insert(n,0);
    }
    // board.insert(1, 0);

    for (key,value) in &board {
        println!("{}: {}", key, value);
    }
}

Output:

21: 0
15: 0
90: 0
10: 0
52: 0
92: 0
32: 0
61: 0
91: 0
50: 0
28: 0
93: 0
64: 0
72: 0
75: 0
95: 0
98: 0
89: 0
57: 0
88: 0
9: 0
85: 0
87: 0
24: 0
29: 0
37: 0
19: 0
16: 0
44: 0
51: 0
79: 0
53: 0
73: 0
11: 0
7: 0
59: 0
62: 0
3: 0
74: 0
14: 0
96: 0
34: 0
40: 0
23: 0
86: 0
49: 0
82: 0
54: 0
80: 0
22: 0
31: 0
60: 0
76: 0
12: 0
0: 0
2: 0
97: 0
83: 0
27: 0
33: 0
69: 0
26: 0
46: 0
68: 0
43: 0
71: 0
58: 0
77: 0
17: 0
5: 0
35: 0
65: 0
56: 0
20: 0
48: 0
1: 0
13: 0
30: 0
4: 0
41: 0
55: 0
45: 0
25: 0
47: 0
63: 0
66: 0
6: 0
67: 0
38: 0
70: 0
81: 0
84: 0
39: 0
18: 0
42: 0
8: 0
78: 0
36: 0
94: 0
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
SRSLOL
  • 5
  • 1

3 Answers3

9

A hashmap's purpose isn't to keep the order in which a key was inserted. It is used for fast data lookup. This question might help you find a solution

Thomas Q
  • 850
  • 4
  • 10
1

Generally, a hashmap works by mapping keys into buckets into an array by applying a hash function to those keys and then storing the value in that bucket. For general-purposes hashmaps that can accept untrusted data, it is necessary to use a cryptographically secure randomized hash function (technically, a PRF) so that an attacker cannot cause pathological performance by making many keys hash into a bucket.

Rust implements a hashmap that uses this approach, as do many other languages (although, notably, last I checked, PHP did not). Because of this approach, internally a hashmap is not ordered and its items will be stored in an arbitrary order which will often differ from instance to instance.

Some languages, like Ruby, explicitly choose to implement insertion-order traversal of objects, which is extremely convenient in many cases. However, this is not free, and since Rust aims to avoid expensive overhead when possible. As such, you have some alternatives:

  • If you need objects sorted by their keys, or just in some deterministic order (e.g., for tests), you can use a BTreeMap, which provides sorted order by key. Note that this has different performance characteristics than a HashMap and requires different things of its keys (specifically, that they're Ord instead of Hash).
  • If you specifically want insertion order, you may find crates such as indexmap or linked_hash_map that offer this.
bk2204
  • 64,793
  • 6
  • 84
  • 100
  • 1
    "it is necessary to use a cryptographically secure randomized hash function" SipHash (which is Rust's default hash function) is specifically not a cryptographic hash function, and it's really not an option with how short hashes are for hashmaps. However it is designed to mitigate collision prediction attacks ("hash flooding" DOS). – Masklinn Oct 15 '21 at 06:33
  • "Rust implements a hashmap that uses this approach, as do many other languages (although, notably, last I checked, PHP did not)." it is mostly an issue for open-addressing tables and PHP uses separate chaining. – Masklinn Oct 15 '21 at 06:39
  • @Masklinn Pathological performance of a bad hash function (or one broken with maliciously crafted input) is an issue with both open-addressed and linked-list-chained hash tables. In both cases the hash table operations degrade to O(n). – user4815162342 Oct 15 '21 at 12:46
  • @user4815162342 colliding insertions degrade to O(n) with open addressing, but with separate chaining unless you're implementation is really stupid (you use a singly-linked list and append) it remains O(1). Other operations will degrade, but there is a lot less control over them for the attacker. – Masklinn Oct 15 '21 at 18:42
  • It is technically a PRF, which I have noted. It is a hash function, it is randomized, and it is cryptographically secure, but it is not a cryptographically secure hash function. And I have a PoC DoS complexity attack against at least older versions of PHP. – bk2204 Oct 15 '21 at 20:50
  • 1
    @Masklinn No, it has to be O(n) because you have to check every colliding element for equality with the one you're attempting to insert, in order to distinguish between insert and update. And then there is the lookup that would degrade to O(n) even if insertion didn't. Collision attacks on hash tables are in no way specific to open addressing. – user4815162342 Oct 16 '21 at 06:14
-1

Since a Hashmap is a growable structure, it's allocated dynamically by the caller but it's subject to stops during the line of run, due to other tasks the OS performs on the same core you're using.

Due to some OS level restrictions, there are limitations and benefits for hashmaps above other structures.

Runtime Memory Ordering

  • Then a hashmap grows but the memory ordering isn't stricly linear since it don't start with a fixed number of references.


  • Allocation of objects

  • Finally the complete block is represented by the print caller function the way it was allocated, and not in relationship to the data that was writen after the allocation was reordered by OS static linked threads.


  • That structure makes hashmaps extremely fast, due to the refs do not need to be atomically counted and sorted based on that initial count; and the way them are sorted is based on the keys themselves rather than from a preallocated count of references.


  • Such keys of the hashmap are stored in a non-exactly-linear manner by the caller function and reference linked to the object them hold. and Thus represented the way described.


  • VM based Languages

  • One last important thing to notice, is some languages (like rust itself) catch to OS signals while running some code you wrote, this is some form of elementary VM (virtual machine) catching by itself (the rust library embedded in your exe) the possible allocation or runtime errors in the application.

  • Former languages don't do runtime checks while the binary itself is run. Other languages do have a stack on its own, Like Java for an instance which is on itself a complete virtual machine. Virtual machines can ease a lot of problems related to the allocation of references in a timely shape, and then provide the basis for a more solid object references in the languaje model.
  • Kable
    • 7
    • 2
    • 1
      Neither the OS, nor the presence or absence of a VM have _any_ effect whatsoever on hashmap ordering (and precious little on its performance). – Jmb Oct 15 '21 at 06:32