4

I am implementing a 15-puzzle solver by Ant Colony Optimization, and I am thinking a way of efficiently hashing each state into a number, so I waste the least amount of bytes.

A state is represented by a list of 16 numbers, from 0 to 15 (0 is the hole).

Like:

[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,0]

So I want to create an unique number to identify that state. I could convert all the digits to a base 16 number, but I don't think that's very efficient Any ideas?.

Thanks

Alfonso Embid-Desmet
  • 3,561
  • 3
  • 32
  • 45

2 Answers2

7

Your state is isomorphic to the permutations of 16 elements. A 45 bit number is enough to enumerate those (log2 16!), but we might as well round up to 64 bit if it's beneficial. The problem reduces to finding an efficient conversion from the state to its position in the enumeration of states.

Knowing that each number in 0..16 occurs only once, we could create 16 variables of log2 16 = 4 bits each, where the ith variable denotes which position the number i occurs. This has quite a bit of redundancy: It takes log2(16) * 16 bits, but that's exactly 64 bit. It can be implemented pretty efficiently (untested pseudocode):

state2number(state):
  idx = 0
  for i in [0;16):
    val = state[i]
    idx |= i << (val * 4)
  return idx

I don't know if this is what you meant by "convert all the digits to a base 16 number". It is insanely efficient, when unrolled and otherwise micro-optimized it's only a few dozen cycles. It takes two bytes more than necessary, but 64 bit is still pretty space efficient and directly using it as index into some array isn't feasible for 64 nor for 45 bit.

  • Thanks! I think it would be `for i in 0..15` but yeah, it worked like a charm, thanks again! – Alfonso Embid-Desmet Dec 24 '13 at 00:18
  • @AlfonsoPérez I'm used to half-open ranges, but obviously `..` is the wrong notation for that, so: Good catch. –  Dec 24 '13 at 00:18
  • 2
    delnan, about "efficient" sorry I didn't specify, I was just wondering that as in this particular puzzle "only" 16!/2 states are reachable, then this way we would be "wasting" a couple of bytes, but I have given another thought to it and I think that it is really not necessary to be that pedantic. – Alfonso Embid-Desmet Dec 24 '13 at 00:25
  • sorry, and wouldn't it be more like: `idx |= val << (i*4)` ? – Alfonso Embid-Desmet Dec 25 '13 at 20:18
  • @AlfonsoPérez Now you planted doubts in my mind. I'm not sure (any more) whether it's correct. I know though that your alternative cannot be correct: Shifting by `i` means it can't touch most bits of `idx` and the various 4-bit variables interfere with each other. You should try my version (and the obvious alternative, `idx |= val << (i * 4)`) on some permutations and check whether the result meets the specification I gave. –  Dec 25 '13 at 20:23
  • I think `idx |= val << (i * 4)` is also correct. For 8 puzzle `idx |= val << (i * 3)` wouldn't be correct, because the biggest number is 8, so it takes 4 bits (binary 1000) - it must be shifted by 3 * 8 to work. But for 15 puzzle it doesn't metter, because biggest is 15 and it takes 4 bits. – alcohol is evil Oct 18 '15 at 10:00
  • But my question is: do all values must be unique? My boolean array for found states in 9 puzzle has more than 150,000,000 elements. It's not too much, but for 16 puzzle it will take gigabytes of memory. Am I right? How to store it then? – alcohol is evil Oct 18 '15 at 10:05
  • 1
    @alcoholisevil Congratulations, you just encountered combinatorial explosion. There are far too many possible game states (16!/2 ~= 10^13, as OP pointed out in a comment above) to enumerate them all or store some data for each and every one of them. I already said so in the answer. There are still useful operations you can do with the state identifier (e.g. use it as key to a hash map or other sparse array), but you can't have a full array with one element per possible state. –  Oct 18 '15 at 10:20
3

There are 16! = 2.09*10^13 possible states which needs about 44.25 bits to be encoded.

So if you want to encode the state in bytes, you need at least 6 bytes to do it.

Why not encode it this way:
Let us name the values a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p

The value can be

b`:= b - (b>a)?1:0;
c`:= c - (c>a)?1:0 - (c>b)?1:0
d`:= d - (d>a)?1:0 - (d>b)?1:0 - (d>c)?1:0
....

hashNumber= a+b*15+c*15*14+d`*15*14*15+....

This will give you a bijective mapping of each possible sate to a number fitting in 6 bytes.

Also converting the number back to its referring state is quite easy, if you need to do it.


Not optimal but fast is:
Use 4 bits for each number (leave out the last one because it can be computed from the previous 15 numbers) that needs 15*4 bits = 60 bits. can be stored in 7.5 bytes or if you are ok to waste more, simply use 8 bytes.

MrSmith42
  • 9,961
  • 6
  • 38
  • 49