In my program I need to store data related to many (we talk hundreds of thousands, millions) game board states. For that I use a dict.
class BoardState(object):
def __init__(self, ...):
# ...
self.board = [ [ None ] * self.cols for _ in xrange(self.rows) ]
def __hash__(self):
board_tuple = tuple([ tuple(row) for row in self.board ])
return hash(board_tuple)
# ...
self.board
is a 2D list, in my main use case, with 6 rows and 7 columns.
At the beginning I indexed the dict
with BoardState
objects. But since I don't use BoardState
objects stored in dict
for other purpose than future lookup I noticed that I can save memory by indexing with hash(board_state)
(this version uses 4 times less memory).
What is the chance that two different BoardState
objects (with different board
s inside) will result in the same value after hash
ing?
To clarify a bit, that's how I store and retrieve values from dict
:
board_state = BoardState(...)
my_values[hash(board_state)] = { ... }
...
other_val_with_board_state = source_function()
retrieved = my_values[hash(other_val_with_board_state)]
(As I mentioned earlier, I index with result from hash()
to save memory, since I don't use BoardState
objects later.)
UPDATE Now I'm wondering if maybe using string representation of board_state.board
as index would be a good solution to my problem.