1

I have huge dictionaries that I manipulate. More than 10 Million words are hashed. Its is too slow and some time it goes out of memory.

Is there a better way to handle these huge data structure ?

AlgoMan
  • 2,785
  • 6
  • 34
  • 40

2 Answers2

9

Yes. It's called a database. Since a dictionary was working for you (aside from memory concerns) I would suppose that an sqlite database would work fine for you. You can use the sqlite3 quite easily and it is very well documented.

Of course this will only be a good solution if you can represent the values as something like json or are willing to trust pickled data from a local file. Maybe you should post details about what you have in the values of the dictionary. (I'm assuming the keys are words, if not please correct me)

You might also want to look at not generating the whole dictionary and only processing it in chunks. This may not be practical in your particular use case (It often isn't with the sort of thing that dictionaries are used for unfortunately) but if you can think of a way, it may be worth it to redesign your algorithm to allow it.

aaronasterling
  • 68,820
  • 20
  • 127
  • 125
  • In this situation a viable alternative to a full-blown database might be the built-in `shelve` module. It provides a nice Python dictionary-like interface, so conversion ro using it should be relatively easy. – martineau Nov 19 '10 at 12:01
  • @martineau, that's a possibility. My line of thinking was that shelve would probably be significantly slower than sqlite but I could well be wrong as I haven't done any profiling on it. – aaronasterling Nov 19 '10 at 12:04
1

I'm not sure what your words point to, but I guess they're quite big structures, if memory is an issue.

I did solve a Python MemoryError problem once by switching from Python 32 bits to Python 64 bits. In fact, some Python structures had become to large for the 4 GB address space. You might want to try that, as a simple potential solution to your problem.

Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260