1

If I instantiate/update a few lists very, very few times, in most cases only once, but I check for the existence of an object in that list a bunch of times, is it worth it to convert the lists into dictionaries and then check by key existence?

Or in other words is it worth it for me to convert lists into dictionaries to achieve possible faster object existence checks?

Derek
  • 11,980
  • 26
  • 103
  • 162
  • 1
    What is wrong with the `if element in list` syntax? Why would you want to convert a `list` into a `dict` for that? – Alex Jul 05 '13 at 07:19
  • 1
    @Alex: Because that's slow! – jason Jul 05 '13 at 07:19
  • Is this code you own? If so and having a lookup of cost O(1) is important, then can't you store in a dictionary/set to begin with? – Mike Jul 05 '13 at 07:20

4 Answers4

4

Dictionary lookups are faster the list searches. Also a set would be an option. That said:

If "a bunch of times" means "it would be a 50% performance increase" then go for it. If it doesn't but makes the code better to read then go for it. If you would have fun doing it and it does no harm then go for it. Otherwise it's most likely not worth it.

Dan D.
  • 73,243
  • 15
  • 104
  • 123
Martin Thurau
  • 7,564
  • 7
  • 43
  • 80
1

You should be using a set, since from your description I am guessing you wouldn't have a value to associate. See Python: List vs Dict for look up table for more info.

Community
  • 1
  • 1
korylprince
  • 2,969
  • 1
  • 18
  • 27
  • 1
    There's some pretty good performance info [here](http://interactivepython.org/courselib/static/pythonds/AlgorithmAnalysis/analysis.html#performance-of-python-data-structures) as well. – Mike Jul 05 '13 at 07:27
1

Usually it's not important to tune every line of code for utmost performance.

As a rule of thumb, if you need to look up more than a few times, creating a set is usually worthwhile.

However consider that pypy might do the linear search 100 times faster than CPython, then a "few" times might be "dozens". In other words, sometimes the constant part of the complexity matters.

It's probably safest to go ahead and use a set there. You're less likely to find that a bottleneck as the system scales than the other way around.

If you really need to microtune everything, keep in mind that the implementation, cpu cache, etc... can affect it, so you may need to remicrotune differently for different platforms, and if you need performance that badly, Python was probably a bad choice - although maybe you can pull the hotspots out into C. :)

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
0

random access (look up) in dictionary are faster but creating hash table consumes more memory. more performance = more memory usages

it depends on how many items in your list.

aifarfa
  • 3,939
  • 2
  • 23
  • 35