4

I was just wondering if there is a more cpu efficient way of writing the following loop as I need to speed my program up?

for char in data:
    if char in self.key:
        match += chr(self.key.index(char))

Thanks in advance for any help.

David Hoerster
  • 28,421
  • 8
  • 67
  • 102
Clinton Moffat
  • 211
  • 1
  • 2
  • 12

3 Answers3

7

Replace self.key with a dictionary; it is the membership testing against a list, as well as the .index() calls that are costing you the most performance; both require scans across the list.

Use str.join() to concatenate a series of characters; that builds one new string object instead of N new objects:

keys = {char: chr(i) for i, char in enumerate(self.key)}
match = ''.join([keys[char] for char in data if char in keys])

Dictionary membership tests and lookups are O(1) constant cost; by building the dictionary with char(..) values you can avoid multiple chr() calls per value; depending on how many values are re-used it could be faster to use char: i instead and move the chr() call to the list comprehension.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Just a little confused: why wouldn't `match = ''.join(keys[char] for char in data if char in keys)` work? Forgive me if it is something stupid I'm missing. –  Aug 31 '13 at 16:44
  • @iCodez: See [list comprehension without \[ \], Python](http://stackoverflow.com/a/9061024) for the reason why; due to how `str.join()` operates a list comp is more efficient than a generator expression. – Martijn Pieters Aug 31 '13 at 16:51
  • Thanks to everyone who responded. Lots of choices, went with Martijn's answer as it had the most votes. No offence to all those who provided other great alternatives as well. It really was a lot quicker than my attempt. Thanks again Clinton. – Clinton Moffat Aug 31 '13 at 16:58
4

Yes, use a dictionary instead of a list. index operation is slow (it's O(log(N)) as is the check if the element is in the list, while dictionary access is O(1))

self.map = dict(zip(self.key, range(len(self.key)))
for char in data:
     if char in self.map:
        match += chr(self.map[char])

Also change the constant adding to the string to just one string concatenation using join and a generator expression inside (avoid list creation):

result = ''.join(chr(self.map[char]) for char in data if char in self.map)
Viktor Kerkez
  • 45,070
  • 12
  • 104
  • 85
  • Thanks to everyone who responded. Lots of choices, went with Martijn's answer as it had the most votes. No offence to all those who provided other great alternatives as well. It really was a lot quicker than my attempt. Thanks again Clinton. – Clinton Moffat Aug 31 '13 at 16:59
  • @ViktorKerkez : Using a list comprehension is actually faster in the case of `join(...)`. – Sukrit Kalra Aug 31 '13 at 17:13
3
match = ''.join(char for char in data if char in self.key)
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • Thanks to everyone who responded. Lots of choices, went with Martijn's answer as it had the most votes. No offence to all those who provided other great alternatives as well. It really was a lot quicker than my attempt. Thanks again Clinton. – Clinton Moffat Aug 31 '13 at 16:59