Optimizing a feature in Python

Question

I have a piece of code that requires hours to run on long lists :

for elt in listCombinaisons_survivor:     
    effects = []
    for idBadge in elt:
        effects.append(dictBadges[idBadge]["Effect"])
    cpt = Counter(effects)
    for type_effect in cpt.keys():
        cpt_effect = cpt[type_effect]
        if cpt_effect > 3:
            listCombinaisons_survivor.remove(elt)
            break

My listCombinaisons_survivor contains all the possible combinations of 6 badges among 57 possibilities ([1,2,3,5,6,7],[1,2,3,5,6,8]...).

For each possible combination, I pick in a dictionary the "Effect" matching the badge ID (Badge ID 1 = Damage), and I collect all the effects in the list "effects".

I then count how much occurencies of each effect I have, and if I have more than 3 badges with the same effect in a combination, I delete this combination.

Is there any way I could optimize this code to make it faster ?

I also tried this but this is not faster :

newlist_combinations = []
for elt in listCombinaisons_survivor:     
    effects = []
    for idBadge in elt:
        effects.append(dictBadges[idBadge]["Effect"])
    cpt = Counter(effects)
    for type_effect in cpt.keys():
        cpt_effect = cpt[type_effect]
        if cpt_effect > 3:
            break
    if not cpt_effect > 3:
        newlist_combinations.append(elt)

I don't want to know how much time it takes, but how to optimize this code.

The whole code can be found there :

https://github.com/yirkkiller/Python/blob/master/badgesRepartition-NEW.py

Thanks !

Use a profile: https://docs.python.org/3/library/profile.html — , Apr 13 '18 at 11:49
Possible duplicate of [How can you profile a script?](https://stackoverflow.com/questions/582336/how-can-you-profile-a-script) — , Apr 13 '18 at 11:49
Not at all, I don't want to profile my script but optimize it to make it work ! :-) — Lory Michel, Apr 13 '18 at 11:53
dictBadges = { '43' = { "Letter" = "A", "Effect" = "Damage", "Rarety" = 5, ... }, ... } — Lory Michel, Apr 13 '18 at 11:57
thats just an empty dictionary, but it actually has data, share a snippet — eagle, Apr 13 '18 at 12:01
I am surprised you do not receive errors from trying to delete item(s) elt from listCombinaisons_survivor while iterating over it... As to your problem; a large list will always be time-itensive. If you could find a way to implement this with dictionaries instead, you would see a massive speed gain. (I am referring to implementing listCombinaisons_survivor as a dictionary versus a list and changing your paradigm some) — , Apr 13 '18 at 12:02
I clicked "add comment" too fast. The dictionary contains dictionaries named by the ID of the badge and containing strings matching the different caracteristics of the badge. dictBadges = { '43' = { "Letter" = "A", "Effect" = "Damage", "Rarety" = 5, ... }, '12' = { "Letter" = "B", "Effect" = "Health", "Rarety" = 4, ... } ... } — Lory Michel, Apr 13 '18 at 12:03
@Lory Michel, dictionary literals are written with {"name": "value"}, not {"name" = "value"} — , Apr 13 '18 at 12:08
I just guessing, but I think it should be `dictBadges = {'43': { "Letter": "A", "Effect": "Damage", "Rarety": 5,}, '12': { "Letter": "B", "Effect": "Health", "Rarety": 4 },}` — , Apr 13 '18 at 12:10
Only generate those combinations of badges that have an effect less than three times. It isn't hard to write and it will beat calling `combinations` and filtering. — Dan D., Apr 13 '18 at 12:13
Please post a [mcve]. Note that 57 choose 6 = 36288252, so if what you are doing with each such combination takes more than a fraction of a millisecond you can expect your code to take hours to run. — John Coleman, Apr 13 '18 at 12:16
@DanD. This is precisely what I have troubles to do. You can find the whole script in https://github.com/yirkkiller/Python/blob/master/badgesRepartition-NEW.py — Lory Michel, Apr 13 '18 at 12:21
@SurestTexas yes about the dictionary, those are not "=" but ":" obviously. — Lory Michel, Apr 13 '18 at 12:21
After 1,5 hour to run the script finished. But this part still has troubles. I find myself with a set of 6 badges with the same effect... :-( — Lory Michel, Apr 13 '18 at 12:23
after looking at your code, you would be helped a lot if you understand more basic and advanced python concepts, and change to an Object-Oriented approach — Maarten Fabré, Apr 13 '18 at 13:21

score 2 · Answer 1 · answered Apr 13 '18 at 12:27

Apart from the very unclear variable names, your algorithm basically comes down to this:

def filter_effects(listCombinaisons_survivor):
    for elt in listCombinaisons_survivor
        cpt = Counter(dictBadges[idBadge]["Effect"] for idBadge in elt)
        if all(value <= 3 for key, value in cpt.items()):
            yield elt

PS. If I look at your complete script, I would recommend you first learn the basics of Python. There are a lot of things that can be done a lot clearer, robuster and simpler

I suggest:

A whirlwind tour of Python (link)
Fluent Python (link)
'Looping like a Pro' by David Baumgold

Thanks Maarten :-) I'm indeed not a developer but a DevOps, so no proper training to development, just learning by myself. I know there are MANY things I don't master yet, I just try to make this script work as much as I can. — Lory Michel, Apr 13 '18 at 13:25

Optimizing a feature in Python

1 Answers1