0

I noticed some really bad performance, and my profiler pointed me to this construct:

right_mapping = next((                    
                        mapping
                        for mapping in ref_dat_mapping.map
                        if all(
                            legacy_object[k] == mapping[k]
                            for k in ref_dat_mapping.mapped_legacy_keys
                        )               
                      ), None)

I replaced it with the following, with instant improvements:

right_mapping = None
for mapping in ref_dat_mapping.map:
    if all(
        legacy_object[k] == mapping[k]
        for k in ref_dat_mapping.mapped_legacy_keys
        ):
            right_mapping = mapping
            break

Why is the latter construct twice as fast? Are there even faster ways of achieving this?

Fontanka16
  • 1,161
  • 1
  • 9
  • 37
  • There's no list comprehension here. – no comment Sep 27 '21 at 15:52
  • list comprehensions are generally faster than the speed of light – rv.kvetch Sep 27 '21 at 15:52
  • Added how much faster – Fontanka16 Sep 27 '21 at 15:54
  • 1
    In response to the question itself - I would quite like to know too, as from a quick glance it does look roughly equivalent (though when you say twice as fast, do you mean more like 0.001 vs 0.002 seconds, or 10 vs 20 seconds). In all fairness though, you should have written it the second way first time, it's easier to read. – Peter Sep 27 '21 at 15:55
  • 1
    This is probably down to the overhead of the execution control being passed back and forth between the generator and the consumer. These context changes don't occur within the for-loop. – user2390182 Sep 27 '21 at 15:59
  • 1
    @schwobaseggl The consumer is `next`. It doesn't switch back. – no comment Sep 27 '21 at 16:01
  • @don'ttalkjustcode True. It's still another stack element. – user2390182 Sep 27 '21 at 16:05
  • Try looking at the disassembled code for each. – Barmar Sep 27 '21 at 16:26
  • @schwobaseggl Yeah, just not as bad as your comment (or maybe an earlier version of it) made it sound to me. And amusingly, what you described *does* happen *more* between the `all` consumer and the generator it gets. Could be improved with `all(False for k in ref_dat_mapping.mapped_legacy_keys if legacy_object[k] != mapping[k])`, I think. Or by turning that into a `for` statement as well. – no comment Sep 27 '21 at 16:36
  • 2
    @Fontanka16 Realistic data would be very helpful for explaining and testing. – no comment Sep 27 '21 at 16:52

1 Answers1

2

If you see that much improvement with that, then probably the search ends very early and you're doing this very often, so other such micro-optimizations in there might help as well.

You could replace

    if all(
        legacy_object[k] == mapping[k]
        for k in ref_dat_mapping.mapped_legacy_keys
        ):

with

    if all(
        False
        for k in ref_dat_mapping.mapped_legacy_keys
        if not legacy_object[k] == mapping[k]
        ):

which avoids multiple back-and-forth between all and the generator.

Or turn that into a for statement as well:

right_mapping = None
for mapping in ref_dat_mapping.map:
    for k in ref_dat_mapping.mapped_legacy_keys:
        if not legacy_object[k] == mapping[k]:
            break
    else:
        right_mapping = mapping
        break

And maybe fetch ref_dat_mapping.mapped_legacy_keys beforehand instead of every time:

right_mapping = None
mapped_legacy_keys = ref_dat_mapping.mapped_legacy_keys
for mapping in ref_dat_mapping.map:
    for k in mapped_legacy_keys:
        if not legacy_object[k] == mapping[k]:
            break
    else:
        right_mapping = mapping
        break

And put it all inside a function if it's not already, so it benefits from faster local variables.

no comment
  • 6,381
  • 4
  • 12
  • 30