1

Background: I need to read the same key/value from a dictionary (exactly) twice.

Question: There are two ways, as shown below,

Method 1. Read it with the same key twice, e.g.,

sample_map = {'A':1,}
...
if sample_map.get('A', None) is not None:
    print("A's value in map is {}".format(sample_map.get('A')))

Method 2. Read it once and store it in a local variable, e.g,

sample_map = {'A':1,}
...
ret_val = sample.get('A', None)
if ret_val is not None:
    print("A's value in map is {}".format(ret_val))

Which way is better? What are their Pros and Cons?

Note that I am aware that print() can naturally handle ret_val of None. This is a hypothetical example and I just use it for illustration purposes.

Georgy
  • 12,464
  • 7
  • 65
  • 73
C. Yan
  • 11
  • 2
  • By the way, the example is fine. Sure, `print` can handle `None`, but it would handle it by saying `A's value in map is None`, which isn't true. So a check like this makes sense. – abarnert Aug 01 '18 at 18:24
  • Thanks all of you for your great inputs. This example is really for illustration purposes, so what value is displayed is less important :). I am looking for comments more from perspectives of performance and readability. Many thanks for your comments again. – C. Yan Aug 03 '18 at 03:32

3 Answers3

1

Under these conditions, I wouldn't use either. What you're really interested in is whether A is a valid key, and the KeyError (or lack thereof) raised by __getitem__ will tell you if it is or not.

try:
    print("A's value in map is {}".format(sample['A'])
except KeyError:
    pass

Or course, some would say there is too much code in the try block, in which case method 2 would be preferable.

try:
    ret_val = sample['A']
except KeyError:
    pass
else:
    print("A's value in map is {}".format(ret_val))

or the code you already have:

ret_val = sample.get('A')  # None is the default value for the second argument
if ret_val is not None:
    print("A's value in map is {}".format(ret_val))
chepner
  • 497,756
  • 71
  • 530
  • 681
0

There isn't any effective difference between the options you posted.

Python: List vs Dict for look up table

Lookups in a dict are about o(1). Same goes for a variable you have stored.

Efficiency is about the same. In this case, I would skip defining the extra variable, since not much else is going on.

But in a case like below, where there's a lot of dict lookups going on, I have plans to refactor the code to make things more intelligible, as all of the lookups clutter or obfuscate the logic:

# At this point, assuming that these are floats is OK, since no thresholds had text values
                if vname in paramRanges:
                    """
                    Making sure the variable is one that we have a threshold for
                    """
                    # We might care about it
                    # Don't check the equal case, because that won't matter
                    if float(tblChanges[vname][0]) < float(tblChanges[vname][1]):

                        # Check lower tolerance
                        # Distinction is important because tolerances are not always the same +/-
                        if abs(float(tblChanges[vname][0]) - float(tblChanges[vname][1])) >= float(
                                paramRanges[vname][2]):
                            # Difference from default is greater than tolerance
                            # vname : current value, default value, negative tolerance, tolerance units, change date
                            alerts[vname] = (
                                float(tblChanges[vname][0]), float(tblChanges[vname][1]), float(paramRanges[vname][2]),
                                paramRanges[vname][0], tblChanges[vname][2]
                            )
                        if abs(float(tblChanges[vname][0]) - float(tblChanges[vname][1])) >= float(
                                paramRanges[vname][1]):
                            alerts[vname] = (
                                float(tblChanges[vname][0]), float(tblChanges[vname][1]), float(paramRanges[vname][1]),
                                paramRanges[vname][0], tblChanges[vname][2]
                            )
Gigaflop
  • 390
  • 1
  • 13
0

In most cases—if you can't just rewrite your code to use EAFP as chepner suggests, which you probably can for this example—you want to avoid repeated method calls.


The only real benefit of repeating the get is saving an assignment statement.

If your code isn't crammed in the middle of a complex expression, that just means saving one line of vertical space—which isn't nothing, but isn't a huge deal.

If your code is crammed in the middle of a complex expression, pulling the get out may force you to rewrite things a bit. You may have to, e.g., turn a lambda into a def, or turn a while loop with a simple condition into a while True: with an if …: break. Usually that's a sign that you, e.g., really wanted a def in the first place, but "usually" isn't "always". So, this is where you might want to violate the rule of thumb—but see the section at the bottom first.


On the other side…

For dict.get, the performance cost of repeating the method is pretty tiny, and unlikely to impact your code. But what if you change the code to take an arbitrary mapping object from the caller, and someone passes you, say, a proxy that does a get by making a database query or an RPC to a remote server?

For single-threaded code, calling dict.get with the same arguments twice in a row without doing anything in between is correct. But what if you're taking a dict passed by the caller, and the caller has a background thread also modifying the same dict? Then your code is only correct if you put a Lock or other synchronization around the two accesses.

Or, what if your expression was something that might mutate some state, or do something dangerous?

Even if nothing like this is ever going to be an issue in your code, unless that fact is blindingly obvious to anyone reading your code, they're still going to have to think about the possibility of performance costs and ToCToU races and so on.

And, of course, it makes at least two of your lines longer. Assuming you're trying to write readable code that sticks to 72 or 79 or 99 columns, horizontal space is a scarce resource, while vertical space is much less of a big deal. I think your second version is easier to scan than your first, even without all of these other considerations, but imagine making the expression, say, 20 characters longer.


In the rare cases where pulling the repeated value out of an expression would be a problem, you still often want to assign it to a temporary.

Unfortunately, up to Python 3.7, you usually can't. It's either clumsy (e.g., requiring an extra nested comprehension or lambda just to give you an opportunity to bind a variable) or impossible.

But in Python 3.8, PEP 572 assignment expressions handle this case.

if (sample := sample_map.get('A', None)) is not None:
    print("A's value in map is {}".format(sample))

I don't think this is a great use of an assignment expression (see the PEP for some better examples), especially since I'd probably write this the way chepner suggested… but it does show how to get the best of both worlds (assigning a temporary, and being embeddable in an expression) when you really need to.

abarnert
  • 354,177
  • 51
  • 601
  • 671