2

Python 3

Data might be a string, bytes or an EmailMessage.

Is this the most effective way to get the bytes out of data?

    while True:
        try:
            # data is a string
            f.write(data.encode('utf-8'))
            break
        except:
            pass
        try:
            # data is an EmailMessage
            f.write(data.get_bytes())
            break
        except:
            pass
        try:
            # data is bytes
            f.write(data)
            break
        except:
            pass
        else:
            self.log.info('Exiting, unknown attachment type')
            sys.exit(1)
Duke Dougal
  • 24,359
  • 31
  • 91
  • 123

2 Answers2

7
if hasattr(data, "encode"):
    f.write(data.encode('utf-8'))
elif hasattr(data, "get_bytes"):
    f.write(data.get_bytes())
else:
    try:
        f.write(data)
    except TypeError:
        self.log.info('Exiting, unknown attachment type')
        sys.exit(1)

Edit:

This is ducktyping - as in "if it quacks like a duck, it is a duck; if it encodes, it is a string".

This is preferred over if isinstance(data, str) because it is less restrictive; so long as an object knows how to encode itself to bytes, we don't really care if it is actually a string or not.

Exceptions are relatively slow and should be reserved for handling unexpected or unlikely errors.

Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99
  • 1
    Really? I've been reading about ducktyping and that I should be just trying to use expected methods and exception handle them if the method is not present. – Duke Dougal May 25 '14 at 00:47
  • 1
    @DukeDougal That's not really a good characterisation of duck typing. Duck typing is "just use the properties of the input that you need"; whether you handle the exceptions that result when those properties do **not** hold is mostly irrelevant. The key point is that if someone can make those properties hold for some other type that's completely different from what you were thinking of when you wrote your code, your code should still work on it. – Ben May 25 '14 at 00:54
  • @DukeDougal In this case, you're trying to make code that works provided the input has an `encode` method, *or* a `get_bytes` method, `or` can be directly handled by `f.write`. Those are the properties you need, so directly testing for those is fine; there's no "virtue" in using exceptions to switch between the different cases you're trying to handle. It's duck typing because you're not checking whether `data` is an `EmailMessage` before treating it like one, you're just seeing whether it has a `get_bytes` method. – Ben May 25 '14 at 00:57
  • `hasattr` is implemented (https://docs.python.org/3/library/functions.html#hasattr) by trying to get the attribute and handling the exception. So performance is a red herring. But your answer is nice otherwise. – aychedee May 25 '14 at 01:15
  • 2
    @aychedee, compared to real work, sure. But if this is all you're doing in a tight loop, handling an exception has about a 25% performance penalty. `hasattr` intercepts the exception at the C API level, which is more efficient than handling it in bytecode in the ceval loop. – Eryk Sun May 25 '14 at 01:45
1

From wikipedia:

For example, in a non-duck-typed language, one would create a function that requires that the object passed into it be of type Duck, in order to ensure that that function can then use the object's walk and quack methods. In a duck-typed language, the function would take an object of any type and simply call its walk and quack methods, producing a run-time error if they are not defined. Instead of specifying types formally, duck typing practices rely on documentation, clear code, and testing to ensure correct use.

This doesn't mean you can call whatever methods you feel like, and catch an exception if it doesn't exist. You still need to make some effort to only call methods you are sure will be there. Duck-typing is more related to the fact that you don't care what the type of the object is (if you passed it into a function, you don't specify the type as you would in Java), but are still sure that it will respond to a specific method call.

In your case, what I would do is create a wrapper around each of those objects (String, EmailMessage, bytes), with a common method get_data, where each implementation of get_data is specific to the type of object it wraps around (encode, get_bytes etc.). Then your loop would look as follows:

while True:
    try:
        # data is a string
        f.write(data.get_data())
        break
    except:
        pass

Edit: You may also want to see this question which is related to yours: How to handle "duck typing" in Python?

Community
  • 1
  • 1
Martin Konecny
  • 57,827
  • 19
  • 139
  • 159