5

USAGE CONTEXT ADDED AT END

I often want to operate on an abstract object like a list. e.g.

def list_ish(thing):
    for i in xrange(0,len(thing)):
        print thing[i]

Now this appropriate if thing is a list, but will fail if thing is a dict for example. what is the pythonic why to ask "do you behave like a list?"

NOTE:

hasattr('__getitem__') and not hasattr('keys')

this will work for all cases I can think of, but I don't like defining a duck type negatively, as I expect there could be cases that it does not catch.

really what I want is to ask.
"hey do you operate on integer indicies in the way I expect a list to do?" e.g.

  thing[i],  thing[4:7] = [...],   etc.

NOTE: I do not want to simply execute my operations inside of a large try/except, since they are destructive. it is not cool to try and fail here....

USAGE CONTEXT -- A "point-lists" is a list-like-thing that contains dict-like-things as its elements. -- A "matrix" is a list-like-thing that contains list-like-things

-- I have a library of functions that operate on point-lists and also in an analogous way on matrix like things.

-- for example, From the users point of view destructive operations like the "spreadsheet-like" operations "column-slice" can operate on both matrix objects and also on point-list objects in an analogous way -- the resulting thing is like the original one, but only has the specified columns.

-- since this particular operation is destructive it would not be cool to proceed as if an object were a matrix, only to find out part way thru the operation, it was really a point-list or none-of-the-above.

-- I want my 'is_matrix' and 'is_point_list' tests to be performant, since they sometimes occur inside inner loops. So I would be satisfied with a test which only investigated element zero for example.

-- I would prefer tests that do not involve construction of temporary objects, just to determine an object's type, but maybe that is not the python way.

in general I find the whole duck typing thing to be kinda messy, and fraught with bugs and slowness, but maybe I dont yet think like a true Pythonista

happy to drink more kool-aid...

Dan Oblinger
  • 489
  • 3
  • 15
  • 4
    The accepted way to do Duck Typing in Python is to treat it like a duck, then use try/except to find out that it didn't work. Can you be a little more descriptive about why you don't think that will work in your case? – Mark Ransom May 20 '15 at 00:06
  • Just check the type of `thing`? E.g. http://stackoverflow.com/questions/1835018/python-check-if-an-object-is-a-list-or-tuple-but-not-string – 101 May 20 '15 at 00:09
  • Are you just trying to fail-fast? Or do you really need to determine the type? If someone passes in a duck-typed "list" that doesn't derive `list`, how much of the `list` functionality do you require, or expect that they'll implement? – Dan Getz May 20 '15 at 00:12
  • Don't forget that you could make your function create a new list and return it instead of modifying in-place. Then the "try and fail" approach would work much better. – Dan Getz May 20 '15 at 00:42
  • 1
    Right off the bat, it's more Pythonic to use `for item in thing: print item` than to iterate over a sequence of indices. – chepner May 20 '15 at 00:44
  • I added a usage context section above. does that clarify what I am looking for. p.s. @chepner I intentionally operated on a list in a non-pythonic way, just to highlight that this would fail for a dict, but your code would not fail for a dict. – Dan Oblinger May 20 '15 at 00:46
  • Thanks, but I don't get it yet... what do you mean by "column-slice" as a destructive operation? Your description doesn't sound destructive, as you mention returning a new thing different from the original thing. That would be non-destructive, right? – Dan Getz May 20 '15 at 01:25
  • Checking for `keys` is pretty much how most of the Python implementation code I've seen distinguishes sequences and mappings when it needs to. – user2357112 May 20 '15 at 01:27
  • 1
    @dangetz just as 'sort' returns no value, instead directly modifies the list it is sorting, my column_slice operation modified all points within the pointlist or it modifies the inner lists within a matrix. So it is destructive -- my use case is serial operations over very large datasets, so like the numpy module, I allow the user to modify these structures in place, and require them to explicitly copy them, when they want to separate effect in on place from actions in another place. – Dan Oblinger May 20 '15 at 01:30
  • I feel that python is missing the ability to test if an object adheres to a particular interface requirement -- i understand such interfaces could be arbitrarily complex -- still in the case of lists and dicts it seems bad that there is not a reliable way to test if one should operate on an object as if it were a list. but maybe with more kool-aid I will yet believe (grin) – Dan Oblinger May 20 '15 at 01:34
  • @DanOblinger: The `collections.abc` classes were supposed to address the problem, but there are a lot of missing `__subclasshook__` methods preventing them from actually being a solution. I'm not sure how possible it is to write satisfactory `__subclasshook__`s for some of them. – user2357112 May 20 '15 at 01:41
  • Well, don't forget that what you "should" do is not a problem of someone else's making, it's you that decided your function should accept multiple types of objects as input. You can make your function accept as many or as few types as you want, and you can make multiple versions of your methods that operate on different types of things. Just document it for your users. Also, you've never really been clear as to what's wrong with `isinstance` for your purposes. Have you really investigated that solution? – Dan Getz May 20 '15 at 01:41
  • @user2357112: Yes it seems that python realized late, that sometime interfaces are a good thing, but has not managed to implement a solution to for them. Really the 90% solution would for the Python spect to write in English a precise list of properties expected of a list_like_thing and a dict_like_thing, and then have a single "one-and-only-one" way a code writer to assert that an object is such a thing. as it stands now, with all the duck typing around, we are generally supposed to correctly operate on things even if they have not declared themselves to be conformant. – Dan Oblinger May 20 '15 at 02:10
  • @DanGetz: I dont this perspective reflects the way the python librarys themselves operate: when it makes sense a single function (like the 'write' function on a file object) may perform different but analgous operations given different types of arguments. This is a *GOOD* thing, since I don't want to have to remember multiple function names when their action is "the same" at some logical level. So I think my goal is a good one, and is a (rare) example of where strong typing seems to simplify things. – Dan Oblinger May 20 '15 at 02:14
  • @DanGetz: and you asked if 'isinstance' would suffice. perhaps it could be made to do so, but my plan is to extend numpy objects so that they expose list-of-dict like views... this would allow me to write nify scripts operating on these objects as if they were 'normal' python objects, but under the covers they are funky other things, which do not allow normal pythons slicing and dicing. My goal is that my current libraries simply operate on those objects as if they were normal python list-of-dict or list-of-list – Dan Oblinger May 20 '15 at 02:29

3 Answers3

4

One thing you can do, that should work quickly on a normal list and fail on a normal dict, is taking a zero-length slice from the front:

try:
    thing[:0]
except TypeError:
    # probably not list-like
else:
    # probably list-like

The slice fails on dicts because slices are not hashable.

However, str and unicode also pass this test, and you mention that you are doing destructive edits. That means you probably also want to check for __delitem__ and __setitem__:

def supports_slices_and_editing(thing):
    if hasattr(thing, '__setitem__') and hasattr(thing, '__delitem__'):
        try:
            thing[:0]
            return True
        except TypeError:
            pass
    return False

I suggest you organize the requirements you have for your input, and the range of possible inputs you want your function to handle, more explicitly than you have so far in your question. If you really just wanted to handle lists and dicts, you'd be using isinstance, right? Maybe what your method does could only ever delete items, or only ever replace items, so you don't need to check for the other capability. Document these requirements for future reference.

Dan Getz
  • 8,774
  • 6
  • 30
  • 64
1

When dealing with built-in types, you can use the Abstract Base Classes. In your case, you may want to test against collections.Sequence or collections.MutableSequence:

if isinstance(your_thing, collections.Sequence):
    # access your_thing as a list

This is supported in all Python versions after (and including) 2.6.

If you are using your own classes to build your_thing, I'd recommend that you inherit from these abstract base classes as well (directly or indirectly). This way, you can ensure that the sequence interface is implemented correctly, and avoid all the typing mess.

And for third-party libraries, there's no simple way to check for a sequence interface, if the third-party classes didn't inherit from the built-in types or abstract classes. In this case you'll have to check for every interface that you're going to use, and only those you use. For example, your list_ish function used __len__ and __getitem__, so only check whether these two methods exist. A wrong behavior of __getitem__ (e.g. a dict) should raise an exception.

l04m33
  • 586
  • 3
  • 12
  • Relying on those ABCs is dangerous, though; they're missing `__subclasshook__`s, so for example, `issubclass(shelve.Shelf, collections.Mapping)` is `False`, even though `shelve.Shelf` is standard library code. – user2357112 May 20 '15 at 02:15
  • @user2357112 This is a bug in the library then. And I believe it's fixed in 3.4 – l04m33 May 20 '15 at 02:24
  • It's not fixed. `shelve.Shelf` inherits from `MutableMapping` in Python 3.4, so that example doesn't reproduce the bug any more, but you still run into the same problem with any class that doesn't inherit from the ABC and isn't explicitly registered as a subclass. [There's an open bug report on the bug tracker about it.](https://bugs.python.org/issue23864) – user2357112 May 20 '15 at 02:27
1

Perhaps their is no ideal pythonic answer here, so I am proposing a 'hack' solution, but don't know enough about the class structure of python to know if I am getting this right:

def is_list_like(thing):
    return hasattr(thing, '__setslice__')

def is_dict_like(thing):
    return hasattr(thing, 'keys')

My reduce goals here are to simply have performant tests that will:

  • (1) never call a dict-thing, nor a string-like-thing a list List item
  • (2) returns the right answer for python types
  • (3) will return the right answer if someone implement a "full" set of core method for a list/dict
  • (4) is fast (ideally does not allocate objects during the test)

EDIT: Incorporated ideas from @DanGetz

Dan Oblinger
  • 489
  • 3
  • 15
  • Looks faster, yes. Are you intending to call it on every element of a sequence? I thought you had sequences of which all elements were of the same type, in which case speed would not be much of an issue. You should be aware that the official documentation does not mention the `__setslice__` special method. The "official" way appears to be calling `__setitem__` with a `slice()` value. – Dan Getz May 20 '15 at 02:47
  • @DanGetz: Thanks for you patience. I incorporated your 'official' approach, but in a way that only performs hasattr. (Yes in the data martialing routines I will be performing a large 'type_case' on each element as I recurse thru structures, in order to perform the correct specialized dump/load for those structures.) – Dan Oblinger May 20 '15 at 03:44
  • Sorry, I should have been more specific: I wasn't talking about a `list.slice()` method; that doesn't exist. I was referring to the built-in [slice objects](https://docs.python.org/3/library/functions.html#slice), which can be passed in to `__setitem__`. – Dan Getz May 20 '15 at 03:59
  • @DanGetz, got it. yes I just verified that 'slice' was not an attr of basestring nor, of dict, but did not verify that it was a attr of list ! Ok I will revert my answer. (for the record, I came up with my answer by doing set differences on the dir() for the respective classes. The whole thing is hacky, but it will serve my purpose. Thanks much for your dedication! – Dan Oblinger May 20 '15 at 04:55