Here is a summary of definitions.
container
- An object with a
__contains__
method
generator
- A function which returns an iterator.
iterable
- A object with an
__iter__()
or __getitem__()
method.
- Examples of iterables include all sequence types (such as list,
str, and tuple) and some non-sequence types like dict and file.
- When an iterable object is passed as an argument to the builtin
function
iter()
, it returns an iterator for the object. This
iterator is good for one pass over the set of values.
iterator
- An iterable which has a
next()
method.
- Iterators are required to have an
__iter__()
method that returns the iterator object itself.
- An iterator is
good for one pass over the set of values.
sequence
- An iterable which supports efficient element access using integer
indices
via the
__getitem__()
special method and defines a len()
method that returns
the length of the sequence.
- Some built-in sequence types are
list
, str
,
tuple
, and unicode
.
- Note that dict also supports
__getitem__()
and
__len__()
, but is considered a mapping rather than a sequence because the
lookups use arbitrary immutable keys rather than integers.
Now there is a multitude of ways of testing if an object is an iterable, or iterator, or sequence of some sort. Here is a summary of these ways, and how they classify various kinds of objects:
Iterable Iterator iter_is_self Sequence MutableSeq
object
[] True False False True True
() True False False True False
set([]) True False False False False
{} True False False False False
deque([]) True False False False False
<listiterator> True True True False False
<generator> True True True False False
string True False False True False
unicode True False False True False
<open> True True True False False
xrange(1) True False False True False
Foo.__iter__ True False False False False
Sized has_len has_iter has_contains
object
[] True True True True
() True True True True
set([]) True True True True
{} True True True True
deque([]) True True True False
<listiterator> False False True False
<generator> False False True False
string True True False True
unicode True True False True
<open> False False True False
xrange(1) True True True False
Foo.__iter__ False False True False
Each columns refers to a different way to classify iterables, each rows refers to a different kind of object.
import pandas as pd
import collections
import os
def col_iterable(obj):
return isinstance(obj, collections.Iterable)
def col_iterator(obj):
return isinstance(obj, collections.Iterator)
def col_sequence(obj):
return isinstance(obj, collections.Sequence)
def col_mutable_sequence(obj):
return isinstance(obj, collections.MutableSequence)
def col_sized(obj):
return isinstance(obj, collections.Sized)
def has_len(obj):
return hasattr(obj, '__len__')
def listtype(obj):
return isinstance(obj, types.ListType)
def tupletype(obj):
return isinstance(obj, types.TupleType)
def has_iter(obj):
"Could this be a way to distinguish basestrings from other iterables?"
return hasattr(obj, '__iter__')
def has_contains(obj):
return hasattr(obj, '__contains__')
def iter_is_self(obj):
"Seems identical to col_iterator"
return iter(obj) is obj
def gen():
yield
def short_str(obj):
text = str(obj)
if text.startswith('<'):
text = text.split()[0] + '>'
return text
def isiterable():
class Foo(object):
def __init__(self):
self.data = [1, 2, 3]
def __iter__(self):
while True:
try:
yield self.data.pop(0)
except IndexError: # pop from empty list
return
def __repr__(self):
return "Foo.__iter__"
filename = 'mytestfile'
f = open(filename, 'w')
objs = [list(), tuple(), set(), dict(),
collections.deque(), iter([]), gen(), 'string', u'unicode',
f, xrange(1), Foo()]
tests = [
(short_str, 'object'),
(col_iterable, 'Iterable'),
(col_iterator, 'Iterator'),
(iter_is_self, 'iter_is_self'),
(col_sequence, 'Sequence'),
(col_mutable_sequence, 'MutableSeq'),
(col_sized, 'Sized'),
(has_len, 'has_len'),
(has_iter, 'has_iter'),
(has_contains, 'has_contains'),
]
funcs, labels = zip(*tests)
data = [[test(obj) for test in funcs] for obj in objs]
f.close()
os.unlink(filename)
df = pd.DataFrame(data, columns=labels)
df = df.set_index('object')
print(df.ix[:, 'Iterable':'MutableSeq'])
print
print(df.ix[:, 'Sized':])
isiterable()