2

I can't find an __iter__() method defined in rpy2.robjects.DataFrame, nor in any of its base classes*

Yet, I can use this code to convert a DataFrame into a dict:

from rpy2.robjects import DataFrame
dataframe = DataFrame(...)

d = dict(zip(dataframe.names, map(list, list(dataframe))))

Why doesn't list(dataframe) in the above code trigger a TypeError: 'DataFrame' object is not iterable?


* Determined by running the following code:

def test_attr(cls, attr):
  if attr in cls.__dict__:
    print cls.__name__
  else:
    for base in cls.__bases__:
      test_attr(base, attr)
Python 2.7.8 (default, Oct 18 2014, 05:53:47)
... 
>>> from rpy2.robjects import DataFrame
>>> test_attr(DataFrame, '__iter__')
Community
  • 1
  • 1
  • There is `__getslice__` in rpy2.robjects.Vector. Also, `__getitem__`. – dom0 Nov 12 '14 at 22:25
  • Can you just type `rpy2.robjects.Dataframe.__iter__` at the interactive prompt and see whether it prints out a function or builtin function or raises an `AttributeError`? That would make the answer a lot easier to find. – abarnert Nov 12 '14 at 22:34
  • From your edit, have you still not tried just printing out `Dataframe.__iter__`? While that won't identify the class it's defined in in all cases, it's at least a quick and reliable way to test whether it exists… – abarnert Nov 13 '14 at 19:35

2 Answers2

2

I think every robject implements rinterface

you can see the __iter__ method in

https://bitbucket.org/lgautier/rpy2/src/08ec0c15bd5ef8170ad8a49c2dc2b4a8dea36d64/rpy/rinterface/_rinterface.c?at=default#cl-2446

at least I think ... it gets pretty tangled pretty quick

Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
  • I tried looking at the code and decided it wasn't worth trying to untangle… Nice detective work. – abarnert Nov 12 '14 at 22:35
  • 1
    There is a class diagram in the documentation: http://rpy.sourceforge.net/rpy2/doc-2.5/html/robjects.html#class-diagram – lgautier Nov 12 '14 at 23:33
  • 1
    thats good and all but doesnt help too much here .. it doesnt even show rinterface ... even though it is clearly a base class type – Joran Beasley Nov 12 '14 at 23:59
  • @lgautier: Well, I only got as far as `Vector` before giving up, so that might have helped me at least cut part the search out… but fortunately, Joran found it in less time than it took me to give up anyway. :) – abarnert Nov 13 '14 at 00:01
1

The list method works in terms of the iter method.* And, as the docs say:

Without a second argument, object must be a collection object which supports the iteration protocol (the __iter__() method), or it must support the sequence protocol (the __getitem__() method with integer arguments starting at 0).


Here's an example of a class that's iterable** without defining __iter__:

class Range10(object):
    def __getitem__(self, i):
        if i < 10: return i
        raise IndexError
r = Range10()
list(r)

The output will be [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].


If you're curious, this "sequence protocol" if effectively how for loops worked in early Python, but the modern definition was created for backward compatibility back when iterators were added in Python 2.2.*** It could have been removed in 3.0, but there were good arguments for why it was useful, so it stayed.****


* Actually, at least in CPython, that's not how it actually works, but it's documented to work as if it were calling iter.

** But notice that it's not an Iterable, even though that's one of the few "automatic ABCs" that you don't have to inherit from/register with. The documentation explicitly doesn't say that Iterable means iterable; it says "See also the definition of iterable".

*** For example, third party libraries like numeric, the predecessor to today's numpy, provided collection classes that worked in for loops in Python 2.1, and they wanted them to keep working even though for loops were now implemented in terms of iterators.

**** I don't remember what exactly the arguments were, but it must have had something to do with certain classes being more readable/easier to understand by thinking in terms of the sequence protocol instead of manually reproducing the same thing in terms of the iteration protocol. You'd have to hunt through the python-3000 list archives for details.

abarnert
  • 354,177
  • 51
  • 601
  • 671