Troubleshooting 'itertools.chain' object has no attribute 'getitem'

Question

I'm using itertools.chain method in Python to chain several Django Querysets together. By doing so, I'm not touching the database and this is the efficient behaviour I need. However, I'm using a third-party library to paginate these results and this library only accepts list and queryset objects. When calling it with the chain object I get the following error:

Exception Value: 'itertools.chain' object has no attribute '__getitem__'

The line in the library (django-pagemore) that is actually diving me crazy is:

objects = self.objects[page0*self.per_page:1+page*self.per_page]

The problem here is that when using a chain you can't slice it.

I know that I could convert the chain object into a list easily with list() method, but this would evaluate the ENTIRE queryset and this can contain thousands of items inside.

After some research on how to calculate the size of a Python object I did some testing and using sys.getsizeof(cPickle.dumps(content)) (where content is one of the objects inside the chain) gives me a value of 15,915 bytes, so a chain containing 3,000 of these objects would need 45.53 MB aprox!

`sys.getsizeof()` of the string produced by pickling is not a very good indicator of size, btw. — Martijn Pieters, Aug 14 '13 at 12:52

Martijn Pieters · Accepted Answer · 2013-08-14T12:56:42.227

6

itertools.chain() returns a iterable, not a sequence. You cannot index or slice an iterable.

Use itertools.islice() to define a subset; when looping over the islice() result, the underlying iterable will be advanced to the starting index, then will yield items until the end index:

objects = islice(self.objects, page0 * self.per_page, 1 + page * self.per_page)

This iterates over the chained sequence, so you cannot then access the items before the start index.

edited Aug 14 '13 at 12:56

answered Aug 14 '13 at 12:50

Martijn Pieters

1,048,767
296
4,058
3,343

1

Note that this also consumes part or all of the iterable as well. – Ignacio Vazquez-Abrams Aug 14 '13 at 13:00
@Caumons: Then you are slicing a chain that doesn't return enough objects, I suspect. It works just fine for me. – Martijn Pieters Aug 14 '13 at 13:21
@MartijnPieters OK, the problem was that I was iterating over the chain object before calling `islice()` to it, so the pointer was at the end of the iterable. However, I'm wondering if modifying the library will have side effects because `isslice()` will actually shorten the chain each time and when a chain is iterated, then trying to iterate it again will do nothing. I've tried to do a `deepcopy()` of the chain but it doesn't seem to work. – Caumons Aug 14 '13 at 13:40
@Caumons: That is the nature of an iterator. You **cannot** rewind them or iterate more than once. If you need random access to the items, then you are *forced* to use a list. That, or *recreate* the iterator (call `chain()` on the queries again). – Martijn Pieters Aug 14 '13 at 13:44
@Caumons: TLDR: you cannot iterate over an iterator more than once. – Martijn Pieters Aug 14 '13 at 13:44
So, if I'm forced to use a list... Which is a better way to calculate the memory used than the method I used? And another thing: working with a 50 MB list would cause problems to a webserver, I mean, is it a terribly huge object in memory that MUST be avoided? – Caumons Aug 14 '13 at 13:47
1

You are trading memory vs database bandwidth here, btw. The chain is based on a database query, so recreating the chain for a second run-through results in a new series of database queries plus data transfer from database to web server process. loading everything into memory *may* be faster, and memory is cheap. – Martijn Pieters Aug 14 '13 at 13:53
You'll need to recursively call `sys.getsizeof()` on the content object and it's attributes to calculate the memory size. Calling `sys.getsizeof()` on a pickle string only tells you how much memory that *string* takes, and a pickle is not necessarily a good way of representing how much memory the original object requires because a pickle needs to be importable cross-platform. – Martijn Pieters Aug 14 '13 at 13:55
A Python integer on a 64-bit machine takes more memory than on a 32-bit machine, but the pickle for that integer is *the same size* on either. And if pickle *were* a reasonable representation of a memory footprint, then just using the `len()` of the string would be much better to measure the size.. – Martijn Pieters Aug 14 '13 at 13:56
Sounds like you could benefit from using a sparse list filled only with the objects of the page, fooling the third-party library into retrieving from an almost-empty list. One example is the blist [http://stutzbachenterprises.com/blist/blist.html]. – augustomen Aug 14 '13 at 18:41
@MartijnPieters thanks for your comments! I'll take into account what you told me about measuring an object's memory size. (Maybe you want to add a better response to the linked question). Finally, to solve the problem I simply limited the query and used `list()` in conjuction with `chain()`, as it's the simplest approach I can think of. It may be considered quick & dirty, but at least it will work (I hope). I've accepted your answer because you explained me how to slice a chain! Thanks :) – Caumons Aug 16 '13 at 20:10

Troubleshooting 'itertools.chain' object has no attribute 'getitem'

1 Answers1

Linked

Troubleshooting 'itertools.chain' object has no attribute '__getitem__'

1 Answers1

Linked

Troubleshooting 'itertools.chain' object has no attribute 'getitem'