0

I am having difficulties with the nested lists of lists in python (this is the structure of geojson coordinates )

Here an example

updated example to avoid confusion

DictofCoordinates = {
'a': [1,1],
'b': [[2, 2], [2,2], [2, 2]],
'c': [[[3,3], [3, 3], [3, 3]]],
'd': [[[41, 41], [41, 41]], 
  [[42, 42], [42, 42]]]
 }

what I want to get is the lists which do not contains anyhing else than the pairs (of coordinates). this is what I call "atomic list of list" (for lack of a better term)

so

 - for a : the list  [1, 1]
 - for b : [[2, 2], [2,2], [2, 2]]
 - for c : [[3,3], [3, 3], [3, 3]]
 - for d : the two lists [[41, 41], [41, 41]] and  [[42, 42], [42, 42]]]

taking inspriation from here that is what I tried

def ExplodeTolist(xList):
for x1 in xList:
    if isinstance(x1[0], (float, int, long)):
        yield x1
    else:
        for x2 in ExplodeTolist(x1):
            yield x2 

but it does not work

for x in ExplodeTolist(DictofCoordinates.values()):
   print x        

any help appreciated. Thanks

Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
user1043144
  • 2,680
  • 5
  • 29
  • 45
  • What it the output you want to get? – tmr232 Jun 19 '14 at 21:27
  • What on earth is an *"atomic list of list"*? – jonrsharpe Jun 19 '14 at 21:28
  • I think he just wants to flatten it .... but the question is what does __*does not work*__ actually mean? – Joran Beasley Jun 19 '14 at 21:33
  • Thanks to tmr232, jon and joran. I have edited the question. I do not wnat to flatten the list (see edit) – user1043144 Jun 19 '14 at 21:37
  • what output do you want? – Padraic Cunningham Jun 19 '14 at 21:40
  • Okay, you basically showed us the output that you **want**? But what is the output that you actually **get**? – Dan Lenski Jun 19 '14 at 21:43
  • The only real difference I see between your input and output is for the key `c`, where you remove one layer of nesting. Is that correct? – merlin2011 Jun 19 '14 at 21:46
  • When you return "two lists" for key `d` they are either going to be a tuple containing two lists, or a list containing two lists. In the latter case, that is exactly what the input is. – merlin2011 Jun 19 '14 at 21:47
  • is it important that the original ordering is maintained? for instance, should the atoms of [[1,2][[[3,4],[5,6]],[[7,8],[9,0]]]] always be [[1,2],[3,4],[5,6],[7,8],[9,0]] or can they be in different orders? – colinro Jun 19 '14 at 21:56
  • Hm. The expected output is inconsistent with the old expected output for `b`. Here, you ask for a list of lists. Before, you wanted the elements separately. Which is it? – jpmc26 Jun 19 '14 at 23:00

3 Answers3

2

If I understand correctly, you just want to process the contents of each list, if it contains more lists. Otherwise, you want to return the list itself. I think the big mistake you're making is that your function is recursive, so it goes as deep as possible, and you end up with just an iterator over all points. Try this instead:

# You might want to modify this method to
# return False if it passes isinstance(x, basestring)
def is_iterable(x):
    try:
        iter(x)
        return True
    except TypeError:
        return False

def get_elements(coordinate_dict):
    for v in coordinate_dict.values():
        if is_iterable(v[0]):
            for i in v:
                yield i
        else:
            yield v

When it finds iterable contents, it iterates through the list and returns the elements. If the contents of the list are not iterable, it just returns the list. The key difference is that when it finds iterable contents, it iterates only one layer deep.

As seen in the comments, there's some debate over how to test if something is iterable. I recommend seeing In Python, how do I determine if an object is iterable? and its answers for more discussion of that topic.

Here is the output. It's a little out of order because dict is unordered, but all the elements are there:

>>> for i in get_elements(d):
...      print i
...
[1, 1]
[[3, 3], [3, 3], [3, 3]]
[2, 2]
[2, 2]
[2, 2]
[[41, 41], [41, 41]]
[[42, 42], [42, 42]]
Community
  • 1
  • 1
jpmc26
  • 28,463
  • 14
  • 94
  • 146
  • `from collections import Iterable` – Padraic Cunningham Jun 19 '14 at 22:07
  • a faster implementation of is_iterable would return `True if getattr(x, '__iter__', False) else False`. It's 2-3 times slower to rely on exception handling – colinro Jun 19 '14 at 22:15
  • @colinro I based my iterable test off [this answer](http://stackoverflow.com/a/1952481/1394393). In my opinion, it's simply the most reliable way to tell if something is iterable. Any number of the techniques there could be employed. I don't know of any clear consensus on the Pythonic way of doing this test. – jpmc26 Jun 19 '14 at 22:17
  • `if isinstance(v[0],Iterable):` – Padraic Cunningham Jun 19 '14 at 22:19
  • @PadraicCunningham Yes, I understood that. I don't know what risks that entails; seems kind of iffy to me. The method I chose relies on plain old duck typing with the "easier to ask forgiveness" principle, two foundational principles in Python. Regardless, if the OP doesn't like my choice, they can easily pick another. – jpmc26 Jun 19 '14 at 22:21
  • I added an answer using collections.Iterable as I think it is the best solution, I don't see how it would be "iffy"? – Padraic Cunningham Jun 19 '14 at 22:30
  • @PadraicCunningham Try an instance of this class and see what your suggestion returns: `class X(object): __iter__ = 1`. On my system, it shows `True`, which is "what the heck?" worthy in my book. Calling `iter` on an instance of that class throws a `TypeError`. – jpmc26 Jun 19 '14 at 22:37
  • I don't understand what that has to do with this question, you can use collections.Iterable to flatten any iterable containing any iterables which is what this question is related to. – Padraic Cunningham Jun 19 '14 at 22:42
  • @PadraicCunningham You asked why it was "iffy;" I gave an example. The point is that `isinstance(x, Iterable)` is black magic. I chose a simpler, more reliable, easier to understand solution for testing if something is iterable. If I'm going to do something and I can make it more flexible and forgiving with no extra effort, that's a win, especially in Python, where you can never be totally sure what kind of type you're dealing with. It's just a matter of good general practices. – jpmc26 Jun 19 '14 at 22:47
  • I still don't see the "iffy" part when we are dealing with flattening lists, I use a slightly different version of my code to flatten lists with mixed types, tuples,lists etc.. and it works perfectly. How is defining a function simpler than `from collections import Iterable`? – Padraic Cunningham Jun 19 '14 at 22:52
  • @PadraicCunningham It's not. The `try`/`catch` is simpler than the `isinstance` check because the `isinstance` check is doing very weird things behind the scenes. That's all I'm going to say on this. You posted your answer, and this one is mine. I appreciate your initial suggestion, and I made a decision after considering it. I even learned something in the process. Thank you. – jpmc26 Jun 19 '14 at 23:02
  • @jpmc26: thanks a lot. I learned a lot from the code. Your code works , I figure I would need just to add another condition to avoid it exploding also the coordinate lists. I am accepting the answer of Padraic (second function) as it does exactly that. – user1043144 Jun 20 '14 at 04:03
1

You actually only need to check if element[0] is a list.

def flatten(items):
    for elem in items:
        if isinstance(elem[0],list):
            for sub_elem in elem:
                yield sub_elem
        else:
            yield elem

print list(flatten(DictofCoordinates.values())) 
[[1, 1], [[3, 3], [3, 3], [3, 3]], [2, 2], [2, 2], [2, 2], [[41, 41], [41, 41]], [[42, 42], [42, 42]]]

To match your new output:

def flatten(items):
    for elem in items:
        if sum(isinstance(i, list) for i in elem) == 0 or sum(isinstance(i, list) for i in elem[0]) == 0:
            yield elem
        else:
            for sub_elem in elem:
                yield sub_elem

print (list(flatten(DictofCoordinates.values())))
[[1, 1], [[3, 3], [3, 3], [3, 3]], [[2, 2], [2, 2], [2, 2]], [[41, 41], [41, 41]], [[42, 42], [42, 42]]]
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0

This function will return the atoms as you describe.

def getAtoms(lst):
    for l in lst:
        yield l
ScottO
  • 129
  • 4
  • No it won't. That's just all the elements of the outermost list – colinro Jun 19 '14 at 21:51
  • Each value in the dictionary is a list. If you pass this function one of those lists, e.g. the value d: [[[41, 41], [41, 41]], [[42, 42], [42, 42]]], it will yield [[41, 41], [41, 41]] and then [[42, 42], [42, 42]] in two successive gen.next() calls. This is what the OP example shows. – ScottO Jun 19 '14 at 22:28
  • I see what you mean now. I had been interpreting the question differently – colinro Jun 19 '14 at 22:33