1

I have the following list:

geo=[
[ ['a'],     ['b','c']     ],
[ ['d','e'], ['f','g','h'] ],
[ ['i']                    ]
]

My aim is to get a list of sublists: first sublist with elements in 1st position in original subsublist, second sublist with elements in 2nd position, third sublist with elemnts in 3rd position, and so on... In other words I need:

result=[
['a','b','d','f','i'],
['c','e','g'],
['h']
]

Bear in mind that number of elements in subsublist may vary, and number of subsublists inside sublists too. Unfortunately I cannot use Pandas or Numpy.

With zip and Alex Martelli's approach to flatten lists, I've been able to get a list with a tuple of firsts elements, but I cannot iterate along the rest of elements.

result=zip(*[item for sublist in geo for item in sublist])
# [('a', 'b', 'd', 'f', 'i')]

This is the last thing I need for a project which took me envolved for the last 4 weeks.. I'm almost done. Thank you very much in advance.

Victor
  • 133
  • 8

2 Answers2

2

You can use itertools.zip_longest (izip_longest in Python2):

import itertools
l = [[['a'], ['b', 'c']], [['d', 'e'], ['f', 'g', 'h']], [['i']]]
d= [list(filter(lambda x:x is not None, i)) for i in itertools.zip_longest(*[i for b in l for i in b])]
print(d)

Output:

[['a', 'b', 'd', 'f', 'i'], ['c', 'e', 'g'], ['h']]
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
  • That's a good one, although should be noted that `None` (or the `fillvalue` parameter to `zip_longest` and used to filter) must not be present in `geo`. Also, `filter` will remove every "falsy" value (e.g. empty string or 0), could use `filter(lambda i: i is not None, i)` if it were necessary to avoid that. – jdehesa May 23 '18 at 17:01
  • 1
    @jdehesa Good point regarding "falsy" values. Please see my recent edit. – Ajax1234 May 23 '18 at 17:02
  • Thank you both @Ajax1234 and dehesa for your help. You guys are terrific!! thank you very much! – Victor May 23 '18 at 17:11
1

You can do:

from itertools import chain

geo = [
[ ['a'],     ['b','c']     ],
[ ['d','e'], ['f','g','h'] ],
[ ['i']                    ]
]
c = list(chain.from_iterable(geo))
result = [[ci[idx] for ci in c if len(ci) > idx] for idx in range(max(map(len, c)))]
print(result)

Output:

[['a', 'b', 'd', 'f', 'i'], ['c', 'e', 'g'], ['h']]
jdehesa
  • 58,456
  • 7
  • 77
  • 121