19

Similar questions have been asked before, but the solutions to those don't work for my use case (e.g., Making a flat list out of list of lists in Python and Flattening a shallow list in Python. I have is a list of strings and lists, where embedded list can also contain strings and lists. I want to turn this into a simple list of strings without splitting strings into list of characters.

import itertools

list_of_menuitems = ['image10', ['image00', 'image01'], ['image02', ['image03', 'image04']]]
chain = itertools.chain(*list_of_menuitems)

Resulting list:

['i', 'm', 'a', 'g', 'e', '1', '0', 'image00', 'image01', 'image02', ['image03', 'image04']]

Expected result:

['image10', 'image00', 'image01', 'image02', 'image03', 'image04']

What's the best (Pythonic) way to do this?

Community
  • 1
  • 1
Ian Gow
  • 3,098
  • 1
  • 25
  • 31
  • See: http://stackoverflow.com/questions/16176742/python-3-replacement-for-deprecated-compiler-ast-flatten-function – Jon Clements Jul 25 '13 at 17:12
  • I agree that it's an almost-duplicate of http://stackoverflow.com/questions/5286541/how-can-i-flatten-lists-without-splitting-strings. The one dimension here that is missing from that question (which I hadn't found before asking my question) is the issue of arbitrary levels of nesting. However, the solution posted there (and at http://stackoverflow.com/questions/16176742/python-3-replacement-for-deprecated-compiler-ast-flatten-function) handles this issue quite nicely, at least for the case I provided. – Ian Gow Jul 25 '13 at 17:22
  • OP: use `basestring` instead of `str` so that you don't split `unicode`. – 2rs2ts Jul 25 '13 at 17:44

5 Answers5

17

The oft-repeated flatten function can be applied to this circumstance with a simple modification.

from collections import Iterable
def flatten(coll):
    for i in coll:
            if isinstance(i, Iterable) and not isinstance(i, basestring):
                for subc in flatten(i):
                    yield subc
            else:
                yield i

basestring will make sure that both str and unicode objects are not split.

There are also versions which count on i not having the __iter__ attribute. I don't know about all that, because I think that str now has that attribute. But, it's worth mentioning.

(Please upvote the linked answer.)

Community
  • 1
  • 1
2rs2ts
  • 10,662
  • 10
  • 51
  • 95
  • 1
    I do suspect it would have been better to just link to that answer, that way it would serve as direction, and wouldn't introduce duplication. – Morgan Wilde Jul 25 '13 at 19:41
12

Using recursion.

def flatten(A):
    rt = []
    for i in A:
        if isinstance(i,list): rt.extend(flatten(i))
        else: rt.append(i)
    return rt

Test:

>>> list_of_menuitems = ['image10', ['image00', 'image01'], ['image02', ['image0
3', 'image04']]]
>>> flattern(list_of_menuitems)
['image10', 'image00', 'image01', 'image02', 'image03', 'image04']
emehex
  • 9,874
  • 10
  • 54
  • 100
rnbguy
  • 1,369
  • 1
  • 10
  • 28
9

The following works for strings (and would be easily adapted to other types):

def flatten_to_strings(listOfLists):
    """Flatten a list of (lists of (lists of strings)) for any level 
    of nesting"""
    result = []

    for i in listOfLists:
        # Only append if i is a basestring (superclass of string)
        if isinstance(i, basestring):
            result.append(i)
        # Otherwise call this function recursively
        else:
            result.extend(flatten_to_strings(i))
    return result

flatten_to_strings(list_of_menuitems)
Out[2]: ['image10', 'image00', 'image01', 'image02', 'image03', 'image04']
Ian Gow
  • 3,098
  • 1
  • 25
  • 31
  • 2
    This has some redundancy - isinstance takes account of inheritance, and you could switch the none test to the recursive branch – Marcin Jul 25 '13 at 17:13
  • @marcin Thanks for your comment. I tweaked my code to address these points. – Ian Gow Jul 25 '13 at 19:25
  • 2
    I like how straightforward this one is. In python 3.2 I just had to replace basestring with str and it works EXACTLY as I wanted. – user46147 Sep 04 '20 at 17:29
5

In one specialized case when none of the list items contains one of the following delimiters []', you can use the following hack. I have not profiled it, but it looks obvious that, this would have a better performance than the obvious and cleaner recursive solution.

>>> str(list_of_menuitems).translate(None,"[]'").split(',')
['image10', ' image00', ' image01', ' image02', ' image03', ' image04']

I agree, this is a dirty hack, but does the JOB, without much effort.

Abhijit
  • 62,056
  • 18
  • 131
  • 204
  • 1
    I would downvote this, except that you explained clearly its drawbacks. So it's a good reference I suppose. – 2rs2ts Jul 25 '13 at 17:42
1

This is a generic recursive flatten which can be used to work with any combination of types which should or should not be flattened:

import collections
def generic_flatten(seq, flatten_types=(tuple,list,set),atom_types=(basestring,dict),fixtype=True):
    newseq = []
    for item in seq:
        if (not isinstance(collections.Iterable)) or any(isinstance(i,t) for t in atom_types):
           newseq.append(item)
        elif any(isinstance(i,t) for t in flatten_types): # set flatten_types to (object,) or (collections.Iterable,) to disable check
           newseq.extend(generic_flatten(item, flatten_types, atom_types,fixtype)
    if fixtype and type(newseq) is not type(seq):
       newseq = type(seq)(newseq)
    return newseq

yield and chain could be used to create a generic iterator-based version.

Marcin
  • 48,559
  • 18
  • 128
  • 201