3

I have hundreds of rows of data that look like this:

[[u' 16 '], [u'1x23'], [u'Mr Test', u' (5)'], [u'John Smith'], [u'54.5'], [], [u'10%'], [u'40%'], [u'$26,503']]

Some of the values are nested and some also are empty.

I'm trying to massage it to be like this:

['16', '1x23', 'Mr Test', '(5)', 'John Smith', '54.5', '', '10%', '40%', '$26,503']

I've tried some ideas found on here like flattening, including the following routine:

def traverse(o, tree_types=(list, tuple)):
    if isinstance(o, tree_types):
        for value in o:
            for subvalue in traverse(value):
                yield subvalue
    else:
        yield o

This worked for some tables I've already parsed but only when there are no empty values.

croc
  • 155
  • 1
  • 6
  • for clarification, do you want empty values to result in `""` as i have seen you write, or ignored? – Inbar Rose Aug 14 '12 at 11:23
  • Show us the code you used to build `data`. Maybe we can suggest a way to build it in the form you want directly. – unutbu Aug 14 '12 at 11:32
  • One thing to watch out for: not all of your sub-lists are the same size. Some have one element, and others have two; if you flatten the list, then you may lose relationships such as between columns of a file, and end up indexing the wrong thing. A better question is: what do you want to do with the list? There may be an alternate data structure worth considering. – abought Aug 14 '12 at 11:41
  • Could you use some of the info here? http://stackoverflow.com/questions/2158395/flatten-an-irregular-list-of-lists-in-python – TakeS Aug 14 '12 at 11:43
  • 1
    Whatever you do, don't look to see if this has been asked before: http://stackoverflow.com/search?q=Flatten+list+of+lists+in+Python&submit=search – msw Aug 14 '12 at 11:52
  • @InbarRose, the empty values still need to be there. stummjr's answer works perfect for my needs. – croc Aug 14 '12 at 12:19
  • @abought, actually the flattening out is ok as long as the empty values stay in place. The rows need further massaging then zipped into a dict structure, before display, logging and finally into a db. – croc Aug 14 '12 at 12:22
  • @msw, went down that path already, that's where I got 'traverse' come from. stummjr's mod works like a charm – croc Aug 14 '12 at 12:32

3 Answers3

2

Try this,

sum((item or [""] for item in a), [])

Weird huh?

Jakob Bowyer
  • 33,878
  • 8
  • 76
  • 91
1

This will do the trick (even with empty values):

import operator
def flatten(a):
    return reduce(operator.add, a)
SlimJim
  • 2,264
  • 2
  • 22
  • 25
0

If your only problem is with empty values, you could check for it inside first if:

def traverse(o, tree_types=(list, tuple)):
    if isinstance(o, tree_types):
        if len(o) == 0:
            yield ''
        for value in o:
            for subvalue in traverse(value):
                yield subvalue
    else:
        yield o
Valdir Stumm Junior
  • 4,568
  • 1
  • 23
  • 31