0

I have a list that is composed of strings, integers, and floats, and nested lists of strings, integer, and floats. Here is an example

data = [
        1.0,
        'One',
        [1, 'Two'],
        [1, 'Two', ['Three', 4.5]],
        ['One', 2, [3.4, ['Five', 6]]]
    ]

I want each item of the list written to a line in a CSV file. So, given the above data, the file would look like this:

1.0
One
1,Two
1,Two,Three,4.5
One,2,3.4,Five,6

There are lots of resources about how to write a list to a file, but i have not seen any that do so independently of the nestedness of the list. I'm sure i could come up with something involving many loops, etc, but does anyone have a more elegant solution?

EDIT: The best thing i have come up with is to convert each item in the list to a string, then remove the extra characters ("[", "]", etc). Then you attach the item strings, and write the result to a file:

string = ''
for i in data:
    line = str(i).replace("[","")
    line = line.replace("]","")
    line = line.replace("'","")
    line = line.replace(" ","")
    string+=line + '\n'

# write string to file...

This just feels kludgey, and it is potentially harmful as it assumes the strings do not contain the brackets, quotes, or spaces. I'm looking for a better solution!

Nolan Conaway
  • 2,639
  • 1
  • 26
  • 42
  • Please show us what you have tried. If you havent, once you start off in a direction, someone can help you come up with an accurate solution. Infact, it should not be that hard - Open a csv file in write mode, and in a loop, start writing to the file. – karthikr Sep 30 '15 at 00:47
  • A more elegant solution *than what*? Show what you have tried... – David Zemens Sep 30 '15 at 00:49
  • 2
    I think it's simplest to [flatten each item first](http://stackoverflow.com/questions/10823877/what-is-the-fastest-way-to-flatten-arbitrarily-nested-lists-in-python) then save to csv. – zehnpaard Sep 30 '15 at 01:01

1 Answers1

4

What you ask is more-or-less impossible.

CSV is a flat, tabular storage format. The hierarchical nature of "arbitrarily nested lists" simply do not match or fit into a tabular structure well.

You can definitely flatten the nested list so that each first-level element of your nested list will appear on a single line of the output file. But that isn't CSV, strictly speaking. Some CSV readers may correctly read the data, but others will not. And, once flattened as in your example, you can never reconstruct the original list by reading the file.

Demonstration:

[1, ["Two", "Three"], 4.0]

and

[1, ["Two", ["Three"]], 4.0]

both will emit:

1
Two,Three
4.0

So on reading that file, the reader/parser won't know which of the original lists to return--the first, two-level list, or the second, three-level list. (I can make that counter-example arbitrarily complex and ugly.)

In general, nested / hierarchical structures and flat / tabular structures are just not easily or completely compatible.

If you want an easy storage format for an arbitrarily nested list, consider JSON or YAML. They provide easy, high-quality storage for nested data. E.g.:

import json

outpath = 'out.json'
with open(outpath, "w") as f:
    f.write(json.dumps(data))

would write your data to a file. To read it back in:

data = json.load(open(out path))

But if you really want CSV-ish text:

def flatten(l):
    """
    Flatten a nested list.
    """
    for i in l:
        if isinstance(i, (list, tuple)):
            for j in flatten(i):
                yield j
        else:
            yield i

def list2csv(l):
    """
    Return CSV-ish text for a nested list.
    """
    lines = []
    for row in l:
        if isinstance(row, (list, tuple)):
            lines.append(",".join(str(i) for i in flatten(row)))
        else:
            lines.append(str(row))
    return "\n".join(lines)

print list2csv(data)

Yields:

1.0
One
1,Two
1,Two,Three,4.5
One,2,3.4,Five,6
Jonathan Eunice
  • 21,653
  • 6
  • 75
  • 77
  • I'm aware if the incompatibility of the two forms. However, for my purposes, the original structure of the nested list is unimportant (arbitrary, actually). – Nolan Conaway Sep 30 '15 at 01:28
  • Then a flatten approach can work. It's a little more direct work than using an existing I/O module like `json`, and It's still not very well-formed CSV, but you can avoid most problems with flattening. – Jonathan Eunice Sep 30 '15 at 01:32
  • Okay, so I added list-flatten-to-CSV code. It's not ideal, but it will do what you asked. – Jonathan Eunice Sep 30 '15 at 02:04
  • That does it! I had been trying to no effect to put something similar together – Nolan Conaway Sep 30 '15 at 02:05
  • Note that there are some ongoing complexities here. What if there are Unicode characters in your strings? This handles that in Python 3, but needs a little more code in Python 2. Also, what if there are commas or embedded quotes in the strings? There needs to be more sophisticated quoting and escaping logic in a full, production-ready version of this. But as a proof of concept, it works pretty good. – Jonathan Eunice Sep 30 '15 at 04:37