1

I'm considering using user-readable file format for my Python app. Right now I'm using pickle to store my data in binary. I'm not sure if XML or JSON is a way to go but basically my file contains list of lists that looks like this:

[1, 'the name of the set', [[1, 'data1', 'data2'],[2,'data3','data4']]

The list that hold the other lists containing the strings can have multiple items (even hundreds). Basically, I'd like to have something that has easy interface to convert it to/from python list and I need those integers to stay integers.

user3056783
  • 2,265
  • 1
  • 29
  • 55
  • You need to take another look at your data structure. Multiple nested `list`s with a leading numbering element will not help you produce good code. – TigerhawkT3 May 12 '15 at 01:40
  • [YAML](http://pyyaml.org/) ftw -- It's much nicer to read for humans! – James Mills May 12 '15 at 01:45
  • pickle will keep the types, if you go json or xml, it won't. A nice alternative is t ouse protobuf, which has strong typing and allow both human-readable and binary efficient utputs – Bruce May 12 '15 at 01:45
  • My app already uses pickle so I know it works. Yeah I know the code doesn't look pretty but I wanted to have user-editable file format, hence the leading numbers. They specify ordering and user can edit just the numbers. I have parser in my app which then re-orders and re-numbers the item in list, then saves to a file again. Will check out YAML for future projects. Thank you! – user3056783 May 12 '15 at 03:03

2 Answers2

2

I'd use YAML here specifically PyYAML; IHMO it's much nicer to read for humans!

Example: (by hand)

foo:
    - 1
    - 2
    - 3

Nesting excluded from above example

Use: yaml.dump() and friends.

Demo:

>>> import yaml
>>> data = {"foo": [[1, 2, 3], [4, 5, 6]]}
>>> print yaml.dump(data)
foo:
- [1, 2, 3]
- [4, 5, 6]

NB: As JSON is a subset of YAML neither will preserve complex types; only basic types are supported; int, float, list, dict and str.

James Mills
  • 18,669
  • 3
  • 49
  • 62
  • Actually @JamesMills, YAML is a superset of JSON: http://stackoverflow.com/a/1729545/1567452 – jwilner May 12 '15 at 01:55
  • 1
    Something I have never really used but it is a neat format. – Padraic Cunningham May 12 '15 at 02:03
  • I've been going through a big YAML phase, but I have mixed feelings about it. I use it for all sorts of config files, and I think it shines for more complicated data structures, or when your readers are non-programmers. Still, when it's a simpler data structure, all the white space starts to feel silly, and if all your readers are programmers who are looking for direct mappings to data structures, the curly and square brackets definitely end up being more legible. My coworkers have been ragging on me. – jwilner May 12 '15 at 02:22
  • Well to each to their own; it's just one of those "subjective" things really. Personally I find YAML much more readable; especially if it's just raw dumped JSON without any formatting applied :) – James Mills May 12 '15 at 02:23
2

That's already literal JSON. JSON's probably the most popular format out there, and it's hard to argue with its legibility.

In [105]: my_list = [1, 'the name of the set', [[1, 'data1', 'data2'],[2,'data3','data4']]]
In [106]: my_list == json.loads(json.dumps(my_list))
Out[106]: True
jwilner
  • 6,348
  • 6
  • 35
  • 47
  • 1
    "hard to argue with its legibility" <-- except for the curly braces of course :) – James Mills May 12 '15 at 01:53
  • Cool didn't know that cause I kind of came up with it by accident :) Just a few more questions: 1. In order to make it more legible, can I format it (maybe each list item on new line) when I'm dumping it into the file? 2. Can I write it to file without using binary? Can it cause issues on different platforms to the aforementioned formatting? – user3056783 May 12 '15 at 03:02