0

I have a file named example_dict.py

#This is a valid comment
{
    'key1': 'value1',
    'key2': 'value2',
    'key3': 'value3',
}

Then I read this file and transform dict:

from collections import OrderedDict
with open("example_dict.py") as fp:
    dict_from_file = OrderedDict( eval( fp.read() ) )

But this "dict_from_file" not have the same order key1, key2, key3.

How I can get this dict in same order.

moylop260
  • 1,288
  • 2
  • 13
  • 20
  • Eval will treat the data as an unordered dict which will scramble the keys. Casting to an Ordered Dict will be pointless. You may need to parse the data and feed it in token-by-token. – Mr. Polywhirl Sep 02 '14 at 23:55
  • Possible duplicate of: [Can I get JSON to load into an OrderedDict in Python?](http://stackoverflow.com/questions/6921699/can-i-get-json-to-load-into-an-ordereddict-in-python) – Mr. Polywhirl Sep 02 '14 at 23:56
  • @Mr.Polywhirl: While that question is related, I don't think the answer it describes is appropriate here. The file in this question contains a Python literal, not JSON. It's true that the example data is almost valid JSON (just change the single quotes to double quotes and get rid of the last comma), but I don't know if that will be true for the real data. – Blckknght Sep 03 '14 at 00:15
  • I was going to recommend using a parsing library like [`pyparsing`](http://pyparsing.wikispaces.com/) until I saw Jon Clements' answer. Now I think his idea is probably better… but it's still worth looking at `pyparsing` and playing with it if you never have before. (Or, if you know a bit more about parsers, search PyPI for other parser and pgen libraries that might be in a more familiar style.) – abarnert Sep 03 '14 at 00:40

3 Answers3

6

You can write a custom parser using the ast module, for a starter:

import ast
from collections import OrderedDict

with open('example_dict.py') as fin:
    parsed = ast.parse(fin.read())

first_dict = next(node for node in ast.walk(parsed) if isinstance(node, ast.Dict))
keys = (node.s for node in first_dict.keys)
vals = (node.s for node in first_dict.values)
od = OrderedDict(zip(keys, vals))
# OrderedDict([('key1', 'value1'), ('key2', 'value2'), ('key3', 'value3')])

Note that although this works with your example data - this needs a bit more work to make it more robust, but should serve as a starting point.

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • 2
    Clever solution! Although there might be a way to take advantage of even more of the `ast` machinery to make it more robust, by using a `NodeTransformer` to translating the dict literal into an `OrderedDict` constructor of tuples. Let me see if I can get that to work… But even if I can, yours will still probably be more readable and easier to understand as a starting point. – abarnert Sep 03 '14 at 00:39
  • @abarnert I don't have time to do so - like I said - this is a starting point... hopefully more inspirational than anything really :) But if you have time and can get the `NodeTransformer` approach working (which I think sounds quite feasible) then **please, please share** - I'd be very keen to see it. – Jon Clements Sep 03 '14 at 00:41
5

@JonClements' solution is beautiful and simple—but, as he points out, it's not that robust, because you're depending on the fact that each element of the dictionary display will evaluate to itself—and that you've got some arbitrary code of which the first valid dict literal is the only thing you care about.

A related idea would be to use ast.NodeTransformer transform the dict literal AST into an OrderedDict constructor AST, then just eval that.

Pros:

  • Once you get it working for trivial cases, it automatically works properly for more complex cases.
  • It's trivial to extend it from parsing single dict literals to converting all dict literals in an entire module (which you can then install as part of an import hook).
  • You get to learn more about how Python ASTs work.

Cons:

  • There's a lot more (and uglier) code to write to get it working for trivial cases.
  • Since you're not parsing the elements manually, it's not as easy to add in restrictions for, e.g., safely processing potentially malicious or incompetent input (e.g., by using literal_eval on each element).
  • You have to learn more about how Python ASTs work.

However, it's worth stepping back and asking whether you really want to write and use all this code. You might be a lot happier using something like MacroPy, which automates a lot of the clunky stuff being done here, and a lot of the stuff I'm not doing here (like installing import hooks), to let you concentrate on just the part of the transformation that's interesting to you. (Actually, I think MacroPy even comes with an odict literal as one of its builtin examples…)


Anyway, the transformer looks like this:

class DictToOrdered(ast.NodeTransformer):
    def visit_Dict(self, node):
        return ast.fix_missing_locations(ast.copy_location(
            ast.Call(
                func=ast.Attribute(
                    value=ast.Name(id='collections', ctx=ast.Load()),
                    attr='OrderedDict',
                    ctx=ast.Load()),
                args=[ast.Tuple(elts=
                        [ast.Tuple(elts=list(pair), ctx=ast.Load())
                         for pair in zip(node.keys, node.values)],
                        ctx=ast.Load())],
                keywords=[],
                starargs=None,
                kwargs=None),
            node))

This is a little uglier than usual, because dict literals don't have to have a context (because they can't be used as assignment targets), but tuples do (because they can), so we can't just copy the context the way we do the line numbers.

To use it:

def parse_dict_as_odict(src):
    import collections
    parsed = ast.parse(src, '<dynamic>', 'eval')
    transformed = DictToOrdered().visit(parsed)
    compiled = compile(transformed, '<dynamic>', 'eval')
    return eval(compiled)

That assumes you want to evaluate exactly one expression at a time, and that you want to do so within the current global/local environment, and that you don't mind inserting the collections module into that environment; if you look at the docs for compile, ast.parse, and eval it should be obvious how to change any of those assumptions.

So:

>>> src = '''
... {
...     'key1': 'value1',
...     'key2': 'value2',
...     'key3': 'value3',
... }
... '''
>>> parse_dict_as_odict(src)
OrderedDict([('key1', 'value1'), ('key2', 'value2'), ('key3', 'value3')])

If you want to learn more, without digging through the source code yourself, Green Tree Snakes is a great resource for understanding Python's ASTs and its ast module that I wish had been written a few years earlier. :)

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • I had feared it would be something like that :) But definitely +1 - great work – Jon Clements Sep 03 '14 at 01:15
  • 1
    On a side note: is there meant to be a "Cons:" there? – Jon Clements Sep 03 '14 at 01:27
  • @JonClements: Yeah, thanks; the second half of the list is cons. Unless you think uglier and more verbose is a pro. :) I'll edit it. – abarnert Sep 04 '14 at 01:08
  • @abarnert :) Well, all I can say is if I was the OP of the question, I wouldn't hesitate to accept this very well researched, explained and detailed answer. In a toss up between "simple and meets the spec" (which we know normally doesn't turn out to be the *actual* spec :P) and this one for being more robust... I'd pick this one. – Jon Clements Sep 04 '14 at 01:33
  • @JonClements: Honestly, I'd probably pick a link-only answer saying "use MacroPy" over either one, but that doesn't fit at Stack Overflow. :) – abarnert Sep 04 '14 at 06:36
1

Python dictionaries do not have any inherent order. You probably already know this, since you're trying to put your data into an instance of OrderedDict, which does maintain the order its values are added in.

However, the problem you're having is that your eval expression is producing an ordinary dict instance first, and only after the order has already been lost does it get passed on to OrderedDict.

There's no direct way around this. If you use eval to parse a file with a dictionary literal in it, it's going to give you a regular dict.

There are other options though. You could write your own parsing code, and create the values to put in the OrderedDict directly without creating a regular dict first. This would be somewhat complicated, and you should probably pick a better file format if this is the approach you go for.

If in fact you can change the file's contents, you could simply have the eval call create some other data structure which you can pass to OrderedDict without losing the ordering information. A list of (key,value) 2-tuples would be a good option, requiring no other changes to your code:

[
    ('key1', 'value1'),
    ('key2', 'value2'),
    ('key3', 'value3'),
]

Note that in some future version of Python, keyword arguments passed in function calls may get put into an OrderedDict rather than a dict (as described in PEP 468). If that happens, you could change your file contents to the following, and get an OrderedDict directly from eval:

OrderedDict(
    key1='value1',
    key2='value2',
    key3='value3',
)

Alas, if you try this today you'll run into the same issue your current code does (the keyword arguments are packed into a regular dict which discards their ordering before the OrderedDict code gets a look at them). The keyword arguments to the OrderedDict constructor are not terribly useful.

Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • Hello @Blckknght, thank for you reply. But this file is of other app. Then I don't have access to modify this file. – moylop260 Sep 03 '14 at 00:32
  • 1
    The solution provided at the end wouldn't work even if it were possible, because `OrderedDict` takes a `**kwargs` dict, meaning it will get those keys and values in arbitrary order, leaving you right where you started. – abarnert Sep 03 '14 at 00:41
  • @abarnert: That's exactly what I said in the answer. It may become an option if PEP 468 is enacted, as that proposes to make all **kwargs dictionaries `OrderedDicts`, but it won't work today. – Blckknght Sep 03 '14 at 08:24