5

Summary: I want to take a variable of type 'module' and export it.

I'm importing a python module from a .py file using import and making changes to it. I need to export the module back to a file or obtain a string representation of the complete module that could then be written to disk.

I have been unable to find any way to export a python module or a way to convert objects within the module to strings in structured, plain text python-executable format. (not json, pickling, etc)

Detailed Question & Use Case: This requirement is part of an internal build process; there are no security requirements and only our own modules, and not built in modules, are being modified. A python script runs with business logic to modify a number of other scripts. This process uses information only available at build time. As a result I do not have the option to import a module with varying data at runtime.

The initial system used a template with placeholder strings that would be replaced but the current requirements require more complex modifications to object declarations where programatically modifying the object is far easier than string replacements.

What I've Done With the master generator script, written in python, I can import multiple modules (which have only variable declarations and no executable code) and make all the substitutions that I need. I'm left with a variable of type module that I need to export back to a file to later be executed.

@abarnert had some good ideas. I was unaware of the repr function. That got me the information but without any formatting. This led me to look at pprint which is as close as I've gotten so far to what I'd prefer.

Example example.py

    sample = {
    'level1_dict_1' : {
        'key1' : 'value1',
        'key2' : {
            'level2_dict' : {
                'key1' : 'value3',
                'key2' : ['list1','list2','list3','list4'],
            }
        }
    },
    'level1_dict_2' : {
        'key1' : 'value1',
        'key2' : 'value2',
    },
}

Greatly simplified (and without any business logic applied) I basically want to do the following:

with open("example.py","w") as outfile:
    example = __import__('example') # Import module 
    example.sample['level1_dict_1']['key2']['level2_dict']['key2'][2] = "newlistitem3"  # Change 1 property in a list nested a few levels deep
    outfile.write("sample = \n" + pprint.pformat(example.sample)) # 

I'd love to have the same formatting as my source file but pprint, while readable, has different formatting than I would prefer. This might be as close as I can get to what I need though.

pprint output:

sample = 
{'level1_dict_1': {'key1': 'value1',
                   'key2': {'level2_dict': {'key1': 'value3',
                                            'key2': ['list1',
                                                     'list2',
                                                     'newlistitem3',
                                                     'list4']}}},
 'level1_dict_2': {'key1': 'value1', 'key2': 'value2'}}

EDITS & Clarifications: - My goal is to load a module, modify it, and save it back as an executable python file. That is the reason I'm objecting to pickle,json,etc. I need to produce a single executable py file. - Rewrote use case for clarification - Added examples and more information on things I've tried

Eric
  • 349
  • 2
  • 8
  • 17
  • 2
    Why do you need to import the files? Can you not use a template system like django's to produce customized PY files from templates? – jarmod Aug 02 '13 at 01:18
  • Or create _data_ files that your Python files read and generate objects from at runtime, instead of trying to create Python files? – abarnert Aug 02 '13 at 01:20
  • I think this is what you are looking for: http://stackoverflow.com/questions/14163532/is-there-anything-like-python-export – Jocke Aug 02 '13 at 01:22
  • Alternatively… if your objects are really nothing but literal strings, numbers, and builtin constants and (recursively) lists and dicts built out of them… calling `json.dumps`, or `yaml.dumps`, or even just `repr` will give you executable source code that generates the same objects. Is that good enough? – abarnert Aug 02 '13 at 01:23
  • Meanwhile, what exactly is wrong with, say, JSON for your use case? It _is_ a structured plain text format, so if the only reason you can't use JSON is that you need a structured plain ext format… then you can use JSON. – abarnert Aug 02 '13 at 01:30
  • Why does `pickle` not work for you? Or even `json`? Even if your data doesn't support JSON encoding immediately, you can write a lightweight encoder, pass that as a kwarg to `dump`, and voila, JSON. – roippi Aug 02 '13 at 01:33
  • @abarnet - I used structured output in the wrong context. I meant to say I needed an executable python file as my output. I rewrote the question to clarify why I skipped json. jocke - I did read that question however that made changes to an imported module but didn't provide any way to export it back to disk. – Eric Aug 06 '13 at 01:08
  • One thing to keep in mind with `repr` is that, unlike something designed for persistence or interchange, if you give it something that can't be `repr`-`eval`-looped, instead of getting an error, you'll successfully generate useless output. For example, try it on an instance of a simple custom class, and you'll get something like ``. If you want to verify that the output is usable, you may want to call `ast.literal_eval` on each formatted string before you save it. If that fails, most likely the generated code will also fail. – abarnert Aug 06 '13 at 02:00
  • This link was also helpful. Shows using json.dumps with indentation. For my requirements this gave me the output in the format I was looking for. Wasn't looking for data portability but the format happens to be evaluate with the data types I'm exporting http://stackoverflow.com/a/3314411/783106 – Eric Aug 07 '13 at 21:21

1 Answers1

6

What you're asking for isn't possible. You're trying to create source code out of object code. While there are decompiler projects that can do this to a greater or lesser extent, a fully general solution is intractable.

If you just want to get the original source to the module, that's easy. For example, you can use inspect.getsource, then write the resulting string to a file. Or just use inspect.getfile and copy from the resulting pathname.

And you can of course modify the original source while copying it. For example:

source = inspect.getsource(foo)
with open('newfoo.py', 'wb') as f:
    f.write(source)
    f.write('spam = Spam(3)\neggs = Eggs(spam)\n')

But you can't modify foo and then re-generate its source.


However, there are probably much better ways to do what you really need to do. For example:

  • Use JSON, pickle (mode 0), YAML, etc. Despite what you claim, these are structured, readable plain-text formats, just like Python source.

  • Use repr. For string, numeric, and builtin-constant literals, and lists and dicts containing nothing but (recursively) the above types, repr is round-trippable. For example:


with open('newfoo.py', 'wb') as f:
    for name, value in foo.__dict__.items():
        f.write('{} = {!r}\n'.format(name, value))
  • If you're pretty sure that all of your values are repr-able, but not positive, there are a number of ways to check or sanitize your output before writing it out. For example:

with open('newfoo.py', 'wb') as f:
    for name, value in foo.__dict__.items():
        if ast.literal_eval(repr(value)) != value:
            raise ValueError('Tried to save {}'.format(value))
        f.write('{} = {!r}\n'.format(name, value))
  • Create data files in some nice format that's ideal for your code generator, human readers, etc. instead of trying to generate Python source, then write simple Python code that reads the data files and generates the objects at runtime.
abarnert
  • 354,177
  • 51
  • 601
  • 671
  • This is very useful and mentioned a few techniques I was not previously aware of. I rewrote my question to reflect exactly what I was trying. Instead of a long use case I made a simple example. Using the techniques specified here I can fully accomplish my task of importing a module, making changes, then writing out to a file each attribute and contents. If this is as far as I get it would be good enough but if there was any way to better control the format of the output that would be ideal. Humans manually modify these files and the whitespace helps with that. – Eric Aug 06 '13 at 01:11