41

I'm working on pypreprocessor which is a preprocessor that takes c-style directives and I've been able to make it work like a traditional preprocessor (it's self-consuming and executes postprocessed code on-the-fly) except that it breaks library imports.

The problem is: The preprocessor runs through the file, processes it, outputs to a temporary file, and exec() the temporary file. Libraries that are imported need to be handled a little different, because they aren't executed, but rather they are loaded and made accessible to the caller module.

What I need to be able to do is: Interrupt the import (since the preprocessor is being run in the middle of the import), load the postprocessed code as a tempModule, and replace the original import with the tempModule to trick the calling script with the import into believing that the tempModule is the original module.

I have searched everywhere and so far and have no solution.

This Stack Overflow question is the closest I've seen so far to providing an answer: Override namespace in Python

Here's what I have.

# Remove the bytecode file created by the first import
os.remove(moduleName + '.pyc')

# Remove the first import
del sys.modules[moduleName]

# Import the postprocessed module
tmpModule = __import__(tmpModuleName)

# Set first module's reference to point to the preprocessed module
sys.modules[moduleName] = tmpModule

moduleName is the name of the original module, and tmpModuleName is the name of the postprocessed code file.

The strange part is this solution still runs completely normal as if the first module completed loaded normally; unless you remove the last line, then you get a module not found error.

Hopefully someone on Stack Overflow know a lot more about imports than I do, because this one has me stumped.

Note: I will only award a solution, or, if this is not possible in Python; the best, most detailed explanation of why this is not impossible.

Update: For anybody who is interested, here is the working code.

if imp.lock_held() is True:
    del sys.modules[moduleName]
    sys.modules[tmpModuleName] = __import__(tmpModuleName)
    sys.modules[moduleName] = __import__(tmpModuleName)

The 'imp.lock_held' part detects whether the module is being loaded as a library. The following lines do the rest.

Community
  • 1
  • 1
Evan Plaice
  • 13,944
  • 6
  • 76
  • 94
  • You are writing a pre-processor, you should parse the files before compiling them. I.e. you should be able to change the `import module` into `import post_processed_module` before the python runtime loads your file, by parsing the source, modifying it and putting it in a file. After you preprocessed all the source tree _then_ you may exec() the post-processed root file. – Iacopo Jun 18 '10 at 15:04
  • @lacopo Unfortunately, the preprocessor needs to be imported into the file it's preprocessing. It's sort of a, import the preprocessor and preprocessor directives will work in this file. IE. it's self-consuming. – Evan Plaice Jun 18 '10 at 15:34

3 Answers3

46

Does this answer your question? The second import does the trick.

Mod_1.py

def test_function():
    print "Test Function -- Mod 1"

Mod_2.py

def test_function():
    print "Test Function -- Mod 2"

Test.py

#!/usr/bin/python

import sys

import Mod_1

Mod_1.test_function()

del sys.modules['Mod_1']

sys.modules['Mod_1'] = __import__('Mod_2')

import Mod_1

Mod_1.test_function()
Ron
  • 998
  • 7
  • 6
  • 2
    Thank you so much, this is almost identical to my implementation but it helped me get it right with an actual useful working example. Note: the second 'import Mod_1' is redundant because the line before it already takes care of that. – Evan Plaice Jun 20 '10 at 06:58
  • 3
    @EvanPlaice the important thing is that you CAN do the second `import Mod_1`. Doing it doesn't reload or refresh the real module - it has been permanently replaced by `Mod_2`. – jwg Oct 26 '15 at 00:14
  • 1
    @Ron , I suppose this is only applicable to Python 2.x version. – Abhijeet Jul 13 '17 at 13:07
  • How do you do it for Python 3.x? – MathCrackExchange May 01 '19 at 03:28
14

To define a different import behavior or to totally subvert the import process you will need to write import hooks. See PEP 302.

For example,

import sys

class MyImporter(object):

    def find_module(self, module_name, package_path):
        # Return a loader
        return self

    def load_module(self, module_name):
        # Return a module
        return self

sys.meta_path.append(MyImporter())

import now_you_can_import_any_name
print now_you_can_import_any_name

It outputs:

<__main__.MyImporter object at 0x009F85F0>

So basically it returns a new module (which can be any object), in this case itself. You may use it to alter the import behavior by returning processe_xxx on import of xxx.

IMO: Python doesn't need a preprocessor. Whatever you are accomplishing can be accomplished in Python itself due to it very dynamic nature, for example, taking the case of the debug example, what is wrong with having at top of file

debug = 1

and later

if debug:
   print "wow"

?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Anurag Uniyal
  • 85,954
  • 40
  • 175
  • 219
  • @Anurag almost... that mimics the current default __import__ functionality. What I needed, was something that could get rid of the old import and load a new one under the old one's name. – Evan Plaice Jun 20 '10 at 06:39
  • 1
    @Anurag To answer "why does python need a preprocessor?". Lets say you have python 2 and python 3 code in the same file. So, you sprinkle 'if py2x' statements all over the code. Then python 4 comes out and you decide to drop support for 2, now you have to find all the if statements for py2x in the code. In my preprocessor, it's as easy as telling it to remove all code blocks under '#ifdef py2x'. It's not geared toward functionality it's for maintainability. I'm trying to create a better alternative to support 2x and 3x code to give library writers more incentive to support 3x. – Evan Plaice Jun 20 '10 at 06:44
  • @Evan Plaice, but why can't import hook be used to change loading of old module and instead you load new modules? – Anurag Uniyal Jun 20 '10 at 06:45
  • @Anurag If you run a self consuming example like the one in the question, the .pyc output should be stripped of all unnecessary meta-data having to do with the preprocessor including code that activates the preprocessor itself. There's no point in mentioning all it's features here. I'll just say, it's not your standard preprocessor. If you want to know more about it check out the project. – Evan Plaice Jun 20 '10 at 06:51
  • @Anurag it could, but you didn't create an example illustrating how. – Evan Plaice Jun 20 '10 at 07:17
  • @Evan Plaice, I thought replacing a module import by a class object was example enough. – Anurag Uniyal Jun 20 '10 at 07:46
0

In Python 2 there is the imputil module that seems to provide the functionality you are looking for, but has been removed in python 3. It's not very well documented but contains an example section that shows how you can replace the standard import functions.

For Python 3 there is the importlib module (introduced in Python 3.1) that contains functions and classes to modify the import functionality in all kinds of ways. It should be suitable to hook your preprocessor into the import system.

user541686
  • 205,094
  • 128
  • 528
  • 886
sth
  • 222,467
  • 53
  • 283
  • 367
  • I have gone down that path already. importlib was introduced in 3.1 but the guy who created it also has a project on PYPI that back-ports it to Python 2.3. See http://pypi.python.org/pypi/importlib/1.0.2. – Evan Plaice Jun 20 '10 at 06:37