The performance penalty definitely exists. In case a function is created inside a call to another function, the function object is really created every time the outer function is called. But that penalty is small and usually may be ignored. Especially taking into account the obvious fact: in most cases you should create a nested function only if it cannot be placed outside.
The reason why you may need to have a nested function is need to access the outer function's scope variables inside the nested function. Usually that will lead to directly or indirectly returning the inner function object from the outer function (like in decorators), or, maybe, to passing the inner function somewhere as a callback. The variables accessed by nested function will exist until the nested function object is destroyed, and they will be different for different instances of nested function since each one sees the variables from different scope instances.
To my mind, just comparing times required for creating an empty inner function to using the same function placed outside is almost pointless. The performance differences arise purely from differences in code behavior. The desired code behavior is what should make you select where to place your function.
Just a small illustration:
def outer(n):
v1 = "abc%d" % n
v2 = "def"
def inner():
print locals().keys()
return v1
v1 = "_" + v1
return inner
f1 = outer(1)
f2 = outer(2)
print f1()
print f2()
The output is:
['v1']
_abc1
['v1']
_abc2
The key moments:
Inner function's locals() include only the outer function locals it uses (v1, but not v2).
v1 is changed after the function object is created. However, the changes are still visible to inner function, even though v1's type is immutable (str). So, what the inner function sees is a real subset of outer function's locals, not just references stored at the moment of function object creation. Fortunately, existence of inner function object does not prevent scope variables other than v1 from destruction. If I replace v2 value with an object that prints something when being destroyed, it prints the message immediately when outer function exits.
Different instances of inner() do not share a single outer scope instance: v1 values differ.
All these effects simply can not be achieved without using a nested function. And that is why nested functions should be used, and in fact there is no performance penalty: extra behavior requires extra time. If you need that extra behavior, you should use nested functions. If you don't need it, you should not.