-2

Edit: Why are people downvoting this post? Are Python developers really this inept? It's a legitimate question, not one that's been answered in other places. I searched for a solution. I'm not an idiot. One parameter has a value and the other one is undefined, but if you actually read the post, you will see that both of them appear to be equally scoped.

First of all, I assure you that this question is unlike other questions involving the error message:

UnboundLocalError: local variable referenced before assignment closure method

As I'm looking at this code, it appears that the parameter, uuidString, of the top-level method, getStockDataSaverFactory, should actually be in-scope when the method returns its inner method, saveData, as a first-class function object... because to my amazement, the parameter tickerName IS in-scope and does have the value 'GOOG' when the saveData() method is called (e.g. by the test method testDataProcessing_getSaverMethodFactory), so we can actually see that it has an actual value when the method, getDataMethodFactory(..) is called, unlike uuidString.

To make the matter more obvious, I added the lines:

localUuidString = uuidString

and

experimentUuidString = localUuidString

to show that the parameter uuidString has an available value when the method is inspected by a breakpoint.

def getStockDataSaverFactory(self, tickerName, uuidString, methodToGetData, columnList):
    # This method expects that methodToGetData returns a pandas dataframe, such as the method returned by: self.getDataFactory(..)
    localUuidString = uuidString
    def saveData():
        (data, meta_data) = methodToGetData()
        experimentUuidString = localUuidString
        methodToNameFile = self.getDataMethodFactory(tickerName, uuidString)
        (full_filepathname, full_filename, uuidString) = methodToNameFile()
        methodToSaveData = self.getDataFrameSaverFactory(methodToGetData, columnList, full_filepathname)
        # We might want try/catch here:
        methodToSaveData()
        # A parameterless method that has immutable state (from a closure) is often easier to design around than one that expects parameters when we want to pass it with a list of similar methods
        return (full_filepathname, full_filename, uuidString)
    return saveData


def testDataProcessing_getSaverMethodFactory(self):
    dataProcessing = DataProcessing()
    getSymbols = dataProcessing.getSymbolFactory(
        dataProcessing.getNasdaqSymbols(dataProcessing.getListOfNASDAQStockTickers))
    tickers = getSymbols()
    uuidString = 'FAKEUUID'
    columnList = ['low', 'high']
    tickerSubset = tickers[0:2]
    methodsToPullData = map(lambda ticker: dataProcessing.getStockDataSaverFactory(ticker,
                                                                         uuidString,
                                                                         dataProcessing.getDataFactory(
                                                                             ticker),
                                                                         columnList), tickerSubset)
    savedPathTuples = [f() for f in methodsToPullData]
    savedFileNames = [pathTuple[0] for pathTuple in savedPathTuples]


    for fileName in savedFileNames:
        self.assertTrue(os.path.isfile(fileName))
        os.remove(fileName)

Just to make it clear that uuidString has no value but ticker does have a value, I'm including this screenshot:

Screenshot of PyCharm with breakpoint

Notice that in the variable watch window, uuidString is undefined, but ticker has the string value of "A".

Is there something unique about Python (or Python 3) that is resulting in this behavior?

devinbost
  • 4,658
  • 2
  • 44
  • 57
  • Can you post a full stack trace? Something isn't adding up here. In practical application, there's nothing _wrong_ with what you've shown. See this [repl.it](https://repl.it/repls/DifficultAltruisticLing) - a function defined and returned maintains access to the scope where it was defined - unless something else is going on that isn't shown. – g.d.d.c Dec 31 '17 at 03:42
  • That's an interesting way to create an object `dataProcessing = DataProcessing.DataProcessing()`? Any particular reason why it isn't just `dataProcessing = DataProcessing()`? – Miket25 Dec 31 '17 at 03:45
  • 6
    For goodness sake, make shorter function names....! And why functions that return other functions? You call those functions, and then immediately call the function they return. This seems really overcomplicated. – Ned Batchelder Dec 31 '17 at 04:08
  • @NedBatchelder have you ever used currying in functional programming or promises in Javascript / node.js? Or generic lambdas in C#? I was actually keeping it simple here for the S.O. example. In practice, I would take a list of strings, and then I can pipeline a set of method producer functions that function like decorators (as a form of method composition). Think string -> methodThatGetsData -> methodThatSavesRetrievedData -> methodThatTransformsSavedData -> methodThatSortsTransformedData -> etc. At implementation, you only need to provide a list of method producers to setup the pipeline. – devinbost Jan 04 '18 at 03:10
  • @NedBatchelder it's also a way of combining principles of dependency injection based inversion-of-control with functional programming. And since Python isn't strongly typed, I'd rather have method names that are sufficiently descriptive than ambiguousMethodProducer() since an ambiguous method producer name doesn't provide much of a hint regarding what type of method producer it's expecting to be injected. In C#, we would use an interface to enforce the generic function producer's produced function type, but if you know a way to do something like that in Python, please let me know. – devinbost Jan 04 '18 at 03:19
  • Section 4.1 of this article has an example that might be helpful: http://book.pythontips.com/en/latest/map_filter.html – devinbost Jan 04 '18 at 03:28
  • 4
    OK, you do you :) – Ned Batchelder Jan 04 '18 at 12:59
  • I prefer to not have everything hardcoded and need to be re-written every time a single dependency changed. – devinbost Jan 06 '18 at 07:59
  • If you want loose coupling and high flexibility I recommend to take a look at the Zope Component Architecture, especially interfaces and the adapter pattern. The ZCA gives you all that while still allowing for clean and readable code. https://docs.zope.org/zope.component/socketexample.html – thet Jan 09 '18 at 02:01

1 Answers1

3

The problem is that you reference uuidString in the call to self.getMethodThatProvidesFullFilePathNameForPricesCsvFromUUIDAndTickerName before you assign to it. The assignment makes it local to the scope of the innermost function and therefore, it is unassigned when you reference it.

A full description of the scoping rules is provided by: https://stackoverflow.com/a/292502/7517724

This simpler example reproduces your error to make the problem more clear:

class aclass():

    def outer(self, uuidString):
        def inner():
            print(uuidString)
            uuidString = 'new value'
            return uuidString
        return inner

a = aclass()
func = a.outer('a uuid')
val = func()
print(val)

The assignment in inner() causes the uuidString to be local to inner() and therefore it is unassigned when the print(uuidString) is call, which causes Python to raise the UnboundLocalError.

You can fix the error by passing the variable in to your function with a default argument. Changing the definition of saveData to pass uuidString as a default argument, as:

def saveData(uuidString=uuidString):

will make it work as you expect.

Craig
  • 4,605
  • 1
  • 18
  • 28
  • 1
    I found the second answer quite helpful too. It shows the exact scenario as the question. https://stackoverflow.com/a/34094235/891373 Point 7 and 8. – Froyo Jan 05 '18 at 14:16
  • So why isn't this a problem with tickerName? – devinbost Jan 06 '18 at 08:18
  • @devinbost - it isn't a problem because you don't assign a value to `tickerName` inside the nested function.As I explain in my answer, **the assignment** makes it local to the scope of the inner function. If you don't assign to it, it will search outer scopes for the variable. – Craig Jan 06 '18 at 21:29
  • 2
    @devinbost - Maybe you are assuming that Python is interpreting your code one line at a time and therefore it shouldn't know about the assignment. When you run a script, Python reads the entire script and compiles it into bytecode which it then executes. In the compiling phase, it sees the assignment of `uuidString` and determines that the variable is local. – Craig Jan 06 '18 at 21:33