5

For sqldf method of pandasql package, there is a "session/environment variables", could be locals() or globals(), could anyone let me know what it is for? And any document reference when should we use locals(), and when should we use globals()?

https://github.com/yhat/pandasql/

Here is my code and wondering what things pandansql is looking for thorough locals()? And locals() means namespace inside method select_first_50?

def select_first_50(filename):
    students = pandas.read_csv(filename)
    students.rename(columns = lambda x: x.replace(' ', '_').lower(), inplace=True)

    q = "select major, gender from studentstable limit 50"

    #Execute your SQL command against the pandas frame
    results = pandasql.sqldf(q.lower(), locals())
    return results
Lin Ma
  • 9,739
  • 32
  • 105
  • 175

1 Answers1

9

locals() and globals() are python built-in functions that are used to return the corresponding namespace.

In Python , Namespace is a way to implement scope. So global namespace means global scope, so variables(names) defined there are visible throughout the module.

local namepsace is the namespace that is local to a particular function.

globals() returns a dictionary representing the current global namespace.

locals()'s return depends on where it is called, when called directly inside the script scope (not inside a particular function) it returns the same dictionary as globals() that is the global namespace. When called inside a function it returns the local namespace.

In pandasql , the second argument you need to pass is basically this namespace (dictionary) that contains the variables that you are using in the query. That is lets assume you create a DataFrame called a , and then write your query on it. Then pandasql needs to know the DataFrame that corresponds to the name a for this it needs the local/global namespace, and that is what the second argument is for.

So you need to decide what to pass in, example , if your DataFrame is only defined inside a function and does not exist in global scope, you need to pass in locals() return dictionary, If your DataFrame exists in global scope, you need to pass in result of globals() .

Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
  • great answer. I have posted my code in my original post, and wondering what things pandansql is looking for thorough locals() in my case? And locals() means namespace inside method select_first_50? – Lin Ma Aug 16 '15 at 21:51
  • 1
    in your code, does it work? should'nt the name of the table inside sql be `students` ? inside `locals()` it looks for `students` table. – Anand S Kumar Aug 17 '15 at 01:24
  • so locals() means function scope of select_first_50? BTW, it works for me. – Lin Ma Aug 17 '15 at 05:30
  • 1
    yes, it means function_scope. you can try printing the result of `locals()` and you will see the function variables (as string keys) and the data they hold as values. – Anand S Kumar Aug 17 '15 at 05:32