Your question may get closed as being off-topic or too broad, but I think it's a good question if rephrased as "what's the python equivalent of this code".
Generally speaking, this is something that a lot of folks coming from matlab get confused by. In python, things are separated into "namespaces" and you need to explicitly import functions/variables/etc from other files.
Common high-level structure of code
In matlab (if I remember correctly), you can't have functions in the same file with "bare" statements. In python you can. However, you can't call a function before it has been defined.
In other words, you can do:
def foo():
print 'bar'
foo()
but not:
foo()
def foo():
print 'bar'
Therefore, because you typically want the "outline-level" code at the top of the file, it's common to put it into a function and then call that function at the bottom after the other functions have been defined. Typically, you'd call this function main
, but you're free to name it whatever you'd like.
As a quick example:
def main():
directory = load_data()
threshold, fft_size = 10, 1000
prescreen_fn(directory,threshold)
plot_prescreen_hits(directory)
extract_features(directory,fft_size)
generate_train_test(directory)
SVM_train_test(directory)
def prescreen_fn(directory, threshold):
"""A prescreen function that is run. Ideally this would be a
more informative docstring."""
pass
def plot_prescreen_hits(directory):
pass
def extract_features(directory,fft_size):
pass
def generate_train_test(directory):
pass
def SVM_train_test(directory):
pass
def load_data():
pass
if __name__ == '__main__':
main()
The last part probably looks a bit confusing. What that says is basically "execute the code in this block only if this file is run directly. If we're just importing functions from it, don't run anything yet." (There are a lot of explanations of this, e.g.What does if __name__ == "__main__": do? )
If you wanted, you could just do:
def main():
...
def other_things():
...
main()
If you just run the file, you'll get the same result. The difference is in what happens when we import this code from somewhere else. (In the first example, main
wouldn't be called while in the second it would.)
Calling functions in other files
As things grow, you might decide to split some of that into separate files. For example, we might put some of the functions in a file called data.py
and others in a file called model.py
. We can then import functions from these files into another file where the "pipeline" is built up (we might even call this one main.py
, or maybe something more descriptive).
Unlike matlab, we need to explicitly import
these files. I won't go into the details here, but import basically tries to find a file or package (directory with a specific structure) with the specified name first in "library" locations and then in the same directory as the file being run (the preference order changed in 2.7 - local files used to supersede library files).
In the example below, import data
will import functions and variables in the file "data.py"
(and the same for import model
). The functions, etc in that file are in a "namespace" called data
, so we'll need to refer to them that way. (Note that you can do from data import *
to bring them into the global namespace, but you really, really should avoid that unless you're in an interactive shell.)
import data
import model
directory = data.load_data()
threshold, fft_size = 10, 1000
data.prescreen_fn(directory, threshold)
data.plot_prescreen_hits(directory)
data.extract_features(directory, fft_size)
model.generate_train_test(directory)
model.SVM_train_test(directory)
Notice that I didn't bother wrapping this one into a main
function. We certainly could have. The reason I didn't do that here is that you presumably wouldn't ever want to import something from this short "main.py" file. Therefore we don't need to run things behind an if __name__ == '__main__':
conditional.
Hopefully these examples help clarify things a bit.