0

CONTEXT

I'm trying to pass some parameters to a function by command line. These parameters can be read either from directly from the command line, from a configfile or combining both to make easier some pipelines. E.g.:

for i in { 1 .. 100 }
do
python myprogram.py -i data_${i}.nii.gz -o results_${i} -cf configfile.ini
done

MY PROBLEM

I have two functions read_configfile() and read_commandline within myprogram.py that work correctly and provide a dict args in exactly the same output format. If you print that args variables returned by both funcitons, you get something like:

{'data_path': '~/problem/data_X/', 'results_path': '~/problem/results_X/', 'base_path': '~/problem/', 'model_name': 'subject_X', 'data_name': 'data.nii.gz', 'image_file': None, 'bvals_file': None, 'bvecs_file': None, 'brainmask_file': None, 'header_init': None, 'init_files': None, 'cov_file': 'Analytic', 'njobs': 14, 'slice_axis': None, 'slice_n': None, 'seed': None, 'get_cov_samples': False, 'num_sticks': 3, 'snr': 30, 'type_noise': 'Rician', 'framework': 'Matlab', 'optimizer': 'fmincon', 'cons_fit': True, 'ard_optimization': False, 'likelihood_type': 'Default', 'reparametrization': False, 'no_run_mcmc': False, 'mcmc_analysis': 'Hybrid1', 'burnin': 1000, 'jumps': 1250, 'thinning': 25, 'adaptive': True, 'period': 50, 'duration': 'whole-run', 'no_ard_mcmc': False, 'type_ard': 'Gaussian', 'ard_fudge': 1, 'bingham_flag': False, 'acg': False}

What I want is to map the content of the dict args to create new variables (i.e. having variableX instead of args.variableX or args['variableX']). The key would be the name of the variable I want to create and the value its respective value. E.g.:

print(data_path)
~/problem/data_X/

print(period)
50

I would like to do it as cleanest and scalable as possible, so I don't want to assign manually one by one.

I have searched and tried several options to do this mapping, like the ones listed here and here (e.g. locals().update(args) or for i in list(args.keys()): exec(f'{i} = {args[i]}' ). However, all of them work partially. These mappings leave most of the variables undefined (those ones that are actually redefined or reused later in the code), probably for some kind of conflict with the namespace or something that I don't understand.

Is there any solution or I have to re-define them one by one?

EDIT

def myprogram():
    [...]

    var1, var2, var3, ...., var50 = read_params(logger, sys.argv[1:])
    
    [other stuff]

def read_params(logger, argv):
    # Only CONFIGFILE
    if isinstance(argv, str):   # It is a string, i.e. a path (chain of characters) = Only configfile provided
        logger.info('Reading parameters from configfile....')
        args = read_configfile(argv)
        
    # COMMAND LINE PARAMS
    elif isinstance(argv, list):    # Params introduced by terminal
        logger.info('Reading parameters from command-line....')
        args = read_params(argv)

 
    # One (automatic) option that should work 
    locals().update(args)
    print('Variables in locals():')
    print(locals()) # It seems to be updated correctly to locals(). See print shown below

    # Re-definition one by one. It works but I want to avoid this
    # model_name = args['model_name']
    # base_path = args['base_path']
    # data_path = args['data_path']
    # [...] # Repeat for the rest of variables


    # If I use these updated variables from locals().update(args) anywhere in the code, it raises an Error.
    if data_path is None:
        if (os.path.isdir(os.path.join(base_path, 'data/'))):
            data_path = os.path.join(base_path, 'data/')
            sys.path += [data_path]
            logger.info(f'Data path set in from {data_path}')
        else:
            logger.error(f'Error! Data path {data_path} does not exist.')
            sys.exit()

            
    [other similar stuff]
    
    return data_path, ...


def read_configfile(configfile):
    # localconfig is a wrapper on top of configparser (so fully compatible) that makes easier to import the variables in correct data types using the same configfile
    # https://pypi.org/project/localconfig/
    from localconfig import config
    config.read(configfile)
    args = dict(list(config.args))    # returns a dict
    return args

[...]

The print(locals()) and the error:

Variables in locals():
{'logger': <Logger run_myBedpostX (DEBUG)>, 'argv': '~/code/config/configfile_template.py', 'args': {'data_path': '~/code/data/', 'results_path': '~/code/results/', 'base_path': '~/code/', 'image_file': 'data.nii.gz', 'bvals_file': 'bvals', 'bvecs_file': 'bvecs', 'brainmask_file': 'nodif_brain_mask.nii.gz', 'model_name': 'model_1', 'header_init': 'pvmfit', 'init_files': '~/code/data/brain/PVMFIT/', 'cov_file': '~/code/results/3fib/covSPD.npy', 'njobs': 1, 'slice_axis': None, 'slice_n': None, 'seed': 1234, 'get_cov_samples': False, 'full_report': False, 'num_sticks': 3, 'snr': 30, 'type_noise': 'Rician', 'framework': 'None #Matlab', 'optimizer': 'None #', 'cons_fit': True, 'ard_optimization': False, 'likelihood_type': 'Default', 'reparametrization': False, 'run_mcmc': True, 'mcmc_analysis': 'Hybrid1', 'burnin': 1000, 'jumps': 1250, 'thinning': 25, 'adaptive': True, 'period': 50, 'duration': 'whole-run', 'ard_mcmc': True, 'type_ard': 'Gaussian', 'ard_fudge': 1, 'acg': False, 'bingham_flag': False}, 'base_path': '~/code/', 'image_file': 'data.nii.gz', 'bvals_file': 'bvals', 'bvecs_file': 'bvecs', 'brainmask_file': 'nodif_brain_mask.nii.gz', 'model_name': 'model_1', 'header_init': 'pvmfit', 'init_files': '~/code/data/brain/PVMFIT/', 'cov_file': '~/code/results/3fib/covSPD.npy', 'njobs': 1, 'slice_axis': None, 'slice_n': None, 'seed': 1234, 'get_cov_samples': False, 'full_report': False, 'num_sticks': 3, 'snr': 30, 'type_noise': 'Rician', 'framework': 'None #Matlab', 'optimizer': 'None #', 'cons_fit': True, 'ard_optimization': False, 'likelihood_type': 'Default', 'reparametrization': False, 'run_mcmc': True, 'mcmc_analysis': 'Hybrid1', 'burnin': 1000, 'jumps': 1250, 'thinning': 25, 'adaptive': True, 'period': 50, 'duration': 'whole-run', 'ard_mcmc': True, 'type_ard': 'Gaussian', 'ard_fudge': 1, 'acg': False, 'bingham_flag': False}

Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1758, in <module>
    main()
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1752, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1147, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "~/code/main.py", line 214, in <module>
    main(args)
  File "~/code/main.py", line 72, in main
    get_cov_samples, type_ard, ard_fudge, bingham_flag, acg = read_params(logger, args)
  File "~/code/utils/read_params.py", line 98, in read_params
    if data_path is None:
UnboundLocalError: local variable 'data_path' referenced before assignment

As you can see, all the items from args appear also in locals() as separate variables (after the args dict). Actually, debugging it in Pycharm, I can call them from the Debug Console. However, it raises that error if you run the script (i.e., called from the function), as shown above.

JP Manzano
  • 11
  • 5
  • Might be helpful: https://docs.python.org/3/howto/argparse.html – รยקคгรђשค Sep 23 '20 at 19:08
  • Does this answer your question? [convert dictionary entries into variables - python](https://stackoverflow.com/questions/18090672/convert-dictionary-entries-into-variables-python) – รยקคгรђשค Sep 23 '20 at 19:17
  • Hi @Suparshva, I've done the tutorial and follow the documentation. I can extract the arguments using argparse. My problem is they are stored in a class, where I have to access them one by one using their names (e.g. args['variable1] or args['variable2]). What I ask above is how to map these to have directly variable1, variable2, etc. Regarding the SO link you sent, I've posted it in the description as well. It kind of work, but the problem with those approaches is they don't update all the variables (see above). – JP Manzano Sep 24 '20 at 14:17
  • Did you have a look at the accepted answer of the question in my other comment? There are multiple ways to do it in python. Feel free to use and choose what meets your purpose. – รยקคгรђשค Sep 24 '20 at 14:21
  • Not sure whether I understood you...or I'm not explaining well. The methods showed in that SO link (e.g. ```locals().update(args)``` or methods using ```exec```) only work partially. Those variables that, for any reason, are used or called later in the same function, even being updated into locals() (if your print locals(), they appear there) but they return an error ```NameError: name 'variable1' is not defined "``` when they are used. I guess this happens because of some conflicts with the namespace, but don't know how to deal with it. I hope this clarifies it a bit. – JP Manzano Sep 24 '20 at 14:43
  • Show how are you trying to use it? Remove any unnecessary and irrelevant code. I can see currently added code doesn't serve any purpose for the question, unneccessarily increasing the size of the question. – รยקคгรђשค Sep 24 '20 at 15:21
  • Edited! Let me know whether something is not clear. Thanks for your help :) – JP Manzano Sep 24 '20 at 18:16

1 Answers1

1

Your attempt to update locals() or use exec with locals won't work in Python 3. There are also warnings available in the docs for the same. locals is no more implemented as dict to increase performance but is a fixed-sized array.

Remember that CPython is compiled to bytecode, which the interpreter runs. When a function is compiled, the local variables are stored in a fixed-size array (not a dict) and variable names are assigned to indexes. This is possible because you can't dynamically add local variables to a function. Then retrieving a local variable is literally a pointer lookup into the list and a refcount increase on the PyObject which is trivial.

Contrast this to a global lookup (LOAD_GLOBAL), which is a true dict search involving a hash and so on. Incidentally, this is why you need to specify global i if you want it to be global: if you ever assign to a variable inside a scope, the compiler will issue STORE_FASTs for its access unless you tell it not to.

from : Why does Python code run faster in a function?

Possible workarounds:

  1. Update globals instead of locals, it still works since it is still implemented as a dict. Warning: This is a bad idea and you should think twice before using, there are better ways.

  2. Use a namespace. You want to have keys of dictionary to be proper variables to directly access with '.' operator or else you would need to use getattr to access the variable.

>>> from types import SimpleNamespace
>>> args = {"data_path": "xyx/data","base_path": "xyz"} 
>>> n = SimpleNamespace(**args)
>>> n.data_path
'xyx/data'

References:

Why does Python code run faster in a function?

https://www.python.org/dev/peps/pep-0558/

https://stackoverflow.com/a/28345836/6741053

รยקคгรђשค
  • 1,919
  • 1
  • 10
  • 18
  • Hi, thanks again for the answer. I've tried both workarounds but I'm still at the same point (global update has the same issue than locals().update and doing it through namespace also returns an object where you need to access the variables by "." operator, instead of creating a new variable with the name of the "key"). I finally decided to assign them manually as I didn't find any automatic solution. Another partial solution would be to have the configfile as .py and import it as a module. – JP Manzano Sep 28 '20 at 14:17
  • Can you show an example of how you are using with globals().update? You shouldn't be using it anyways. In the second approach you can simply add your variable names with a prefix of namespace, in most IDEs it shouldn't be any problem to replace variables in 1 go. – รยקคгรђשค Sep 28 '20 at 14:24