0

I am developing a code that uses a module data_module to store common data/parameters. The data/parameters are not available at the beginning but will be set in the main program. When the data/parameters are given values, they can be used by functions in other modules, which in turn, get executed in the main code.

#data_module.py
run_data = {}
predictors = []


#another_module.py
from data_module import run_data, predictors
def test():
   print(run_data)
   print(predictors)

If I use the following version 1 of the main code

#main1.py
from data_module import run_data, predictors
from another_module import test
run_data['full_data'] = 1
predictors = ['X1','X2']
test()

I will get

{'full_data': 1} []

So the dictionary variable run_data in the data_module.py gets updated while the list variable predictors is not.

If I want to update the list variable I have to use the extend operator

#main2.py
from data_module import run_data, predictors
from another_module import test
run_data['full_data'] = 1
predictors.extend(['X1','X2'])
test()

Please help me explain the different behaviors of dictionary and list in this example and let me know the pythonic way to modifying variables in a module for later use.

jslatane
  • 300
  • 5
  • 15
  • Possible duplicate of [https://stackoverflow.com/questions/1977362/how-to-create-module-wide-variables-in-python](https://stackoverflow.com/questions/1977362/how-to-create-module-wide-variables-in-python) – jslatane Nov 16 '21 at 04:54

1 Answers1

0

It's not any difference in behavior of list and dict; it's the fact that in the case of the list you are re-binding the variable name run_data to a new list, while in the case of the dict you are changing the contents of the dict but not changing what object the name predictors points to (it's still the same dictionary).

Since you've imported individual values from your data module, those are duplicate names. After the import, the same object has names in both your data module and the main module. If you put a different object in your main module's run_data variable, that does not change the run_data variable in your data module. The two names have different values now.

The solution to this is to import data_module only and always access its contents through the module name: data_module.run_data, data_module.predictors, etc. This way there is only one name for these values. Each module you import into has a separate data_module variable, but you don't change what's in it by reading or writing its attributes.

To save typing you can import data_module as dm and then use dm.predictors for the values.

I would be remiss to end this answer without mentioning that you're essentially using global variables to store all your state and this will make your program confusing. Instead, you should use classes to bundle related data and code together, pass values as parameters to methods, etc.

kindall
  • 178,883
  • 35
  • 278
  • 309
  • Thank you @kindall. It is clear to me now. Regarding your suggestion of using classes, I get it. The code is mainly for quick testing/running things. I am also developing an object oriented version but I found it quite challenging to fulfill all of my need. – Hieu Nguyen Nov 16 '21 at 09:44
  • I will admit that I've done this sort of thing for quick-and-dirty configuration. (Wouldn't recommend it for end-users since expecting them to keep Python syntax is problematic.) – kindall Nov 16 '21 at 22:20