3

I have a nested dictionary containing a bunch of data on a number of different objects (where I mean object in the non-programming sense of the word). The format of the dictionary is allData[i][someDataType], where i is a number designation of the object that I have data on, and someDataType is a specific data array associated with the object in question.

Now, I have a function that I have defined that requires a particular data array for a calculation to be performed for each object. The data array is called cleanFDF. So I feed this to my function, along with a bunch of other things it requires to work. I call it like this:

rm.analyze4complexity(allData[i]['cleanFDF'], other data, other data, other data)

Inside the function itself, I straight away re-assign the cleanFDF data to another variable name, namely clFDF. I.e. The end result is:

clFDF = allData[i]['cleanFDF']

I then have to zero out all of the data that lies below a certain threshold, as such:

clFDF[ clFDF < threshold ] = 0

OK - the function works as it is supposed to. But now when I try to plot the original cleanFDF data back in the main script, the entries that got zeroed out in clFDF are also zeroed out in allData[i]['cleanFDF']. WTF? Obviously something is happening here that I do not understand.

To make matters even weirder (from my point of view), I've tried to do a bodgy kludge to get around this by 'saving' the array to another variable before calling the function. I.e. I do

saveFDF = allData[i]['cleanFDF']

then run the function, then update the cleanFDF entry with the 'saved' data:

allData[i].update( {'cleanFDF':saveFDF} )

but somehow, simply by performing clFDF[ clFDF < threshold ] = 0 within the function modifies clFDF, saveFDF and allData[i]['cleanFDF'] in the main friggin' script, zeroing out all the entires at the same array indexes! It is like they are all associated global variables somehow, but I've made no such declarations anywhere...

I am a hopeless Python newbie, so no doubt I'm not understanding something about how it works. Any help would be greatly appreciated!

asheeshr
  • 4,088
  • 6
  • 31
  • 50
  • You might have to do a "deep copy" of your `clFDF = allData[i]['cleanFDF']` line, but I could be wrong; more information here: http://docs.python.org/2/library/copy.html. Basically, assignments in Python work differently than, say, assignments in PHP. – summea Jan 02 '13 at 07:17

2 Answers2

2

You are passing the value at allData[i]['cleanFDF'] by reference (decent explanation at https://stackoverflow.com/a/430958/337678). Any changes made to it will be made to the object it refers to, which is still the same object as the original, just assigned to a different variable.

Making a deep copy of the data will likely fix your issue (Python has a deepcopy library that should do the trick ;)).

Community
  • 1
  • 1
akaIDIOT
  • 9,171
  • 3
  • 27
  • 30
  • Nothing is ever passed by reference in Python. Unless you use a nonstandard definition of pass by reference for no good reason. –  Jan 02 '13 at 08:27
  • Yep - clFDF = copy.deepcopy(allData[i]['cleanFDF']) then passing clFDF to the function fixes my problem. Thanks! – Craig Anderson Jan 02 '13 at 21:23
  • @delnan has a point; it's not exactly passing by reference in python, but point is the same: your initial code uses the exact same object no matter the variable you assign it to. To be able to change the object and not having the changes reflected in the original you need a copy :) – akaIDIOT Jan 03 '13 at 09:15
2

Everything is a reference in Python.

def function(y):
    y.append('yes')
    return y

example = list()
function(example)
print(example)

it would return ['yes'] even though i am not directly changing the variable 'example'.

See Why does list.append evaluate to false?, Python append() vs. + operator on lists, why do these give different results?, Python lists append return value.

Community
  • 1
  • 1
chirag ghiyad
  • 690
  • 1
  • 6
  • 14