5
def function_name(df):
    for i, row in df.iterrows():
        df.set_value(...)
    return df

if __name__ == '__main__':
    # Assume we have a dataframe called idf
    idf = function_name(idf)

In the code above, I pass a dataframe called idf into a function called function_name. In that function, I loop over all rows in the dataframe, make some modifications and return a dataframe which I store back into idf.

I have a feeling that this approach is wasting memory, can someone correct me or point out a better more pythonic approach? Please note that I have a good reason to be using iterrows, even though it makes everything slower, I just want some feedback on the way I am passing dataframe to a function and getting it back

---EDIT--

Based on feedback esp by @marius, here's what I want to know:

By passing dataframe into the function, am I making a new copy of the dataframe? That is the memory wastage I am concerned with

user308827
  • 21,227
  • 87
  • 254
  • 417
  • "I have a feeling that this approach is wasting memory" - why do you have this feeling? – Ami Tavory Sep 19 '15 at 23:43
  • Don't know for sure, wish I had a better answer – user308827 Sep 19 '15 at 23:55
  • I think your suspicions about memory usage are off, `iterrows` should only be yielding the rows one at a time as needed, not creating a list of rows all at once that then have to be stored in memory. If you want answers about how to improve efficiency, you probably need to be more specific about what the actual problem is. – Marius Sep 20 '15 at 00:25
  • Are you trying to set the value of an existing column by applying a scalar function to each row? If that's the case, instead of iterating over the rows you can consider apply, map, or applymap methods based on your need. This is a pretty good summary http://stackoverflow.com/questions/19798153/difference-between-map-applymap-and-apply-methods-in-pandas: – leroyJr Sep 20 '15 at 00:30
  • @Marius, By passing dataframe into the function, am I making a new copy of the dataframe? That is the memory wastage I am concerned with – user308827 Sep 20 '15 at 00:34
  • @leroyJr I have plenty of if statements inside that funcion, else I would have used apply – user308827 Sep 20 '15 at 00:34
  • 2
    @user308827: No, Python doesn't copy values when you pass them as arguments, just binds a new name to the existing value. It's hard to explain exactly how Python works in this regard, you can start with something like [this](http://stupidpythonideas.blogspot.com.au/2013/11/does-python-pass-by-value-or-by.html) – Marius Sep 20 '15 at 00:41

0 Answers0