1

I would like to apply a lambda function to several columns but I am not sure how to loop through the columns. Basically I have Column1 - Column50 and I want the exact same thing to happen on each but can't figure out how to iterate through them where x.column is below. Is there a way to do this?

for column in df:
   df[column] = df.apply(lambda x: x.datacolumn * x.datacolumn2 if x.column >= x.datacolumn3, axis=1)
azro
  • 53,056
  • 7
  • 34
  • 70
lilbit
  • 11
  • 2
  • Can you be a little more specific about what you are trying to accomplish here? From your code, it looks like you want to do something like "set the value of `df[column]` to `df["datacolumn"] * df["datacolumn2"]` if `df[column] >= df["datacolumn3"]` for all `column`s in `df`. Am I interpreting your intention correctly? – PaSTE Nov 09 '19 at 00:21

3 Answers3

0

Are you looking for something like map()? map() applies a function to every item in a list (or other iterable) and returns a list containing the results.

Here's an eloquent explanation of how it works (way better than what I could write).

At a certain point, however, declaring a normal function and/or using a for loop might be easier.

Michael Noguera
  • 401
  • 4
  • 14
  • Disclaimer: This isn't pandas-specific. I'm not very experienced in the panda realm, and thus it's entirely possible that this won't work for pandas data structures. – Michael Noguera Nov 08 '19 at 22:51
0

At first, you are missing the else branch (what to do when the if condition is False?), and for accessing Panda's Series (the input for the lambda function) elements, you could use indexes.

For example, setting to 0, if the condition does not stand:

for column in df:
    df[column] = df.apply(lambda x: x[0] * x[1] if x[0] >= x[2] else 0, axis=1)
y.luis.rojo
  • 1,794
  • 4
  • 22
  • 41
0

It might be easiest to extract each column as a list, perform the operation, then write the result back into the dataframe.

for column in df:
   temp = [x for x in df.loc[:, column]]      #pull a list out using loc
   if temp[0] > temp[2]:
       temp[0] = temp[0] * temp[1]
   df.loc[:, column] = temp                   #overwrite original df column

The above leaves data unchanged if condition is not met.

neutrino_logic
  • 1,289
  • 1
  • 6
  • 11