-2

I've doing this iteration to execute a different function for each single value of a dataframe:

being xxx a 2-col dataframe

for i in range(1, len(xxx)):
row = xxx[i-1:i]
do_something(row['value1'])
do_something_else(row['value2'])

this works fine, but I've always wondered if is there some way to make the same operation more readable

Please answer with concepts or libraries that I should check

  • 2
    Does this answer your question? [How to iterate over rows in a DataFrame in Pandas](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas) – Stas Buzuluk Sep 22 '20 at 16:16
  • 1
    If you need to iterate over the rows of your data frame, you should seriously question whether a data frame is the best representation for your data. Almost all uses are better solved by some form of vectorization: apply a function to all rows of the data frame (i.e. let the run-time system manage your iteration). – Prune Sep 22 '20 at 16:44

4 Answers4

2

Try this:

df=pd.DataFrame([[1,2,3,4],['A','B','C','D']]).T
df.columns=['A','B']
def func(X):
    return X**2
r=map(func, df['A'])
df['A']=pd.DataFrame(r)
Vaziri-Mahmoud
  • 152
  • 1
  • 10
1

You can apply a function along an axis of the DataFrame (rows or columns) with apply:

pandas.DataFrame.apply
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwds)
1

You may also try using a lambda function along with an apply method like this:

Let's say that you have a function that converts an element to a string and then capitalizes that string.

def capitalize(cell):
    return str(cell).capitalize()

You may then apply that function on every row for a chosen column.

df["Column"].apply(lambda x: capitalize(x))
ats
  • 141
  • 6
1

One potential solution is to map regular functions or lambda functions to the columns of the dataframe, which is much more faster and efficient than a loop (e.g. df.iterrows()).

Here is summary of efficient dataframe/series manipulation methods based on an answer here :

  • map works for Series ONLY
  • applymap works for DataFrames ONLY
  • apply works for BOTH

` Here is a toy example :

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(4, 2), columns=list('AB'))
print(df)

def square(x):
   return x**2

#mapping a lambda function
print('Result of mapping with a lambda function')
df['A'] = df['A'].map(lambda x : x**2)
print(df)

#mapping a regular function
print('Result of mapping with a regular function')
df['C']  =df['A'].map(square)
print(df)

#apply
print('Result of applymap a regular function')
df1 = df.applymap(square)
print(df1)


#apply
print('Result of applying with a regular function')
df2 = df.apply(square)
print(df2)

Output:

          A         B
0 -0.030899 -2.206942
1  0.080991  0.049431
2  1.190754 -0.101161
3  0.794870 -0.969503

Result of mapping with a lambda function
          A         B
0  0.000955 -2.206942
1  0.006560  0.049431
2  1.417894 -0.101161
3  0.631818 -0.969503

Result of mapping with a regular function
          A         B             C
0  0.000955 -2.206942  9.115775e-07
1  0.006560  0.049431  4.302793e-05
2  1.417894 -0.101161  2.010425e+00
3  0.631818 -0.969503  3.991945e-01

Result of applymap with a regular function
              A         B             C
0  9.115775e-07  4.870592  8.309735e-13
1  4.302793e-05  0.002443  1.851403e-09
2  2.010425e+00  0.010234  4.041807e+00
3  3.991945e-01  0.939936  1.593563e-01

Result of applying with a regular function
              A         B             C
0  9.115775e-07  4.870592  8.309735e-13
1  4.302793e-05  0.002443  1.851403e-09
2  2.010425e+00  0.010234  4.041807e+00
3  3.991945e-01  0.939936  1.593563e-01
Grayrigel
  • 3,474
  • 5
  • 14
  • 32