1

I have: df = pd.DataFrame([[1, 2,3], [2, 4,6],[3, 6,9]], columns=['A', 'B','C'])

and I need to calculate de difference between the i+1 and i value of each row and column, and store it again in the same column. The output needed would be:

Out[2]: 
   A  B  C
0  1  2  3
1  1  2  3
2  1  2  3

I have tried to do this, but I finally get a list with all values appended, and I need to have them stored separately (in lists, or in the same dataframe).

Is there a way to do it?


difs=[]
for column in df:
    for i in range(len(df)-1):
        a = df[column]
        b = a[i+1]-a[i]
        difs.append(b)

for x in difs:
    for column in df:
        df[column]=x
Rose
  • 203
  • 2
  • 10
  • 2
    Please have a look at [How to create good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and provide a [mcve] with sample input and output – G. Anderson Aug 08 '19 at 18:12

1 Answers1

1

You can use pandas function shift to achieve your intended goal. This is what it does (more on it on the docs):

Shift index by desired number of periods with an optional time freq.

for col in df:
    df[col] = df[col] - df[col].shift(1).fillna(0)

df
Out[1]:
    A       B       C
0   1.0     2.0     3.0
1   1.0     2.0     3.0
2   1.0     2.0     3.0

Added

In case you want to use the loop, probably a good approach is to use iterrows (more on it here) as it provides (index, Series) pairs.

difs = []
for i, row in df.iterrows():
    if i == 0:
        x = row.values.tolist() ## so we preserve the first row
    else:
        x = (row.values - df.loc[i-1, df.columns]).values.tolist()
    difs.append(x)

difs
Out[1]:
[[1, 2, 3], [1, 2, 3], [1, 2, 3]]

## Create new / replace old dataframe
cols = [col for col in df.columns]
new_df = pd.DataFrame(difs, columns=cols)

new_df
Out[2]:
    A       B       C
0   1.0     2.0     3.0
1   1.0     2.0     3.0
2   1.0     2.0     3.0
realr
  • 3,652
  • 6
  • 23
  • 34
  • That’s a useful function! Many thanks! And with the loop I was doing could also be done? Kind regards – Rose Aug 08 '19 at 18:53
  • 1
    Hi @Rose, it is possible. You just have to be careful when iterating through a pandas dataframe not to change its own values. But absolutely, you can run the loop and store in a separate list. I will edit the response to include a looping example – realr Aug 08 '19 at 19:50
  • 1
    That’s absolutely great! Many thanks for the explanation! Best regards! – Rose Aug 08 '19 at 20:17
  • Absolutely @Rose! – realr Aug 08 '19 at 20:34