Code:
import pandas as pd
df = pd.read_csv('xyz.csv', usecols=['transaction_date', 'amount'])
df=pd.concat(g for _, g in df.groupby("amount") if len(g) > 3)
df=df.reset_index(drop=True)
print(df)
Output:
transaction_date amount
0 2016-06-02 50.0
1 2016-06-02 50.0
2 2016-06-02 50.0
3 2016-06-02 50.0
4 2016-06-02 50.0
5 2016-06-02 50.0
6 2016-07-04 50.0
7 2016-07-04 50.0
8 2016-09-29 225.0
9 2016-10-29 225.0
10 2016-11-29 225.0
11 2016-12-30 225.0
12 2017-01-30 225.0
13 2016-05-16 1000.0
14 2016-05-20 1000.0
I need to add another column next to the amount column which gives the difference between corresponding rows of transaction_date e.g.
transaction_date amount delta(days)
0 2016-06-02 50.0 -
1 2016-06-02 50.0 0
2 2016-06-02 50.0 0
3 2016-06-02 50.0 0
4 2016-06-02 50.0 0
5 2016-06-02 50.0 0
6 2016-07-04 50.0 32
7 2016-07-04 50.0 .
8 2016-09-29 225.0 .
9 2016-10-29 225.0 .
10 2016-11-29 225.0