What is a the more efficient way to bin the amount
column into different bucket and get the length of each bucket.
buckets are amount
1. amount < 10000
2. amount >=10000 & <100000
3. amount >100000 & <500000
4. amount > 500000
I was trying to implement the above question using:
sample_data =
date amount type
0 2018-09-28 4000.0 D
1 2018-11-23 2000.0 D
2 2018-12-27 52.5 D
3 2018-10-02 20000.0 D
4 2018-11-27 4000.0 C
5 2018-06-01 500.0 D
6 2018-07-02 5000.0 D
7 2018-07-02 52.5 D
8 2018-10-31 500.0 D
9 2018-11-26 2000.0 C
sample_data['Transactions_bin_1'] = sample_data[sample_data.amount < 10000]['amount']
sample_data['Transactions_bin_2'] = sample_data[(sample_data['amount'] >= 10000) & (sample_data['amount'] < 100000)]['amount']
sample_data['Transactions_bin_3'] = sample_data[(sample_data['amount'] >= 100000) & (sample_data['amount'] < 500000)]['amount']
sample_data['Transactions_bin_4'] = sample_data[sample_data.amount > 500000]['amount']
bin_classification =
{
'bin1' : sample_data.Transactions_bin_1.count(),
'bin2' : sample_data.Transactions_bin_2.count(),
'bin3' : sample_data.Transactions_bin_3.count(),
'bin4' : sample_data.Transactions_bin_4.count()
}