Trying to determine the best way to count transactions in a range by identifier for a project. Below is a sample csv dataset.
Customer A,$1.55 ,12:01
Customer B,$12.95 ,12:02
Customer A,$28.77 ,12:03
Customer C,$35.65 ,12:04
Customer A,$11.95 ,12:05
Customer C,$7.65 ,12:06
Customer B,$55.96 ,12:07
Customer C,$44.25 ,12:08
If I have thousands of transaction occurring per second, and I want to know the number (count) of transactions for each customer in a range, say $0-$10, $11-20, $21-$30, $31-$40, and so on, what is the most efficient way to do this?
If I make sublists in python like
Customer A = [$1, $5, $12, $25, $18, $11]
Then I can do it pretty easily with a count function of some sort like:
def count_range_in_list(li, min, max):
ctr = 0
for x in li:
if min <= x <= max:
ctr += 1
return ctr
print(count_range_in_list(Customer A, 0, 10))
[source: https://www.w3resource.com/python-exercises/list/python-data-type-list-exercise-31.php ]
Just struggling with an efficient way to do this from a large dataset. Since I know my customers I could do a comparison between the transaction data and that list and try to parse it that way, then do the count by range?
Any help getting started would be nice. Thanks.