I am a newbie into python and I am trying to optimize a snippet of my program from using two for loops to using the list comprehension in Python3. While doing this, I am unable to calculate the sum of more than one column. For Ex, if I have columns 1,2, and 3 of types float int, and string in a dictionary, I am able to calculate the sum only for column one and not for column 2.
The double for loop looks something like this
final_dict = []
for k, g in itertools.groupby(temp_dict, key=lambda x: (x['transaction_category_id'])):
txn_amount = 0
distinct_user_count = 0
for v in g:
# print(k)
txn_amount += float(v['transaction_amount'])
distinct_user_count += v['user_id_count']
# print(v)
final_dict.append({'transaction_category_id': k, 'aggregated_transaction_amount': txn_amount,
'distinct_user_count': distinct_user_count})
The code I want to optimise to should ideally look something like this :
final_result = [[k, sum(float(v['transaction_amount']) for v in g),sum(s['user_id_count'] for s in g)] for k, g in
itertools.groupby(temp_dict, key=lambda x: (x['transaction_category_id']))]
But the code does not add up values for the user_id_count
column and return sum as 0.
The sample data looks something like this :
user_id,transaction_amount,transaction_category_id
b2d30a62-36bd-41c6-8221-987d5c4cd707,63.05,3
b2d30a62-36bd-41c6-8221-987d5c4cd707,13.97,4
b2d30a62-36bd-41c6-8221-987d5c4cd707,97.15,4
b2d30a62-36bd-41c6-8221-987d5c4cd707,23.54,5
and the ideal output would look like :
['4', 111.12, 2],
['3', 63.05, 1],
['5', 23.54, 1],
but it prints out and the ideal output would look like :
['4', 111.12, 0],
['3', 63.05, 0],
['5', 23.54, 0],
I tried the below sample code out but the output is not what I expected :
final_result = [[k, sum(float(v['transaction_amount']) for v in g),sum(s['user_id_count'] for s in g)] for k, g in
itertools.groupby(temp_dict, key=lambda x: (x['transaction_category_id']))]