I have a sorted and nested list. Each element in the list has 3 sub-elements; 'Drugname', Doctor_id, Amount. For a given drugname (which repeats) the doctor ids are different and so are the amounts. See sample list below..
I need an output where, for each drugname, I need to count the total UNIQUE doctor ids and the sum of the dollar amount for that drug. For ex, for the list snippet below..
[
['CIPROFLOXACIN HCL', 1801093968, 61.49],
['CIPROFLOXACIN HCL', 1588763981, 445.23],
['HYDROCODONE-ACETAMINOPHEN', 1801093968, 251.52],
['HYDROCODONE-ACETAMINOPHEN', 1588763981, 263.16],
['HYDROXYZINE HCL', 1952310666, 945.5],
['IBUPROFEN', 1801093968, 67.06],
['INVEGA SUSTENNA', 1952310666, 75345.68]
]
The desired output is as below.
[
['CIPROFLOXACIN HCL', 2, 516.72],
['HYDROCODONE-ACETAMINOPHEN', 2, 514.68]
['HYDROXYZINE HCL', 1, 945.5]
['IBUPROFEN', 1, 67.06]
['INVEGA SUSTENNA', 1, 75345.68]
]
In a database world this is the easiest thing with a simple GROUP BY on drugname. In Python, I am not allowed to use PANDAS, NumPy etc. Just the basic building blocks of Python. I tried the below code but I am unable to reset the count variable to count doctor ids and amounts. This commented code is one of several attempts. Not sure if I need to use a nested for loop or a for loop-while loop combo.
All help is appreciated!
aggr_list = []
temp_drug_name = ''
doc_count = 0
amount = 0
for list_element in sorted_new_list:
temp_drug_name = list_element[0]
if temp_drug_name == list_element[0]:
amount += float(amount)
doc_count += 1
aggr_list.append([temp_drug_name, doc_count, amount])
print(aggr_list)