Python dictionary key length not same as rows returning for the query in mysql

Question

So i am trying to fetch data from the mysql into a python dictionary here is my code.

def getAllLeadsForThisYear():
charges={}
cur.execute("select lead_id,extract(month from transaction_date),pid,extract(Year from transaction_date) from transaction where lead_id is not NULL and transaction_type='CHARGE' and YEAR(transaction_date)='2015'")
for i in cur.fetchall():
    lead_id=i[0]
    month=i[1]
    pid=i[2]
    year=str(i[3])
    new={lead_id:[month,pid,year]}
    charges.update(new)
return charges

x=getAllLeadsForThisYear()

when i prints (len(x.keys()) it gave me some number say 450

When i run the same query in mysql it returns me 500 rows.Although i do have some same keys in dictionary but it should count them as i have not mentioned it if i not in charges.keys(). Please correct me if i am wrong. Thanks

since `charges` is a dict, it can only hold one value for each key. If you have multiple keys, it will overwrite whatever value you had with the new value. — R Nar, Dec 11 '15 at 17:44
Thanks R Nar, Can you please suggest me any other way so that i can fetch all keys in a dictionary, because in my data , key can be same sometimes but dictionary contents(array) are different . — Shubham, Dec 11 '15 at 17:48
the best way would be to set `charges = [defaultdict(list)`](https://docs.python.org/2/library/collections.html#collections.defaultdict) and appending each `(month,pid,year)` to the list at each key, rather than having a single list, but this will still give you the same amount of keys. why do you need to know the amount of keys anyways? — R Nar, Dec 11 '15 at 17:53
I am not concerned about the number it prints but it should have it in the dictionary, because i need to use that dictionary to generate some kind of report but i am having confliction in the data in my report as i was trying to compare it with the data in MYSQL. — Shubham, Dec 11 '15 at 17:57
i am thinking to change it to something like this charges ={id(some uniquenumber) :{lead_id:[month,pid,year]}} Will this work ? — Shubham, Dec 11 '15 at 17:59
yes but unless you have some way to hash that unique id, it would probably defeat the purpose of your dict. If you dont actually need the length, I believe my approach is what you are look for, I will make an answer to clarify what I meant — R Nar, Dec 11 '15 at 18:02

score 1 · Accepted Answer · edited May 23 '17 at 11:59

As I said, the problem is that you are overwriting your value at a key every time a duplicate key pops up. This can be fixed two ways:

You can do a check before adding a new value and if the key already exists, append to the already existing list.

For example:

#change these lines
new={lead_id:[month,pid,year]}
    charges.update(new)
#to
if lead_id in charges:
    charges[lead_id].extend([month,pid,year])
else
    charges[lead_id] = [month,pid,year]

Which gives you a structure like this:

charges = {
    '123':[month1,pid1,year1,month2,pid2,year2,..etc]
    }

With this approach, you can reach each separate entry by chunking the value at each key by chunks of 3 (this may be useful)

However, I don't really like this approach because it requires you to do that chunking. Which brings me to approach 2.

Use defaultdict from collections which acts in the exact same way as a normal dict would except that it defaults a value when you try to call a key that hasn't already been made.

For example:

#change
charges={}
#to
charges=defaultdict(list)

#and change
new={lead_id:[month,pid,year]}
    charges.update(new)
#to
charges[lead_id].append((month,pid,year))

which gives you a structure like this:

charges = {
    '123':[(month1,pid1,year1),(month2,pid2,year2),(..etc]
    }

With this approach, you can now iterate through each list at each key with:

for key in charges:
     for entities in charges[key]:
         print(entities) # would print `(month,pid,year)` for each separate entry

If you are using this approach, dont forget to from collections import defaultdict. If you don't want to import external, you can mimic this by:

if lead_id in charges:
    charges[lead_id].append((month,pid,year))
else
    charges[lead_id] = [(month,pid,year)]

Which is incredibly similar to the first approach but does the explicit "create a list if the key isnt there" that defaultdict would do implicitly.

Thanks! just a quick question how i can access that key which is having more than one list, i.e'123': '[(month1,pid1,year1),(month2,pid2,year2),(..etc] . Something like this if len(charges['123']) > 1 ??? — Shubham, Dec 11 '15 at 18:34
Even keys with only one `(month,pid,year)` value will still be a list (single value of a 3 element tuple). Used the mention for loop to iterate through every value. If you want to find all keys with more than one value then yes, that would work fine. — R Nar, Dec 11 '15 at 18:36

Python dictionary key length not same as rows returning for the query in mysql

1 Answers1