0

So i am trying to fetch data from the mysql into a python dictionary here is my code.

def getAllLeadsForThisYear():
charges={}
cur.execute("select lead_id,extract(month from transaction_date),pid,extract(Year from transaction_date) from transaction where lead_id is not NULL and transaction_type='CHARGE' and YEAR(transaction_date)='2015'")
for i in cur.fetchall():
    lead_id=i[0]
    month=i[1]
    pid=i[2]
    year=str(i[3])
    new={lead_id:[month,pid,year]}
    charges.update(new)
return charges

x=getAllLeadsForThisYear()

when i prints (len(x.keys()) it gave me some number say 450

When i run the same query in mysql it returns me 500 rows.Although i do have some same keys in dictionary but it should count them as i have not mentioned it if i not in charges.keys(). Please correct me if i am wrong. Thanks

Shubham
  • 85
  • 1
  • 9
  • 2
    since `charges` is a dict, it can only hold one value for each key. If you have multiple keys, it will overwrite whatever value you had with the new value. – R Nar Dec 11 '15 at 17:44
  • Thanks R Nar, Can you please suggest me any other way so that i can fetch all keys in a dictionary, because in my data , key can be same sometimes but dictionary contents(array) are different . – Shubham Dec 11 '15 at 17:48
  • the best way would be to set `charges = [defaultdict(list)`](https://docs.python.org/2/library/collections.html#collections.defaultdict) and appending each `(month,pid,year)` to the list at each key, rather than having a single list, but this will still give you the same amount of keys. why do you need to know the amount of keys anyways? – R Nar Dec 11 '15 at 17:53
  • I am not concerned about the number it prints but it should have it in the dictionary, because i need to use that dictionary to generate some kind of report but i am having confliction in the data in my report as i was trying to compare it with the data in MYSQL. – Shubham Dec 11 '15 at 17:57
  • i am thinking to change it to something like this charges ={id(some uniquenumber) :{lead_id:[month,pid,year]}} Will this work ? – Shubham Dec 11 '15 at 17:59
  • yes but unless you have some way to hash that unique id, it would probably defeat the purpose of your dict. If you dont actually need the length, I believe my approach is what you are look for, I will make an answer to clarify what I meant – R Nar Dec 11 '15 at 18:02

1 Answers1

1

As I said, the problem is that you are overwriting your value at a key every time a duplicate key pops up. This can be fixed two ways:

  1. You can do a check before adding a new value and if the key already exists, append to the already existing list.

For example:

#change these lines
new={lead_id:[month,pid,year]}
    charges.update(new)
#to
if lead_id in charges:
    charges[lead_id].extend([month,pid,year])
else
    charges[lead_id] = [month,pid,year]

Which gives you a structure like this:

charges = {
    '123':[month1,pid1,year1,month2,pid2,year2,..etc]
    }

With this approach, you can reach each separate entry by chunking the value at each key by chunks of 3 (this may be useful)

However, I don't really like this approach because it requires you to do that chunking. Which brings me to approach 2.

  1. Use defaultdict from collections which acts in the exact same way as a normal dict would except that it defaults a value when you try to call a key that hasn't already been made.

For example:

#change
charges={}
#to
charges=defaultdict(list)

#and change
new={lead_id:[month,pid,year]}
    charges.update(new)
#to
charges[lead_id].append((month,pid,year))

which gives you a structure like this:

charges = {
    '123':[(month1,pid1,year1),(month2,pid2,year2),(..etc]
    }

With this approach, you can now iterate through each list at each key with:

for key in charges:
     for entities in charges[key]:
         print(entities) # would print `(month,pid,year)` for each separate entry

If you are using this approach, dont forget to from collections import defaultdict. If you don't want to import external, you can mimic this by:

if lead_id in charges:
    charges[lead_id].append((month,pid,year))
else
    charges[lead_id] = [(month,pid,year)]

Which is incredibly similar to the first approach but does the explicit "create a list if the key isnt there" that defaultdict would do implicitly.

Community
  • 1
  • 1
R Nar
  • 5,465
  • 1
  • 16
  • 32
  • Thanks! just a quick question how i can access that key which is having more than one list, i.e'123': '[(month1,pid1,year1),(month2,pid2,year2),(..etc] . Something like this if len(charges['123']) > 1 ??? – Shubham Dec 11 '15 at 18:34
  • Even keys with only one `(month,pid,year)` value will still be a list (single value of a 3 element tuple). Used the mention for loop to iterate through every value. If you want to find all keys with more than one value then yes, that would work fine. – R Nar Dec 11 '15 at 18:36