0

My title is not very descriptive, but it is difficult to explain it in one line. Hopefully you can see what I mean below:

Here is the dictionary:

d = {"A": [33, 333, 11, 111, 27, 272,], "B": [44, 444, 23, 233]}   #every two elements in the list are considered as a pair which should be later 'retrieved' as pair.

I want to work with each key in the dictionary, go over the list (value for that key in the dictionary) and do some tests, if the test passes, then I want to recover the elements that passed with its corresponding pair. Again, here is an example below to explain what I mean (I apologise for not making it very clear yet, please bear with me):

    i = 0
    for key, value in d.items():
        print key
        score_list = value[0::2] #get every other item (i.e. 33, 11, 27) , this returns a list
        highest_score_in_list = score_list[0]   # gets just 33 for key 'A' and 44 for key 'B'
        threshold = 0.8 * float(highest_score_in_list)  # 26.4 , 35.2
        for index, items in enumerate(score_list):
             i += 1
             id = value[1::2]    # I am hoping to get the 333, 111, 222) but I am not getting what I want
             if float(items) <=float(threshold):
                 pass
             else:
                 print index, items, id[i]

so what I was expecting is/ desired output:

     A
     0 33 333
     2 27 272
     B
     0 44 444

I haven't worked it out correctly though, I am getting an index error for the taxid[i]: What I am achieving is that the threshold check works correctly, but I think I am going wrong with the indexing, maybe the way I do the i =+1 and instead of printing the corresponding id of the pair, it can't correspond them correctly and it gives me errors.

Please comment where I need to give any further clarification, and your help is greatly appreciated. I have been trying to solve it for some time. Thank you.

FgS2
  • 11
  • 1
  • 6
  • Are you able to change the formatting of the input dict? If so, you might want a dict of lists of tuples. Eg. `{'a': [(33, 333), (11, 111)]}` – Morgan Thrapp Jul 01 '15 at 15:26
  • What is `taxid`? It is not defined anywhere in the given code. – Sudipta Jul 01 '15 at 15:26
  • @Morgan Thrapp: I guess I could, I don't know how to do it yet. I built the dictionary myself, I thought that this way was the best. I will look into how to make a dictIonary of lists of tuples. Any good links? – FgS2 Jul 01 '15 at 15:28
  • May be you need to partition those lists using solutions from http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python – Eriks Dobelis Jul 01 '15 at 15:31
  • @FgS2 Links on what? Tuples? Lists? Dictionaries? For any of them I'm going to recommend the [official](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range) [docs](https://docs.python.org/3/library/stdtypes.html#dict). – Morgan Thrapp Jul 01 '15 at 15:32
  • @Morgan Thrapp: I meant on how to build a dictionary of lists of tuples. I can't find anything on that. – FgS2 Jul 01 '15 at 15:36
  • @FgS2 The same way you build a dictionary/list of anything else. I have no idea how you're building the initial dictionary, so I can't say. – Morgan Thrapp Jul 01 '15 at 15:37
  • @Morgan Thrapp The way I am building the dictionary is that I have variables from another list that I then put as the key, and lists of values in the dictionary. `for k, v, c, in list: d[k].append(v) d[k].append(c)` And so this way, I don't know how I could have 'grouped v and c together so that the final dictionary would be like the one you suggested. This is where I am stuck. – FgS2 Jul 01 '15 at 15:46

2 Answers2

0

You're using i and increasing it every time you see a score, so it's the total number of scores you've looked at so far (regardless of the key), not the location of the id corresponding to a score. You could fix this, but @MorganThrapp's idea to change the data structure is a good one.

Using a dictionary or a namedtuple would be a better idea, because then you wouldn't have to remember what each element in a tuple corresponds to (i.e. does score come first or does id?), but you could use zip to pair up values:

>>> vals = d["A"]
>>> vals[::2]
[33, 11, 27]
>>> vals[1::2]
[333, 111, 272]
>>> list(zip(vals[::2], vals[1::2]))
[(33, 333), (11, 111), (27, 272)]

and so

d = {"A": [33, 333, 11, 111, 27, 272,], "B": [44, 444, 23, 233]}
for key, value in sorted(d.items()):
    print(key)
    pairs = list(zip(value[::2], value[1::2]))
    threshold = 0.8 * max(score for score, id in pairs)
    for score, id in pairs:
        if score >= threshold:
            print(score, id)

gives

A
33 333
27 272
B
44 444

where we don't have to use indices at all.

DSM
  • 342,061
  • 65
  • 592
  • 494
  • Yes I agree, if I am to change the data structure as Morgan suggested would be the ideal. I will work on that as well. – FgS2 Jul 01 '15 at 15:57
0

The initialization and incrementation of i variable is done at wrong places.

See the corrected version:

for key, value in d.items():
    i = 0
    print key
    score_list = value[0::2]
    highest_score_in_list = score_list[0]
    threshold = 0.8 * float(highest_score_in_list)
    for index, items in enumerate(score_list):
        id = value[1::2]
        if float(items) <=float(threshold):
            pass
        else:
            print index, items, id[i]
        i += 1

Output:

A
0 33 333
2 27 272
B
0 44 444
Sudipta
  • 4,773
  • 2
  • 27
  • 42
  • Works!! I was missing that minor thing you really saved me! Thank you! – FgS2 Jul 01 '15 at 15:55
  • @FgS2 glad to be of help. You can return the favour back by upvoting and accepting the answer! – Sudipta Jul 01 '15 at 16:01
  • I can only mark ("tick") one answer and atm don't have enough points to upvote you. When I do I will come back and upvote. The first answer was also useful so this time I will go with that. Again many thanks! – FgS2 Jul 01 '15 at 16:22