0

I'm trying to get the Key based on values in a list of the key or return the element if the value/key is not found in the dict.

headersDict = {'Number; SEX AND AGE - Total Population':['TPop'],
               'Number; SEX AND AGE - Male Population':['MPop'],
               'Number; SEX AND AGE - Female Population':['FPop'],
               'Under 5 years': ['<5'],
               '5 to 9 years': ['5_9'],
               '10 to 14 years': ['10_14'],
               '15 to 19 years': ['15_19'],
               '20 to 24 years': ['20_24'],
               '25 to 29 years': ['25_29'],
               '30 to 34 years': ['30_34'],
               '35 to 39 years': ['35_39'],
               '40 to 44 years': ['40_44'],
               '45 to 49 years': ['45_49'],
               '50 to 54 years': ['50_54'],
               '55 to 59 years': ['55_59'],
               '60 to 64 years': ['60_64'],
               '65 to 69 years': ['65_69'],
               '70 to 74 years': ['70_74'],
               '75 to 79 years': ['75_79'],
               '80 to 84 years': ['80_84'],
               '85 years and over': ['85+'],
               'Median age(years)': ['Medage'],
               '16 years and over': ['16+'],
               '18 years and over': ['18+'],
               '21 years and over': ['21+'],
               '62 years and over': ['62+', 'sixty two+'],
               '65 years and over': ['65+', 'sixty five+']}

headersList = [  '1+', '25_29', '85+',
                '65+'
                ]
new_headersList = [k for k, v in headersDict.items() for elem in headersList for val in v if elem == val]


print(new_headersList)

If I try the above, I get the output as:

$ python 1.py 
['25 to 29 years', '85 years and over', '65 years and over']

What I require is:

$ python 1.py 
['1+', '25 to 29 years', '85 years and over', '65 years and over']

Thanks in advance for the help

sgp
  • 87
  • 6
  • 2
    1+ isn't in your headers dict, right? Also, consider breaking your list comprehension into For loops, it will make it easier to understand and troubleshoot for that complicated of a comprehension. – user1558604 Dec 09 '19 at 03:15
  • 1
    Why do you have a bunch of lists with a single item? This seems like the unfortunate result of poor design. – AMC Dec 09 '19 at 03:15
  • maybe write list comprehension as normal loop and then you will better see how it works and it will be easier to modify it. – furas Dec 09 '19 at 03:16
  • 1
    Your code is going over the dict and printing all keys where the values contain one of the expected headers. But there is no `1+` anywhere in the dict. Is your input data wrong or the business logic? – Hubert Grzeskowiak Dec 09 '19 at 03:16
  • i think thats why `1+` is still `1+` in his expected output – Joran Beasley Dec 09 '19 at 03:16
  • Does this answer your question? [Get key by value in dictionary](https://stackoverflow.com/questions/8023306/get-key-by-value-in-dictionary) – AMC Dec 09 '19 at 03:16
  • JoranBeasley is right. I think I did not articulate it right. If the key is not found, then i just want the not found result back, so that I dont miss on it. AlexanderCécile: I have lists in the value, since I can have a bunch of sources, that will be in different ways to denote or represent, and i'm trying to unify it. – sgp Dec 09 '19 at 03:44

3 Answers3

1

these problems are typically easier if you invert your dict

inverted_dict = {val:key for key,arr in my_dict.items() for val in arr}

now you can simply lookup your keys

for key in [  '1+', '25_29', '85+', '65+']:
    print(inverted_dict.get(key,key))
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
  • The idea is good, but you might want to make it more applicable to the data structures and variable names of the question. – Hubert Grzeskowiak Dec 09 '19 at 03:19
  • 2
    i would rather make it more broadly applicable to assist others with similar questions in the future ... anyone on here should be able to draw the correct connections – Joran Beasley Dec 09 '19 at 03:20
1

This code inverses the dictionary so that each value within the array becomes a new key. With that inversed dictionary it's very easy to query individual header keys or fall back to the header name.

headersDict = {'Number; SEX AND AGE - Total Population': ['TPop'],
               'Number; SEX AND AGE - Male Population': ['MPop'],
               'Number; SEX AND AGE - Female Population': ['FPop'],
               'Under 5 years': ['<5'],
               '5 to 9 years': ['5_9'],
               '10 to 14 years': ['10_14'],
               '15 to 19 years': ['15_19'],
               '20 to 24 years': ['20_24'],
               '25 to 29 years': ['25_29'],
               '30 to 34 years': ['30_34'],
               '35 to 39 years': ['35_39'],
               '40 to 44 years': ['40_44'],
               '45 to 49 years': ['45_49'],
               '50 to 54 years': ['50_54'],
               '55 to 59 years': ['55_59'],
               '60 to 64 years': ['60_64'],
               '65 to 69 years': ['65_69'],
               '70 to 74 years': ['70_74'],
               '75 to 79 years': ['75_79'],
               '80 to 84 years': ['80_84'],
               '85 years and over': ['85+'],
               'Median age(years)': ['Medage'],
               '16 years and over': ['16+'],
               '18 years and over': ['18+'],
               '21 years and over': ['21+'],
               '62 years and over': ['62+', 'sixty two+'],
               '65 years and over': ['65+', 'sixty five+']}


headersDictReversed = {}
for k, v in headersDict.items():
  for new_k in v:
    headersDictReversed[new_k] = k

headersList = ['1+', '25_29', '85+', '65+']
results = []
for header in headersList:
  # Return the value for header and default to the header itself.
  results.append(headersDictReversed.get(header, header))
print(results)

['1+', '25 to 29 years', '85 years and over', '65 years and over']

Hubert Grzeskowiak
  • 15,137
  • 5
  • 57
  • 74
0

If you can use pandas, you can use this solution:

import pandas as pd

df1 = pd.DataFrame(headersDict, index=[0,1]).T.reset_index()
df1 = pd.DataFrame(pd.concat([df1[0], df1[1]]).drop_duplicates()).join(df1, lsuffix='_1').drop(columns=['0',1]).rename(columns={'0_1':0}) 
a = pd.DataFrame(headersList).merge(df1, 'outer')[0:len(pd.DataFrame(headersList))].set_index(0)['index'] 
a.fillna(a.index.to_series()).values.tolist() 


# ['1+', '25 to 29 years', '85 years and over', '65 years and over']
oppressionslayer
  • 6,942
  • 2
  • 7
  • 24