0

I have some dozens of tuples each containing 2 strings and 1 integer. Ex:(str, str, int). All these tuples are in a list (example below). Each tuple is unique and each tuple's strings and integer are also unique.

Ex.:

[('a','aA', 53),
 ('b','bb', 21),
 ('c','cc', 234),
 ('d','de', 76),
..]

What I want is, to use this data structure like a dictionary and retrieve the entire tuple for any of one of the 3 values I pass.

Ex.:

For value 'a' -> get the whole tuple of:('a', 'aA', 53)

For value 'cc' -> get the whole tuple of:('c', 'cc', 234)

For value '76' - > get the whole tuple of:('d', 'de', 76)

So far I have done: Creating a simple function to iterate through the list of tuples, go through each tuple and its all 3 values to find a match and if there's a match return the tuple, if not return False.

This sounds slow and seems like the very wrong way to do this task.

  1. What would be the right way to achieve this?
  2. Should I create 3 dictionaries and link them to each other?
Community
  • 1
  • 1
Phil
  • 13,875
  • 21
  • 81
  • 126

3 Answers3

1

You'd have to create separate indexes using dictionaries to allow looking up elements by contents:

from collections import defaultdict

index_on_1 = defaultdict(list)
index_on_2 = defaultdict(list)
index_on_3 = defaultdict(list)

for i, (val1, val2, val3) in enumerate(yourstructure):
    index_on_1[val1].append(i)
    index_on_2[val2].append(i)
    index_on_3[val3].append(i)

Now you can look up indices on strings:

from itertools import chain

def lookup(entry):
    if isinstance(entry, str):
        entries = chain(index_on_1.get(entry, []), index_on_2.get(entry, []))
        return [yourstructure[i] for i in entries]
    else:
        return [yourstructure[i] for i in index_on_3.get(entry, [])]

Note that this always returns a list, as your entries could match multiple tuples. If the lookup is a string, we only use the first two indexes, otherwise only the 3rd.

Alternatively, the more general solution that doesn't care about entry type, would be to create a list of indexes, instead of 3 separate variables:

indexes = [defaultdict(list) for _ in range(3)]

for i, values in enumerate(yourstructure):
    for index, val in zip(indexes, values):
        index[val].append(i)

with the lookup becoming:

def lookup(entry):
    entries = chain(*[index.get(entry, []) for index in indexes])
    return [yourstructure[i] for i in entries]

You could bundle this all up in a class, where your indexes are kept up to date as you add or remove elements.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Martijn, thank you for your time and explanations. I learn a lot from you. I believe your first approach is the most sensible way in computer programming so I will create indexes. It makes more sense. Just to keep resource usage low, I should not scramble the code I guess. Thanks a lot! – Phil Oct 15 '12 at 12:12
1

The easy, unsophisticated method is:

>>> your_list
[('a', 'aA', 53), ('b', 'bb', 21), ('c', 'cc', 234), ('d', 'de', 76)]
>>> def get_tuple(list_of_tuples, elem):
...     for item in list_of_tuples:
...             if elem in item:
...                     return item
...     return False
... 
>>> get_tuple(your_list, 'a')
('a', 'aA', 53)
>>> get_tuple(your_list, 'cc')
('c', 'cc', 234)

Though, you didn't specify, what should happen if one element is in more than in one tuple. What should be return for 'a' in list of

[('a','aA', 53),
('b','bb', 21),
('a','ca', 234),
..]
Gandi
  • 3,522
  • 2
  • 21
  • 31
1

To keep a O(1) look up you can create a dictionary like this from those tuples:

In [20]: lis=[('a','aA', 53),
   ....:  ('b','bb', 21),
   ....:  ('c','cc', 234),
   ....:  ('d','de', 76)]

In [22]: dic=dict((y,x) for x in lis for y in x)

In [23]: dic

Out[23]: 
{21: ('b', 'bb', 21),
 53: ('a', 'aA', 53),
 76: ('d', 'de', 76),
 234: ('c', 'cc', 234),
 'a': ('a', 'aA', 53),
 'aA': ('a', 'aA', 53),
 'b': ('b', 'bb', 21),
 'bb': ('b', 'bb', 21),
 'c': ('c', 'cc', 234),
 'cc': ('c', 'cc', 234),
 'd': ('d', 'de', 76),
 'de': ('d', 'de', 76)}

now searching any item becomes easy:

In [24]: dic.get('a','not found')
Out[24]: ('a', 'aA', 53)

In [25]: dic.get('aA','not found')
Out[25]: ('a', 'aA', 53)

In [26]: dic.get('21','not found')
Out[26]: 'not found'

In [27]: dic.get(21,'not found')
Out[27]: ('b', 'bb', 21)
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504