0

I am attempting to create a function that takes a 2 dimensional list and return a dictionary. I am wondering if there is a more efficient way instead of what I have written (e.g. list comprehension / itertools?) I am relatively new to python and have read some examples on list comprehension and itertools doc (Iterating over a 2 dimensional python list) but can't seem to implement it to this chunk of code.

Any help would be appreciated. Thank you!

def listToDict(self, lstInputs):        
    dictOutput = dict()
    rows = len(lstInputs)
    cols = len(lstInputs[0])
    if rows == 2:
        for x in range(rows):
            if lstInputs[0][x] is not None:
                if lstInputs[1][x] is not None:
                    dictOutput[lstInputs[0][x].strip()] = lstInputs[1][x].strip()
                else:
                    dictOutput[lstInputs[0][x].strip()] = lstInputs[1][x]
    elif cols == 2:
        for x in range(rows):
            if lstInputs[x][0] is not None:
                if lstInputs[x][1] is not None:
                    dictOutput[lstInputs[x][0].strip()] = lstInputs[x][1].strip()
                else:
                    dictOutput[lstInputs[x][0].strip()] = lstInputs[x][1]
    else:
        pass
    
    return dictOutput
Glorfindel
  • 21,988
  • 13
  • 81
  • 109
AiRiFiEd
  • 311
  • 2
  • 12
  • 1
    Can you provide an example list? – Jan Zeiseweis Jun 16 '17 at 09:41
  • @JanZeiseweis hi! thanks for your reply! I have read some examples at https://docs.python.org/3/library/itertools.html#itertools.zip_longest and http://jmduke.com/posts/a-gentle-introduction-to-itertools/ but can't seem to find 1 that would help this. if i were to chain the list, I cant think of how to assign the key-value pair to the dictionary when looping – AiRiFiEd Jun 16 '17 at 09:44
  • 1
    Please show an example of input and output. – Daniel Roseman Jun 16 '17 at 09:46
  • 1
    I was actually asking for an example list that you'd like to convert to a dict. – Jan Zeiseweis Jun 16 '17 at 09:46
  • Here is a cross-reference where this is already answered. https://stackoverflow.com/questions/30387014/make-dictionary-from-2d-array-python – Indika Rajapaksha Jun 16 '17 at 09:47
  • apologies - for example, `[ [key1, value1], [key2, value2], ...]` OR `[ [key1, key2, ...], [value1, value2, ...] ]`@IndikaRajapaksha thanks a lot for the link! given that I have to test if the element is `None` and that I have to `string.strip()` the inputs if they aren't `None`, would looping via `for k, v in s` be much faster than the current loop? Thanks a lot for the help! edit: the reason why I need to check for `None`s and do `string.strip()` is because I am actually reading data from excel (user inputs) via `xlwings` and am trying to do some basic clean-up as I do not trust user inputs – AiRiFiEd Jun 16 '17 at 09:54
  • It is always wise to use the python implementation other than code it by yourself. You always can validate user inputs after converting it to a dictionary. Also using `defaultdict` will take less coding which is easy to maintain in long run. – Indika Rajapaksha Jun 16 '17 at 10:02
  • @IndikaRajapaksha hey I have tried using the defaultdict in another function that I am writing and it works fine! although i dont really know how to go about testing for efficiency but i too believe python implementation should be faster. thanks again! would you like to put your reply as the answer? – AiRiFiEd Jun 16 '17 at 10:37
  • @IndikaRajapaksha how would a `defaultdict` help here ??? – bruno desthuilliers Jun 16 '17 at 11:01
  • @brunodesthuilliers refer the link mentioned in the first comment. – Indika Rajapaksha Jun 16 '17 at 11:02
  • @IndikaRajapaksha sorry but I still don't get the point... The OP does not ask about regrouping values for a same key but about transforming a list of key values pairs (or a pair of keys, values lists) into a dict more efficiently than he actually does. There's no mention of duplicate keys or grouping in his post. – bruno desthuilliers Jun 16 '17 at 11:14

2 Answers2

2

Your function is doing way too many things:

  1. Trying to find out if it's input is a sequence of key=>value pairs or a pair of keys, values sequences. It's unreliable. Don't try to guess, it's the caller's duty to pass the right structure, because only the caller knows what data he wants to turn into a dict.

  2. Cleaning (currently striping) keys and vals. Here again it only makes sense if both are strings, which is not garanteed to be the case (at least not from the function's name nor documention...). You could of course test if your keys and/or values are indeed strings but this adds quite some overhead. Here again it's the caller's duty to do the (eventual) cleaning.

To make a long story short, your function should only expect a single data structure (either a sequence of key=>value pairs or a pair of (keys, values) sequence, and not apply any cleanup, leaving on the caller the responsability to provide what's expected.

Actually, building a dict from a sequence (or any iterable) of pairs is actually so trivial that you don't need a special function, it's just a matter of passing the sequence to the dict constructor:

>>> lst_of_pairs = [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]
>>> dict(lst_of_pairs) 
{0: 'a', 1: 'b', 2: 'c', 3: 'd'}

Or on more recent python versions using a dict comprehension which can faster:

>>> lst_of_pairs = [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]
>>> {k:v for k, v in lst_of_pairs} 
{0: 'a', 1: 'b', 2: 'c', 3: 'd'}

So well, your first building block is builtin and dont need any special func.

Note that this works with any iterable as long as 1. it yields only pairs and 2. the keys (first items of the pairs) are uniques. So if you want to apply some cleaning before building the dict, you can do it with a generator function or expression, ie if the caller knows all the keys are strings and might need striping and all the values are either strings needing striping or None, you can pass a generator expression instead of the source list, ie:

>>> lst_of_pairs = [(" a ", "1 "), ("b ", None), ("c", " fooo ")]
>>> {k.strip(): v if v is None else v.strip() for k, v in lst_of_pairs}
{'a': '1', 'c': 'fooo', 'b': None}

Finally, transposing a pair of keys, values sequences to a sequence of key=>value pairs is what the builtin zip() and it's lazy version itertools.izip() are for:

>>> keys = [' a ', 'b ', 'c']
>>> values = ['1 ', None, ' fooo ']
>>> zip(keys, values)
[(' a ', '1 '), ('b ', None), ('c', ' fooo ')]
>>> list(itertools.izip(keys, values))
[(' a ', '1 '), ('b ', None), ('c', ' fooo ')]

Putting it together, the most "devious" case (building a dict from a sequence of keys and a sequence of values, applying striping to keys and conditionnaly applying striping to values) can be expressed as:

>>> {k.strip(): v if v is None else v.strip() for k, v in itertools.izip(keys, values)}
{'a': '1', 'c': 'fooo', 'b': None}

If it's for a one-shot use, that actually all you need.

Now if you have a use case where you know you will have to apply this from different places in your code with always the same cleaning but either lists of pairs or pairs of lists, you of course want to factor it out as much as possible - but not more:

def to_dict(pairs):
    return {
        k.strip(): v if v is None else v.strip()) 
        for k, v in lst_of_pairs
        }

and then leave it to the caller to apply zip() before if needed:

def func1():
    keys = get_the_keys_from_somewhere()
    values = get_the_values_too()
    data = to_dict(itertools.izip(keys, values))
    do_something_with(data)


def func2()
   pairs = get_some_seqence_of_pairs()
    data = to_dict(pairs)
    do_something_with(data)

As to wether you want to use zip() or itertools.izip(), it mostly depends on your Python version and your inputs.

If you're using Python 2.x, zip() will build a new list in memory while itertools.izip() will build it lazily, so there's a slight performance overhead from using itertools.izip() but it will save a lot of memory if you're working large datasets.

If you're using Python3.x, zip() has been turned into an iterator, sus replacing itertools.izip() so the question becomes irrelevant ;)

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118
  • dict comprehension and generator expression...thank you so much for taking the time to answer this in details! – AiRiFiEd Jun 16 '17 at 11:26
0
l = [[1,2,3],['a','b','c']]

def function(li):
    d = {}
    for num in zip(li[0],li[1]):
        d[num[0]] = num[1]
    print(d)
function(l)
out put:
{1: 'a', 2: 'b', 3: 'c'}
joshua
  • 28
  • 8
  • would it be possible to demonstrate how to implement this with [ [1,'a'], [2, 'b'] ..] instead? apologies! – AiRiFiEd Jun 16 '17 at 10:38