3

If i have a list in python say

thing = [[20,0,1],[20,0,2],[20,1,1],[20,0],[30,1,1]]

I would want to have a resulting list

thing = [[20,1,1],[20,0,2],[30,1,1]]

That is if the first element is the same, remove duplicates and give priority to the number 1 in the second element. Lastly the 3rd element must also be unique to the first element.

In this previous question we solved a complicated method where for a transaction it details a purchased unit. I want to output other units in that course. If two transactions exist that relate to two units in one course it will display them a duplicate (or times each subsequent unit).

The aim of this question it to ensure that this duplication is stopped. Because of the complication of this solution it has resulted in a series of question. Thanks for everyone that has helped so far.

Community
  • 1
  • 1
Alex Stewart
  • 730
  • 3
  • 12
  • 30
  • possible duplicate of [Get unique items from list of lists?](http://stackoverflow.com/questions/6926928/get-unique-items-from-list-of-lists) – hivert Jul 26 '13 at 11:02
  • No this is different as you can see we are looking at the individual values in a multidimensional list. – Alex Stewart Jul 26 '13 at 11:08
  • use sets to remove duplicates, but sets come with a disadvantage as it's not ordered however there's a geeky function yourset.oderedDict() or something to order it or so(not sure but check on pythons official documentation for sets) – K DawG Jul 26 '13 at 11:10
  • I am ok with no ordering as i will use a for loop and an id statement on the first element in display. My problem is the duplication. Not my strongest point, so it would be awesome and point worthy if someone made a cool loop for me. Thanks – Alex Stewart Jul 26 '13 at 11:12
  • @AlexStewart: So you not only want to make the items unique, you also want to reorder them? I wonder why for your example you would expect `[[20,1,1],[20,0,2],[30,1,1]]` instead of `[[20,1,1],[20,0,1],[30,1,1]]`. – Frerich Raabe Jul 26 '13 at 11:29
  • "..if the first element is the same, remove duplicates.." do you mean 'merge' two sublists if they have the same length and their first elements are the same[, where max value of two values gets priority during the merge]? – RussW Jul 26 '13 at 11:33
  • @FrerichRaabe thanks for your interest, no ordering is required. – Alex Stewart Jul 26 '13 at 11:43
  • @RussW So if the first element is the same remove all duplicates where there exists 0 in the second element. Apologies for the complication. – Alex Stewart Jul 26 '13 at 11:44
  • 1
    Shouldn't resulting list be `thing = [[20,1,1],[30,1,1]]`? Why you save `[20,0,2]` element? And what about third element? Where should it come from? – twil Jul 26 '13 at 11:49
  • @twil the last element must also be unique. I know its complicated. – Alex Stewart Jul 26 '13 at 11:51
  • @AlexStewart: You wrote `give priority to the number 1 in the second element` - what happens if the second element is, say, 2? or 3? – Frerich Raabe Jul 26 '13 at 11:58
  • @FrerichRaabe, just edited to add further clarification that the third element must be unique to the first element. – Alex Stewart Jul 26 '13 at 11:59
  • @AlexStewart: What does `to be unique to the first element` mean? No two lists with the same first element may have the same third element? – Frerich Raabe Jul 26 '13 at 12:04

3 Answers3

2

I am not sure you would like this, but it works with your example:

[list(i) + j for i, j in dict([(tuple(x[:2]), x[2:]) for x in sorted(thing, key=lambda x:len(x))]).items()]

EDIT:

Here a bit more detailed (note that it fits better to your description of the problem, sorting ONLY by the length of each sublist, may not be the best solution):

thing = [[20,0,1],[20,0,2],[20,1,1],[20,0],[30,1,1]]
dico = {}
for x in thing:
    if not tuple(x[:2]) in dico:
        dico[tuple(x[:2])] = x[2:]
        continue
    if tuple(x[:2])[1] < x[1]:
        dico[tuple(x[:2])] = x[2:]

new_thing = []
for i, j in dico.items():
    new_thing.append(list(i) + j)
fransua
  • 1,559
  • 13
  • 30
  • It doesn't work for input `thing = [[20,0,1],[20,0,2],[20,1,1],[20,1,2]]`, output should be `[[20, 1, 2], [20, 1, 1]]` and your solution gives `[[20, 1, 2], [20, 0, 2]]` – Roman Pekar Jul 26 '13 at 12:47
  • @RomanPekar I think it do have to return [[20, 1, 2], [20, 0, 2]]... but I can be missunderstanding something... – fransua Jul 26 '13 at 12:55
  • 1
    `[list(i) + j for i, j in {tuple(x[:2]): x[2:] for x in sorted(thing, key=len)}.items()]` would be a cleaner version of the original list comprehension. – sjakobi Jul 26 '13 at 12:58
  • I don't know, may be it's my English, by I see a "give priority to the number 1 in the second element" in the description of the question. Your solution doesn't check for this condition. – Roman Pekar Jul 26 '13 at 13:03
2

You might want to try using the unique_everseen function from the itertools recipes.

As a first step, here is a solution excluding [20, 0]:

from itertools import filterfalse

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in filterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

thing = [[20,0,1],[20,0,2],[20,1,1],[30,1,1]]

thing.sort(key=lambda x: 0 if x[1] == 1 else 1)

print(list(unique_everseen(thing, key=lambda x: (x[0], x[2]))))

Output:

[[20, 1, 1], [30, 1, 1], [20, 0, 2]]
sjakobi
  • 3,546
  • 1
  • 25
  • 43
2
thing = [[20,0,1],[20,0,2],[20,1,1],[20,0,1],[30,1,1]]

d = {}
for e in thing:
    k = (e[0], e[2])
    if k not in d or (d[k][1] != 1 and e[1] == 1):
        d[k] = list(e)

print d.values()

[[20, 0, 2], [30, 1, 1], [20, 1, 1]]

if you don't need initial list:

thing = [[20,0,1],[20,0,2],[20,1,1],[20,0,1],[30,1,1]]

d = {}
for e in thing:
    k = (e[0], e[2])
    if k not in d or (d[k][1] != 1 and e[1] == 1):
        d[k] = e

thing = d.values()

[[20, 0, 2], [30, 1, 1], [20, 1, 1]]

if you want to keep order of your lists, use OrderedDict

from collections import OrderedDict
d = OrderedDict()
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197