1

i have a list of two string tuples: For example purposes i use this list of tuples but of course the list is generally longer:

[("Hello world","1"), ("Helloworld","2"),("Hi, Hello world","1"),("How are you","3"),("HiHelloworld","2")]

The two strings of the tuple are the messages and the sender ID, these messages are of variable length, the only thing that doesn't change is the sender ID. I find myself with a list with multiple messages of different length with the same sender ID, i just want to get a list with the longest message of each sender: e.g. in my example would be:

[("Hi, Hello world","1"),("How are you","3"),("HiHelloworld","2")]

I'm a little bit confused as i don't often work with tuples so i really don't know how to procede. I know i should sort the list before doing anything, that's ok, i know to do this but how do i the longest string for each sender after that, knowing that each element of the list is not a string or integer but a tuple?

Thank you very much!

Laz22434
  • 373
  • 1
  • 12

4 Answers4

3

You could create a dictionary using a comprehension inside the update method:

L = [("Hello world","1"), ("Helloworld","2"),("Hi, Hello world","1"),
     ("How are you","3"),("HiHelloworld","2")]

D = dict()
D.update((s,m) for m,s in L if len(m)>=len(D.get(s,'')))

{'1': 'Hi, Hello world', '2': 'HiHelloworld', '3': 'How are you'}

You could sort the list beforehand but that would actually be less efficient than the update() approach:

D = dict(map(reversed,sorted(L,key=lambda ms:len(ms[0]))))

{'2': 'HiHelloworld', '1': 'Hi, Hello world', '3': 'How are you'}
Alain T.
  • 40,517
  • 4
  • 31
  • 51
  • Is it guaranteed or an implementation detail? – Bharel Jan 25 '22 at 15:37
  • @Bharel, I assume you are referring to the use of D.get() within the generator provided to D.update(). I could not find an explicit specification but it would seem unlikely that updating form an iterator would not make the changes as the elements are consumed (this also works for list.extend btw). – Alain T. Jan 25 '22 at 16:02
  • There are plenty of things that could go wrong: locking mechanisms in place preventing reading while memory is being modified, pre-sizing operations to get the length before actually allocating the area (like `str.join()` does), and many other edge cases which can cause an issue. Unless the specification allows it, I tend to think this is an implementation detail (which might be here to stay, otherwise it would break this kind of code, but still). Spidey sense tingles for trouble on that one. – Bharel Jan 25 '22 at 16:12
2

You can use a dictionary (defaultdict) to keep track of the longest message per ID:

from collections import defaultdict

# input
l = [("Hello world","1"), ("Helloworld","2"),("Hi, Hello world","1"),("How are you","3"),("HiHelloworld","2")]

d = defaultdict(lambda:('', float('-inf')))
for msg, ID in l:
    if len(msg) > len(d[ID][0]):
        d[ID] = (msg, ID)
out = list(d.values())

output:

[('Hi, Hello world', '1'), ('HiHelloworld', '2'), ('How are you', '3')]
mozway
  • 194,879
  • 13
  • 39
  • 75
  • 1
    Might be worth mentioning that no sorting of the list is needed with this solution, as OP was under the impression that it needs to be to do anything: ` I know i should sort the list before doing anything` – FlyingTeller Jan 24 '22 at 11:38
2

You can map it using a regular dictionary while comparing the current size before insertion:

messages = [("Hello world","1"), ("Helloworld","2"),("Hi, Hello world","1"),("How are you","3"),("HiHelloworld","2")]

def get_longest_messages(messages):
    output = {}
    for message, sender in messages:
        if len(message) > len(output.get(sender, "")):
            output[sender] = message
    return output

print(get_longest_messages(messages))

Output:

{'1': 'Hi, Hello world', '2': 'HiHelloworld', '3': 'How are you'}

I highly suggest leaving the output as a dictionary.

Bharel
  • 23,672
  • 5
  • 40
  • 80
1

Once you sort the list, you can create an auxiliary list with all the strings from the same sender ID, and then apply the max function in order to get the longest string out of that auxiliary list.

>>> mylist = ['123','123456','1234']
>>> print max(mylist, key=len)
123456

Different approaches can be seen in this post.

Atalajaka
  • 125
  • 1
  • 2
  • 14