1

Suppose I have a list

msg_type =["sent-message", "received-message", "received-message", "sent-message", "received-message", "sent-message", "received-message", "received-message", "sent-message", "sent-message", "received-message", "sent-message", "sent-message", "received-message", "sent-message", "sent-message", "received-message", "sent-message", "received-message", "received-message", "sent-message", "sent-message", "received-message", "sent-message", "received-message", "sent-message", "received-message" ]

How would I be able to group this if there is change in next item

for ex: first set would be :
("sent-message", "received-message", "received-message")
second set would be:
("sent-message", "received-message")
and if sent-message repeats itself it should be in one group
"sent-message", "sent-message", "received-message"

I am able to achieve it for all other cases except the last one.
Basically I want to group them by single exchange

expected result:

conversations = [(0,2), (3,4),(5,7),(8,10), (11,13), (14,16),(17,19),(20,22),(23,24),(25,26)]
Sanket Wagh
  • 156
  • 1
  • 14
  • 2
    I don't understand what you're asking. – Frank Yellin Mar 24 '23 at 19:55
  • Why don't you get a new set when the value changes from `sent-message` to `received-message`? – Barmar Mar 24 '23 at 19:56
  • I think there is data missing that connects the messages to each other. You mean that you have (say) emails that are replies within an email thread, and you want to group them somehow into threads based on shared text data? – Lover of Structure Mar 24 '23 at 19:57
  • suppose I and you are texting . if i send 2 message and after that you send 2 messages it would be one set of conversation. it can be different each time like it I could also sent 5 message and you would send 1 in reply so for first part there would be 4 messages thus (0,3) for second part there would be 6 messages (4,9) and op would be [(0,3), (4,9)] – Sanket Wagh Mar 24 '23 at 20:00

2 Answers2

1

Here's a hacky solution. It returns the expected results, however I haven't fully tested every use case. Hope this helps

def group_conversations(messages):
    conversations = []
    start = 0
    for k, v in enumerate(messages[1:]):
        if v == "sent-message":  # Switched
            if messages[k] == "sent-message":  # Sent 2 in a row, skip
                continue
            conversations += [[start, k]]
            start = k+1
    last_group = [[conversations[-1][1]+1, len(messages)-1]]  # Hack to grab the last conv which is excluded
    return conversations + last_group
TheLazyScripter
  • 2,541
  • 1
  • 10
  • 19
1

You can try:

from itertools import groupby

out = []
for k, g in groupby(enumerate(msg_type), lambda k: k[1]):
    g = list(g)
    if k == "sent-message":
        out.append([g[0][0]])
    else:
        out[-1].append(g[-1][0])

print(out)

Prints:

[[0, 2], [3, 4], [5, 7], [8, 10], [11, 13], [14, 16], [17, 19], [20, 22], [23, 24], [25, 26]]
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91