Python: Difference between two positions in one list depending on rule set for another list

Question

Consider two lists of identical length:

t is a list of irregular-intervals of times in seconds, arranged chronologically
pt is a list of sequences of numbers 1,2,3 wsuch that a 1 is followed by a consecutive string of 2's, then followed by a 3.
- 1 = start of event, 2 = continuation of event, 3 = end of event
- This means that for a single event, the sequence begins with a single 1, is followed by a consecutive string of 2s (how many times it repeats will vary), and finally ends in a single 3.
- There is more than 1 event contained in this vector

For example, the input could look like:

#    |--Event #1-|   |---Event #2----|   |Event #3 | 
pt = [1, 2,  2,  3,  1,  2,  2,  2,  3,  1,  2,  3 ]
t =  [1, 10, 13, 14, 17, 20, 21, 25, 37, 32, 33, 38]

Is there a 1-liner that doesn't involve multiple nested loops that we could use that would calculate the difference in time values in t for each event sequence in pt?

For example, the desired output for the above inputs would be a list of length 3 (because there are 3 events) where the output is

Output: [13, 20, 6]

### Explanation:
# 13 = 14-1  = t[position where pt shows first 3]  - t[position where pt shows first 1]
# 20 = 37-17 = t[position where pt shows second 3] - t[position where pt shows second 1]
# 6  = 38-32 = t[position where pt shows third 3]  - t[position where pt shows third 1]

https://stackoverflow.com/help/how-to-ask StackOverflow does not do your homework for you. — Frank Yellin, Nov 12 '20 at 01:13
@FrankYellin I apologize for the misunderstanding here. Thank you for sharing the link. I'm not a student, but a (clearly inexperienced) Python user working with a medical dataset for a project at work. I've edited the message to make that a little bit clearer. — quant_fin, Nov 12 '20 at 02:48
I clarified below, but this question is based on a medical dataset was collected from patients who were asked to draw an image on an iPad. At various points in time (i.e. the `t` series above), the (x,y) coordinates were recorded [though I didn't included that info in this question since it wasn't necessary] and a tag indicating the nature of the point was recorded, which identifies whether the person just pressed their Apple Pen to the page (i.e. 1), dragged it (i.e. 2), or lift their Apple Pen off the page (i.e. 3) was recorded - that's the `pt` series described above — quant_fin, Nov 12 '20 at 02:48

Ironkey · Accepted Answer · 2020-11-12T05:15:45.713

3

using pure python:

pt = [1, 2,  2,  3,  1,  2,  2,  2,  3,  1,  2,  3 ]
t =  [1, 10, 13, 14, 17, 20, 21, 25, 37, 32, 33, 38]

l = [y for x, y in zip(pt,t) if x in [1,3]]

print([l[i:i+2][1] - l[i:i+2][0] for i in range(0, len(l), 2)])

[13, 20, 6]

using more_itertools.chunked():

from more_itertools import chunked

print([y-x for x,y in chunked([y for x, y in zip(pt,t) if x in [1,3]], 2)])

[13, 20, 6]

explanation

If you look close, we see this list comprehension occurring multiple times. This is the center of the solution!

[y for x, y in zip(pt,t) if x in [1,3]]

So, what's going on?

Using the zip function, we make a list of the paired elements, and if the x element (1st list element pair) is either 1 or 3 we add it to the list.

This gives us a list of the differences we need to find.

#|---|  |----|  |----|
[1, 14, 17, 37, 32, 38]

Now comes the second part, getting the differences from these. We essentially need to make pairs from this, the method I'm going to use here is chunking. The pure python way to partition a list into chunks is as follows:

#given a list l
chunklen = 2 
[l[i:i+chunklen] for i in range(0, len(l), chunklen)]

using this we could partition the [1, 14, 17, 37, 32, 38] list to:

[[1, 14], [17, 37], [32, 38]]

but it's far simpler to just get the differences immediately!

l[i:i+chunklen][1]-l[i:i+chunklen][0]
#given l[i:i+chunklen] as [1, 14] this would return 14-1 i.e. 13

edited Nov 12 '20 at 05:15

answered Nov 12 '20 at 01:24

Ironkey

2,568
1
8
30

This was very helpful and clear - thank you for sharing an alternative approach using the `more_itertools` package. Haven't had much exposure to that before! – quant_fin Nov 12 '20 at 02:07
1

no problem, I'll add some extra explanation for this in a little bit! If I could know, where did you get this problem from? – Ironkey Nov 12 '20 at 02:18
Of course! Happy to provide some context. I'm currently working with a medical dataset at work that was collected from patients who were asked to draw an image on an iPad. At various points in time (i.e. `t`), the (x,y) coordinates were recorded [though I didn't included that info in this question since it wasn't necessary] and a tag indicating the nature of the point was recorded, which identifies whether the person just pressed their Apple Pen to the page (i.e. 1), dragged it (i.e. 2), or lift their Apple Pen off the page (i.e. 3) was recorded - that's the `pt` series described above – quant_fin Nov 12 '20 at 02:31
Oh that sounds like fun! Happy to help :) – Ironkey Nov 12 '20 at 02:37
1

Thank you so much for the in-depth explanation you just added - learnt so much from that! – quant_fin Nov 12 '20 at 02:43

score 1 · Answer 2 · answered Nov 12 '20 at 01:14

This seems to work

pt = [1, 2,  2,  3,  1,  2,  2,  2,  3,  1,  2,  3 ]
t =  [1, 10, 13, 14, 17, 20, 21, 25, 37, 32, 33, 38] 

st = 0
lst = []
for x in zip(pt,t):
   if x[0] == 1: st = x[1]
   if x[0] == 3: 
      d = x[1] - st
      lst.append(d)

print(lst)

Output

[13, 20, 6]

score 1 · Answer 3 · answered Nov 12 '20 at 01:28

Code:

pt = [1, 2,  2,  3,  1,  2,  2,  2,  3,  1,  2,  3 ]
t =  [1, 10, 13, 14, 17, 20, 21, 25, 37, 32, 33, 38] 

events_t =[]
arr = []
for pt_ele, t_ele in zip(pt, t):
    arr.append(t_ele)
    if pt_ele == 3:
        events_t.append(arr)
        arr = []
print(events_t)
res = [i[-1] - i[0] for i in events_t]
print(res)

Output:

[[1, 10, 13, 14], [17, 20, 21, 25, 37], [32, 33, 38]]
[13, 20, 6]

Python: Difference between two positions in one list depending on rule set for another list

3 Answers3