Given activity logs in a Pandas data-frame which consists of an "id" going through a sequence of "action" at specified "timestamp" - I want to retain rows corresponding to a specified sequence of actions.
For e.g. Input data
import pandas as pd
# Create a sample data-frame from a dictionary
id = ['A123', 'A123', 'A123', 'A123', 'A123', 'A123', 'A234', 'A234', 'A234', 'A234', 'A341', 'A341', 'A341', 'A341', 'A341', 'A341', 'A341', 'A341', 'A341', 'A341']
action = ['A', 'B', 'C', 'D', 'B', 'A', 'B', 'A', 'C', 'D', 'D', 'B', 'C', 'D', 'A', 'B', 'C', 'D', 'B', 'C']
timestamp = ['1', '2', '3', '4', '5', '6', '1', '2', '3', '4', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
the_dict = {'id': id, 'action': action, 'timestamp': timestamp}
# This is the sample data-frame with columns:
# id action timestamp
# Each id when ordered by timestamp then action gives the sequence of actions taken by the id
dataFrame = pd.DataFrame(the_dict)
######################################
# Input data
######################################
# id action timestamp
#0 A123 A 1
#1 A123 B 2
#2 A123 C 3
#3 A123 D 4
#4 A123 B 5
#5 A123 A 6
#6 A234 B 1
#7 A234 A 2
#8 A234 C 3
#9 A234 D 4
#10 A341 D 1
#11 A341 B 2
#12 A341 C 3
#13 A341 D 4
#14 A341 A 5
#15 A341 B 6
#16 A341 C 7
#17 A341 D 8
#18 A341 B 9
#19 A341 C 10
# The sequence of interest
the_sequence = ['B', 'C', 'D']
# Desired output: Group by id, order by timestamp, return all rows which match the given sequence of actions
######################################
# The output data-frame:
######################################
# id action timestamp
#1 A123 B 2
#2 A123 C 3
#3 A123 D 4
#11 A341 B 2
#12 A341 C 3
#13 A341 D 4
#15 A341 B 6
#16 A341 C 7
#17 A341 D 8