I have a dataframe, which has 2 columns,
a b
0 1 2
1 1 1
2 1 1
3 1 2
4 1 1
5 2 0
6 2 1
7 2 1
8 2 2
9 2 2
10 2 1
11 2 1
12 2 2
Is there a direct way to make a third column as below
a b c
0 1 2 0
1 1 1 1
2 1 1 0
3 1 2 1
4 1 1 0
5 2 0 0
6 2 1 1
7 2 1 0
8 2 2 1
9 2 2 0
10 2 1 0
11 2 1 0
12 2 2 0
in which target [1, 2]
is a sublist of df.groupby('a').b.apply(list)
, find the 2 rows that firstly fit the target in every group.
df.groupby('a').b.apply(list)
gives
1 [2, 1, 1, 2, 1]
2 [0, 1, 1, 2, 2, 1, 1, 2]
[1,2]
is a sublist of [2, 1, 1, 2, 1]
and [0, 1, 1, 2, 2, 1, 1, 2]
so far, I have a function
def is_sub_with_gap(sub, lst):
'''
check if sub is a sublist of lst
'''
ln, j = len(sub), 0
ans = []
for i, ele in enumerate(lst):
if ele == sub[j]:
j += 1
ans.append(i)
if j == ln:
return True, ans
return False, []
test on the function
In [55]: is_sub_with_gap([1,2], [2, 1, 1, 2, 1])
Out[55]: (True, [1, 3])