The question is simple.
Suppose we have Series with this values:
srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])
How can I find place (index) of subseries 1.0, 2.0, 3.0
?
The question is simple.
Suppose we have Series with this values:
srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])
How can I find place (index) of subseries 1.0, 2.0, 3.0
?
Using a rolling window we can find the first occurrence of a list a
.It puts a 'marker' (e.g. 0, any non-Nan value will be fine) at the end (right border) of the window. Then we use first_valid_index
to find the index of this element and correct this value by the window size:
a = [1.0, 2.0, 3.0]
srs.rolling(len(a)).apply(lambda x: 0 if (x == a).all() else np.nan).first_valid_index()-len(a)+1
Output:
2
One naive way is to iterate over the series, subset the n
elements and compare if they are equal to the given list:
Here the code:
srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])
sub_list = [1.0, 2.0, 3.0]
n = len(sub_list)
index_matching = []
for i in range(srs.shape[0] - n + 1):
sub_srs = srs.iloc[i: i+n]
if (sub_srs == sub_list).all():
index_matching.append(sub_srs.index)
print(index_matching)
# [RangeIndex(start=2, stop=5, step=1)]
Or in one line with list comprehension:
out = [srs.iloc[i:i+n].index for i in range(srs.shape[0] - n + 1) if (srs.iloc[i: i+n] == sub_list).all()]
print(out)
# [RangeIndex(start=2, stop=5, step=1)]
If you want an explicit list:
real_values = [[i for i in idx] for idx in out]
print(real_values)
# [[2, 3, 4]]
The simplest solution might be to use list comprehension:
a = srs.tolist() # [7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0]
b = [1.0, 2.0, 3.0]
[x for x in range(len(a)) if a[x:x+len(b)] == b]
# [2]