How to find part of series in some series

Question

The question is simple.

Suppose we have Series with this values:

srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])

How can I find place (index) of subseries 1.0, 2.0, 3.0?

You can find where the first and last elements are, then loop through and get everything in between. — Axiumin_, Aug 11 '19 at 19:21
What do you mean exactly. What is your expected output? In this case index `1, 2, 3, 4` or `2,3,4` — Erfan, Aug 11 '19 at 19:24
I've written "place (index)", which mean that the right answer is `2`. But if there is some way I could get indexes of all values `2, 3, 4` it will be ok too. — Alex-droid AD, Aug 11 '19 at 19:30
Possible duplicate of [Python/NumPy first occurrence of subarray](https://stackoverflow.com/questions/7100242/python-numpy-first-occurrence-of-subarray) — help-ukraine-now, Aug 11 '19 at 20:02
if you want pandas specific answer, it's not a duplicate, sorry. otherwise, it shows the way to find the index of subseries (if it's possible to convert `pd.Series` into a `list` or `np.array`). — help-ukraine-now, Aug 11 '19 at 20:52

PyWin · Answer 1 · 2019-08-12T15:32:07.317

Using a rolling window we can find the first occurrence of a list a.It puts a 'marker' (e.g. 0, any non-Nan value will be fine) at the end (right border) of the window. Then we use first_valid_index to find the index of this element and correct this value by the window size:

a = [1.0, 2.0, 3.0]
srs.rolling(len(a)).apply(lambda x: 0 if (x == a).all() else np.nan).first_valid_index()-len(a)+1

Output:

Alexandre B. · Answer 2 · 2019-08-11T22:57:46.487

One naive way is to iterate over the series, subset the n elements and compare if they are equal to the given list:

Here the code:

srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])
sub_list = [1.0, 2.0, 3.0]


n = len(sub_list)
index_matching = []
for i in range(srs.shape[0] - n + 1):
    sub_srs = srs.iloc[i: i+n]
    if (sub_srs == sub_list).all():
        index_matching.append(sub_srs.index)

print(index_matching)
# [RangeIndex(start=2, stop=5, step=1)]

Or in one line with list comprehension:

out = [srs.iloc[i:i+n].index for i in range(srs.shape[0] - n + 1) if (srs.iloc[i: i+n] == sub_list).all()]
print(out)
# [RangeIndex(start=2, stop=5, step=1)]

If you want an explicit list:

real_values = [[i for i in idx] for idx in out]
print(real_values)
# [[2, 3, 4]]

score 0 · Answer 3 · answered Aug 11 '19 at 21:39

0

The simplest solution might be to use list comprehension:

a = srs.tolist() # [7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0]
b = [1.0, 2.0, 3.0]

[x for x in range(len(a)) if a[x:x+len(b)] == b]
# [2]

answered Aug 11 '19 at 21:39

help-ukraine-now

3,850
4
19
36

This returns the index **position**, not the index values. What happens if you have some `string` as index the series ? – Alexandre B. Aug 12 '19 at 08:14

How to find part of series in some series

3 Answers3