0

The question is simple.

Suppose we have Series with this values:

srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])

How can I find place (index) of subseries 1.0, 2.0, 3.0?

Alex-droid AD
  • 635
  • 1
  • 6
  • 14
  • 1
    You can find where the first and last elements are, then loop through and get everything in between. – Axiumin_ Aug 11 '19 at 19:21
  • What do you mean exactly. What is your expected output? In this case index `1, 2, 3, 4` or `2,3,4` – Erfan Aug 11 '19 at 19:24
  • I've written "place (index)", which mean that the right answer is `2`. But if there is some way I could get indexes of all values `2, 3, 4` it will be ok too. – Alex-droid AD Aug 11 '19 at 19:30
  • Possible duplicate of [Python/NumPy first occurrence of subarray](https://stackoverflow.com/questions/7100242/python-numpy-first-occurrence-of-subarray) – help-ukraine-now Aug 11 '19 at 20:02
  • No, it's not. `np.array` is not same as `pd.Series` – Alex-droid AD Aug 11 '19 at 20:10
  • if you want pandas specific answer, it's not a duplicate, sorry. otherwise, it shows the way to find the index of subseries (if it's possible to convert `pd.Series` into a `list` or `np.array`). – help-ukraine-now Aug 11 '19 at 20:52

3 Answers3

2

Using a rolling window we can find the first occurrence of a list a.It puts a 'marker' (e.g. 0, any non-Nan value will be fine) at the end (right border) of the window. Then we use first_valid_index to find the index of this element and correct this value by the window size:

a = [1.0, 2.0, 3.0]
srs.rolling(len(a)).apply(lambda x: 0 if (x == a).all() else np.nan).first_valid_index()-len(a)+1

Output:

2
PyWin
  • 101
  • 2
0

One naive way is to iterate over the series, subset the n elements and compare if they are equal to the given list:

Here the code:

srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])
sub_list = [1.0, 2.0, 3.0]


n = len(sub_list)
index_matching = []
for i in range(srs.shape[0] - n + 1):
    sub_srs = srs.iloc[i: i+n]
    if (sub_srs == sub_list).all():
        index_matching.append(sub_srs.index)

print(index_matching)
# [RangeIndex(start=2, stop=5, step=1)]

Or in one line with list comprehension:

out = [srs.iloc[i:i+n].index for i in range(srs.shape[0] - n + 1) if (srs.iloc[i: i+n] == sub_list).all()]
print(out)
# [RangeIndex(start=2, stop=5, step=1)]

If you want an explicit list:

real_values = [[i for i in idx] for idx in out]
print(real_values)
# [[2, 3, 4]]
Alexandre B.
  • 5,387
  • 2
  • 17
  • 40
0

The simplest solution might be to use list comprehension:

a = srs.tolist() # [7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0]
b = [1.0, 2.0, 3.0]

[x for x in range(len(a)) if a[x:x+len(b)] == b]
# [2]
help-ukraine-now
  • 3,850
  • 4
  • 19
  • 36
  • This returns the index **position**, not the index values. What happens if you have some `string` as index the series ? – Alexandre B. Aug 12 '19 at 08:14