0

So I have a list that is monotonically increasing, but has many repetitive values. For example, these are the first few elements:

[0.0, 0.0, 0.0, 0.6931471805599453, 0.6931471805599453, 0.6931471805599453, 0.6931471805599453, 1.0986122886681098, 1.0986122886681098, ...

What is an efficient way to get a list of indices where jumps happen? E.g. here, 2 would be the first index.

900edges
  • 173
  • 8

3 Answers3

1

You can do so using list comprehension, enumerate and zip.

x = [0.0, 0.0, 0.0, 0.6931471805599453, 0.6931471805599453, 0.6931471805599453, 0.6931471805599453, 1.0986122886681098, 1.0986122886681098]

print([i for i, t in enumerate(zip(x, x[1:])) if t[0] != t[1]])

>>> [2, 6]

Basically you are combining the list with itself shifted by 1. You only want the indexes where the values between the two lists differ.

If you have very large lists then you could use the numpy module to do something similar.

import numpy as np

x = np.array([0.0, 0.0, 0.0, 0.6931471805599453, 0.6931471805599453, 0.6931471805599453, 0.6931471805599453, 1.0986122886681098, 1.0986122886681098])

res = np.where(x[:-1] != x[1:])[0]
print(res)

>>> array([2, 6])
ScootCork
  • 3,411
  • 12
  • 22
0

Here's a simple solution:

jumps = []
for i in range(len(numbers) - 1):
    if numbers[i] < numbers[i+1]
    jumps.append(i)

Then, if you want to get the values at the jumps in a list, you can do:

[numbers[i] for i in jumps]
Anderson-A
  • 74
  • 1
  • 6
  • Sure, thanks. I am trying to do a vectorized version, though. My thought was something like [i if list[i+1] != list[i]] but this doesn't work. – 900edges Feb 28 '21 at 21:36
0

With numpy you can use array[:-1] != array[1:]. Since it's vectorized it's faster than python list.

Comparison of methods:

import numpy as np
numbers  = [0.0, 0.0, 0.0, 0.6931471805599453, 0.6931471805599453, 0.6931471805599453, 0.6931471805599453, 1.0986122886681098, 1.0986122886681098] * 500000
array = np.array(numbers)

Numpy method:

%%time
ok = (array[:-1] != array[1:])
jumps = np.where(ok == True)[0]

13.8 ms


@ScootCork method:

%%time
jumps = [i for i, t in enumerate(zip(numbers, numbers[1:])) if t[0] != t[1]]

374 ms


@Anderson-Amethod:

%%time
jumps = []
for i in range(len(numbers) - 1):
    if numbers[i] < numbers[i+1]:
        jumps.append(i)

505 ms


politinsa
  • 3,480
  • 1
  • 11
  • 36