0

I have a graph that looks like a letter "L" in the mirror or like this: ___/ The slope of first part of the graph is +/-0 (but it is not zero!) and I would like to define the exact point, where the graph starts to bend (slope > 0.).

path = '/storage/.../01_python_in/'
test = np.loadtxt(path+'sample_data.txt', skiprows=0)

window = 10
slope_value = []
for j in range(len(test) - window):
    slope, intercept, r_value, p_value, std_err = stats.linregress(test[j:j+window])

    if slope > 0.2:
        slope_value.append(slope)
        print slope

    else:
        slope_value.append(0)

This works ok, whereas I have two issues:

1) My output is an array of slopes for i+10 elements. How can I find out what is the index of the first element that is not zero, so I can read out the data point in my 'test" data (sorry, this is basic, but I'm a python-newbie)? 2) My actual data doesn't look perfectly linear as it contains some noise. My solution has two variables ('window' and slope > 0.2), which I can only guess (estimate). Is there a more elegant solution maybe? Thanks for helping!

Renan Araújo
  • 3,533
  • 11
  • 39
  • 49
burazija
  • 17
  • 4
  • Given that the curve is not linear (even within the window it's just an approximation), what about using a spline fit to get a function, either in the window or for the entire data set? It may be a bit complex for a beginner, depending on your needs for the project. http://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.interpolate.UnivariateSpline.html The spline object has a method to get the derivative. – amd Sep 09 '15 at 17:56
  • Have you considered working backwards on the data? – Martin Evans Sep 09 '15 at 18:06

2 Answers2

0

(1) Try next(i for i,v in enumerate(slope_value) if v > 0) as described here.
(2) Maybe. I'd say the suitability of your solution depends on whether the model you apply is a good fit for your data, or for the process that generates your data. You'll likely have to go through some trial and error to find optimal values for your two parameters (window length and slope threshold).

Community
  • 1
  • 1
Norman
  • 1,975
  • 1
  • 20
  • 25
0

Try a list comprehension.

If the list slope_value is paired with other data use:

>>> slope_value = [0,0,0,0,0,1,2,2,3,4,5,6]
>>> x = [1,2,3,4,5,6,7,8,9,10,11,12]
>>> [X for (X,A) in sorted(zip(x,slope_value)) if A > 0][0]
6

Otherwise you can find the index of the first nonzero like so:

>>> [X for (X,A) in sorted(zip(range(len(slope_value)),slope_value)) if A > 0][0] 
5
Kevin Johnson
  • 593
  • 4
  • 9