0

Please find the attached graph. I need to find the points "a" and "b". Kindly suggest any methods in python.

Graph is plotted by obtaining run times, which are observed as below: x = ([1000, 2000, 3000, 4000, 5000, 6000,7000, 8000, 9000]) and y = ([2314,802,519,417,358,318,302,284,280])

Need to find out "a" and "b" points so that i can use them individually for other tasks

Full code:

def piecewise_linear(x, x0, y0, k1, k2):
    return np.piecewise(x, [x < x0], [lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0])

perr_min = np.inf
p_best = None
for n in range(100):
    k = np.random.rand(10)*20
    p , e = optimize.curve_fit(piecewise_linear, x, y)
    perr = np.sum(np.abs(y-piecewise_linear(x, *p)))
    if(perr < perr_min):
        print "success"
        perr_min = perr
        p_best = p

xd = np.linspace(min(x), max(x), 100)
plt.figure()
plt.plot(x, y, "bo")
y_out = piecewise_linear(xd, *p_best)
plt.plot(xd, y_out)
plt.ylabel('Number of KeyFrames')
plt.xlabel('Threshold Values')
plt.show()

Graph

animuson
  • 53,861
  • 28
  • 137
  • 147
Rudra
  • 149
  • 1
  • 2
  • 11
  • Independently of the programming language, you must define the point mathematically. So how are the green lines defined ? Describe point `(a, b)` without the plot. – Gribouillis Jul 15 '17 at 08:09
  • As Sam as mentioned below, data points would look like this : (1000, 2314) (2000, 802) (3000, 519) (4000, 417) (5000, 358) (6000, 318) (7000, 302) (8000, 284) (9000, 280) – Rudra Jul 15 '17 at 08:22
  • This does not define the green lines, nor the curve you're talking about. – Gribouillis Jul 15 '17 at 08:37

1 Answers1

1

I don't exactly understand your question. Do you want code to extract relevant data points from that graph image using some computer vision or do you just want the coordinates of the data points using the defined x and y lists? If latter is the case, you can do something like:

change_points = [] # to store the points you want
curr_slope = (y[1] - y[0]) / (x[1]-x[0]) # to be used for comparision
for i in range(2, len(y)):
    prev_slope = curr_slope
    curr_slope = (y[i]-y[i-1]) / (x[i]-x[i-1])
    if not (0.2 <= (curr_slope / prev_slope) <= 5):
        change_points.append((x[i-1], y[i-1]))

for point in change_points:
    print(point)

This prints (2000, 802). Is there anything that defines the green line? Otherwise, here I've set a ratio threshold to add only points which change the slope by a 'large enough' (in this case, by a factor of 5) amount.

Also, the parenthesis in your x and y initialisations are redundant. Just use square brackets.

Sam Chats
  • 2,271
  • 1
  • 12
  • 34
  • I don need all the data points. The point (a,b) is the data point where the curve changes. i want the exact position points. – Rudra Jul 15 '17 at 08:22
  • By "graph changes" you mean "slope" changes"? – Sam Chats Jul 15 '17 at 08:52
  • You can adapt @ashti 's answer by using numpy.where(condition...) –  Jul 15 '17 at 08:56
  • Yes, thats the technical term i guess – Rudra Jul 15 '17 at 08:57
  • @ashti I think we should be using `xd` and `y_out` instead of `x` and `y`. – Sam Chats Jul 15 '17 at 09:25
  • 1
    [this answer](https://stackoverflow.com/a/16343791/7345804) (and numpy docs) cover the where function. If you have all of the (x,y) points, you can use the where function to find which index of x/y meets your specified condition (example, slope change above threshold for two consecutive points). –  Jul 15 '17 at 09:32
  • @SamChats : The line is a regression line defined by piecewise regression (See the code). I can check with the large difference threshold – Rudra Jul 15 '17 at 10:01
  • @ashti Yes, and it's data is in `xd` and `y_out`, so why not use them? – Sam Chats Jul 15 '17 at 10:05
  • @SamChats : Running the code with xd and y_out to check if it produces the result what i am expecting. will update in sometime – Rudra Jul 15 '17 at 10:06
  • @mikey : Sure i will try using numpy where function too – Rudra Jul 15 '17 at 10:07