4

I have 3 arrays: a, b, c all with length 15.

a=[950, 850, 750, 675, 600, 525, 460, 400, 350, 300, 250, 225, 200, 175, 150] 

b = [16, 12, 9, -35, -40, -40, -40, -45, -50, -55, -60, -65, -70, -75, -80]

c=[32.0, 22.2, 12.399999999999999, 2.599999999999998, -7.200000000000003, -17.0, -26.800000000000004, -36.60000000000001, -46.400000000000006, -56.2, -66.0, -75.80000000000001, -85.60000000000001, -95.4, -105.20000000000002] 

I am trying to find the value of a at the index where b=c. T

The problem is that there is no place where b=c exactly so I need to linearly interpolate between values in the array to find the value of a where b=c. Does that make sense?

I was thinking about using scipy.interpolate to do the interpolation.

I am having a hard time wrappying my mind around how to solve this problem. Any ideas on this would be great!

Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
HM14
  • 689
  • 1
  • 10
  • 30
  • Can you add some examples of arrays `a`, `b`, and `c`? – Christian Ternus Sep 12 '16 at 19:31
  • `b = [16, 12, 9, -35, -40, -40, -40, -45, -50, -55, -60, -65, -70, -75, -80])` `c=[32.0, 22.2, 12.399999999999999, 2.599999999999998, -7.200000000000003, -17.0, -26.800000000000004, -36.60000000000001, -46.400000000000006, -56.2, -66.0, -75.80000000000001, -85.60000000000001, -95.4, -105.20000000000002]` `a=[950, 850, 750, 675, 600, 525, 460, 400, 350, 300, 250, 225, 200, 175, 150]` – HM14 Sep 12 '16 at 19:34
  • Define ```interpolate linearily``` (what do you mean? linear-regression on all points; piecewise-linear? is it a valid assumption in regards to the data?...). Also: instead of adding this example as comment, edit your question and format it nicely! – sascha Sep 12 '16 at 21:07
  • Linear interpolation: https://en.wikipedia.org/wiki/Linear_interpolation. In particular, see "Linear interpolation of a data set": https://en.wikipedia.org/wiki/Linear_interpolation#Interpolation_of_a_data_set – Warren Weckesser Sep 12 '16 at 21:50

3 Answers3

3

Here's simpler variation of a function from another answer of mine:

from __future__ import division

import numpy as np


def find_roots(t, y):
    """
    Given the input signal `y` with samples at times `t`,
    find the times where `y` is 0.

    `t` and `y` must be 1-D numpy arrays.

    Linear interpolation is used to estimate the time `t` between
    samples at which sign changes in `y` occur.
    """
    # Find where y crosses 0.
    transition_indices = np.where(np.sign(y[1:]) != np.sign(y[:-1]))[0]

    # Linearly interpolate the time values where the transition occurs.
    t0 = t[transition_indices]
    t1 = t[transition_indices + 1]
    y0 = y[transition_indices]
    y1 = y[transition_indices + 1]
    slope = (y1 - y0) / (t1 - t0)
    transition_times = t0 - y0/slope

    return transition_times

That function can be used with t = a and y = b - c. For example, here is your data, entered as numpy arrays:

In [354]: a = np.array([950, 850, 750, 675, 600, 525, 460, 400, 350, 300, 250, 225, 200, 175, 150])

In [355]: b = np.array([16, 12, 9, -35, -40, -40, -40, -45, -50, -55, -60, -65, -70, -75, -80])

In [356]: c = np.array([32.0, 22.2, 12.399999999999999, 2.599999999999998, -7.200000000000003, -17.0, -26.800000000000004, -3
     ...: 6.60000000000001, -46.400000000000006, -56.2, -66.0, -75.80000000000001, -85.60000000000001, -95.4, -105.2000000000
     ...: 0002])

The place where "b = c" is the place where "b - c = 0", so we pass b - c for y:

In [357]: find_roots(a, b - c)
Out[357]: array([ 312.5])

So the linearly interpolated value of a is 312.5.

With the following matplotlib commands:

In [391]: plot(a, b, label="b")
Out[391]: [<matplotlib.lines.Line2D at 0x11eac8780>]

In [392]: plot(a, c, label="c")
Out[392]: [<matplotlib.lines.Line2D at 0x11f23aef0>]

In [393]: roots = find_roots(a, b - c)

In [394]: [axvline(root, color='k', alpha=0.2) for root in roots]
Out[394]: [<matplotlib.lines.Line2D at 0x11f258208>]

In [395]: grid()

In [396]: legend(loc="best")
Out[396]: <matplotlib.legend.Legend at 0x11f260ba8>

In [397]: xlabel("a")
Out[397]: <matplotlib.text.Text at 0x11e71c470>

I get the plot

plot

Community
  • 1
  • 1
Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
1

This is not necessarily a solution to your problem, since your data does not appear to be linear, but it might give you some ideas. If you assume that your lines a, b, and c are linear, then the following idea works:

Perform a linear regression of lines a, b and c to get their respective slopes (m_a, m_b, m_c) and y-intercepts (b_a, b_b, b_c). Then solve the equation 'y_b = y_c' for x, and find y = m_a * x + b_a to get your result.

Since the linear regression approximately solves y = m * x + b, equation y_b = y_c can be solved by hand giving: x = (b_b-b_c) / (m_c-m_b).

Using python, you get:

>> m_a, b_a, r_a, p_a, err_a = stats.linregress(range(15), a)
>> m_b, b_b, r_b, p_b, err_b = stats.linregress(range(15), b)
>> m_c, b_c, r_c, p_c, err_c = stats.linregress(range(15), c)
>> x = (b_b-b_c) / (m_c-m_b)
>> m_a * x + b_a
379.55151515151516

Since your data is not linear, you probably need to go through your vectors one by one and search for overlapping y intervals. Then you can apply the above method but using only the endpoints of your two intervals to construct your b and c inputs to the linear regression. In this case, you should get an exact result, since the least-squares method will interpolate perfectly with only two points (although there are more efficient ways to do this since the intersection can be solved exactly in this simple case where there are two straight lines).

Cheers.

davhoo
  • 141
  • 1
  • 7
  • Don't let Warren see your solution ;-). I like it. While it's debatable, if a linear-regression is the correct approach here (which you questioned), OP did not give much info. And our approaches are the same, **with the same result**, but your approach is just more elegant (without the need for general-purpose optimizers)! – sascha Sep 12 '16 at 21:49
0

Another simple solution using:

  • one linear-regressor for each vector (done with scikit-learn as scipy-docs were down for me; easy to switch to numpy/scipy-based linear-regression)
  • general-purpose minimization using scipy.optimize.minimize

Code

a=[950, 850, 750, 675, 600, 525, 460, 400, 350, 300, 250, 225, 200, 175, 150]
b = [16, 12, 9, -35, -40, -40, -40, -45, -50, -55, -60, -65, -70, -75, -80]
c=[32.0, 22.2, 12.399999999999999, 2.599999999999998, -7.200000000000003, -17.0, -26.800000000000004, -36.60000000000001, -46.400000000000006, -56.2, -66.0, -75.80000000000001, -85.60000000000001, -95.4, -105.20000000000002]

from sklearn.linear_model import LinearRegression
from scipy.optimize import minimize
import numpy as np

reg_a = LinearRegression().fit(np.arange(len(a)).reshape(-1,1), a)
reg_b = LinearRegression().fit(np.arange(len(b)).reshape(-1,1), b)
reg_c = LinearRegression().fit(np.arange(len(c)).reshape(-1,1), c)

funA = lambda x: reg_a.predict(x.reshape(-1,1))
funB = lambda x: reg_b.predict(x.reshape(-1,1))
funC = lambda x: reg_c.predict(x.reshape(-1,1))

opt_crossing = lambda x: (funB(x) - funC(x))**2
x0 = 1
res = minimize(opt_crossing, x0, method='SLSQP', tol=1e-6)
print(res)
print('Solution: ', funA(res.x))

import matplotlib.pyplot as plt

x = np.linspace(0, 15, 100)
a_ = reg_a.predict(x.reshape(-1,1))
b_ = reg_b.predict(x.reshape(-1,1))
c_ = reg_c.predict(x.reshape(-1,1))

plt.plot(x, a_, color='blue')
plt.plot(x, b_, color='green')
plt.plot(x, c_, color='cyan')
plt.scatter(np.arange(15), a, color='blue')
plt.scatter(np.arange(15), b, color='green')
plt.scatter(np.arange(15), c, color='cyan')

plt.axvline(res.x, color='red', linestyle='solid')
plt.axhline(funA(res.x), color='red', linestyle='solid')

plt.show()

Output

fun: array([  7.17320622e-15])
jac: array([ -3.99479864e-07,   0.00000000e+00])
message: 'Optimization terminated successfully.'
nfev: 8
nit: 2
njev: 2
status: 0
success: True
  x: array([ 8.37754008])
Solution:  [ 379.55151658]

Plot

enter image description here

sascha
  • 32,238
  • 6
  • 68
  • 110
  • Cool, but 379? Plot the data, and see if you find that answer satisfactory. :) – Warren Weckesser Sep 12 '16 at 21:32
  • @WarrenWeckesser It's all about the model. I asked what he meant by linear interpolation and got no answer. So my approach is using a global linear-regression. This might be valid or not. It's a model-decision! So i have to admit: i find it satisfactory and one can easily see the difference between our models (now after you added a plot too). – sascha Sep 12 '16 at 21:44