2

How does some get the y values from the red trend line generated from the source code below? I am no math expert.

As for the code below from How to add trendline in python matplotlib dot (scatter) graphs?

#random inputs for x and y


x = np.random.uniform(low=0.5, high=13.3, size=(50,))
y = np.random.uniform(low=0.5, high=13.3, size=(50,))

# plot the data itself
pylab.plot(x,y,'o')

# calc the trendline
z = numpy.polyfit(x, y, 1)
p = numpy.poly1d(z)
pylab.plot(x,p(x),"r--")
# the line equation:
print "y=%.6fx+(%.6f)"%(z[0],z[1])

When I print the value of p(x), the expected values of y to plot the red trend line.

[7.25072088 7.74580974 7.707636   7.57456601 7.72771792 7.36682509
 7.36216195 7.45937086 7.47592622 7.76663313 7.71256734 7.68601844
 7.34777885 7.2552914  7.28729136 7.4828444  7.25690455 7.47861776
 7.48472596 7.63791435 7.79364877 7.79382845 7.45020348 7.5488981
 7.29478413 7.27191799 7.47409563 7.26783249 7.49132469 7.2515923
 7.40558937 7.55062512 7.46004735 7.4094514  7.69985713 7.23891764
 7.50790404 7.38789488 7.23477781 7.59598148 7.49460819 7.62039958
 7.67580303 7.40553616 7.61933389 7.60038837 7.76048006 7.41307834
 7.28136679 7.5063726 ]

If this an upward moving trend, should the array elements increase from start to end? As you can see, there are element previous values higher than then current one. Should there not be a steady incline which where the next element would ALWAYS be the higher than the previous element? Call me confused.

Bryan Downing
  • 207
  • 2
  • 14
  • What are the inputs that returned those two elements of the array? Or is that just an example? If `x` only has two elements, a function, presumably `p(x)` would only return two outputs. – AER Jun 05 '18 at 02:58
  • 1
    I updated the code with the input of x and y. Both are random – Bryan Downing Jun 05 '18 at 03:02
  • I've recreated it, it returns an array of the same dimension... So you run this as a script? Then type `p(x)` in the command line? – AER Jun 05 '18 at 03:06
  • I got that. How would one extract y if you were to plot the trend line? pylab.plot(x,p(x),"r--") – Bryan Downing Jun 05 '18 at 03:10
  • When I say array of the same dimension, I mean using your created "data" `x` it has length 50, similarly `p(x)` is an array of 50 – AER Jun 05 '18 at 03:36
  • I edited above with a corrected question at the end. Thanks helping so far. – Bryan Downing Jun 05 '18 at 03:48

1 Answers1

1

Should there not be a steady incline which where the next element would ALWAYS be the higher than the previous element?

Yes, the fit is a straight line, so higher values of x are always associated with higher (or lower, depending on the slope) values of p(x).

What's happening in your case is that x is not sorted, and so p(x) isn't sorted either.

In [18]: x
Out[18]:
array([  9.95692606,   5.25372625,   9.84277793,   9.75691888,
         3.53691402,   7.47732635,  13.26638669,  10.39011192,
        11.86590794,  10.38592445,   0.5328471 ,   7.69932299,
        ...

As you can see, we're not starting on the left and moving to the right. We're first looking at some point in the middle, then jumping left a lot, then jumping right, then moving a little bit to the left, etc. The corresponding p(x) values are not going to be monotonic either.

If you sort the points from left to right, you'll see that they indeed always move in the same vertical direction:

In [20]: sorted(zip(x, p(x)))
Out[20]:
[(0.53284710066507301, 5.2982022878459842),
 (0.90494271648495472, 5.3490731826338447),
 (1.2383322417505211, 5.3946523906172272),
 (1.2542322226117251, 5.3968261497778585),
 (1.3243912128123114, 5.4064179064586044),
 (1.4506628234207115, 5.4236810763129437),
 (2.0368566039434102, 5.503822311163459),
 (2.8349103207704576, 5.6129278876274968),
 (3.0174136939304748, 5.637878759123244),
 (3.5369140229038196, 5.7089020269444219),
 (4.932863919562303, 5.8997487268324766),
 (4.943993127936622, 5.9012702518497351),
 (4.9500689452818589, 5.9021009046491208),
 ...
NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • Thanks but should the elements not be higher than the previous element? This is naturally higher with out the need to sort. This is what I would expect. Or am I missing something? – Bryan Downing Jun 05 '18 at 04:01
  • @BryanDowning: Since `x` are not in the ascending order, you're not looking at the points from left to right. You're looking at some point in the middle, then jumping left a lot, then jumping right, then moving a little bit to the left, etc. The corresponding `p(x)` values are not going to be monotonic. – NPE Jun 05 '18 at 04:03
  • @BryanDowning: The sorting in my answer _is from left to right_. We're not sorting from top to bottom - that happens automatically. – NPE Jun 05 '18 at 04:05
  • So if I sort (ascending or descending) which is the correct direction of the trend line as you say, the last sorted element will be the highest point (or max) in the trend line? This also should be the last point of the trend or most recent? I am trying to ensure I got a handle on this correctly. – Bryan Downing Jun 05 '18 at 04:07
  • @BryanDowning: The largest `x` is the most recent, and corresponding `p(x)` is the value of the trendline (which will always be either the max or the min, depending on whether the line is trending upwards or downwards). – NPE Jun 05 '18 at 04:09