19

I have few weeks data with units sold given

xs[weeks] = [1,2,3,4]
ys['Units Sold'] = [1043,6582,5452,7571]

from the given series, we can see that although there is a drop from xs[2] to xs[3] but overall the trend is increasing. How to detect the trend in small time series dataset.

Is finding a slope for the line is the best way? And how to calculate slope angle of a line in python?

bguiz
  • 27,371
  • 47
  • 154
  • 243
Kanika Singhal
  • 253
  • 1
  • 2
  • 10
  • 1
    Detecting trends on time series is a whole topic on itself. Since the problem is not strictly defined (there is no hardline definition for what constitutes a trend and what is just a small variation), there is no definitive answer. See possible closed duplicate [How to detect significant change / trend in a time series data?](https://stackoverflow.com/q/12851208), or a related question I answered some time ago [How to calculate and plot multiple linear trends for a time series?](https://stackoverflow.com/q/41906679/1782792). – jdehesa Apr 12 '19 at 10:18
  • "Is finding a slope for the line is the best way?" The best way surely depends on the underlying models for the data and the noise. If the noise is large and the data set is small, most of the time you'll want to answer with "I don't know the trend with certainty", which would probably all you can do. – NoDataDumpNoContribution Apr 12 '19 at 11:04
  • Questions like this really belong on the statistics site, https://stats.stackexchange.com. Note that there are probably already several questions and answers on this topic there. – shadowtalker Feb 22 '23 at 04:15

2 Answers2

36

I have gone through the same issue that you face today. In order to detect the trend, I couldn't find a specific function to handle the situation.

I found a really helpful function ie, numpy.polyfit():

numpy.polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False) 
                                                    

[Check this Official Documentation]

You can use the function like this

def trenddetector(list_of_index, array_of_data, order=1):
    result = np.polyfit(list_of_index, list(array_of_data), order)
    slope = result[-2]
    return float(slope)

This function returns a float value that indicates the trend of your data and also you can analyze it by something like this.

For example,

if the slope is a +ve value --> increasing trend

if the slope is a -ve value --> decreasing trend

if the slope is a zero value --> No trend

Play with this function and find out the correct threshold as per your problem and give it as a condition.

Example Code for your Solution

import numpy as np
def trendline(index,data, order=1):
    coeffs = np.polyfit(index, list(data), order)
    slope = coeffs[-2]
    return float(slope)

index=[1,2,3,4]
List=[1043,6582,5452,7571]
resultent=trendline(index,List)
print(resultent)  

RESULT

1845.3999999999999

As per this output, The result is much greater than zero so it shows your data is increasing steadily.

Shaido
  • 27,497
  • 23
  • 70
  • 73
Majo_Jose
  • 744
  • 1
  • 10
  • 24
  • 1
    I like this answer. np.polyfit is Least squares polynomial fitting that is widely used and pros and cons are well understood The other answer SMA is basically a convolution low pass filter, which is ok for stream processing but it introduces phase delay. – Kenji Noguchi Sep 21 '21 at 20:19
  • @Majo_Jose Thanks for solution. I am confused in one thing, in time series data it will found the slop of between each two consecutive points and sum all of them to find the final slope or it will measure the slope between starting point and end point ? Thanks – Khalid Usman Oct 26 '21 at 08:26
  • 1
    polyfit is now considered [legacy](https://numpy.org/doc/stable/reference/routines.polynomials.html#transitioning-from-numpy-poly1d-to-numpy-polynomial) – Matthew Hegarty Nov 03 '21 at 18:29
  • 1
    Thanks for the answer, Why is tge order/degree set to 1? is there any specific reason? setting this to 2, 3,.. may give another answer, correct? – Moh-Spark Dec 08 '22 at 23:16
  • 1
    @Moh-Spark. No specific reason , Basically, This function is used for calculating the coefficients for eg. if the degree is 1, then it is calculating 'm' and 'c' of the y= mx+c equation. but we are utilising it for the purpose of trend indication. setting different degrees would give distinct values. but I hope it will follow the logic. – Majo_Jose Dec 13 '22 at 05:28
5

One approach could be to use a Moving Average (lots of variations of this, you may see EMA or SMA thrown around) which looks at the current time-step and n number of previous steps, averages these and uses this as a sort of 'smoothed' value. This will give you a better indication of the way the data is actually moving, as one small decrease isnt going to have a dramatic impact on the gradient of the line.

Depending on the domain of your problem, it may also be worth checking out some statistics used in the financial sector, such as DMI (Directional Movement Indicator) or MACD.

Hope this helps

tadge
  • 116
  • 6