this is the first time i ask something here, so sorry if im doing anything wrong. I have this data in a panda dataFrame:
Year Month PassengerCountSum Date DateOrd Prediction
0 2006 9 2720100.000 2006-09-01 732555 2815063.471
1 2007 5 3056934.000 2007-05-01 732797 2908360.055
2 2012 2 2998119.000 2012-02-01 734534 3578013.633
3 2008 4 3029021.000 2008-04-01 733133 3037895.807
4 2006 10 2834959.000 2006-10-01 732585 2826629.163
... ... ... ... ... ... ...
124 2007 7 3382382.000 2007-07-01 732858 2931876.962
125 2009 6 3419595.000 2009-06-01 733559 3202128.637
126 2012 9 3819379.000 2012-09-01 734747 3660130.047
127 2013 10 3910790.000 2013-10-01 735142 3812411.661
128 2011 6 3766323.000 2011-06-01 734289 3483560.480
I need to make a graph with the Date in the X axis and PassengerCountSum in the Y axis. Also i need to show the values of the Prediction in a linear regresion.
there is no problem when i do this:
plt.plot(df_pass_by_year_pd['Date'] , df_pass_by_year_pd['Prediction'])
It paints a perfect linear regression.
But when I change the df_pass_by_year_pd['Prediction']) for df_pass_by_year_pd['PassengerCountSum']) to show the real values of the dataFrame like this :
plt.plot(df_pass_by_year_pd['Date'] , df_pass_by_year_pd['PassengerCountSum'])
The graph goes crazy and paint things I dont really understand.
Someone sees the problem? Ty all!
I have tried to change type of the column and reshape the array but im pretty new to all of this so any help or tip is welcome