5

I have this df :

      CET    MaxTemp  MeanTemp MinTemp  MaxHumidity  MeanHumidity  MinHumidity  revenue     events
0  2016-11-17   11      9        7            100           85             63   385.943800    rain
1  2016-11-18   9       6        3             93           83             66  1074.160340    storm
2  2016-11-19   8       6        4             93           87             76  2980.857860    
3  2016-11-20   10      7        4             93           84             81  1919.723960    rain-thunderstorm
4  2016-11-21   14     10        7            100           89             77   884.279340
5  2016-11-22   13     10        7             93           79             63   869.071070
6  2016-11-23   11      8        5            100           91             82   760.289260    fog-rain
7  2016-11-24   9       7        4             93           80             66  2481.689270
8  2016-11-25   7       4        1             87           74             57  2745.990070
9  2016-11-26   7       3       -1            100           88             61  2273.413250    rain 
10 2016-11-27  10       7        4            100           81             66  2630.414900    fog

Where:

CET                  object
Mean TemperatureC     int64
Mean Humidity         int64
Events               object
revenue              object
dtype: object

I want to plot all the columns against each other, to see how they variate over time. So, x-axis will be column CET and y-axis will have the rest of the columns. How can I do that? I used:

plt.figure();
df.plot(kind='line')
plt.xticks(rotation='vertical')
plt.yticks()
pylab.show()

but I can only see the Mean TemperatureC and Mean Humidity. Moreover, the x-axis is not CET date values, but the row number

joasa
  • 946
  • 4
  • 15
  • 35
  • [This](https://plot.ly/python/multiple-axes/#multiple-yaxes) will surely help you on creating multiple y-axes. – jack jay Jan 04 '17 at 14:18
  • see Multiple y-axes portion on the link page. – jack jay Jan 04 '17 at 15:09
  • I want to use the df columns as x and y-axis (instead of x=[1, 2, 3], y=[40, 50, 60] ) but it gives me a key error, why is that? – joasa Jan 04 '17 at 15:11
  • simply x must be list of CET values and y must list of corresponding other column value. – jack jay Jan 04 '17 at 15:23
  • How can I state x as a list of CET values? – joasa Jan 06 '17 at 09:32
  • make a list of various CET values as in shown in your df then assign it to x. – jack jay Jan 06 '17 at 09:34
  • You mean to make manually a list for all my current CET values? Every day the dataframe will change, so it is not very handy to do it for every single date and having to change it every day. – joasa Jan 06 '17 at 09:42

4 Answers4

7

As far as I remember plot uses the index for the x values. Try:

df.set_index('CET').plot()

And you should make sure that all you columns have a numeric datatype.

Edit:

df = df.set_index('CET')
num_cols = ['MaxTemp',
            'MeanTemp',
            'MinTemp',
            'MaxHumidity',
            'MeanHumidity',
            'MinHumidity',
            'revenue']
df[num_cols] = df[num_cols].astype(float)
df[num_cols].plot()
plt.xticks(range(len(df.index)), df.index)
Jan Zeiseweis
  • 3,718
  • 2
  • 17
  • 24
  • How can I show all values of 'CET' column on x-axis? It only shows a few as xticks, but not all of them – joasa Jan 04 '17 at 14:28
  • And also, since all of my columns have to be numeric, how can I transform a type "object" to "int64"? I tried `df = df.convert_objects(convert_numeric = True)`, but the "Events" column remains as object – joasa Jan 04 '17 at 14:30
  • There is no 'events' in your example data. But I updated my answer. – Jan Zeiseweis Jan 04 '17 at 14:34
  • You're right, I updated the question adding the Events column. I tried your suggestion, but I get this error `ValueError: could not convert string to float: 'fog-rain' ` – joasa Jan 04 '17 at 14:38
  • Of course you can't convert a string to number. Which number would represent 'fog'? ;-) So when you converting the values, make sure you only select the numeric columns. – Jan Zeiseweis Jan 04 '17 at 14:46
  • Is there a way to be able to represent how the revenue changes over time, depending on the event? – joasa Jan 04 '17 at 14:47
  • You could add markers (vertical lines) where an event ocured. In this post someone had a similar problem: http://stackoverflow.com/questions/21638727/marking-event-points-on-a-pandas-plot – Jan Zeiseweis Jan 04 '17 at 14:51
  • You should also consider adding multiple y-axis since the range of your y-values per column differs a lot (Mintemp [-1,7], revenue 100+) – Jan Zeiseweis Jan 04 '17 at 14:55
  • In [this](https://plot.ly/python/multiple-axes/#multiple-yaxes) example, for trace1 they use `x=[1, 2, 3], y=[40, 50, 60]`. I want to use `x=df['CET'], y = df['Mean TemperatureC'] `, but it hits a KeyError. Do you know why? – joasa Jan 04 '17 at 15:08
  • Because by `df.set_index('CET') you move the CET column from the column to the index. You can also use df.plot(x='CET', y='Mean TemperatureC') before setting CET as the index. Here you can find a lot of example plots http://pandas.pydata.org/pandas-docs/version/0.18.1/visualization.html – Jan Zeiseweis Jan 04 '17 at 15:53
1

The pandas plotting routines like plot.line or plot.scatter can take the column names for x and y arguments:

E.g.

>>> lines = df.plot.line(x='pig', y='horse')
>>> ax1 = df.plot.scatter(x='length',
...                       y='width',
...                       c='DarkBlue')
shaneb
  • 1,314
  • 1
  • 13
  • 18
0

To plot columns against each other you could use a pairplot

noobie
  • 2,427
  • 5
  • 41
  • 66
0

check out seaborn's "pairplot" and pandas "scattermatrix"

roadrunner66
  • 7,772
  • 4
  • 32
  • 38