1

I have a log which describes my home ADSL speeds. Log entries are in the following format, where the fields are datetime;level;downspeed;upspeed;testhost:

2020-01-06 18:09:45;INFO;211.5;29.1;0;host:spd-pub-rm-01-01.fastwebnet.it
2020-01-06 18:14:39;WARNING;209.9;28.1;0;host:spd-pub-rm-01-01.fastwebnet.it
2020-01-08 10:51:27;INFO;211.6;29.4;0;host:spd-pub-rm-01-01.fastwebnet.it

(for a full sample file -> https://www.dropbox.com/s/tfmj9ozxe5millx/test.log?dl=0 for you to download for the code below)

I wish to plot a matplot figure with the download speeds on the left axis, the upload speeds (which are on a smaller and lower range of values) and have the shortened datetimes under the x tick marks possibly at 45 degrees angle.

"""Plots the adsl-log generated log."""
import matplotlib.pyplot as plt
# import matplotlib.dates as mdates
import pandas as pd

# set field delimiter and set column names which will also cause reading from row 1
data = pd.read_csv("test.log", sep=';', names=[
                   'datetime', 'severity', 'down', 'up', 'loss', 'server'])

#  we need to filter out ERROR records (with 0 speeds)
indexNames = data[data['severity'] == 'ERROR'].index
data.drop(indexNames, inplace=True)

# convert datetime pandas objecti to datetime64
data['datetime'] = pd.to_datetime(data['datetime'])

# use a dataframe with just the data I need; cleaner
speeds_df = data[['datetime', 'down', 'up']]
speeds_df.info() # this shows datetime column is really a datetime64 value now
# now let's plot
fig, ax = plt.subplots()
y1 = speeds_df.plot(ax=ax, x='datetime', y='down', grid=True, label="DL", legend=True, linewidth=2,ylim=(100,225))
y2 = speeds_df.plot(ax=ax, x='datetime', y='up', secondary_y=True, label="UL", legend=True, linewidth=2, ylim=(100,225))

plt.show()

I am now obtaining the plot I need but would appreciate some clarification about the roles of the ax, y1 and y2 axes in the above code.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Robert Alexander
  • 875
  • 9
  • 24
  • Stroke of luck ;) managed to get almost there by adding a fig, ax = plt.subplots() line before my plots and use the ax=ax parameter in both. Still confused as to the semantics of ax and the y1, y2 generated by my code. – Robert Alexander Jan 09 '20 at 14:53

2 Answers2

3

First, assigning y1 and y2 objects is unnecessary as you will never use them later on. Also, legend=True is the default.

Therefore, you are first initializing an array of axes objects (defaulting to one item, nrow=1 and nrow=2), and then assigning it/them according to the pandas plots. Now, normally, you would be overwriting the assignment of ax with ax=ax, but since you employ a secondary y-axis, plots overlay with each other:

# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(figsize=(8,4))

# ASSIGN AXES OBJECTS ACCORDINGLY
speeds_df.plot(ax=axs, x='datetime', y='down', grid=True, label="DL", linewidth=2, ylim=(100,225))
speeds_df.plot(ax=axs, x='datetime', y='up', secondary_y=True, label="UL", linewidth=2, ylim=(100,225))

plt.show()

Single Plot


To illustrate how axes objects can be extended, see below with multiple (non-overlaid) plots.

Example of multiple subplots using nrows=2:

# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(nrows=2, figsize=(8,4))

# ASSIGN AXES OBJECTS WITH INDEXING AND NO Y LIMITS
speeds_df.plot(ax=axs[0], x='datetime', y='down', grid=True, label="DL", linewidth=2)
plt.subplots_adjust(hspace = 1)
speeds_df.plot(ax=axs[1], x='datetime', y='up', label="UL", linewidth=2)

plt.show()

two row subplots


Example of multiple plots using ncols=2:

# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(ncols=2, figsize=(12,4))

# ASSIGN AXES OBJECTS WITH INDEXING AND NO Y LIMITS
speeds_df.plot(ax=axs[0], x='datetime', y='down', grid=True, label="DL", linewidth=2)
speeds_df.plot(ax=axs[1], x='datetime', y='up', label="UL", linewidth=2)

plt.show()

two column subplots


You can even use subplots=True after setting date/time field as index:

# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(figsize=(8,4))

# ASSIGN AXES OBJECT PLOTTING ALL COLUMNS
speeds_df.set_index('datetime').plot(ax=axs, subplots=True, grid=True, label="DL", linewidth=2)

plt.show()

Pandas subplots output

Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Thanks. What I do not understand in your first code example is the use of the df.plot ylim parameters. They seem to define the left ax limits but how did you specify the right values as you're using the same numbers on both plots? – Robert Alexander Jan 09 '20 at 17:08
  • Both Pandas' `subplots=True` (as shown at bottom) and `secondary_y=True` are convenience short-hand ways likely for simpler plots while matplotlib has its own way for both features. See [dual axes](https://stackoverflow.com/questions/14762181/adding-a-y-axis-label-to-secondary-y-axis-in-matplotlib). As you can see, it does not allow specifying the second y-limit axes which you must manually handle. – Parfait Jan 09 '20 at 17:52
0

So thanks to @Parfait I hope I understood things correctly. Here the working code:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
###### Prepare the data to plot
# set field delimiter and set column names which will also cause reading from row 1
data = pd.read_csv('test.log', sep=';', names=[
                   'datetime', 'severity', 'down', 'up', 'loss', 'server'])
#  we need to filter out ERROR records (with 0 speeds)
indexNames = data[data['severity'] == 'ERROR'].index
data.drop(indexNames, inplace=True)
# convert datetime pandas object to datetime64
data['datetime'] = pd.to_datetime(data['datetime'])
# use a dataframe with just the data I need; cleaner
speeds_df = data[['datetime', 'down', 'up']]

# now plot the graph
fig, ax = plt.subplots()

color = 'tab:green'
ax.set_xlabel('thislabeldoesnotworkbutcolordoes', color=color)
ax.tick_params(axis='x', labelcolor=color)

color = 'tab:red'
speeds_df.plot(ax=ax, x='datetime', y='down', label="DL", legend=True, linewidth=2, color=color)
ax.set_ylabel('DL', color=color)
ax.tick_params(axis='y', labelcolor=color)

color = 'tab:blue'
ax2 = speeds_df.plot(ax=ax, x='datetime', y='up', secondary_y=True, label="UL", legend=True, linewidth=2, color=color)
ax2.set_ylabel('UL', color=color)
ax2.tick_params(axis='y', labelcolor=color)
# using ylim in the plot command params does not work the same
# cannot show a grid since the two scales are different
ax.set_ylim(10, 225)
ax2.set_ylim(15, 50)

plt.show()

Which gives: output of code above

What I still don't get is: a) why the x-axis label only seems to honour the color but not the string value :( b) why the ylim=(n,m) parameters in the df plot does not work well and I have to use the ax.set_ylim constructs instead

Robert Alexander
  • 875
  • 9
  • 24
  • For (a) pandas likely does not have a feature to select secondary y-axis limits. For (b) your pandas `plot` overwrites the x label, `thislabeldoesnotworkbutcolordoes`. So move it after pandas plotting and it should render. – Parfait Jan 09 '20 at 17:57
  • @Parfait thank you for the clarity you've helped me achieve. That was the case and now all works as expected. – Robert Alexander Jan 09 '20 at 19:13