1

I have data in a file. It looks like this:

08:00,user1,1   
08:10,user3,2   
08:15,empty,0  
....

How could I plot binary data with hours on x-axis and users on y-axis. Users will be denoted with different markers according to user. For example, user1 to be denoted as * and user3 to be denoted as o. And y-axis is 1 for user and 0 for empty. The numbers (in text file) after the username are meant to decide in condition statement which marker it will be.

Here is a pic of what I want to do.

enter image description here

Mel
  • 5,837
  • 10
  • 37
  • 42
Knowledge
  • 123
  • 6

1 Answers1

1

You can load the file with np.recfromcsv . We then convert the time column into datetime objects, for which we define a convtime function. Then we use this function to read in your CSV file.

import numpy as np
import matplotlib.pyplot as plt
convtime = lambda x: datetime.datetime.strptime(x, "%H:%M")
all_records = np.recfromcsv("myfilename.csv", names=["time", "user", "val"], converters={0:convtime}) # This says parse column 0 using the convtime function

Note that since we have given just the time part to datetime, it will assume the date as 1 January 1900. You can add a relevant date to it if you care.

Now, to plot the data. This brings us to a curious problem where matplotlib can use only one symbol for all points being plotted. Unfortunately, this means we have to use a for loop. First, let us define dicts for the symbol and colour for each user:

symbols = {'user1':'*', 'user3':'o', 'empty':'x'}
colours = {'user1':'blue', 'user3':'red', 'empty':'orange'}
for rec in all_records:
    plt.scatter(rec['time'], rec['val'], c=colours[rec['user']], marker=symbols[rec['user']])

That almost does it. We are still missing the legend. A drawback of this for loop is that every row in your file will make one entry in the legend. We beat this by creating a custom legend.

import matplotlib.lines as mlines
legend_list = []
for user in symbols.keys():
    legend_list.append(mlines.Line2D([], [], color=colours[user], marker=symbols[user], ls='none', label=user))
plt.legend(loc='upper right', handles=legend_list)
plt.show()

That does it! If your plot appears squished, then use plt.xlim() to adjust limits to your taste.

VBB
  • 1,305
  • 7
  • 17