40

I am using Python 2.7 and matplotlib. I am attempting to reach into my database of ambulance calls and count up the number of calls that happen on each weekday.

I will then use matplotlib to create a bar chart of this information to give the paramedics a visual graphic of how busy they are on each day.

Here is the code that works well:

import pyodbc
import matplotlib.pyplot as plt
MySQLQuery = """
SELECT 
 DATEPART(WEEKDAY, IIU_tDispatch)AS [DayOfWeekOfCall]
, COUNT(DATEPART(WeekDay, IIU_tDispatch)) AS [DispatchesOnThisWeekday]
FROM AmbulanceIncidents
GROUP BY DATEPART(WEEKDAY, IIU_tDispatch)
ORDER BY DATEPART(WEEKDAY, IIU_tDispatch)
"""
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=MyServer;DATABASE=MyDatabase;UID=MyUserID;PWD=MyPassword')
cursor = cnxn.cursor()
GraphCursor = cnxn.cursor()
cursor.execute(MySQLQuery)

#generate a graph to display the data
data = GraphCursor.fetchall()
DayOfWeekOfCall, DispatchesOnThisWeekday = zip(*data)
plt.bar(DayOfWeekOfCall, DispatchesOnThisWeekday)
plt.grid()
plt.title('Dispatches by Day of Week')
plt.xlabel('Day of Week')
plt.ylabel('Number of Dispatches')
plt.show()

The code shown above works very well. It returns a nice looking graph and I am happy. I just want to make one change.

Instead of the X axis showing the names of the days of the week, such as "Sunday", it shows the integer. In other words, Sunday is 1, Monday is 2, etc.

My fix for this is that I rewrite my sql query to use DATENAME() instead of DATEPART(). Shown below is my sql code to return the name of the week (as opposed to an integer).

SELECT 
 DATENAME(WEEKDAY, IIU_tDispatch)AS [DayOfWeekOfCall]
, COUNT(DATENAME(WeekDay, IIU_tDispatch)) AS [DispatchesOnThisWeekday]
FROM AmbulanceIncidents
GROUP BY DATENAME(WEEKDAY, IIU_tDispatch)
ORDER BY DATENAME(WEEKDAY, IIU_tDispatch)

Everything else in my python code stays the same. However this will not work and I cannot understand the error messages.

Here are the error messages:

Traceback (most recent call last):
  File "C:\Documents and Settings\kulpandm\workspace\FiscalYearEndReport\CallVolumeByDayOfWeek.py", line 59, in 

<module>
    plt.bar(DayOfWeekOfCall, DispatchesOnThisWeekday)
  File "C:\Python27\lib\site-packages\matplotlib\pyplot.py", line 2080, in bar
    ret = ax.bar(left, height, width, bottom, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 4740, in bar
    self.add_patch(r)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 1471, in add_patch
    self._update_patch_limits(p)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 1489, in _update_patch_limits
    xys = patch.get_patch_transform().transform(vertices)
  File "C:\Python27\lib\site-packages\matplotlib\patches.py", line 547, in get_patch_transform
    self._update_patch_transform()
  File "C:\Python27\lib\site-packages\matplotlib\patches.py", line 543, in _update_patch_transform
    bbox = transforms.Bbox.from_bounds(x, y, width, height)
  File "C:\Python27\lib\site-packages\matplotlib\transforms.py", line 745, in from_bounds
    return Bbox.from_extents(x0, y0, x0 + width, y0 + height)
TypeError: coercing to Unicode: need string or buffer, float found

To sum up, when I output my data with the x axis as integers representing days of week and y axis showing a count of the number of ambulance incidents, Matplotlib will produce a nice graph. But when my data output is the x axis is a string (Sunday, Monday, etc). then Matplotlib will not work.

How to fix this?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
David Kulpanowski
  • 431
  • 1
  • 4
  • 6

3 Answers3

89

Your question has nothing to do with an SQL query, it is simply a means to end. What you are really asking is how to change the text labels on a bar chart in pylab. The docs for the bar chart are useful for customizing, but to simply change the labels here is a minimal working example (MWE):

import pylab as plt

DayOfWeekOfCall = [1,2,3]
DispatchesOnThisWeekday = [77, 32, 42]

LABELS = ["Monday", "Tuesday", "Wednesday"]

plt.bar(DayOfWeekOfCall, DispatchesOnThisWeekday, align='center')
plt.xticks(DayOfWeekOfCall, LABELS)
plt.show()

enter image description here

Hooked
  • 84,485
  • 43
  • 192
  • 261
  • 27
    Does anyone else find it weird that a bar chart doesn't accept string labels by default? – Owen Jan 10 '17 at 14:12
  • 6
    @Owen. At this point matplotlib is so weird that I suspect no one really understands why anything happens. – Abhishek Divekar Jul 20 '17 at 13:12
  • @Owen. Luckily seaborn (though built on matplotlib) does not seem to have this problem (https://stackoverflow.com/q/32528154/4900327). – Abhishek Divekar Jul 20 '17 at 13:30
  • This is exactly the scenario that I was looking for. I had an array of length with 20 strings, and other array of same length with integers and was thinking to make relationship between these with int array on y axis and string array on x axis. Thank you. – Sidd Thota Mar 23 '18 at 13:03
6

Don't change your SQL code just to alter the illustration. Instead, make a small addition to your Python code.

I believe you can do something like this answer. Set the tick labels to be the days of the week.

It may be as simple as adding the following line:

plt.xticks((1, 2, ..., 7), ('Sunday', 'Monday', ..., 'Saturday'))

Documentation: pyplot.xticks

EDIT: Example in response to comment using a fictional table IncidentTypes that maps integer keys to names of incident types.

cursor.execute('select incident_type_id, count(*), incident_type 
    from Incidents join IncidentTypes using (incident_type_id) 
    group by incident_type_id')
results = cursor.fetchall()
tickpositions = [int(r[0]) for r in results]
numincidents = [int(r[1]) for r in results]
ticklabels = [r[2] for r in results]

plt.bar(tickpositions, numincidents)
plt.xticks(tickpositions, ticklabels)
Community
  • 1
  • 1
Steve Tjoa
  • 59,122
  • 18
  • 90
  • 101
  • This looks like it might be a good answer. I am going to try it out right now.Unfortunately, the next bar graph I am needing to create is the number of types of incidents ambulances respond to. There are about 60 different types of incidents. I cannot hard code 60 different types of values for the x axis. It is just too prone to error. – David Kulpanowski Feb 01 '12 at 20:19
  • continuation from previous post. SPSS and SAS easily create bar charts using nominal values. I have a hard time believing this is so difficult for Matplotlib. There has to be something easy that I am missing ! But what is it ? – David Kulpanowski Feb 01 '12 at 20:21
  • Re first comment: You could add a SQL table that maps integers to days or integers to incident types. Example: `create table IncidentTypes (pk int primary key auto_increment, Name varchar(20))`. Then just join the tables. This is flexible and modular. You can refer to an incident type either by key (int) or name (in Python). – Steve Tjoa Feb 01 '12 at 20:53
  • Re second comment: Adding the line above is not too much of a hassle. If you see [this example in the docs](http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.bar), they do the same thing. To get the labels (the second argument), you could read them in Python from the proposed SQL table in my previous comment. – Steve Tjoa Feb 01 '12 at 20:56
  • 1
    Thank you very much Steve. As soon as StackOverflow allows me to do so, I will post the final code that works. I got this to work and it creates a nice presentation graphic. Will post the final resulting code for others to see. – David Kulpanowski Feb 01 '12 at 21:57
1

Final completed answer that resolved the issue: Thank you very much Steve. You have been a great help. I studied geography in college, not programming, so this is quite difficult for me. Here is the final code that works for me.

 import pyodbc
    import matplotlib.pyplot as plt
    MySQLQuery = """
    SELECT 
      DATEPART(WEEKDAY, IIU_tDispatch)AS [IntegerOfDayOfWeek]
    , COUNT(DATENAME(WeekDay, IIU_tDispatch)) AS [DispatchesOnThisWeekday]
    , DATENAME(WEEKDAY, IIU_tDispatch)AS [DayOfWeekOfCall]
    FROM IIncidentUnitSummary
    INNER JOIN PUnit ON IIU_kUnit = PUN_Unit_PK
    WHERE PUN_UnitAgency = 'LC'
    AND IIU_tDispatch BETWEEN 'October 1, 2010' AND 'October 1, 2011'
    AND PUN_UnitID LIKE 'M__'
    GROUP BY DATEPART(WEEKDAY, IIU_tDispatch), DATENAME(WEEKDAY, IIU_tDispatch)
    ORDER BY DATEPART(WEEKDAY, IIU_tDispatch)
    """
    cnxn = pyodbc.connect("a bunch of stuff I don't want to share")
    cursor = cnxn.cursor()
    GraphCursor = cnxn.cursor()
    cursor.execute(MySQLQuery)

    results = cursor.fetchall()
    IntegerDayOfWeek, DispatchesOnThisWeekday, DayOfWeekOfCall = zip(*results)
    tickpositions = [int(r[0]) for r in results]
    numincidents = [int(r[1]) for r in results]
    ticklabels = [r[2] for r in results]
    plt.bar(tickpositions, numincidents)
    plt.xticks(tickpositions, ticklabels)
    #plt.bar(DayOfWeekOfCall, DispatchesOnThisWeekday)
    plt.grid()
    plt.title('Dispatches by Day of Week')
    plt.xlabel('Day of Week')
    plt.ylabel('Number of Dispatches')
    plt.show()

    cursor.close()
    cnxn.close()

I don't really understand the lines between "results=cursor.fetchall()" and the following four lines of code that involve creating arrays. I am glad you do, because I look at it and it still does not sink in. thank you very much. This helps out a lot. David

David Kulpanowski
  • 431
  • 1
  • 4
  • 6