4

I have the following sorted DataFrame (the numbers are completely random):

In[1]: df
Out[1]:
            Total  Count
Location 1     20      5
Location 2     15      4
Location 3     13      3
...
Location 10     1      1

Each location has a latitude and longitude.

I would like to plot these locations on a map using circles. The radius of the circles needs to correspond to the amount in Total. In other words, Location 1 needs to have the biggest circle, Location 2 a smaller one, etc.

Also, I would like to have a transition in colors. The biggest circle in red, the next one in orange, the next one in yellow, etc.

Lastly, I would like to make an annotation next to each circle.

I managed to draw blue dots on the map, but I don't know how to draw the circles with the corresponding size and color.

This is my code so far:

m = Basemap(resolution='i', projection='merc', llcrnrlat=49.0, urcrnrlat=52.0, llcrnrlon=1., urcrnrlon=8.0, lat_ts=51.0)
m.drawcountries()
m.drawcoastlines()
m.fillcontinents()

for row_index, row in df.iterrows():
    x, y = db.getLocation(row_index)
    lat, lon = m(y, x)
    m.plot(lat, lon, 'b.', alpha=0.5)
    #This draws blue dots.

plt.title('Top 10 Locations')
plt.show()
JNevens
  • 11,202
  • 9
  • 46
  • 72

1 Answers1

3
  • The matplotlib scatter function has s and c parameters which would allow you to plot dots of different sizes and colors.

    The Pandas DataFrame.plot method calls the matplotlib scatter function when you specify kind='scatter'. It also passes extra arguments along to the call to scatter so you could use something like

    df.plot(kind='scatter', x='lon', y='lat', s=df['Total']*50, c=df['Total'], cmap=cmap)
    

    to plot your points.

  • Annotating the points can be done with calls to plt.annotate.

  • The gist_rainbow colormap goes from red to orange to yellow ... to violet. gist_rainbow_r is the reversed colormap, which makes red correspond to the largest values.


For example,

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'Total': [20,15,13,1],
                   'lat': [40,0,-30,50],
                   'lon': [40,50,60,70], }, 
                  index=['Location {}'.format(i) for i in range(1,5)])

cmap = plt.get_cmap('gist_rainbow_r')
df.plot(kind='scatter', x='lon', y='lat', s=df['Total']*50, c=df['Total'], cmap=cmap)

for idx, row in df.iterrows():
    x, y = row[['lon','lat']]
    plt.annotate(
        str(idx), 
        xy = (x, y), xytext = (-20, 20),
        textcoords = 'offset points', ha = 'right', va = 'bottom',
        bbox = dict(boxstyle = 'round,pad=0.5', fc = 'yellow', alpha = 0.5),
        arrowprops = dict(arrowstyle = '->', connectionstyle = 'arc3,rad=0'))

plt.show()

yields

enter image description here


Do not call df.plot or plt.scatter once for each dot. That would become terribly slow as the number of dots increases. Instead, collect requisite the data (the longitudes and latitudes) in the DataFrame so that the dots can be drawn with one call to df.plot:

longitudes, latitudes = [], []
for row_index, row in df.iterrows():
    x, y = db.getLocation(row_index)
    lat, lon = m(y, x)
    longitudes.append(lon)
    latitudes.append(lat)
    plt.annotate(
        str(row_index), 
        xy = (x, y), xytext = (-20, 20),
        textcoords = 'offset points', ha = 'right', va = 'bottom',
        bbox = dict(boxstyle = 'round,pad=0.5', fc = 'yellow', alpha = 0.5),
        arrowprops = dict(arrowstyle = '->', connectionstyle = 'arc3,rad=0'))

df['lon'] = longitudes
df['lat'] = latitudes
cmap = plt.get_cmap('gist_rainbow_r')
ax = plt.gca()
df.plot(kind='scatter', x='lon', y='lat', s=df['Total']*50, c=df['Total'], 
        cmap=cmap, ax=ax)
Community
  • 1
  • 1
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • Very nice and detailed answer! I have only 1 more question. I don't have access to the latitude and longitude outside of the `df.iterrows()`. I have to fetch these out of a database (as you can maybe see by my example code: `x, y = db.getLocation(row_index)`. Can I just use `plt.scatter` in every iteration of the loop to get the same result? – JNevens Apr 09 '15 at 13:05
  • How can I arange for the colors to use the cmap in this case? – JNevens Apr 09 '15 at 13:13
  • Do not call `plt.scatter` with each iteration of the loop. Doing so for hundreds of points could make the script very slow. Instead collect the `lon` and `lat`s in lists or the DataFrame, so the points can be drawn with one call to `df.plot` or `plt.scatter`. I've edited the post above to show one such way. – unutbu Apr 09 '15 at 13:26
  • I almost got it to work. The problem now is, it plots 2 figures. Once the `Basemap` with the annotations and once the DataFrame. – JNevens Apr 09 '15 at 13:34
  • Oops, my mistake. `df.plot` creates a new axis by default, but if you pass an `axis` object to it, then that axis is used. So `ax = plt.gca()` (gets the current axis) and `df.plot(..., ax=ax)` is the fix. (I've corrected it above.) – unutbu Apr 09 '15 at 13:36