0

I am working on a Jupyter Notebbok, plotting markers at certain latitude and longitudes in a map (boundary) using geopandas but I have ~40,000 locations and I need to mark (and color) each of them with a color based on a condition.

Screenshot of Geopandas dataframe gdf is below:

enter image description here

Code Snippet:

import matplotlib.patches as mpatches

# we have range of values from 0-15000
threshold1 = [8000,'#e60000']
threshold2 = [500,'#de791e']
threshold3 = [200,'#ff00ff']
threshold4 = [0   ,'#00ff0033']

# Create a dictionary of colors based on threshold
color_dict = {}
for x in gdf.n.to_list():
    if x>= threshold1[0]                    : color_dict[x] = threshold1[1]
    if x>= threshold2[0] and x<threshold1[0]: color_dict[x] = threshold2[1]
    if x>= threshold3[0] and x<threshold2[0]: color_dict[x] = threshold3[1] 
    if x<threshold3[0]                      : color_dict[x] = threshold4[1] 
        
        
# Set labels for the legend                
a_patch = mpatches.Patch(color = threshold1[1], 
                         label= str(threshold1[0]) + '-' + str(max(gdf.n.to_list())))

b_patch = mpatches.Patch(color = threshold2[1], 
                         label= str(threshold2[0]) + '-' + str(threshold1[0]))
                         
c_patch = mpatches.Patch(color = threshold3[1], 
                         label= str(threshold3[0]) + '-' + str(threshold2[0]))
                         
d_patch = mpatches.Patch(color = threshold4[1], 
                         label= str(min(gdf.n.to_list())) + '-' + str(threshold3[0]))





ax = gdf.plot(markersize=0 ,figsize = (20,20))
usa.geometry.boundary.plot(color=None,edgecolor='k',linewidth = 0.5, ax = ax)

# There are ~40,000 values to be iterated here
for x, y, label in zip(tqdm(gdf.geometry.x), gdf.geometry.y, gdf.n):
    ax.annotate('X', weight = 'bold', xy=(x, y), xytext=(x, y), fontsize= 8,color = color_dict[label], ha='center')
    sleep(0.1) 

ax.annotate(label, xy=(x, y), xytext=(x, y), fontsize= 8, color = color_dict[label], ha='center')

usa.apply(lambda x: ax.annotate(text = x.NAME, xy=x.geometry.centroid.coords[0], ha='center', fontsize= 2,color='black'),axis=1);


plt.xlim([-130,-60])
plt.ylim([20,55])


plt.legend(handles=[a_patch, b_patch, c_patch, d_patch])


plt.savefig("state.png",pad_inches=0, transparent=False, format = 'png')

I know it is this line which takes the most time:

for x, y, label in zip(tqdm(gdf.geometry.x), gdf.geometry.y, gdf.n):
    ax.annotate('X', weight = 'bold', xy=(x, y), xytext=(x, y), fontsize= 8,color = color_dict[label], ha='center')
    sleep(0.1) 

but I can not figure out any other way to label each coordinate without looping. Please help me in making it faster. 2-3 hours is very unreasonable for my work! Thank you!

Some references that I had usedfor my code:

  1. GeoPandas Label Polygons
  2. https://www.geeksforgeeks.org/matplotlib-axes-axes-annotate-in-python/
Sulphur
  • 514
  • 6
  • 24
  • 5
    I'm not sure what kind of plot you are trying to produce with 40K labeled points unless you are making a mural. :). Not to ask the obvious, but why do you have a `sleep()` command in the loop that you complain is slow? 40K * 0.1sec = 1.1hrs right there... – AirSquid Apr 12 '21 at 23:59
  • I am CRAZY! I checked everything but didn't think about the sleep function. My god! Thank you! I do need the 40K point but it's in 3 minutes now. – Sulphur Apr 13 '21 at 00:09
  • I would suggest you to compare the times between a run with 1000 rows, with and without `tqdm` and `sleep`. The mentioned use of sleep adds a lot of time and printing things out can consume time too. – pazitos10 Apr 13 '21 at 00:28
  • Is this one of those secret camera shows? Am I being pranked? ;) – AirSquid Apr 13 '21 at 00:50
  • @pazitos10, Yes, sleep() function was the culprit. @ AirSquid -- don't know about this show you mentioned but thanks for the help. Code is fixed! – Sulphur Apr 13 '21 at 00:59

1 Answers1

0

Thanks to @AirSquid -- it was the sleep function. God knows how I was overlooking this. It is solved.

Sulphur
  • 514
  • 6
  • 24