0

I have a list (coordpairs) that I am trying to use as the basis for plotting using LineCollection. The list is derived from a Pandas data frame. I am having trouble getting the list in the right format, despite what is admittedly a clear error code. Trimmed data frame contents, code, and error are below. Thank you for any help.

Part of the Data Frame

RUP_ID  Vert_ID Longitude   Latitude
1   1   -116.316961 34.750178
1   2   -116.316819 34.750006
2   1   -116.316752 34.749938
2   2   -116.31662  34.749787
10  1   -116.317165 34.754078
10  2   -116.317277 34.751492
10  3   -116.317206 34.751273
10  4   -116.317009 34.75074
10  5   -116.316799 34.750489
11  1   -116.316044 34.760377
11  2   -116.317105 34.755674
11  3   -116.317165 34.754078

Code

import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
fig = plt.figure()
ax1 = plt.subplot2grid((2, 2), (0, 0), rowspan=2, colspan=1)
for ii in range(1,len(mydf)):
    temp = mydf.loc[mydf.RUP_ID == ii]
        df_line = temp.sort_values(by='Vert_ID', ascending=True)
        del temp
        lat = df_line.Latitude
        lon = df_line.Longitude
        lat = lat.tolist()
    long = long.tolist()
    coordpairs = zip(lat, long)
    lc = LineCollection(coordpairs, colors='r') # this is line 112 in the error
    ax1.add_collection(lc)

# note I also tried:
# import numpy as np
# coordpairs2 = np.vstack([np.array(u) for u in set([tuple(p) for p in coordpairs])])
# lc = LineCollection(coordpairs2, colors='r')
# and received the same plotting error

Error/Outputs

C:\apath\python.exe C:/mypath/myscript.py
Traceback (most recent call last):
  File "C:/mypath/myscript.py", line 112, in <module>
    lc = LineCollection(coordpairs, colors='r')  # this is line 112 in the error
  File "C:\apath\lib\site-packages\matplotlib\collections.py", line 1149, in __init__
    self.set_segments(segments)
  File "C:\apath\lib\site-packages\matplotlib\collections.py", line 1164, in set_segments
    self._paths = [mpath.Path(_seg) for _seg in _segments]
  File "C:\apath\lib\site-packages\matplotlib\path.py", line 141, in __init__
    raise ValueError(msg)
ValueError: 'vertices' must be a 2D list or array with shape Nx2

Process finished with exit code 1
username
  • 3
  • 1
  • 4
  • Python error are helpful. You claim you have a list, but `zip` returns an `zip`-iterator in python 3. That's essentially what the error tells you. `list(zip(...))` should work. – ImportanceOfBeingErnest Mar 18 '19 at 19:03
  • Oops, should have tagged as Python2.7 ... I tried `coordpairs = list(zip(lat, long))` and still received same error... is zip behavior different in 2.7? – username Mar 18 '19 at 19:16
  • Ok, in that case we need a [mcve]. – ImportanceOfBeingErnest Mar 18 '19 at 19:55
  • thank you @ImportanceOfBeingErnest; I updated the question based on the guidelines in the link, but please let me know if I have missed something – username Mar 18 '19 at 20:51
  • I see. lon and lat are just 1D vectors. May I ask, what is the purpose of the linecollection for such 1D vector? Why not use a simple `plot`? – ImportanceOfBeingErnest Mar 18 '19 at 23:16
  • Yes, `plot` is much easier, but I wanted to see if `LineCollector` was faster as there are actually a few thousand lines. I think you @ImportanceOfBeingErnest previously suggested I try it in the comments [here](https://stackoverflow.com/questions/55191212/matplotlib-duplicate-subplot-in-multiple-figures-without-redrawing-for-each-figu?noredirect=1#comment97122331_55191212) – username Mar 18 '19 at 23:21
  • Yes `LineCollection` is faster with many lines. I was confused by the line collection being inside the loop (that does not make much sense, because it would create many linecollections). – ImportanceOfBeingErnest Mar 18 '19 at 23:43
  • Yeah, there are actually colors assigned to the lines from a dictionary so that was why I had it in a loop (but maybe that is wrong). I removed that part of the code because I thought it violated the 'minimal' part of the guidelines, but I see now it probably matters. Thanks for your help, your answer works based on the code I provided (without the color dictionary) – username Mar 19 '19 at 00:00

1 Answers1

0

You would want to create one single LineCollection, with several lines, one per RUP_ID value from the first dataframe column. That means you best loop over the unique values of that column (not over every row!) and append the coordinates to a list. Use that list as the input to LineCollection.

u = """RUP_ID  Vert_ID Longitude   Latitude
1   1   -116.316961 34.750178
1   2   -116.316819 34.750006
2   1   -116.316752 34.749938
2   2   -116.31662  34.749787
10  1   -116.317165 34.754078
10  2   -116.317277 34.751492
10  3   -116.317206 34.751273
10  4   -116.317009 34.75074
10  5   -116.316799 34.750489
11  1   -116.316044 34.760377
11  2   -116.317105 34.755674
11  3   -116.317165 34.754078"""

import io
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection

df = pd.read_csv(io.StringIO(u), sep="\s+")

verts = []
for (RUP_ID, grp) in df.groupby("RUP_ID"):

    df_line = grp.sort_values(by='Vert_ID', ascending=True)

    lat = df_line.Latitude
    lon = df_line.Longitude

    verts.append(list(zip(lon, lat)))

lc = LineCollection(verts, color='r')

fig, ax = plt.subplots()
ax.add_collection(lc)
ax.autoscale()
plt.show()
ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712