4

To remove duplicate lists from a list, there are several nice ways in Python - for example:

a = [[ 9.1514622, 47.1166004 ], [ 9.1513045, 47.1164599 ], [ 9.1516278, 47.1163001 ], [ 9.1517832, 47.1164408 ], [ 9.1514622, 47.1166004 ] ] 

print len(a) # 5
b_set = set(map(tuple,a))
b = map(list,b_set)
print len(b) # 4

But unfortunately, I had to convert my list to a Shapely Polygon object, in which I need to simplify the geometry and do some other geo functions.

from shapely.geometry import Polygon
a = [[[ 9.1514622, 47.1166004 ], [ 9.1513045, 47.1164599 ], [ 9.1516278, 47.1163001 ], [ 9.1517832, 47.1164408 ], [ 9.1514622, 47.1166004 ] ] ]
polys = [Polygon(item) for item in a] # convert list to polygon
print len(polys) # prints 5

This answer shows how to remove a duplicate Polygon from a list of Polygons, but how can I remove a duplicate point from a list of points, as a Shapely polygon?

I guess it's possible to convert it back to a list, remove duplicates, and then re-convert to Polygon.

But that seems overly complicated. Any ideas on how to do this?

Georgy
  • 12,464
  • 7
  • 65
  • 73
philshem
  • 24,761
  • 8
  • 61
  • 127

1 Answers1

10

Let's use the data in your question as an example. You have a list of coordinates:

L = [[ 9.1514622, 47.1166004 ], [ 9.1513045, 47.1164599 ], [ 9.1516278, 47.1163001 ], [ 9.1517832, 47.1164408 ], [ 9.1514622, 47.1166004 ]]

which is then converted into a Polygon:

P = Polygon(L)

Now, it might seem that L is redundant since the last point is the same as the first one. But that's actually not a problem since Shapely would otherwise duplicate the first point anyway (in order to close the boundary of the Polygon). You can see this with:

P = Polygon(L)
print(list(P.exterior.coords))
#[(9.1514622, 47.1166004), (9.1513045, 47.1164599), (9.1516278, 47.1163001), (9.1517832, 47.1164408), (9.1514622, 47.1166004)]

#now skip the last point
P = Polygon(L[:-1])
print(list(P.exterior.coords))
#[(9.1514622, 47.1166004), (9.1513045, 47.1164599), (9.1516278, 47.1163001), (9.1517832, 47.1164408), (9.1514622, 47.1166004)]

In case there would be some duplicate point "inside" L, as for example in:

L = [[ 9.1514622, 47.1166004 ], [ 9.1513045, 47.1164599 ], [ 9.1513045, 47.1164599 ], [ 9.1516278, 47.1163001 ], [ 9.1517832, 47.1164408 ], [9.1514622, 47.1166004 ]]

then one could eliminate it using the simplify method with zero tolerance (in order to not introduce side-effects):

print(list(Polygon(L).simplify(0).exterior.coords))
#[(9.1514622, 47.1166004), (9.1513045, 47.1164599), (9.1516278, 47.1163001), (9.1517832, 47.1164408), (9.1514622, 47.1166004)]
ewcz
  • 12,819
  • 1
  • 25
  • 47
  • 1
    `simplify` was very useful to me – vpipkt Mar 06 '20 at 16:41
  • 4
    `simplify(0)` will remove points which fall exactly on the line between two others e.g. `[(0,1),(0,2),(0,3)] -> [(0,1),(0,3)]`. This may be fine for you but will break cases where those vertices are meaningful e.g. openstreetmap road connectivity rules. – Sideshow Bob Sep 20 '21 at 09:28