-2

I have a list of co-ordinates which I looks like this when plotted: plotted co-ordinates

They are not on perfect lines. How can I separate them to multiple lists where each list contains the co-ordinates that looks like on same horizontal line.

Here sample data:

[(24, 228), (25, 194), (26, 162), (29, 83), (30, 52), (31, 17), (63, 223), (63, 194), (64, 162), (65, 84), (66, 49), (67, 19), (100, 228), (100, 190), (101, 158), (102, 81), (102, 54), (102, 20), (137, 227), (137, 195), (137, 163), (137, 86), (137, 52), (137, 22), (172, 23), (172, 57), (172, 87), (173, 163), (173, 195), (173, 227), (206, 24), (206, 58), (207, 84), (208, 159), (208, 191), (209, 223)] 
Spektre
  • 49,595
  • 11
  • 110
  • 380
Ruhshan
  • 121
  • 1
  • 8
  • 1
    provide sample data and I'll give you a starting point... – Julien Jan 09 '18 at 05:49
  • 1
    axis aligned or arbitrary ? The image hints vertical lines are better match ... See [Efficiently calculating a segmented regression on a large dataset](https://stackoverflow.com/a/29232658/2521214) and [Given n points on a 2D plane, find the maximum number of points that lie on the same straight line](https://stackoverflow.com/a/20888844/2521214) – Spektre Jan 09 '18 at 16:15

2 Answers2

1

Since you are interested in horizontal lines, all you care about for each point is its y co-ordinate. I would sort the y coordinates into ascending order, and then go through this list cutting it into segments where the gaps between neighbouring points are over some threshold. Each segment remaining is a cluster of points on the same horizontal line.

mcdowella
  • 19,301
  • 2
  • 19
  • 25
1

This should do the trick:

data = np.array([(24, 228), (25, 194), (26, 162), (29, 83), (30, 52), (31, 17), (63, 223), (63, 194), (64, 162), (65, 84), (66, 49), (67, 19), (100, 228), (100, 190), (101, 158), (102, 81), (102, 54), (102, 20), (137, 227), (137, 195), (137, 163), (137, 86), (137, 52), (137, 22), (172, 23), (172, 57), (172, 87), (173, 163), (173, 195), (173, 227), (206, 24), (206, 58), (207, 84), (208, 159), (208, 191), (209, 223)])

thresh = 10
groups = []
for point in data:
    x,y = point
    for g in groups:
        if abs(g[0][1] - y) < thresh:
            g.append(point)
            break
    else:
        groups.append([point])

Up to you to tweak it as you like...

Julien
  • 13,986
  • 5
  • 29
  • 53