8

I have a list of points in 2D space that form an (imperfect) grid:

x     x        x         x

 x    x       x           x

                 x
x      x                x

 x     x      x           x

What's the best way to fit these to a rigid grid (i.e. create a two-dimendional array and work out where each point fits in that array)?

There are no holes in the grid, but I don't know in advance what its dimensions are.

EDIT: The grid is not necessarily regular (not even spacing between rows/cols)

user1958735
  • 103
  • 1
  • 6
  • 4
    Why not round each coordinate ? – Denys Séguret Jan 08 '13 at 16:34
  • 1
    You need the size of each row/column or only the row/column count? – bitWorking Jan 08 '13 at 16:39
  • 1
    Looks like a job for some kind of Fourier transform. – n. m. could be an AI Jan 08 '13 at 16:42
  • @n.m. You mean if you don't know in advance what is the virtual grid ? If so yes... – Denys Séguret Jan 08 '13 at 16:42
  • @redreggae: aren't they equivalent? For example, I can work out the size of each row by dividing the total number of points by the column count – user1958735 Jan 08 '13 at 18:24
  • 1
    @dystroy - I can't round the coordinates because points in the same row/col may be more than one unit apart – user1958735 Jan 08 '13 at 18:27
  • By "unit", you mean the side of a cell of the grid, right ? If so, I don't see the problem. – Denys Séguret Jan 08 '13 at 18:30
  • @dystroy - Let's say we had the following points: (1, 1), (5, 57), (102, 6), (90, 65). How would you round them? – user1958735 Jan 08 '13 at 18:41
  • I was supposing the the grid interval was known. if that's the case simply divide the values by the interval before rounding to the nearest integer. If you don't know it, you have to find it. You might maybe take the first period of a Fourier transform as suggested by @n.m. but I'm not sure at all. – Denys Séguret Jan 08 '13 at 19:01
  • FWIW I was just searching for algorithms to pick out 2D sub-grids from "bags of points" and was kind of amazed that there doesn't seem to be any. I had expected there to be some clever version of the Hough transform but for grids instead of lines. – BjornW Aug 25 '21 at 08:14

4 Answers4

5

A little bit of an image processing approach: If you think of what you have as a binary image where the X is 1 and the rest is 0, you can sum up rows and columns, and use a peak finding algorithm to identify peaks which would correspond to x and y lines of the grid:

Your points as a binary image:

enter image description here

Sums of row/columns

enter image description here

Now apply some smoothing technique to the signal (e.g. lowess):

enter image description here

I'm sure you get the idea :-)

Good luck

Omar Wagih
  • 8,504
  • 7
  • 59
  • 75
4

The best I could come up with is a brute-force solution that calculates the grid dimensions that minimize the error in the square of the Euclidean distance between the point and its nearest grid intersection.

This assumes that the number of points p is exactly equal to the number of columns times the number of rows, and that each grid intersection has exactly one point on it. It also assumes that the minimum x/y value for any point is zero. If the minimum is greater than zero, just subtract the minimum x value from each point's x coordinate and the minimum y value from each point's y coordinate.

The idea is to create all of the possible grid dimensions given the number of points. In the example above with 16 points, we would make grids with dimensions 1x16, 2x8, 4x4, 8x2 and 16x1. For each of these grids we calculate where the grid intersections would lie by dividing the maximum width of the points by the number of columns minus 1, and the maximum height of the points by the number of rows minus 1. Then we fit each point to its closest grid intersection and find the error (square of the distance) between the point and the intersection. (Note that this only works if each point is closer to its intended grid intersection than to any other intersection.)

After summing the errors for each grid configuration individually (e.g. getting one error value for the 1x16 configuration, another for the 2x8 configuration and so on), we select the configuration with the lowest error.

Initialization:

  P is the set of points such that P[i][0] is the x-coordinate and
                                   P[i][1] is the y-coordinate
  Let p = |P| or the number of points in P
  Let max_x = the maximum x-coordinate in P
  Let max_y = the maximum y-coordinate in P
     (minimum values are assumed to be zero)

  Initialize min_error_dist = +infinity
  Initialize min_error_cols = -1

Algorithm:

  for (col_count = 1; col_count <= n; col_count++) {
     // only compute for integer # of rows and cols
     if ((p % col_count) == 0) {   
        row_count = n/col_count;

        // Compute the width of the columns and height of the rows
        // If the number of columns is 1, let the column width be max_x
        // (and similarly for rows)
        if (col_count > 1) col_width = max_x/(col_count-1);
        else col_width=max_x;
        if (row_count > 1) row_height = max_y/(row_count-1);
        else row_height=max_y;

        // reset the error for the new configuration
        error_dist = 0.0;
        for (i = 0; i < n; i++) {
           // For the current point, normalize the x- and y-coordinates
           // so that it's in the range 0..(col_count-1)
           //                       and 0..(row_count-1)
           normalized_x = P[i][0]/col_width;
           normalized_y = P[i][1]/row_height;

           // Error is the sum of the squares of the distances between 
           // the current point and the nearest grid point 
           // (in both the x and y direction)
           error_dist += (normalized_x - round(normalized_x))^2 +
                         (normalized_y - round(normalized_y))^2;
        }

        if (error_dist < min_error_dist) {
           min_error_dist = error_dist;
           min_error_cols = col_count;
        }
     }
  }
  return min_error_cols;

Once you've got the number of columns (and thus the number of rows) you can recompute the normalized values for each point and round them to get the grid intersection they belong to.

beaker
  • 16,331
  • 3
  • 32
  • 49
  • Thanks for that! It's almost what I'm looking for, except I forgot to specify that the grid isn't necessarily regular (see other answer) – user1958735 Jan 21 '13 at 15:14
  • Thanks! I ended up adapting this approach for a slightly different problem where I couldn't assume there was no blank (but it can't be too sparse): Considering `x` and `y` coordinates separately, I try for increasing number of `slots` (i.e. cols or rows), stopping when it turns out that more than some percent of slots are empty, and then returning the solution with the smallest sum of square errors. – Davide Apr 12 '16 at 15:37
2

In the end I used this algorithm, inspired by beaker's:

  1. Calculate all the possible dimensions of the grid, given the total number of points
  2. For each possible dimension, fit the points to that dimension and calculate the variance in alignment:
    • Order the points by x-value
    • Group the points into columns: the first r points form the first column, where r is the number of rows
    • Within each column, order the points by y-value to determine which row they're in
    • For each row/column, calcuate the range of y-values/x-values
    • The variance in alignment is the maximum range found
  3. Choose the dimension with the least variance in alignment
user1958735
  • 103
  • 1
  • 6
1

I wrote this algorithm that accounts for missing coordinates as well as coordinates with errors.

Python Code

# Input [x, y] coordinates of a 'sparse' grid with errors
xys = [[103,101],
       [198,103],
       [300, 99],
       [ 97,205],
       [304,202],
       [102,295],
       [200,303],
       [104,405],
       [205,394],
       [298,401]]

def row_col_avgs(num_list, ratio):
    # Finds the average of each row and column. Coordinates are
    # assigned to a row and column by specifying an error ratio.
    last_num = 0
    sum_nums = 0
    count_nums = 0
    avgs = []
    num_list.sort()
    for num in num_list:
        if num > (1 + ratio) * last_num and count_nums != 0:
            avgs.append(int(round(sum_nums/count_nums,0)))
            sum_nums = num
            count_nums = 1
        else:
            sum_nums = sum_nums + num
            count_nums = count_nums + 1
        last_num = num
    avgs.append(int(round(sum_nums/count_nums,0)))
    return avgs

# Split coordinates into two lists of x's and y's
xs, ys = map(list, zip(*xys))

# Find averages of each row and column within a specified error.
x_avgs = row_col_avgs(xs, 0.1)
y_avgs = row_col_avgs(ys, 0.1)

# Return Completed Averaged Grid
avg_grid = []
for y_avg in y_avgs:
    avg_row = []
    for x_avg in x_avgs:
        avg_row.append([int(x_avg), int(y_avg)])
    avg_grid.append(avg_row)

print(avg_grid)

Code Output

[[[102, 101], [201, 101], [301, 101]], 
 [[102, 204], [201, 204], [301, 204]], 
 [[102, 299], [201, 299], [301, 299]], 
 [[102, 400], [201, 400], [301, 400]]]

I am also looking for another solution using linear algebra. See my question here.

Jakub
  • 489
  • 3
  • 13