4

This problem comes from cracking the coding interview chapter 7 problem 6. To me as a mathematician this seems like a simple least squares problem where we find the best fit line. Although, in the solution they take a different approach.

My question is the following: is a developing a least squares approach a sufficient solution or am I not understanding the problem at hand?

  • 1
    If you are sure that least squares approach solves the problem AND it can be coded, its a valid solution. Is it the best solution? Only if it is simpler to code / more efficient than https://stackoverflow.com/questions/4179581/what-is-the-most-efficient-algorithm-to-find-a-straight-line-that-goes-through-m – juvian Sep 18 '18 at 17:33
  • @juvian I mean least squares is what they do in pretty much any industry where you want to use regression. I am just confused why the author takes a totally different approach. Doesnt seem practical to me. –  Sep 18 '18 at 17:37
  • 3
    Least squares approach solves another problem, resulting line might not pass through points at all. – MBo Sep 18 '18 at 17:38
  • @MBo I see, so it is not really a viable solution then? –  Sep 18 '18 at 17:44
  • Yes, it is not suitable instrument – MBo Sep 18 '18 at 17:45
  • Generally, the least squares line will pass through *none* of the points. In fact, it is not a solution at all ! –  Sep 19 '18 at 08:39

3 Answers3

2

Least squares isn't an appropriate solution, it doesn't care about the number of aligned points. The least-squares fit might contain no point at all.

The solution in the link by julian has an O(N²) behavior, assuming that a hash map has O(N) behavior to count duplicates. (With sorting, O(N²Log N) can be guaranteed.)

The main idea is to take every point in turn, to compute the directions to all other points, and count the coincident directions.

0

You can consider using the dual plane.
For any point p with coordinates p_x,p_y, you can convert it into its dual line p* = (y=p_xx-p_y).
For any line l: y=mx+b, you can convert it to its dual point l
= (m, -b)

Colinear points in the plane then form intersecting lines in the dual plane. A line intersection algorithm can then be used to find the intersection point with the largest amount of lines. Translating this intersection point in the dual plane back to a line in the primal plane gives the line intersecting the largest amount of points.

See chapter 8.2 of Computation Geometry by M. de Berg et al for more specifics.

b9s
  • 517
  • 4
  • 13
0

Hough Transform is mostly what you are looking for. You may use Probabilistic version of it to speed up at the cost of some accuracy. OpenCv library has it implemented, but it isn't difficult to reimplement it.