49

The problem:

N points are given on a 2-dimensional plane. What is the maximum number of points on the same straight line?

The problem has O(N2) solution: go through each point and find the number of points which have the same dx / dy with relation to the current point. Store dx / dy relations in a hash map for efficiency.

Is there a better solution to this problem than O(N2)?

Reblochon Masque
  • 35,405
  • 10
  • 55
  • 80
Leonid
  • 22,360
  • 25
  • 67
  • 91
  • 2
    Don't think it's possible. It would be possible if there would exist such a transformation on a single point that would help, but unfortunately any transformation I can think of requires 2 points. A probabilistic approach (like Monte Carlo) could be faster, but there would be no garantee that it found the maximum. – ruslik Nov 14 '10 at 21:17
  • If you substitude point coordinates to line equation, `k*x[i] + b = y[i]`, you'll get equation about `k` and `x`. In {k,x}-space it will be a line. So there it become a problem of maximum lines going through one point. It may have a solution. – Vovanium Nov 15 '10 at 00:54
  • 1
    Note that problem makes sense only for integer coodonates, so `k` and `b` have to be rational numbers, such that `b == y - k*x` , where `y` and `x` are integers. Maybe by transforming the problem in the form "find such rational numbers `b` and `k` that satisfies most equations" will help. – ruslik Nov 15 '10 at 23:33
  • 9
    @leonid, do you think dx/dy is enough? I thought dx/dy is only the slope, but how about y-intercept? – Jack Jun 07 '12 at 16:16

9 Answers9

44

There is likely no solution to this problem that is significantly better than O(n^2) in a standard model of computation.

The problem of finding three collinear points reduces to the problem of finding the line that goes through the most points, and finding three collinear points is 3SUM-hard, meaning that solving it in less than O(n^2) time would be a major theoretical result.

See the previous question on finding three collinear points.

For your reference (using the known proof), suppose we want to answer a 3SUM problem such as finding x, y, z in list X such that x + y + z = 0. If we had a fast algorithm for the collinear point problem, we could use that algorithm to solve the 3SUM problem as follows.

For each x in X, create the point (x, x^3) (for now we assume the elements of X are distinct). Next, check whether there exists three collinear points from among the created points.

To see that this works, note that if x + y + z = 0 then the slope of the line from x to y is

(y^3 - x^3) / (y - x) = y^2 + yx + x^2

and the slope of the line from x to z is

(z^3 - x^3) / (z - x) = z^2 + zx + x^2 = (-(x + y))^2 - (x + y)x + x^2 = x^2 + 2xy + y^2 - x^2 - xy + x^2 = y^2 + yx + x^2

Conversely, if the slope from x to y equals the slope from x to z then

y^2 + yx + x^2 = z^2 + zx + x^2,

which implies that

(y - z) (x + y + z) = 0,

so either y = z or z = -x - y as suffices to prove that the reduction is valid.

If there are duplicates in X, you first check whether x + 2y = 0 for any x and duplicate element y (in linear time using hashing or O(n lg n) time using sorting), and then remove the duplicates before reducing to the collinear point-finding problem.

Community
  • 1
  • 1
jonderry
  • 23,013
  • 32
  • 104
  • 171
  • 2
    What's so special about assuming points as (x, x^3)? Why doesn't the math work out for say, (x, x^2)? – gjain Feb 28 '15 at 16:08
  • I was about to ask about what guarantees that the y-intercept is also the same for these 3 lines (i.e., the answer only shows that the *slope* will be the same), but then it clicked: if two lines have the same slope *and go through some common point*, then they are necessarily the same line -- and that's the case with all 3 possible lines here (xy and xz have the same slope and share x, and yz has the same slope and shares z with xz). – j_random_hacker Aug 19 '15 at 13:37
5

If you limit the problem to lines passing through the origin, you can convert the points to polar coordinates (angle, distance from origin) and sort them by angle. All points with the same angle lie on the same line. O(n logn)

I don't think there is a faster solution in the general case.

Anirudh Ramanathan
  • 46,179
  • 22
  • 132
  • 191
Erwin J.
  • 587
  • 1
  • 5
  • 15
4

The Hough Transform can give you an approximate solution. It is approximate because the binning technique has a limited resolution in parameter space, so the maximum bin will give you some limited range of possible lines.

ergosys
  • 47,835
  • 5
  • 49
  • 70
1

Again an O(n^2) solution with pseudo code. Idea is create a hash table with line itself as the key. Line is defined by slope between the two points, point where line cuts x-axis and point where line cuts y-axis.

Solution assumes languages like Java, C# where equals method and hashcode methods of the object are used for hashing function.

Create an Object (call SlopeObject) with 3 fields

  1. Slope // Can be Infinity
  2. Point of intercept with x-axis -- poix // Will be (Infinity, some y value) or (x value, 0)
  3. Count

poix will be a point (x, y) pair. If line crosses x-axis the poix will (some number, 0). If line is parallel to x axis then poix = (Infinity, some number) where y value is where line crosses y axis. Override equals method where 2 objects are equal if Slope and poix are equal.

Hashcode is overridden with a function which provides hashcode based on combination of values of Slope and poix. Some pseudo code below

Hashmap map;
foreach(point in the array a) {
    foeach(every other point b) {
        slope = calculateSlope(a, b);
        poix = calculateXInterception(a, b);
        SlopeObject so = new SlopeObject(slope, poix, 1); // Slope, poix and intial count 1.
        SlopeObject inMapSlopeObj = map.get(so);
        if(inMapSlopeObj == null) {
            inMapSlopeObj.put(so);
        } else {
            inMapSlopeObj.setCount(inMapSlopeObj.getCount() + 1);
        }
    }
}
SlopeObject maxCounted = getObjectWithMaxCount(map);
print("line is through " + maxCounted.poix + " with slope " + maxCounted.slope);
nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Dheerendra Kulkarni
  • 2,728
  • 1
  • 16
  • 18
1

Move to the dual plane using the point-line duality transform for p=(a,b) p*:y=a*x + b. Now using a line sweep algorithm find all intersection points in NlogN time. (If you have points which are one above the other just rotate the points to some small angle). The intersection points corresponds in the dual plane to lines in the primer plane.

1

Whoever said that since 3SUM have a reduction to this problem and thus the complexity is O(n^2). Please note that the complexity of 3SUM is less than that. Please check https://en.wikipedia.org/wiki/3SUM and also read https://tmc.web.engr.illinois.edu/reduce3sum_sosa.pdf

  • 1
    It's unclear what you are trying to say. Especially the second sentence is uncomplete. – Floyd Feb 04 '21 at 16:47
0

As already mentioned, there probably isn't a way to solve the general case of this problem better than O(n^2). However, if you assume a large number of points lie on the same line (say the probability that a random point in the set of points lie on the line with the maximum number of points is p) and don't need an exact algorithm, a randomized algorithm is more efficient.

maxPoints = 0
Repeat for k iterations:
    1. Pick 2 random, distinct points uniformly at random
    2. maxPoints = max(maxPoints, number of points that lies on the 
       line defined by the 2 points chosen in step 1)

Note that in the first step, if you picked 2 points which lies on the line with the maximum number of points, you'll get the optimal solution. Assuming n is very large (i.e. we can treat the probability of finding 2 desirable points as sampling with replacement), the probability of this happening is p^2. Therefore the probability of finding a suboptimal solution after k iterations is (1 - p^2)^k.

Suppose you can tolerate a false negative rate rate = err. Then this algorithm runs in O(nk) = O(n * log(err) / log(1 - p^2)). If both n and p are large enough, this is significantly more efficient than O(n^2). (i.e. Supposed n = 1,000,000 and you know there are at least 10,000 points that lie on the same line. Then n^2 would required on the magnitude of 10^12 operations, while randomized algorithm would require on the magnitude of 10^9 operations to get a error rate of less than 5*10^-5.)

Tony Cai
  • 3
  • 3
-1

It is unlikely for a $o(n^2)$ algorithm to exist, since the problem (of even checking if 3 points in R^2 are collinear) is 3Sum-hard (http://en.wikipedia.org/wiki/3SUM)

user695652
  • 4,105
  • 7
  • 40
  • 58
-3

This is not a solution better than O(n^2), but you can do the following,

  1. For each point convert first convert it as if it where in the (0,0) coordinate, and then do the equivalent translation for all the other points by moving them the same x,y distance you needed to move the original choosen point.

2.Translate this new set of translated points to the angle with respect to the new (0,0).

3.Keep stored the maximum number (MSN) of points that are in each angle.

4.Choose the maximum stored number (MSN), and that will be the solution

Marcel Valdez Orozco
  • 2,985
  • 1
  • 25
  • 24
mariana soffer
  • 1,853
  • 12
  • 17